Predicting HIV-1 Protease Cleavage Sites With Positive-Unlabeled Learning | |
Li, ZF (Li, Zhenfeng)[ 1 ]; Hu, L (Hu, Lun)[ 2 ]; Tang, ZH (Tang, Zehai)[ 1 ]; Zhao, C (Zhao, Cheng)[ 1 ] | |
刊名 | FRONTIERS IN GENETICS |
2021 | |
卷号 | 12期号:3页码:1-11 |
关键词 | HIV-1 protease cleavage site prediction positive-unlabeled learning biased SVM substrate specificity |
ISSN号 | 1664-8021 |
DOI | 10.3389/fgene.2021.658078 |
英文摘要 | Understanding the substrate specificity of HIV-1 protease plays an essential role in the prevention of HIV infection. A variety of computational models have thus been developed to predict substrate sites that are cleaved by HIV-1 protease, but most of them normally follow a supervised learning scheme to build classifiers by considering experimentally verified cleavable sites as positive samples and unknown sites as negative samples. However, certain noisy can be contained in the negative set, as false negative samples are possibly existed. Hence, the performance of the classifiers is not as accurate as they could be due to the biased prediction results. In this work, unknown substrate sites are regarded as unlabeled samples instead of negative ones. We propose a novel positive-unlabeled learning algorithm, namely PU-HIV, for an effective prediction of HIV-1 protease cleavage sites. Features used by PU-HIV are encoded from different perspectives of substrate sequences, including amino acid identities, coevolutionary patterns and chemical properties. By adjusting the weights of errors generated by positive and unlabeled samples, a biased support vector machine classifier can be built to complete the prediction task. In comparison with state-of-the-art prediction models, benchmarking experiments using cross-validation and independent tests demonstrated the superior performance of PU-HIV in terms of AUC, PR-AUC, and F-measure. Thus, with PU-HIV, it is possible to identify previously unknown, but physiologically existed substrate sites that are able to be cleaved by HIV-1 protease, thus providing valuable insights into designing novel HIV-1 protease inhibitors for HIV treatment. |
WOS记录号 | WOS:000638171400001 |
内容类型 | 期刊论文 |
源URL | [http://ir.xjipc.cas.cn/handle/365002/7817] |
专题 | 新疆理化技术研究所_多语种信息技术研究室 |
通讯作者 | Hu, L (Hu, Lun)[ 2 ] |
作者单位 | 1.Chinese Acad Sci, Xinjiang Tech Inst Phys & Chem, Urumqi, Peoples R China 2.Wuhan Univ Technol, Sch Comp Sci & Technol, Wuhan, Peoples R China |
推荐引用方式 GB/T 7714 | Li, ZF ,Hu, L ,Tang, ZH ,et al. Predicting HIV-1 Protease Cleavage Sites With Positive-Unlabeled Learning[J]. FRONTIERS IN GENETICS,2021,12(3):1-11. |
APA | Li, ZF ,Hu, L ,Tang, ZH ,&Zhao, C .(2021).Predicting HIV-1 Protease Cleavage Sites With Positive-Unlabeled Learning.FRONTIERS IN GENETICS,12(3),1-11. |
MLA | Li, ZF ,et al."Predicting HIV-1 Protease Cleavage Sites With Positive-Unlabeled Learning".FRONTIERS IN GENETICS 12.3(2021):1-11. |
个性服务 |
查看访问统计 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论