Predicting HIV-1 Protease Cleavage Sites With Positive-Unlabeled Learning
Li, ZF (Li, Zhenfeng)[ 1 ]; Hu, L (Hu, Lun)[ 2 ]; Tang, ZH (Tang, Zehai)[ 1 ]; Zhao, C (Zhao, Cheng)[ 1 ]
刊名FRONTIERS IN GENETICS
2021
卷号12期号:3页码:1-11
关键词HIV-1 protease cleavage site prediction positive-unlabeled learning biased SVM substrate specificity
ISSN号1664-8021
DOI10.3389/fgene.2021.658078
英文摘要

Understanding the substrate specificity of HIV-1 protease plays an essential role in the prevention of HIV infection. A variety of computational models have thus been developed to predict substrate sites that are cleaved by HIV-1 protease, but most of them normally follow a supervised learning scheme to build classifiers by considering experimentally verified cleavable sites as positive samples and unknown sites as negative samples. However, certain noisy can be contained in the negative set, as false negative samples are possibly existed. Hence, the performance of the classifiers is not as accurate as they could be due to the biased prediction results. In this work, unknown substrate sites are regarded as unlabeled samples instead of negative ones. We propose a novel positive-unlabeled learning algorithm, namely PU-HIV, for an effective prediction of HIV-1 protease cleavage sites. Features used by PU-HIV are encoded from different perspectives of substrate sequences, including amino acid identities, coevolutionary patterns and chemical properties. By adjusting the weights of errors generated by positive and unlabeled samples, a biased support vector machine classifier can be built to complete the prediction task. In comparison with state-of-the-art prediction models, benchmarking experiments using cross-validation and independent tests demonstrated the superior performance of PU-HIV in terms of AUC, PR-AUC, and F-measure. Thus, with PU-HIV, it is possible to identify previously unknown, but physiologically existed substrate sites that are able to be cleaved by HIV-1 protease, thus providing valuable insights into designing novel HIV-1 protease inhibitors for HIV treatment.

WOS记录号WOS:000638171400001
内容类型期刊论文
源URL[http://ir.xjipc.cas.cn/handle/365002/7817]  
专题新疆理化技术研究所_多语种信息技术研究室
通讯作者Hu, L (Hu, Lun)[ 2 ]
作者单位1.Chinese Acad Sci, Xinjiang Tech Inst Phys & Chem, Urumqi, Peoples R China
2.Wuhan Univ Technol, Sch Comp Sci & Technol, Wuhan, Peoples R China
推荐引用方式
GB/T 7714
Li, ZF ,Hu, L ,Tang, ZH ,et al. Predicting HIV-1 Protease Cleavage Sites With Positive-Unlabeled Learning[J]. FRONTIERS IN GENETICS,2021,12(3):1-11.
APA Li, ZF ,Hu, L ,Tang, ZH ,&Zhao, C .(2021).Predicting HIV-1 Protease Cleavage Sites With Positive-Unlabeled Learning.FRONTIERS IN GENETICS,12(3),1-11.
MLA Li, ZF ,et al."Predicting HIV-1 Protease Cleavage Sites With Positive-Unlabeled Learning".FRONTIERS IN GENETICS 12.3(2021):1-11.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace