
Journal of Zhejiang Agricultural Sciences ›› 2026, Vol. 67 ›› Issue (2): 552-556.DOI: 10.16178/j.issn.0528-9017.20250739
Received:2025-10-11
Online:2026-02-28
Published:2026-03-07
CLC Number:
WU Jianxiong, BAO Yufeng. Study progress of protein function prediction based on sequence information and machine learning[J]. Journal of Zhejiang Agricultural Sciences, 2026, 67(2): 552-556.
Add to citation manager EndNote|Ris|BibTeX
URL: http://www.zjnykx.cn/EN/10.16178/j.issn.0528-9017.20250739
| 发表年份 | 方法 | 模型评估标准 | ||
|---|---|---|---|---|
| BP | CC | MF | ||
| 2020 | DeepGOPlus | Fmax=0.390 | Fmax=0.614 | Fmax=0.557 |
| 2021 | NCL+mask BLAST | F1=0.378 | F1=0.475 | F1=0.496 |
| 2022 | Wei2GO | Fmax=0.520 | Fmax=0.580 | Fmax=0.390 |
Table 1 Protein function prediction method based on sequence homology
| 发表年份 | 方法 | 模型评估标准 | ||
|---|---|---|---|---|
| BP | CC | MF | ||
| 2020 | DeepGOPlus | Fmax=0.390 | Fmax=0.614 | Fmax=0.557 |
| 2021 | NCL+mask BLAST | F1=0.378 | F1=0.475 | F1=0.496 |
| 2022 | Wei2GO | Fmax=0.520 | Fmax=0.580 | Fmax=0.390 |
| 发表年份 | 方法 | 模型评估标准 | ||
|---|---|---|---|---|
| BP | CC | MF | ||
| 2018 | GOLabeler | Fmax=0.372 AUPR=0.236 | Fmax=0.586 AUPR=0.697 | Fmax=0.691 AUPR=0.549 |
| 2020 | DeepAdd | Fmax=0.345 AUC=0.896 | Fmax=0.547 AUC=0.958 | Fmax=0.516 AUC=0.912 |
| 2020 | FFPred-GAN | Fmax=0.567 | Fmax=0.755 | Fmax=0.750 |
| 2021 | Global-ProtEnc | Fmax=0.523 | Fmax=0.636 | Fmax=0.515 |
| 2022 | FUTUSA | F1=0.532 | — | — |
| 2022 | GAT-GO | Fmax=0.501 AUPR=0.381 | Fmax=0.542 AUPR=0.479 | Fmax=0.637 AUPR=0.660 |
| 2022 | SPROF-GO | Fmax=0.335 AUPR=0.247 | Fmax=0.725 AUPR=0.765 | Fmax=0.647 AUPR=0.622 |
| 2023 | ProteInfer | Fmax=0.647 AUPR=0.622 | Fmax=0.335 AUPR=0.247 | Fmax=0.647 AUPR=0.622 |
| 2023 | PFP-SCGCN | Fmax=0.706 | Fmax=0.796 | Fmax=0.787 |
| 2024 | AnnoPRO | Fmax=0.609 AUPR=0.574 | Fmax=0.746 AUPR=0.749 | Fmax=0.763 AUPR=0.755 |
| 2024 | TransFew | Fmax=0.449 AUPR=0.244 | Fmax=0.726 AUPR=0.455 | Fmax=0.666 AUPR=0.363 |
| 2024 | PU-GO | Fmax=0.556 | Fmax=0.734 | Fmax=0.569 |
| 2024 | GORetriever | Fmax=0.545 | Fmax=0.653 | Fmax=0.659 |
Table 2 Protein function prediction method based on sequence feature extraction
| 发表年份 | 方法 | 模型评估标准 | ||
|---|---|---|---|---|
| BP | CC | MF | ||
| 2018 | GOLabeler | Fmax=0.372 AUPR=0.236 | Fmax=0.586 AUPR=0.697 | Fmax=0.691 AUPR=0.549 |
| 2020 | DeepAdd | Fmax=0.345 AUC=0.896 | Fmax=0.547 AUC=0.958 | Fmax=0.516 AUC=0.912 |
| 2020 | FFPred-GAN | Fmax=0.567 | Fmax=0.755 | Fmax=0.750 |
| 2021 | Global-ProtEnc | Fmax=0.523 | Fmax=0.636 | Fmax=0.515 |
| 2022 | FUTUSA | F1=0.532 | — | — |
| 2022 | GAT-GO | Fmax=0.501 AUPR=0.381 | Fmax=0.542 AUPR=0.479 | Fmax=0.637 AUPR=0.660 |
| 2022 | SPROF-GO | Fmax=0.335 AUPR=0.247 | Fmax=0.725 AUPR=0.765 | Fmax=0.647 AUPR=0.622 |
| 2023 | ProteInfer | Fmax=0.647 AUPR=0.622 | Fmax=0.335 AUPR=0.247 | Fmax=0.647 AUPR=0.622 |
| 2023 | PFP-SCGCN | Fmax=0.706 | Fmax=0.796 | Fmax=0.787 |
| 2024 | AnnoPRO | Fmax=0.609 AUPR=0.574 | Fmax=0.746 AUPR=0.749 | Fmax=0.763 AUPR=0.755 |
| 2024 | TransFew | Fmax=0.449 AUPR=0.244 | Fmax=0.726 AUPR=0.455 | Fmax=0.666 AUPR=0.363 |
| 2024 | PU-GO | Fmax=0.556 | Fmax=0.734 | Fmax=0.569 |
| 2024 | GORetriever | Fmax=0.545 | Fmax=0.653 | Fmax=0.659 |
| [1] | ASHBURNER M, BALL C A, BLAKE J A,et al. Gene ontology:tool for the unification of biology:the gene ontology consortium[J]. Nature Genetics,2000,25(1):25-29. |
| [2] | BOADU F, CAO H Y, CHENG J L. Combining protein sequences and structures with transformers and equivariant graph neural networks to predict protein function[J]. Bioinformatics,2023,39():i318-i325. |
| [3] | TIWARI A K, SRIVASTAVA R. A survey of computational intelligence techniques in protein function prediction[J]. International Journal of Proteomics,2014,2014:845479. |
| [4] | ZHOU N H, JIANG Y X, BERGQUIST T R,et al. The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens[J]. Genome Biology,2019,20:244. |
| [5] | DAVID J LIPMAN W R P. Rapid and sensitive protein similarity searches[J]. Science,1985,227(4693):1435-1441. |
| [6] | ALTSCHUL S F, GISH W, MILLER W,et al. Basic local alignment search tool[J]. Journal of Molecular Biology,1990,215(3):403-410. |
| [7] | ALTSCHUL S F, MADDEN T L, SCHÄFFER A A,et al. Gapped BLAST and PSI-BLAST:a new generation of protein database search programs[J]. Nucleic Acids Research,1997,25(17):3389-3402. |
| [8] | REIJNDERS M J M F. Wei2GO:weighted sequence similarity-based protein function prediction[J]. PeerJ,2022,10:e12931. |
| [9] | LAN L, DJURIC N, GUO Y H,et al. MS-kNN:protein function prediction by integrating multiple data sources[J]. BMC Bioinformatics,2013,14():S8. |
| [10] | GLIGORIJEVIĆ V, JANJIĆ V, PRŽULJ N. Integration of molecular network data reconstructs gene ontology[J]. Bioinformatics,2014,30(17):i594-i600. |
| [11] | 蔡从中,韩连漪,王万录,等. 支持向量机程序SVMProt预测SARS病毒蛋白质的功能[J]. 重庆大学学报(自然科学版),2003,26(9):148-150. |
| CAI C Z, HAN L Y, WANG W L,et al. Prediction of the function of SARS proteins by using a support vector machine program SVMProt[J]. Journal of Chongqing University (Natural Science Edition),2003,26(9):148-150. | |
| [12] | RANJAN A, TIWARI A, DEEPAK A. A sub-sequence based approach to protein function prediction via multi-attention based multi-aspect network[J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics,2023,20(1):94-105. |
| [13] | ZHENG L Y, SHI S Y, LU M K,et al. AnnoPRO:a strategy for protein function annotation based on multi-scale protein representation and a hybrid deep learning of dual-path encoding[J]. Genome Biology,2024,25(1):41. |
| [14] | WAN C, JONES D T. Protein function prediction is improved by creating synthetic feature samples with generative adversarial networks[J]. Nature Machine Intelligence,2020,2(9):540-550. |
| [15] | LAI B Q, XU J B. Accurate protein function prediction via graph attention networks with predicted structure information[J]. Briefings in Bioinformatics,2022,23(1):bbab502. |
| [16] | 秦琪琪,丁学明,王金雷. 利用序列和组合图卷积网络预测蛋白质功能[J]. 小型微型计算机系统,2023,44(12):2692-2699. |
| QIN Q Q, DING X M, WANG J L. Protein function prediction using sequences and combined graph convolutional networks[J]. Journal of Chinese Computer Systems,2023,44(12):2692-2699. | |
| [17] | YANG K K, FUSI N, LU A X. Convolutions are competitive with transformers for protein sequence pretraining[J]. Cell Systems,2024,15(3):286-294.e2. |
| [18] | BOADU F, CHENG J L. Improving protein function prediction by learning and integrating representations of protein sequences and function labels[J]. Bioinformatics Advances,2024,4(1):vbae120. |
| [19] | ZHAPA-CAMACHO F, TANG Z W, KULMANOV M,et al. Predicting protein functions using positive-unlabeled ranking with ontology-based priors[J]. Bioinformatics,2024,40():i401-i409. |
| [20] | YAN H Y, WANG S J, LIU H C,et al. GORetriever:reranking protein-description-based GO candidates by literature-driven deep information retrieval for protein function annotation[J]. Bioinformatics,2024,40():ii53-ii61. |
| [21] | WANG R H, JIANG Y, JIN J R,et al. DeepBIO:an automated and interpretable deep-learning platform for high-throughput biological sequence prediction,functional annotation and visualization analysis[J]. Nucleic Acids Research,2023,51(7):3017-3029. |
| [22] | POLITANO G, BENSO A, REHMAN H U,et al. PRONTO-TK:a user-friendly protein neural network tool-kit for accessible protein function prediction[J]. NAR Genomics and Bioinformatics,2024,6(3):lqae112. |
| [23] | WANG W K, SHUAI Y Y, YANG Q R,et al. A comprehensive computational benchmark for evaluating deep learning-based protein function prediction approaches[J]. Briefings in Bioinformatics,2024,25(2):bbae050. |
| [1] | ZHANG Yi, YUAN Xiaoxiao, CAO Dongdong, HUANG Yutao, LIANG Lijun. Opportunities and challenges of intelligent irrigation algorithms in agricultural modernization [J]. Journal of Zhejiang Agricultural Sciences, 2025, 66(9): 2280-2286. |
| [2] | YAO Xiaohong, HU Jiabi, YOU Fei, SUN Hong, ZHOU Hanghai, WANG Xin, WU Yifei, XU Huangen, TANG Jiangwu. Screening and evaluation of fermentation strains for Huangjiu lees [J]. Journal of Zhejiang Agricultural Sciences, 2025, 66(9): 2248-2252. |
| [3] | WANG Meifang, XI Chaoyue, ZHU Haisheng, CHEN Lifei. Research progress on the interactions of Ralstonia solanacearum type Ⅲ effector proteins and host proteins [J]. Journal of Zhejiang Agricultural Sciences, 2025, 66(9): 2200-2207. |
| [4] | LI Min, LIU Fei, GAO Changjian, LIU Jinlin. Effects of cadmium stress on osmotic regulation and balance of 3 clones of Taxodium hybrid Zhongshanshan [J]. Journal of Zhejiang Agricultural Sciences, 2025, 66(9): 2194-2199. |
| [5] | WU Jianxiong, BAO Yufeng. Application of artificial intelligence in enzyme engineering [J]. Journal of Zhejiang Agricultural Sciences, 2025, 66(6): 1542-1550. |
| [6] | JIN Lei, ZHANG Chi, SHAO Xiaodong, DU Jun, TIAN Jingjing, LIU Yu. Assessment of nitrogen content in flue-cured tobacco leaves based on UAV-loaded multiple spectrum [J]. Journal of Zhejiang Agricultural Sciences, 2025, 66(5): 1158-1162. |
| [7] | HOU Zhe, CHEN Liangying, JIANG Zhengzhi, SHEN Ping, WU Jincui, HUANG Changbing. Identification and functional analysis of HSP20 gene family in Rosa chinensis [J]. Journal of Zhejiang Agricultural Sciences, 2025, 66(12): 3026-3035. |
| [8] | HU Jiating, LIU Yuxia, ZHANG Yu, XIE Zhigang, ZHENG Rongquan. Special demand of protein in feed for Quasipaa spinosa tadpoles during the metamorphosis stage [J]. Journal of Zhejiang Agricultural Sciences, 2025, 66(10): 2462-2467. |
| [9] | WANG Xuanyi, SUN Yawei, WANG Liying, LONG Yuwei, YE Tong, ZHOU Yuxin, MA Xuelian, LI Na, ZHAO Hongqiong, YAO Gang. Cloning and bioinformatics analysis of 5'UTR of bovine FMR1 gene [J]. Journal of Zhejiang Agricultural Sciences, 2024, 65(9): 2190-2197. |
| [10] | WEI Xin, LI Qianchen, ZHOU Tianpei, WEN Zaiyang, CHEN Qiuxia, LIU Xing. Screening of culture medium for aseptic germination and strengthening cultivation of Hibiscus hamabo [J]. Journal of Zhejiang Agricultural Sciences, 2024, 65(7): 1634-1638. |
| [11] | GE Jinxin, YAN Tingding, WANG Kezhi, LIU Ye, HE Zhiqiao, SHEN Wang. Construction of a tetracycline-induced expression system for recombinant protein expression in large yellow croaker (Larimichthys crocea) cells [J]. Journal of Zhejiang Agricultural Sciences, 2024, 65(6): 1285-1290. |
| [12] | SUN Jianlong, ZHANG Xincheng, REN Yun, WANG Fengying, LU Hongying, QIAN Weihong, MA Shanlin. Effects of nitrogen rate on yield and accumulation of storage protein in indica-japonica hybrid rice [J]. Journal of Zhejiang Agricultural Sciences, 2024, 65(4): 823-829. |
| [13] | Ying GE, Jinghui FAN, Qinghai LI, Hang LIU, Huanhuan WANG, Yinghui WEI, Lei ZHANG. Effects of low protein diets on growth performance and carcass quality of yellow feather broilers [J]. Journal of Zhejiang Agricultural Sciences, 2024, 65(2): 454-458. |
| [14] | Xiaoxiang CHEN, Wenjun ZHANG, Zhengguang ZHAI, Zhiqiang XU, Huabing LIU, Yongjian ZHONG, Zhimin JIANG. PPI network analysis of differentially expressed mRNA in the roots of tobacco varieties resistant to Meloidogyne incongnita (Kofold&White) Chitwood [J]. Journal of Zhejiang Agricultural Sciences, 2024, 65(2): 395-400. |
| [15] | ZHOU Wangyang, WU Feiyan, ZHAO Miaoyu, HUANG Lei, CHEN Xiongjin, ZHOU Yan, LIU Kaidong. Identification of TCP transcription factors in Carica papaya and their expression during fruit ripening [J]. Journal of Zhejiang Agricultural Sciences, 2024, 65(11): 2693-2702. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||
