Chapter 4

Review of Imbalanced Data Learning for Protein Methylation Prediction

ZEJIN DING and YAN-QING ZHANG

4.1 Introduction

Protein posttranslational modifications (PTMs) bring many diversified modifications to the polypeptide of a protein and play very important roles in many biological processes, such as influencing structural and functional diversity, determining cellular plasticity and dynamics, and impacting transcription activity, [1, 2]. Protein methylation is one important type of posttranslational modification; it is a reversible modification and takes place dominantly on arginine and lysine residue [3]. Since the discovery of methylation in the mid-1960s [4], researchers have found its significant contributions in various biological processes, such as transcriptional regulation, RNA processing, signal transduction, DNA repair, genome stability, and heterochromatin compaction.

However, the molecular mechanism underlying methylation is still poorly understood. A genomewide search of methylated substrates is highly needed to unravel many unknown functions of protein arginine methyltransferases (PRMTs) in biological processes and cellular components. But performing actual laboratorial experiments to verify all residues in human proteins is too costly and time-consuming, and hence infeasible. Therefore, effective computational learning methods to predict methylation sites will greatly help researchers expedite the process of finding potentially methylated residues in new ...

Get Algorithmic and Artificial Intelligence Methods for Protein Bioinformatics now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.