Abstract of Soeren Brunak

An integrated computational approach is needed to face the challenge of the functional assignment of thousands of new gene products derived from different sequencing projects. Standard functional assignment by homology using proteins of known function is very powerful, but still leaves unassigned proteins belonging to families without known function (orphan families), or isolated sequences (orphan sequences). The number of orphan families and sequences will increase over time since experimental functional analysis is highly demanding in time and effort.

Function is a multilevel, complex phenomenon, where different levels are interwoven (chemical, biochemical, cellular, organismal and developmental). We present an indirect approach where predicted structural features, putative protein modifications, sorting signals, and gene expression data from DNA array experiments are integrated and used to infer the functiona class.

References:

Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites, H. Nielsen, J. Engelbrecht, S. Brunak and G. von Heijne, Protein Eng., 10, 1-6, 1997.

Machine learning approaches to the prediction of signal peptides and other protein sorting signals, H. Nielsen, S. Brunak and G. von Heijne, Protein Eng.,11, 3-9, 1999.

Prediction of mucin type O-glycosylation sites based on sequence context andsurface accessibility, J.E. Hansen, O. Lund, N. Tolstrup, K. Rapacki and S. Brunak, Glycoconjugate J., 15:115-130, 1998.

Sequence and structure-based prediction of eukaryotic protein phosphorylation sites, N. Blom, S. Gammeltoft, and S. Brunak, J. Mol. Biol., 294, 1351-1362, 1999.

Bioinformatics: The Machine Learning Approach, P. Baldi and S. Brunak, MIT Press, Cambridge, Mass. 351 p., 1998.