BCL Align multiple sequence alignment and fold recognition

Multiple sequence alignment and fold recognition have become key computational tools in predicting the evolutionary history of proteins and determining its structure from the amino acid sequence. These methods allow information to be gathered about newly discovered sequences through comparison with experimentally studied proteins of known structure and function.

The tools used by sequence alignment and fold recognition programs are distinctly different. Most multiple sequence alignment programs measure alignment quality using scoring functions that take into account substitution matrices and various gap penalties. While there is some overlap between the tools used for sequence alignment and fold recognition, there is significant emphasis on secondary structure prediction in fold recognition methods.

Sequence alignment and fold recognition are clearly asking different questions and places a unique significance on each bioinformatics tool. Also, with the growing number of sequence analysis tools that are being developed, it is difficult to know which methods are best used for sequence analysis. Because each program has its own algorithm and scoring functions, it would be difficult to do a comprehensive analysis using all existing sequence alignment and fold recognition programs to answer these questions.

BCL Align is a multiple sequence alignment program which incorporates several tools to perform sequence analysis, including various substitution matrices, gap penalties, secondary structure predictions, and chemical properties. BCL Align is unique in that it has adjustable weights for each of the parameters that allow it to be used for both sequence alignment and fold recognition. As a result, it can be used to determine which tools are most important in sequence alignment and how it differs from fold recognition.

BCL::Align ranked best in alignment accuracy (Cline score of 22.90 for sequences in the Twilight Zone) when compared with Align-m, ClustalW, T-Coffee, and MUSCLE using the SABmark reference alignment test set. ROC curve analysis indicates BCL::Align’s ability to correctly recognize protein folds with over 80% accuracy. The flexibility of the program allows it to be optimized for specific classes of proteins (e.g. membrane proteins) or fold families (e.g. TIM-barrel proteins). BCL::Align is free for academic use and available online at http://www.meilerlab.org/.

Sequence Alignment

Alumni Project Members: Sten Heinze, Elizabeth Nguyen, Marcin J. Skwark