biotite.sequence.align

This subpackage provides functionality for sequence alignemnts.

The two central classes involved are SubstitutionMatrix and Àlignment:

Every function that performs an alignment requires a SubstitutionMatrix that provides similarity scores for each symbol combination of two alphabets (usually both alphabets are equal). The alphabets in the SubstitutionMatrix must match or extend the alphabets of the sequences to be aligned.

An alignment cannot be directly represented as list of Sequence objects, since a gap indicates the absence of any symbol. Instead, the aligning functions return one or more Alignment instances. These objects contain the original sequences and a trace, that describe which positions (indices) in the sequences are aligned. Optionally they also contain the similarity score.

The aligning functions are usually C-accelerated, reducing the computation time substantially.

Substitution matrices

SubstitutionMatrix A SubstitutionMatrix is the foundation for scoring in sequence alignments.

Aligners

align_optimal Perform an optimal alignment of two sequences based on the dynamic programming algorithm.
align_multiple Perform a multiple sequence alignment using a progressive alignment algorithm.
align_ungapped Align two sequences without introduction of gaps.

Alignments

Alignment An Alignment object stores information about which symbols of n sequences are aligned to each other and it stores the corresponding alignment score.
get_codes Get the sequence codes for the alignment.
get_symbols Similar to get_codes(), but contains the decoded symbols instead of codes.
get_sequence_identity Calculate the sequence identity for an alignment.
score Calculate the similarity score of an alignment.