Alignment#

class biotite.sequence.align.Alignment(sequences, trace, score=None)[source]#

Bases: object

An Alignment object stores information about which symbols of n sequences are aligned to each other and it stores the corresponding alignment score.

Instead of saving a list of aligned symbols, this class saves the original n sequences, that were aligned, and a so called trace, which indicate the aligned symbols of these sequences. The trace is a (m x n) ndarray with alignment length m and sequence count n. Each element of the trace is the index in the corresponding sequence. A gap is represented by the value -1.

Furthermore this class provides multiple utility functions for conversion into strings in order to make the alignment human readable.

Unless an Alignment object is the result of a multiple sequence alignment, the object will contain only two sequences.

All attributes of this class are publicly accessible.

Parameters:
sequenceslist

A list of aligned sequences.

tracendarray, dtype=int, shape=(n,m)

The alignment trace.

scoreint, optional

Alignment score.

Attributes:
sequenceslist

A list of aligned sequences.

tracendarray, dtype=int, shape=(n,m)

The alignment trace.

scoreint

Alignment score.

Examples

>>> seq1 = NucleotideSequence("CGTCAT")
>>> seq2 = NucleotideSequence("TCATGC")
>>> matrix = SubstitutionMatrix.std_nucleotide_matrix()
>>> ali = align_optimal(seq1, seq2, matrix)[0]
>>> print(ali)
CGTCAT--
--TCATGC
>>> print(ali.trace)
[[ 0 -1]
 [ 1 -1]
 [ 2  0]
 [ 3  1]
 [ 4  2]
 [ 5  3]
 [-1  4]
 [-1  5]]
>>> print(ali[1:4].trace)
[[ 1 -1]
 [ 2  0]
 [ 3  1]]
>>> print(ali[1:4, 0:1].trace)
[[1]
 [2]
 [3]]
static from_strings(sequence_strings, sequence_factory, gap_character='-')#

Create an Alignment from strings that represent aligned sequences.

DEPRECATED: Use Alignment.from_strings() instead.

Parameters:
sequence_stringslist of str

The strings, where each each one represents a sequence (with gaps) in an alignment. All strings must have the same length.

sequence_factoryCallable (str -> Sequence)

Callable that takes a sequence string (with gaps already removed) and produces a Sequence object.

gap_characterstr, optional

This character is interpreted as gap.

Returns:
alignmentAlignment

The created alignment.

Examples

>>> alignment = Alignment.from_strings(
...     [
...         "BIQTITE",
...         "-IQLITE"
...     ],
...     ProteinSequence,
... )
>>> print(alignment)
BIQTITE
-IQLITE
>>> print(alignment.sequences[0])
BIQTITE
>>> print(alignment.sequences[1])
IQLITE
get_gapped_sequences()#

Get a the string representation of the gapped sequences.

Returns:
sequenceslist of str

The list of gapped sequence strings. The order is the same as in Alignment.sequences.

static trace_from_strings(sequence_strings, gap_character='-')#

Create a trace from strings that represent aligned sequences.

Parameters:
sequence_stringslist of str

The strings, where each each one represents a sequence (with gaps) in an alignment.

gap_characterstr, optional

This character is interpreted as gap.

Returns:
tracendarray, dtype=int, shape=(n,2)

The created trace.

See also

from_strings

Creates directly an Alignment object.