Alignment
#
- class biotite.sequence.align.Alignment(sequences, trace, score=None)[source]#
Bases:
object
An
Alignment
object stores information about which symbols of n sequences are aligned to each other and it stores the corresponding alignment score.Instead of saving a list of aligned symbols, this class saves the original n sequences, that were aligned, and a so called trace, which indicate the aligned symbols of these sequences. The trace is a (m x n)
ndarray
with alignment length m and sequence count n. Each element of the trace is the index in the corresponding sequence. A gap is represented by the value -1.Furthermore this class provides multiple utility functions for conversion into strings in order to make the alignment human readable.
Unless an
Alignment
object is the result of an multiple sequence alignment, the object will contain only two sequences.All attributes of this class are publicly accessible.
- Parameters:
- sequenceslist
A list of aligned sequences.
- tracendarray, dtype=int, shape=(n,m)
The alignment trace.
- scoreint, optional
Alignment score.
Examples
>>> seq1 = NucleotideSequence("CGTCAT") >>> seq2 = NucleotideSequence("TCATGC") >>> matrix = SubstitutionMatrix.std_nucleotide_matrix() >>> ali = align_optimal(seq1, seq2, matrix)[0] >>> print(ali) CGTCAT-- --TCATGC >>> print(ali.trace) [[ 0 -1] [ 1 -1] [ 2 0] [ 3 1] [ 4 2] [ 5 3] [-1 4] [-1 5]] >>> print(ali[1:4].trace) [[ 1 -1] [ 2 0] [ 3 1]] >>> print(ali[1:4, 0:1].trace) [[1] [2] [3]]
- Attributes:
- sequenceslist
A list of aligned sequences.
- tracendarray, dtype=int, shape=(n,m)
The alignment trace.
- scoreint
Alignment score.
- get_gapped_sequences()#
Get a the string representation of the gapped sequences.
- Returns:
- sequenceslist of str
The list of gapped sequence strings. The order is the same as in Alignment.sequences.
- static trace_from_strings(seq_str_list)#
Create a trace from strings that represent aligned sequences.
- Parameters:
- seq_str_listlist of str
The strings, where each each one represents a sequence (with gaps) in an alignment. A
-
is interpreted as gap.
- Returns:
- tracendarray, dtype=int, shape=(n,2)
The created trace.
Gallery#
Customized visualization of a multiple sequence alignment
Finding homologous regions in two genomes
Finding homologs of a gene in a genome
Phylogenetic tree of a protein family
Hydropathy and conservation of ion channels
Dendrogram of a protein family
Homology search and multiple sequence alignment
Fetching and aligning a protein from different species
Display sequence similarity in a heat map
Plot epitope mapping data onto protein sequence alignments
Mutual information as measure for coevolution of residues
Quantifying gene expression from RNA-seq data
Sequence logo of sequences with equal length
Biotite color schemes for protein sequences
Statistics of local alignments and the E-value
Structural alignment of orthologous proteins using ‘Protein Blocks’