to_sequence
#
- biotite.structure.to_sequence(atoms, allow_hetero=False)[source]#
Convert each chain in a structure into a sequence.
- Parameters:
- atomsAtomArray or AtomArrayStack
The structure. May contain multiple chains. Each chain must be either a peptide or a nucleic acid.
- allow_heterobool, optional
If true, residues inside a amino acid or nucleotide chain, that have no one-letter code, are replaced by the respective ‘any’ symbol (“X” or “N”, respectively). The same is true for amino acids in nucleotide chains and vice versa. By default, an exception is raised.
- Returns:
- sequenceslist of Sequence, length=n
The sequence for each chain in the structure.
- chain_start_indicesndarray, shape=(n,), dtype=int
The atom index where each chain starts.
Notes
Residues are considered amino acids or nucleotides based on their appearance
info.amino_acid_names()
orinfo.nucleotide_names()
, respectively.Examples
>>> sequences, chain_starts = to_sequence(atom_array) >>> print(sequences) [ProteinSequence("NLYIQWLKDGGPSSGRPPPS")]
Gallery#

Searching for structural homologs in a protein structure database
Searching for structural homologs in a protein structure database