get_sequence
#
- biotite.structure.io.pdbx.get_sequence(pdbx_file, data_block=None)[source]#
Get the protein and nucleotide sequences from the
entity_poly.pdbx_seq_one_letter_code_can
entry.Supported polymer types (
_entity_poly.type
) are:'polypeptide(D)'
,'polypeptide(L)'
,'polydeoxyribonucleotide'
,'polyribonucleotide'
and'polydeoxyribonucleotide/polyribonucleotide hybrid'
. Uracil is converted to Thymine.- Parameters:
- pdbx_fileCIFFile or CIFBlock or BinaryCIFFile or BinaryCIFBlock
The file object.
- data_blockstr, optional
The name of the data block. Default is the first (and most times only) data block of the file. If the data block object is passed directly to pdbx_file, this parameter is ignored.
- Returns:
- sequence_dictDictionary of Sequences
Dictionary keys are derived from
entity_poly.pdbx_strand_id
(equivalent toatom_site.auth_asym_id
). Dictionary values are sequences.
Notes
The
entity_poly.pdbx_seq_one_letter_code_can
field contains the initial complete sequence. If the structure represents a truncated or spliced version of this initial sequence, it will include only a subset of the initial sequence. Use biotite.structure.get_residues to retrieve only the residues that are represented in the structure.
Gallery#

Mutual information as measure for coevolution of residues