get_sequence#
- biotite.structure.io.pdbx.get_sequence(pdbx_file, data_block=None)[source]#
Get the protein and nucleotide sequences from the
entity_poly.pdbx_seq_one_letter_code_canentry.Supported polymer types (
_entity_poly.type) are:'polypeptide(D)','polypeptide(L)','polydeoxyribonucleotide','polyribonucleotide'and'polydeoxyribonucleotide/polyribonucleotide hybrid'. Uracil is converted to Thymine.- Parameters:
- pdbx_fileCIFFile or CIFBlock or BinaryCIFFile or BinaryCIFBlock
The file object.
- data_blockstr, optional
The name of the data block. Default is the first (and most times only) data block of the file. If the data block object is passed directly to pdbx_file, this parameter is ignored.
- Returns:
- sequence_dictDictionary of Sequences
Dictionary keys are derived from
entity_poly.pdbx_strand_id(equivalent toatom_site.auth_asym_id). Dictionary values are sequences.
Notes
The
entity_poly.pdbx_seq_one_letter_code_canfield contains the initial complete sequence. If the structure represents a truncated or spliced version of this initial sequence, it will include only a subset of the initial sequence. Use biotite.structure.get_residues to retrieve only the residues that are represented in the structure.
Gallery#
Mutual information as measure for coevolution of residues