get_sse
#
- biotite.structure.io.pdbx.get_sse(pdbx_file, data_block=None, match_model=None)[source]#
Get the secondary structure from a PDBx file.
- Parameters:
- pdbx_fileCIFFile or CIFBlock or BinaryCIFFile or BinaryCIFBlock
The file object. The following categories are required:
entity_poly
struct_conf
(if alpha-helices are present)struct_sheet_range
(if beta-strands are present)atom_site
(if match_model is set)
- data_blockstr, optional
The name of the data block. Default is the first (and most times only) data block of the file. If the data block object is passed directly to pdbx_file, this parameter is ignored.
- match_modelNone, optional
If a model number is given, only secondary structure elements for residues are kept, that are resolved in the given model. This means secondary structure elements for residues that would not appear in a corresponding
AtomArray
fromget_structure()
are removed. By default, all residues in the sequence are kept.
- Returns:
- sse_dictdict of str -> ndarray, dtype=str
The dictionary maps the chain ID (derived from
auth_asym_id
) to the secondary structure of the respective chain."a"
: alpha-helix"b"
: beta-strand"c"
: coil or not an amino acid
Each secondary structure element corresponds to the
label_seq_id
of theatom_site
category. This means that the 0-th position of the array corresponds to the residue inatom_site
withlabel_seq_id
1
.
Examples
>>> import os.path >>> file = CIFFile.read(os.path.join(path_to_structures, "1aki.cif")) >>> sse = get_sse(file, match_model=1) >>> print(sse) {'A': array(['c', 'c', 'c', 'c', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'c', 'c', 'c', 'c', 'c', 'a', 'a', 'a', 'c', 'c', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'c', 'c', 'c', 'c', 'c', 'c', 'b', 'b', 'b', 'c', 'c', 'c', 'c', 'c', 'b', 'b', 'b', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'a', 'a', 'a', 'a', 'a', 'c', 'c', 'c', 'c', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'c', 'c', 'a', 'a', 'a', 'a', 'c', 'a', 'a', 'a', 'a', 'a', 'a', 'c', 'c', 'c', 'c', 'c', 'a', 'a', 'a', 'a', 'c', 'c', 'c', 'c', 'c', 'c'], dtype='<U1')}
If only secondary structure elements for resolved residues are requested, the length of the returned array matches the number of peptide residues in the structure.
>>> file = CIFFile.read(os.path.join(path_to_structures, "3o5r.cif")) >>> print(len(get_sse(file, match_model=1)["A"])) 128 >>> atoms = get_structure(file, model=1) >>> atoms = atoms[filter_amino_acids(atoms) & (atoms.chain_id == "A")] >>> print(get_residue_count(atoms)) 128
Gallery#

Three ways to get the secondary structure of a protein