biotite.structure.io.pdb.PDBFile

class biotite.structure.io.pdb.PDBFile[source]

Bases: biotite.TextFile

This class represents a PDB file.

The usage of PDBxFile is encouraged in favor of this class.

This class only provides support for reading/writing the pure atom information (ATOM, HETATM, MODEL and ENDMDL records). TER records cannot be written.

See also

PDBxFile

Examples

Load a \*.pdb file, modify the structure and save the new structure into a new file:

>>> import os.path
>>> file = PDBFile.read(os.path.join(path_to_structures, "1l2y.pdb"))
>>> array_stack = file.get_structure()
>>> array_stack_mod = rotate(array_stack, [1,2,3])
>>> file = PDBFile()
>>> file.set_structure(array_stack_mod)
>>> file.write(os.path.join(path_to_directory, "1l2y_mod.pdb"))
copy()

Create a deep copy of this object.

Returns
copy

A copy of this object.

get_coord(model=None)[source]

Get only the coordinates of the PDB file.

Parameters
modelint, optional

If this parameter is given, the function will return a 2D coordinate array from the atoms corresponding to the given model number (starting at 1). Negative values are used to index models starting from the last model insted of the first model. If this parameter is omitted, an 2D coordinate array containing all models will be returned, even if the structure contains only one model.

Returns
coordndarray, shape=(m,n,3) or shape=(n,2), dtype=float

The coordinates read from the ATOM and HETATM records of the file.

Notes

Note that get_coord() may output more coordinates than the atom array (stack) from the corresponding get_structure() call has. The reason for this is, that get_structure() filters altloc IDs, while get_coord() does not.

Examples

Read an AtomArrayStack from multiple PDB files, where each PDB file contains the same atoms but different positions. This is an efficient approach when a trajectory is spread into multiple PDB files, as done e.g. by the Rosetta modeling software.

For the purpose of this example, the PDB files are created from an existing AtomArrayStack.

>>> import os.path
>>> from tempfile import gettempdir
>>> file_names = []
>>> for i in range(atom_array_stack.stack_depth()):
...     pdb_file = PDBFile()
...     pdb_file.set_structure(atom_array_stack[i])
...     file_name = os.path.join(gettempdir(), f"model_{i+1}.pdb")
...     pdb_file.write(file_name)
...     file_names.append(file_name)
>>> print(file_names)
['...model_1.pdb', '...model_2.pdb', ..., '...model_38.pdb']

Now the PDB files are used to create an AtomArrayStack, where each model represents a different model.

Construct a new AtomArrayStack with annotations taken from one of the created files used as template and coordinates from all of the PDB files.

>>> template_file = PDBFile.read(file_names[0])
>>> template = template_file.get_structure()
>>> coord = []
>>> for i, file_name in enumerate(file_names):
...     pdb_file = PDBFile.read(file_name)
...     coord.append(pdb_file.get_coord(model=1))
>>> new_stack = from_template(template, np.array(coord))

The newly created AtomArrayStack should now be equal to the AtomArrayStack the PDB files were created from.

>>> print(np.allclose(new_stack.coord, atom_array_stack.coord))
True
get_model_count()[source]

Get the number of models contained in the PDB file.

Returns
model_countint

The number of models.

get_structure(model=None, altloc='first', extra_fields=[], include_bonds=False)[source]

Get an AtomArray or AtomArrayStack from the PDB file.

This function parses standard base-10 PDB files as well as hybrid-36 PDB.

Parameters
modelint, optional

If this parameter is given, the function will return an AtomArray from the atoms corresponding to the given model number (starting at 1). Negative values are used to index models starting from the last model insted of the first model. If this parameter is omitted, an AtomArrayStack containing all models will be returned, even if the structure contains only one model.

altloc{‘first’, ‘occupancy’, ‘all’}
This parameter defines how altloc IDs are handled:
  • 'first' - Use atoms that have the first altloc ID appearing in a residue.

  • 'occupancy' - Use atoms that have the altloc ID with the highest occupancy for a residue.

  • 'all' - Use all atoms. Note that this leads to duplicate atoms. When this option is chosen, the altloc_id annotation array is added to the returned structure.

extra_fieldslist of str, optional

The strings in the list are optional annotation categories that should be stored in the output array or stack. These are valid values: 'atom_id', 'b_factor', 'occupancy' and 'charge'.

include_bondsbool, optional

If set to true, a BondList will be created for the resulting AtomArray containing the bond information from the file. All bonds have BondType.ANY, since the PDB format does not support bond orders.

Returns
arrayAtomArray or AtomArrayStack

The return type depends on the model parameter.

classmethod read(file, *args, **kwargs)

Parse a file (or file-like object).

Parameters
filefile-like object or str

The file to be read. Alternatively a file path can be supplied.

Returns
file_objectFile

An instance from the respective File subclass representing the parsed file.

static read_iter(file)

Create an iterator over each line of the given text file.

Parameters
filefile-like object or str

The file to be read. Alternatively a file path can be supplied.

Yields
linestr

The current line in the file.

set_structure(array, hybrid36=False)[source]

Set the AtomArray or AtomArrayStack for the file.

This makes also use of the optional annotation arrays 'atom_id', 'b_factor', 'occupancy' and 'charge'. If the atom array (stack) contains the annotation 'atom_id', these values will be used for atom numbering instead of continuous numbering.

Parameters
arrayAtomArray or AtomArrayStack

The array or stack to be saved into this file. If a stack is given, each array in the stack is saved as separate model.

hybrid36: bool, optional

Defines wether the file should be written in hybrid-36 format.

Notes

If array has an associated BondList, CONECT records are also written for all non-water hetero residues and all inter-residue connections.

write(file)

Write the contents of this object into a file (or file-like object).

Parameters
file_namefile-like object or str

The file to be written to. Alternatively a file path can be supplied.