biotite.structure.io.pdbx.PDBxFile

class biotite.structure.io.pdbx.PDBxFile[source]

Bases: biotite.TextFile, collections.abc.MutableMapping

This class represents a PDBx/mmCIF file.

The categories of the file can be accessed using the get_category()/set_category() methods. The content of each category is represented by a dictionary. The dictionary contains the entry (e.g. label_entity_id in atom_site) as key. The corresponding values are either strings in non-looped categories, or 1-D numpy arrays of string objects in case of looped categories.

A category can be changed or added using set_category(): If a string-valued dictionary is provided, a non-looped category will be created; if an array-valued dictionary is given, a looped category will be created. In case of arrays, it is important that all arrays have the same size.

Alternatively, The content of this file can also be read/write accessed using dictionary-like indexing: You can either provide a data block and a category or only a category, in which case the first data block is taken.

Notes

This class is also able to detect and parse multiline entries in the file. However, when writing a category no multiline values are used. This could lead to long lines.

This class uses a lazy category dictionary creation: When reading the file only the line positions of all categories are checked. The time consuming task of dictionary creation is done when get_category() is called.

Examples

Read the file and get author names:

>>> import os.path
>>> file = PDBxFile.read(os.path.join(path_to_structures, "1l2y.cif"))
>>> author_dict = file.get_category("citation_author", block="1L2Y")
>>> print(author_dict["name"])
['Neidigh, J.W.' 'Fesinmeyer, R.M.' 'Andersen, N.H.']

Dictionary style indexing, no specification of data block:

>>> print(file["citation_author"]["name"])
['Neidigh, J.W.' 'Fesinmeyer, R.M.' 'Andersen, N.H.']

Get the structure from the file:

>>> arr = get_structure(file)
>>> print(type(arr).__name__)
AtomArrayStack
>>> arr = get_structure(file, model=1)
>>> print(type(arr).__name__)
AtomArray

Modify atom array and write it back into the file:

>>> arr_mod = rotate(arr, [1,2,3])
>>> set_structure(file, arr_mod)
>>> file.write(os.path.join(path_to_directory, "1l2y_mod.cif"))
clear() None.  Remove all items from D.
copy()

Create a deep copy of this object.

Returns
copy

A copy of this object.

get(k[, d]) D[k] if k in D, else d.  d defaults to None.
get_block_names()[source]

Get the names of all data blocks in the file.

Returns
blockslist

List of data block names.

get_category(category, block=None, expect_looped=False)[source]

Get the dictionary for a given category.

Parameters
categorystring

The name of the category. The leading underscore is omitted.

blockstring, optional

The name of the data block. Default is the first (and most times only) data block of the file.

expect_loopedbool, optional

If set to true, the returned dictionary will always contain arrays (only if the category exists): If the category is non-looped, each array will contain only one element.

Returns
category_dictdict of (str or ndarray, dtype=str) or None

A entry keyed dictionary. The corresponding values are strings or array of strings for non-looped and looped categories, respectively. Returns None, if the data block does not contain the given category.

items() a set-like object providing a view on D's items
keys() a set-like object providing a view on D's keys
pop(k[, d]) v, remove specified key and return the corresponding value.

If key is not found, d is returned if given, otherwise KeyError is raised.

popitem() (k, v), remove and return some (key, value) pair

as a 2-tuple; but raise KeyError if D is empty.

classmethod read(file)[source]

Read a PDBx/mmCIF file.

Parameters
filefile-like object or str

The file to be read. Alternatively a file path can be supplied.

Returns
file_objectPDBxFile

The parsed file.

static read_iter(file)

Create an iterator over each line of the given text file.

Parameters
filefile-like object or str

The file to be read. Alternatively a file path can be supplied.

Yields
linestr

The current line in the file.

set_category(category, category_dict, block=None)[source]

Set the content of a category.

If the category is already exisiting, all lines corresponding to the category are replaced. Otherwise a new category is created and the lines are appended at the end of the data block.

Parameters
categorystring

The name of the category. The leading underscore is omitted.

category_dictdict

The category content. The dictionary must have strings (subcategories) as keys and strings or ndarray objects as values.

blockstring, optional

The name of the data block. Default is the first (and most times only) data block of the file. If the block is not contained in the file yet, a new block is appended at the end of the file.

setdefault(k[, d]) D.get(k,d), also set D[k]=d if k not in D
update([E, ]**F) None.  Update D from mapping/iterable E and F.

If E present and has a .keys() method, does: for k in E: D[k] = E[k] If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v In either case, this is followed by: for k, v in F.items(): D[k] = v

values() an object providing a view on D's values
write(file)

Write the contents of this object into a file (or file-like object).

Parameters
file_namefile-like object or str

The file to be written to. Alternatively a file path can be supplied.