, model=None, data_block=None, altloc='first', extra_fields=None, use_author_fields=True)[source]

Create an AtomArray or AtomArrayStack from the atom_site category in a PDBxFile.


The file object.

modelint, optional

If this parameter is given, the function will return an AtomArray from the atoms corresponding to the given model number (starting at 1). Negative values are used to index models starting from the last model insted of the first model. If this parameter is omitted, an AtomArrayStack containing all models will be returned, even if the structure contains only one model.

data_blockstr, optional

The name of the data block. Default is the first (and most times only) data block of the file.

altloc{‘first’, ‘occupancy’, ‘all’}
This parameter defines how altloc IDs are handled:
  • 'first' - Use atoms that have the first altloc ID appearing in a residue.

  • 'occupancy' - Use atoms that have the altloc ID with the highest occupancy for a residue.

  • 'all' - Use all atoms. Note that this leads to duplicate atoms. When this option is chosen, the altloc_id annotation array is added to the returned structure.

extra_fieldslist of str, optional

The strings in the list are entry names, that are additionally added as annotation arrays. The annotation category name will be the same as the PDBx subcategory name. The array type is always str. An exception are the special field identifiers: 'atom_id', 'b_factor', 'occupancy' and 'charge'. These will convert the fitting subcategory into an annotation array with reasonable type.

use_author_fieldsbool, optional

Some fields can be read from two alternative sources, for example both, label_seq_id and auth_seq_id describe the ID of the residue. While, the label_xxx fields can be used as official pointers to other categories in the PDBxFile, the auth_xxx fields are set by the author(s) of the structure and are consistent with the corresponding values in PDB files. If use_author_fields is true, the annotation arrays will be read from the auth_xxx fields (if applicable), otherwise from the the label_xxx fields. If the requested field is not available, the respective other field is taken as fallback.

arrayAtomArray or AtomArrayStack

The return type depends on the model parameter.


>>> import os.path
>>> file =, "1l2y.cif"))
>>> arr = get_structure(file, model=1)
>>> print(len(arr))