lddt#

biotite.structure.lddt(reference, subject, aggregation='all', atom_mask=None, partner_mask=None, inclusion_radius=15, distance_bins=(0.5, 1.0, 2.0, 4.0), exclude_same_residue=True, exclude_same_chain=False, filter_function=None, symmetric=False)[source]#

Calculate the local Distance Difference Test (lDDT) score of a structure with respect to its reference. [1]

Parameters:
referenceAtomArray

The reference structure.

subjectAtomArray or AtomArrayStack or ndarray, dtype=float, shape=(n,3) or shape=(m,n,3)

The structure(s) to evaluate with respect to reference. The number of atoms must be the same as in reference. Alternatively, coordinates can be provided directly as ndarray.

aggregation{‘all’, ‘chain’, ‘residue’, ‘atom’} or ndarray, shape=(n,), dtype=int, optional

Defines on which scale the lDDT score is calculated.

  • ‘all’: The score is computed over all contacts.

  • ‘chain’: The score is calculated for each chain separately.

  • ‘residue’: The score is calculated for each residue separately.

  • ‘atom’: The score is calculated for each atom separately.

Alternatively, an array of aggregation bins can be provided, i.e. each contact is assigned to the corresponding bin.

atom_maskndarray, shape=(n,), dtype=bool, optional

If given, the contacts are only computed for the masked atoms. Atoms excluded by the mask do not have any contacts and their lDDT would be NaN in case of aggregation="atom". Providing this mask can significantly speed up the computation, if only for certain chains/residues/atoms the lDDT is of interest.

partner_maskndarray, shape=(n,), dtype=bool, optional

If given, only contacts to the masked atoms are considered. While atom_mask does not alter the lDDT for the masked atoms, partner_mask does, as for each atom only the masked atoms are considered as potential contact partners.

inclusion_radiusfloat, optional

Pairwise atom distances are considered within this radius in reference.

distance_binslist of float, optional

The distance bins for the score calculation, i.e if a distance deviation is within the first bin, the score is 1, if it is outside all bins, the score is 0.

exclude_same_residuebool, optional

If true, only atom distances between different residues are considered. Otherwise, also atom distances within the same residue are included.

exclude_same_chainbool, optional

If true, only atom distances between different chains are considered. Otherwise, also atom distances within the same chain are included.

filter_functionCallable(ndarray, shape=(n,2), dtype=int -> ndarray, shape=(n,), dtype=bool), optional

Used for custom contact filtering, if the other parameters are not sufficient. A function that takes an array of contact atom indices and returns a mask that is True for all contacts that should be retained. All other contacts are not considered for lDDT computation.

symmetricbool, optional

If set to true, the lDDT score is computed symmetrically. This means both contacts found in the reference and subject structure are considered. Hence the score is independent of which structure is given as reference and subject. Note that in this case subject must be an AtomArray as well. By default, only contacts in the reference are considered.

Returns:
lddtfloat or ndarray, dtype=float

The lDDT score for each model and aggregation bin. The shape depends on subject and aggregation: If subject is an AtomArrayStack (or equivalent coordinate ndarray), a dimension depicting each model is added. if aggregation is not 'all', a second dimension with the length equal to the number of aggregation bins is added (i.e. number of chains, residues, etc.). If both, an AtomArray as subject and aggregation='all' is passed, a float is returned.

Notes

The lDDT score measures how well the pairwise atom distances in a model match the corresponding distances in a reference. Hence, like rmspd() it works superimposition-free, but instead of capturing the global deviation, only the local environment within the inclusion_radius is considered.

Note that by default, also hydrogen atoms are considered in the distance calculation. If this is undesired, the hydrogen atoms can be removed prior to the calculation.

References

Examples

Calculate the global lDDT of all models to the first model:

>>> reference = atom_array_stack[0]
>>> subject = atom_array_stack[1:]
>>> print(lddt(reference, subject))
[0.799 0.769 0.792 0.836 0.799 0.752 0.860 0.769 0.825 0.777 0.760 0.787
 0.790 0.783 0.804 0.842 0.769 0.797 0.757 0.852 0.811 0.786 0.805 0.755
 0.734 0.794 0.771 0.778 0.842 0.772 0.815 0.789 0.828 0.750 0.826 0.739
 0.760]

Calculate the residue-wise lDDT for a single model:

>>> subject = atom_array_stack[1]
>>> print(lddt(reference, subject, aggregation="residue"))
[0.599 0.692 0.870 0.780 0.830 0.881 0.872 0.658 0.782 0.901 0.888 0.885
 0.856 0.795 0.847 0.603 0.895 0.878 0.871 0.789]

As example for custom aggregation, calculate the lDDT for each chemical element:

>>> unique_elements = np.unique(reference.element)
>>> element_bins = np.array(
...     [np.where(unique_elements == element)[0][0] for element in reference.element]
... )
>>> element_lddt = lddt(reference, subject, aggregation=element_bins)
>>> for element, lddt_for_element in zip(unique_elements, element_lddt):
...     print(f"{element}: {lddt_for_element:.3f}")
C: 0.837
H: 0.770
N: 0.811
O: 0.808

If the reference structure has more atoms resolved than the subject structure, the missing atoms can be indicated with NaN values:

>>> reference = atom_array_stack[0]
>>> subject = atom_array_stack[1].copy()
>>> # Simulate the situation where the first residue is missing in the subject
>>> subject.coord[subject.res_id == 1] = np.nan
>>> global_lddt = lddt(reference, subject)
>>> print(f"{global_lddt:.3f}")
0.751