lddt
#
- biotite.structure.lddt(reference, subject, aggregation='all', atom_mask=None, partner_mask=None, inclusion_radius=15, distance_bins=(0.5, 1.0, 2.0, 4.0), exclude_same_residue=True, exclude_same_chain=False, filter_function=None, symmetric=False)[source]#
Calculate the local Distance Difference Test (lDDT) score of a structure with respect to its reference. [1]
- Parameters:
- referenceAtomArray
The reference structure.
- subjectAtomArray or AtomArrayStack or ndarray, dtype=float, shape=(n,3) or shape=(m,n,3)
The structure(s) to evaluate with respect to reference. The number of atoms must be the same as in reference. Alternatively, coordinates can be provided directly as
ndarray
.- aggregation{‘all’, ‘chain’, ‘residue’, ‘atom’} or ndarray, shape=(n,), dtype=int, optional
Defines on which scale the lDDT score is calculated.
‘all’: The score is computed over all contacts.
‘chain’: The score is calculated for each chain separately.
‘residue’: The score is calculated for each residue separately.
‘atom’: The score is calculated for each atom separately.
Alternatively, an array of aggregation bins can be provided, i.e. each contact is assigned to the corresponding bin.
- atom_maskndarray, shape=(n,), dtype=bool, optional
If given, the contacts are only computed for the masked atoms. Atoms excluded by the mask do not have any contacts and their lDDT would be NaN in case of
aggregation="atom"
. Providing this mask can significantly speed up the computation, if only for certain chains/residues/atoms the lDDT is of interest.- partner_maskndarray, shape=(n,), dtype=bool, optional
If given, only contacts to the masked atoms are considered. While atom_mask does not alter the lDDT for the masked atoms, partner_mask does, as for each atom only the masked atoms are considered as potential contact partners.
- inclusion_radiusfloat, optional
Pairwise atom distances are considered within this radius in reference.
- distance_binslist of float, optional
The distance bins for the score calculation, i.e if a distance deviation is within the first bin, the score is 1, if it is outside all bins, the score is 0.
- exclude_same_residuebool, optional
If true, only atom distances between different residues are considered. Otherwise, also atom distances within the same residue are included.
- exclude_same_chainbool, optional
If true, only atom distances between different chains are considered. Otherwise, also atom distances within the same chain are included.
- filter_functionCallable(ndarray, shape=(n,2), dtype=int -> ndarray, shape=(n,), dtype=bool), optional
Used for custom contact filtering, if the other parameters are not sufficient. A function that takes an array of contact atom indices and returns a mask that is
True
for all contacts that should be retained. All other contacts are not considered for lDDT computation.- symmetricbool, optional
If set to true, the lDDT score is computed symmetrically. This means both contacts found in the reference and subject structure are considered. Hence the score is independent of which structure is given as reference and subject. Note that in this case subject must be an
AtomArray
as well. By default, only contacts in the reference are considered.
- Returns:
- lddtfloat or ndarray, dtype=float
The lDDT score for each model and aggregation bin. The shape depends on subject and aggregation: If subject is an
AtomArrayStack
(or equivalent coordinatendarray
), a dimension depicting each model is added. if aggregation is not'all'
, a second dimension with the length equal to the number of aggregation bins is added (i.e. number of chains, residues, etc.). If both, anAtomArray
as subject andaggregation='all'
is passed, a float is returned.
Notes
The lDDT score measures how well the pairwise atom distances in a model match the corresponding distances in a reference. Hence, like
rmspd()
it works superimposition-free, but instead of capturing the global deviation, only the local environment within the inclusion_radius is considered.Note that by default, also hydrogen atoms are considered in the distance calculation. If this is undesired, the hydrogen atoms can be removed prior to the calculation.
References
Examples
Calculate the global lDDT of all models to the first model:
>>> reference = atom_array_stack[0] >>> subject = atom_array_stack[1:] >>> print(lddt(reference, subject)) [0.799 0.769 0.792 0.836 0.799 0.752 0.860 0.769 0.825 0.777 0.760 0.787 0.790 0.783 0.804 0.842 0.769 0.797 0.757 0.852 0.811 0.786 0.805 0.755 0.734 0.794 0.771 0.778 0.842 0.772 0.815 0.789 0.828 0.750 0.826 0.739 0.760]
Calculate the residue-wise lDDT for a single model:
>>> subject = atom_array_stack[1] >>> print(lddt(reference, subject, aggregation="residue")) [0.599 0.692 0.870 0.780 0.830 0.881 0.872 0.658 0.782 0.901 0.888 0.885 0.856 0.795 0.847 0.603 0.895 0.878 0.871 0.789]
As example for custom aggregation, calculate the lDDT for each chemical element:
>>> unique_elements = np.unique(reference.element) >>> element_bins = np.array( ... [np.where(unique_elements == element)[0][0] for element in reference.element] ... ) >>> element_lddt = lddt(reference, subject, aggregation=element_bins) >>> for element, lddt_for_element in zip(unique_elements, element_lddt): ... print(f"{element}: {lddt_for_element:.3f}") C: 0.837 H: 0.770 N: 0.811 O: 0.808
If the reference structure has more atoms resolved than the subject structure, the missing atoms can be indicated with NaN values:
>>> reference = atom_array_stack[0] >>> subject = atom_array_stack[1].copy() >>> # Simulate the situation where the first residue is missing in the subject >>> subject.coord[subject.res_id == 1] = np.nan >>> global_lddt = lddt(reference, subject) >>> print(f"{global_lddt:.3f}") 0.751