to_mol#

biotite.interface.rdkit.to_mol(atoms, kekulize=False, use_dative_bonds=False, include_extra_annotations=(), explicit_hydrogen=None)[source]#

Convert an AtomArray or AtomArrayStack into a rdkit.Chem.rdchem.Mol.

Parameters:
atomsAtomArray or AtomArrayStack

The molecule to be converted. Must have an associated BondList.

kekulizebool, optional

If set to true, aromatic bonds are represented by single, double and triple bonds. By default, aromatic bond types are converted to rdkit.rdchem.BondType.AROMATIC.

use_dative_bondsbool, optional

If set to true, BondType.COORDINATION bonds are translated to rdkit.rdchem.BondType.DATIVE bonds instead of rdkit.rdchem.BondType.SINGLE bonds. This may have the undesired side effect that a rdkit.Chem.rdchem.KekulizeException is raised for some molecules, when the returned rdkit.Chem.rdchem.Mol is kekulized.

include_extra_annotationslist of str, optional

Names of annotation arrays in atoms that are added as atom-level property with the same name to the returned rdkit.Chem.rdchem.Mol. These properties can be accessed with rdkit.Chem.rdchem.Mol.GetProp(). Note that standard annotations (e.g. 'chain_id', 'atom_name', 'res_name') are always included per default. These standard annotations can be accessed with rdkit.Chem.rdchem.Atom.GetPDBResidueInfo() for each atom in the returned rdkit.Chem.rdchem.Mol.

explicit_hydrogenbool, optional

If set to true, the conversion process expects that all hydrogen atoms are explicit, i.e. each each hydrogen atom must be part of the AtomArray. If set to false, the conversion process treats all hydrogen atoms as implicit. By default, explicit hydrogen atoms are only assumed if any hydrogen atoms are present in atoms.

Returns:
molrdkit.Chem.rdchem.Mol

The RDKit molecule. If the input atoms is an AtomArrayStack, all models are included as conformers with conformer IDs starting from 0.

Raises:
BadStructureError

If the input atoms does not have an associated BondList. Also raises a BadStructureError, if explicit_hydrogen is set to False despite hydrogen atoms being present in atoms.

Notes

The atoms in the return value are in the same order as the input atoms, i.e. indices pointing to the rdkit.Chem.rdchem.Mol can be used to point to the same atoms in the AtomArray.

Examples

>>> from rdkit.Chem import MolToSmiles
>>> alanine_atom_array = residue("ALA")
>>> mol = to_mol(alanine_atom_array)
>>> print(MolToSmiles(mol))
[H]OC(=O)C([H])(N([H])[H])C([H])([H])[H]

By default, 'atom_name' is stored in RDKit’s PDBResidueInfo grouping for each atom. We can access it manually as below

>>> for atom in mol.GetAtoms():
...     print(atom.GetPDBResidueInfo().GetName())
N
CA
C
O
CB
OXT
H
H2
HA
HB1
HB2
HB3
HXT