biotite.database.pubchem.SimilarityQuery

class biotite.database.pubchem.SimilarityQuery(threshold=0.9, conformation_based=False, **kwargs)[source]

Bases: StructureQuery

A query that searches for all structures similar to the given input structure.

Exactly one of the input structure parameters smiles, smarts, inchi, sdf or cid must be given.

Parameters
thresholdfloat, optional

The minimum required Tanimoto similarity for a match. Must be between 0 (no similarity) and 1 (complete match).

conformation_basedbool, optional

If set to true, the similarity is computed based on the 3D conformation. By default, only the elements and bonds between the atoms are considered for similarity computation.

smilesstr, optional

The query SMILES string.

smartsstr, optional

The query SMARTS pattern.

inchistr, optional

The query InChI string.

sdfstr, optional

A query structure as SDF formatted string. Usually from_atoms() is used to create the SDF from an AtomArray.

cidint, optional

The query structure given as CID.

numberint, optional

The maximum number of matches that this query may return. By default, the PubChem default value is used, which can be considered unlimited.

Notes

The conformation based similarity measure uses shape-Tanimoto and color-Tanimoto scores 1.

References

1

S. Kim, P. A. Thiessen, T. Cheng, B. Yu, E. E. Bolton, “An update on PUG-REST: RESTful interface for programmatic access to PubChem,” Nucleic Acids Research, vol. 46, pp. W563-W570, July 2018. doi: 10.1093/nar/gky294

Examples

>>> # CID of alanine
>>> print(search(SimilarityQuery(cid=5950, threshold=1.0, number=5)))
[5950, ..., ..., ..., ...]
>>> # AtomArray of alanine
>>> atom_array = residue("ALA")
>>> print(search(SimilarityQuery.from_atoms(atom_array, threshold=1.0, number=5)))
[5950, ..., ..., ..., ...]
classmethod from_atoms(atoms, *args, **kwargs)

Create a query using the given query structure.

Parameters
atomsAtomArray or AtomArrayStack

The query structure.

**kwargsdict, optional

See the constructor for additional options.

get_files()

Get the POST file payload for this query.

Returns
paramsdict (str -> object)

The file payload.

get_input_url_path()

Get the input part of the request URL.

Returns
get_input_url_pathstr

The input part of the request URL. Must not contain slash characters at the beginning and end of the string.

get_params()

Get the POST payload for this query.

Returns
paramsdict (str -> object)

The payload.

search_options()

Get additional options for the POST options.

PROTECTED: Override when inheriting.

Returns
optionsdict (str -> object)

They keys are automatically converted from snake case to camel case required by the request parameters.

search_type()

Get the type of performed search for the request input part.

PROTECTED: Override when inheriting.

Returns
search_typestr

The search type for the input part, i.e. the part directly after compound/.