biotite.database.pubchem.fetch

biotite.database.pubchem.fetch(cids, format='sdf', target_path=None, as_structural_formula=False, overwrite=False, verbose=False, throttle_threshold=0.5, return_throttle_status=False)[source]

Download structure files from PubChem in various formats.

This function requires an internet connection.

Parameters
cidsint or iterable object or int

A single compound ID (CID) or a list of CIDs of the structure(s) to be downloaded.

format{‘sdf’, ‘asnt’ ‘asnb’, ‘xml’, ‘json’, ‘jsonp’, ‘png’}

The format of the files to be downloaded.

as_structural_formulabool, optional

If set to true, the structural formula is download instead of an 3D conformer. This means that coordinates lie in th xy-plane and represent the positions atoms would have an a structural formula representation.

target_pathstr, optional

The target directory of the downloaded files. By default, the file content is stored in a file-like object (StringIO or BytesIO, respectively).

overwritebool, optional

If true, existing files will be overwritten. Otherwise the respective file will only be downloaded, if the file does not exist yet in the specified target directory or if the file is empty.

verbose: bool, optional

If set to true, the function will output the download progress.

throttle_thresholdfloat or None, optional

A value between 0 and 1. If the load of either the request time or count exceeds this value the execution is halted. See ThrottleStatus for more information. If None is given, the execution is never halted.

return_throttle_statusfloat, optional

If set to true, the ThrottleStatus of the final request is also returned.

Returns
filesstr or StringIO or BytesIO or list of (str or StringIO or BytesIO)

The file path(s) to the downloaded files. If a single CID was given in cids, a single string is returned. If a list (or other iterable object) was given, a list of strings is returned. If no target_path was given, the file contents are stored in either StringIO or BytesIO objects.

throttle_statusThrottleStatus

The ThrottleStatus obtained from the server response. If multiple CIDs are requested, the ThrottleStatus of of the final response is returned. This can be used for custom request throttling, for example. Only returned, if return_throttle_status is set to true.

Examples

>>> import os.path
>>> file = fetch(2244, "sdf", path_to_directory)
>>> print(os.path.basename(file))
2244.sdf
>>> files = fetch([2244, 5950], "sdf", path_to_directory)
>>> print([os.path.basename(file) for file in files])
['2244.sdf', '5950.sdf']