biotite.database.entrez.fetch

biotite.database.entrez.fetch(uids, target_path, suffix, db_name, ret_type, ret_mode='text', overwrite=False, verbose=False, mail='')[source]

Download files from the NCBI Entrez database in various formats.

The data for each UID will be fetched into a separate file.

A list of valid database, retrieval type and mode combinations can be found under https://www.ncbi.nlm.nih.gov/books/NBK25499/table/chapter4.T._valid_values_of__retmode_and/?report=objectonly

This function requires an internet connection.

Parameters:
uids : str or iterable object of str

A single unique identifier (UID) or a list of UIDs of the file(s) to be downloaded .

target_path : str

The target directory of the downloaded files.

suffix : str

The file suffix of the downloaded files. This value is independent of the retrieval type.

db_name : str:

E-utility database name.

ret_type : str

Retrieval type

ret_mode : str, optional

Retrieval mode

overwrite : bool, optional

If true, existing files will be overwritten. Otherwise the respective file will only be downloaded if the file does not exist yet in the specified target directory. (Default: False)

verbose: bool, optional

If true, the function will output the download progress. (Default: False)

mail : str, optional

A mail address that is appended to to HTTP request. This address is contacted in case you contact the NCBI server too often. This does only work if the mail address is registered.

Returns:
files : str or list of str

The file path(s) to the downloaded files. If a single string (a single UID) was given in uids, a single string is returned. If a list (or other iterable object) was given, a list of strings is returned.

Examples

>>> import os.path
>>> files = fetch(["1L2Y_A","3O5R_A"], path_to_directory, suffix="fa",
...               db_name="protein", ret_type="fasta")
>>> print([os.path.basename(file) for file in files])
['1L2Y_A.fa', '3O5R_A.fa']