biotite.database.rcsb.search¶
- biotite.database.rcsb.search(query, return_type='entry', range=None, sort_by=None)[source]¶
Get all PDB IDs that meet the given query requirements, via the RCSB search API.
This function requires an internet connection.
- Parameters:
- queryQuery
The search query.
- return_type{‘entry’, ‘assembly’, ‘polymer_entity’, ‘non_polymer_entity’, ‘polymer_instance’}, optional
The type of the returned identifiers:
'entry'
: Only the PDB ID is returned (e.g.'XXXX'
). These can be used directly a input tofetch()
.'assembly'
: The PDB ID appended with assembly ID is returned (e.g.'XXXX-1'
).'polymer_entity'
: The PDB ID appended with entity ID of polymers is returned (e.g.'XXXX_1'
).'non_polymer_entity'
: The PDB ID appended with entity ID of non-polymeric entities is returned (e.g.'XXXX_1'
).'polymer_instance'
: The PDB ID appended with chain ID (more exactly'asym_id'
) is returned (e.g.'XXXX.A'
).
- rangetuple(int, int), optional
If this parameter is specified, the only PDB IDs in this range are selected from all matching PDB IDs and returned (pagination). The range is zero-indexed and the stop value is exclusive.
- sort_bystr, optional
If specified, the returned PDB IDs are sorted by the values of the given field name in descending order. A complete list of the available fields is documented at https://search.rcsb.org/structure-search-attributes.html. and https://search.rcsb.org/chemical-search-attributes.html.
- Returns:
- idslist of str
A list of strings containing all PDB IDs that meet the query requirements.
Examples
>>> query = FieldQuery("reflns.d_resolution_high", less_or_equal=0.6) >>> print(sorted(search(query))) ['1EJG', '1I0T', '2GLT', '3NIR', '3P4J', '4JLJ', '5D8V', '5NW3', '7ATG', '7R0H'] >>> print(search(query, sort_by="rcsb_accession_info.initial_release_date")) ['7R0H', '7ATG', '5NW3', '5D8V', '4JLJ', '3P4J', '3NIR', '1I0T', '1EJG', '2GLT'] >>> print(search( ... query, range=(1,4), sort_by="rcsb_accession_info.initial_release_date" ... )) ['7ATG', '5NW3', '5D8V'] >>> print(sorted(search(query, return_type="polymer_instance"))) ['1EJG.A', '1I0T.A', '1I0T.B', '2GLT.A', '3NIR.A', '3P4J.A', '3P4J.B', '4JLJ.A', '4JLJ.B', '5D8V.A', '5NW3.A', '7ATG.A', '7ATG.B', '7R0H.A']