SequenceQuery#

class biotite.database.rcsb.SequenceQuery(sequence, scope, min_identity=0.0, max_expect_value=10000000.0)[source]#

Bases: SingleQuery

A query for protein/DNA/RNA molecules with a sequence similar to a given input sequence using MMseqs2.

Parameters:
sequenceSequence or str

The input sequence. If sequence is a NucleotideSequence and the scope is 'rna', 'T' is automatically replaced by 'U'.

scope{‘protein’, ‘dna’, ‘rna’}

The type of molecule to find.

min_identityfloat, optional

A match is only returned, if the sequence identity between the match and the input sequence exceeds this value. Must be between 0 and 1. By default, the sequence identity is ignored.

max_expect_valuefloat, optional

A match is only returned, if the expect value (E-value) does not exceed this value. By default, the value is effectively ignored.

Notes

MMseqs2 is run on the RCSB servers.

Examples

>>> sequence = "NLYIQWLKDGGPSSGRPPPS"
>>> query = SequenceQuery(sequence, scope="protein", min_identity=0.95)
>>> print(sorted(search(query)))
['1L2Y', '2LDJ', '9G22', '9G2N', '9G2O', '9G31', '9G32', '9GDL', '9GDN', '9GDT', '9GDU', '9GE1']
get_content()#

Get the query content, i.e. the data belonging to the 'query' attribute in the RCSB search API.

This content is converted into JSON by the search() and count() functions.

Returns:
contentdict

The content dictionary for the 'query' attributes.