SequenceQuery#

class biotite.database.rcsb.SequenceQuery(sequence, scope, min_identity=0.0, max_expect_value=10000000.0)[source]#

Bases: SingleQuery

A query for protein/DNA/RNA molecules with a sequence similar to a given input sequence using MMseqs2.

Parameters:
sequenceSequence or str

The input sequence. If sequence is a NucleotideSequence and the scope is 'rna', 'T' is automatically replaced by 'U'.

scope{‘protein’, ‘dna’, ‘rna’}

The type of molecule to find.

min_identityfloat, optional

A match is only returned, if the sequence identity between the match and the input sequence exceeds this value. Must be between 0 and 1. By default, the sequence identity is ignored.

max_expect_valuefloat, optional

A match is only returned, if the expect value (E-value) does not exceed this value. By default, the value is effectively ignored.

Notes

MMseqs2 is run on the RCSB servers.

Examples

>>> sequence = "NLYIQWLKDGGPSSGRPPPS"
>>> query = SequenceQuery(sequence, scope="protein", min_identity=0.8)
>>> print(sorted(search(query)))
['1L2Y', '1RIJ', '2JOF', '2LDJ', '2LL5', '2MJ9', '3UC7', '3UC8']
get_content()#

Get the query content, i.e. the data belonging to the 'query' attribute in the RCSB search API.

This content is converted into JSON by the search() and count() functions.

Returns:
contentdict

The content dictionary for the 'query' attributes.