biotite.database.rcsb.SequenceQuery

class biotite.database.rcsb.SequenceQuery(sequence, scope, min_identity=0.0, max_expect_value=10000000.0)[source]

Bases: SingleQuery

A query for protein/DNA/RNA molecules with a sequence similar to a given input sequence using MMseqs2.

Parameters
sequenceSequence or str

The input sequence. If sequence is a NucleotideSequence and the scope is 'rna', 'T' is automatically replaced by 'U'.

scope{‘protein’, ‘dna’, ‘rna’}

The type of molecule to find.

min_identityfloat, optional

A match is only returned, if the sequence identity between the match and the input sequence exceeds this value. Must be between 0 and 1. By default, the sequence identity is ignored.

max_expect_valuefloat, optional

A match is only returned, if the expect value (E-value) does not exceed this value. By default, the value is effectively ignored.

Notes

MMseqs2 is run on the RCSB servers.

Examples

>>> sequence = "NLYIQWLKDGGPSSGRPPPS"
>>> query = SequenceQuery(sequence, scope="protein", min_identity=0.8)
>>> print(sorted(search(query)))
['1L2Y', '1RIJ', '2JOF', '2LDJ', '2LL5', '2MJ9', '3UC7', '3UC8']
get_content()

Get the query content, i.e. the data belonging to the 'query' attribute in the RCSB search API.

This content is converted into JSON by the search() and count() functions.

Returns
contentdict

The content dictionary for the 'query' attributes.