biotite.structure.pseudoknots¶
- biotite.structure.pseudoknots(base_pairs, scores=None, max_pseudoknot_order=None)[source]¶
Identify the pseudoknot order for each base pair in a given set of base pairs.
By default the algorithm removes base pairs until the remaining base pairs are completely nested i.e. no pseudoknots appear. The pseudoknot order of the removed base pairs is incremented and the procedure is repeated with these base pairs. Base pairs are removed in a way that maximizes the number of remaining base pairs. However, an optional score for each individual base pair can be provided.
- Parameters:
- base_pairsndarray, dtype=int, shape=(n,2)
The base pairs to determine the pseudoknot order of. Each row represents indices form two paired bases. The structure of the
ndarray
is equal to the structure of the output ofbase_pairs()
, where the indices represent the beginning of the residues.- scoresndarray, dtype=int, shape=(n,), optional
The score for each base pair. By default, the score of each base pair is
1
.- max_pseudoknot_orderint, optional
The maximum pseudoknot order to be found. If a base pair would be of a higher order, its order is specified as
-1
. By default, the algorithm is run until all base pairs have an assigned pseudoknot order.
- Returns:
- pseudoknot_orderndarray, dtype=int, shape=(m,n)
The pseudoknot order of the input base_pairs. Multiple solutions that maximize the number of basepairs or the given score, respectively, may be possible. Therefore all m individual solutions are returned.
See also
Notes
The dynamic programming approach by Smit et al [1] is applied to detect pseudoknots. The algorithm was originally developed to remove pseudoknots from a structure. However, if it is run iteratively on removed knotted pairs it can be used to identify the pseudoknot order.
The pseudoknot order is defined as the minimum number of base pair set decompositions resulting in a nested structure [2]. Therefore, there are no pseudoknots between base pairs with the same pseudoknot order.
References
Examples
Remove the pseudoknotted base pair for the sequence ABCbac, where the corresponding big and small letters each represent a base pair:
Define the base pairs as
ndarray
:>>> basepairs = np.array([[0, 4], ... [1, 3], ... [2, 5]])
Find the unknotted base pairs, optimizing for the maximum number of base pairs:
>>> print(pseudoknots(basepairs, max_pseudoknot_order=0)) [[ 0 0 -1]]
This indicates that the base pair Cc is a pseudoknot.
Given the length of the sequence (6 bases), we can also represent the unknotted structure in dot bracket notation:
>>> print(dot_bracket(basepairs, 6, max_pseudoknot_order=0)[0]) ((.)).
If the maximum pseudoknot order is not restricted, the order of the knotted pairs is determined and can be represented using dot bracket letter notation:
>>> print(pseudoknots(basepairs)) [[0 0 1]] >>> print(dot_bracket(basepairs, 6)[0]) (([))]
Gallery¶
Plotting the base pairs of a tRNA-like-structure
Comparison of a tRNA-like-structure with a tRNA