AnnotatedSequence
#
- class biotite.sequence.AnnotatedSequence(annotation, sequence, sequence_start=1)[source]#
Bases:
Copyable
An
AnnotatedSequence
is a combination of aSequence
and anAnnotation
.Indexing an
AnnotatedSequence
with a slice returns anotherAnnotatedSequence
with the corresponding subannotation and a sequence start corrected subsequence, i.e. indexing starts at 1 with the default sequence start 1. The sequence start in the newly createdAnnotatedSequence
is the start of the slice. Furthermore, integer indices are allowed in which case the corresponding symbol of the sequence is returned (also sequence start corrected). In both cases the index must be in range of the sequence, e.g. if sequence start is 1, index 0 is not allowed. Negative indices do not mean indexing from the end of the sequence, in contrast to the behavior inSequence
objects. Both index types can also be used to modify the sequence.Another option is indexing with a
Feature
(preferably from theAnnotation
in the sameAnnotatedSequence
). In this case a sequence, described by the location(s) of theFeature
, is returned. When using aFeature
for setting anAnnotatedSequence
with a sequence, the new sequence is replacing the locations of theFeature
. Note the the replacing sequence must have the same length as the sequence of theFeature
index.- Parameters:
- sequenceSequence
The sequence. Usually a
NucleotideSequence
orProteinSequence
.- annotationAnnotation
The annotation corresponding to sequence.
- sequence_startint, optional
By default, the first symbol of the sequence is corresponding to location 1 of the features in the annotation. The location of the first symbol can be changed by setting this parameter. Negative values are not supported yet.
See also
Examples
Creating an annotated sequence
>>> sequence = NucleotideSequence("ATGGCGTACGATTAGAAAAAAA") >>> feature1 = Feature("misc_feature", [Location(1,2), Location(11,12)], ... {"note" : "walker"}) >>> feature2 = Feature("misc_feature", [Location(16,22)], {"note" : "poly-A"}) >>> annotation = Annotation([feature1, feature2]) >>> annot_seq = AnnotatedSequence(annotation, sequence) >>> print(annot_seq.sequence) ATGGCGTACGATTAGAAAAAAA >>> for f in sorted(list(annot_seq.annotation)): ... print(f.qual["note"]) walker poly-A
Indexing with integers, note the sequence start correction
>>> print(annot_seq[2]) T >>> print(annot_seq.sequence[2]) G
indexing with slices
>>> annot_seq2 = annot_seq[:16] >>> print(annot_seq2.sequence) ATGGCGTACGATTAG >>> for f in annot_seq2.annotation: ... print(f.qual["note"]) walker
Indexing with features
>>> print(annot_seq[feature1]) ATAT >>> print(annot_seq[feature2]) AAAAAAA >>> print(annot_seq.sequence) ATGGCGTACGATTAGAAAAAAA >>> annot_seq[feature1] = NucleotideSequence("CCCC") >>> print(annot_seq.sequence) CCGGCGTACGCCTAGAAAAAAA
- Attributes:
- sequenceSequence
The represented sequence.
- annotationAnnotation
The annotation corresponding to sequence.
- sequence_startint
The location of the first symbol in the sequence.
- copy()#
Create a deep copy of this object.
- Returns:
- copy
A copy of this object.
- reverse_complement(sequence_start=1)#
Create the reverse complement of the annotated sequence.
This method accurately converts the position and the strand of the annotation. The information on the sequence start is lost.
- Parameters:
- sequence_startint, optional
The location of the first symbol in the reverse complement sequence.
- Returns:
- The reverse complement of the annotated sequence.
Gallery#
Finding homologs of a gene in a genome
Identification of a binding site by sequence conservation