AnnotatedSequence#

class biotite.sequence.AnnotatedSequence(annotation, sequence, sequence_start=1)[source]#

Bases: Copyable

An AnnotatedSequence is a combination of a Sequence and an Annotation.

Indexing an AnnotatedSequence with a slice returns another AnnotatedSequence with the corresponding subannotation and a sequence start corrected subsequence, i.e. indexing starts at 1 with the default sequence start 1. The sequence start in the newly created AnnotatedSequence is the start of the slice. Furthermore, integer indices are allowed in which case the corresponding symbol of the sequence is returned (also sequence start corrected). In both cases the index must be in range of the sequence, e.g. if sequence start is 1, index 0 is not allowed. Negative indices do not mean indexing from the end of the sequence, in contrast to the behavior in Sequence objects. Both index types can also be used to modify the sequence.

Another option is indexing with a Feature (preferably from the Annotation in the same AnnotatedSequence). In this case a sequence, described by the location(s) of the Feature, is returned. When using a Feature for setting an AnnotatedSequence with a sequence, the new sequence is replacing the locations of the Feature. Note the the replacing sequence must have the same length as the sequence of the Feature index.

Parameters:
annotationAnnotation

The annotation corresponding to sequence.

sequenceSequence

The sequence. Usually a NucleotideSequence or ProteinSequence.

sequence_startint, optional

By default, the first symbol of the sequence is corresponding to location 1 of the features in the annotation. The location of the first symbol can be changed by setting this parameter. Negative values are not supported yet.

Attributes:
annotationAnnotation

The annotation corresponding to sequence.

sequenceSequence

The represented sequence.

sequence_startint

The location of the first symbol in the sequence.

See also

Annotation

An annotation separated from a sequence.

Sequence

A sequence separated from an annotation.

Examples

Creating an annotated sequence

>>> sequence = NucleotideSequence("ATGGCGTACGATTAGAAAAAAA")
>>> feature1 = Feature("misc_feature", [Location(1,2), Location(11,12)],
...                    {"note" : "walker"})
>>> feature2 = Feature("misc_feature", [Location(16,22)], {"note" : "poly-A"})
>>> annotation = Annotation([feature1, feature2])
>>> annot_seq = AnnotatedSequence(annotation, sequence)
>>> print(annot_seq.sequence)
ATGGCGTACGATTAGAAAAAAA
>>> for f in sorted(list(annot_seq.annotation)):
...     print(f.qual["note"])
walker
poly-A

Indexing with integers, note the sequence start correction

>>> print(annot_seq[2])
T
>>> print(annot_seq.sequence[2])
G

indexing with slices

>>> annot_seq2 = annot_seq[:16]
>>> print(annot_seq2.sequence)
ATGGCGTACGATTAG
>>> for f in annot_seq2.annotation:
...     print(f.qual["note"])
walker

Indexing with features

>>> print(annot_seq[feature1])
ATAT
>>> print(annot_seq[feature2])
AAAAAAA
>>> print(annot_seq.sequence)
ATGGCGTACGATTAGAAAAAAA
>>> annot_seq[feature1] = NucleotideSequence("CCCC")
>>> print(annot_seq.sequence)
CCGGCGTACGCCTAGAAAAAAA
copy()#

Create a deep copy of this object.

Returns:
copy

A copy of this object.

reverse_complement(sequence_start=1)#

Create the reverse complement of the annotated sequence.

This method accurately converts the position and the strand of the annotation. The information on the sequence start is lost.

Parameters:
sequence_startint, optional

The location of the first symbol in the reverse complement sequence.

Returns:
rev_sequenceSequence

The reverse complement of the annotated sequence.