read_alignment_from_cigar
#
- biotite.sequence.align.read_alignment_from_cigar(cigar, position, reference_sequence, segment_sequence)[source]#
Create an
Alignment
from a CIGAR string.- Parameters:
- cigarstr
The CIGAR string.
- positionint
0-based position of the first aligned base in the reference. 0-based equivalent to the
POS
field in the SAM/BAM file.- reference_sequenceSequence
The reference sequence.
- segment_sequenceSequence
The segment, read or query sequence.
- Returns:
- alignmentAlignment
The alignment.
See also
Notes
This function expects that the segment_sequence was taken from the SAM/BAM file, hence hard-clipped bases are not part of the sequence. Therefore, hard clipped bases are simply ignored in the CIGAR string.
Examples
>>> ref = NucleotideSequence("TATAAAAGGTTTCCGACCGTAGGTAGCTGA") >>> seg = NucleotideSequence("CCCCGGTTTGACCGTATGTAG") >>> print(read_alignment_from_cigar("9M2D12M", 3, ref, seg)) AAAAGGTTTCCGACCGTAGGTAG CCCCGGTTT--GACCGTATGTAG >>> print(read_alignment_from_cigar("4X5=2D7=1X4=", 3, ref, seg)) AAAAGGTTTCCGACCGTAGGTAG CCCCGGTTT--GACCGTATGTAG
Explicit terminal deletions are also possible. Note that in this case the deleted positions count as aligned bases with respect to the position parameter.
>>> print(read_alignment_from_cigar("3D9M2D12M4D", 0, ref, seg)) TATAAAAGGTTTCCGACCGTAGGTAGCTGA ---CCCCGGTTT--GACCGTATGTAG----
If bases in the segment sequence are soft-clipped, they do not appear in the alignment. Furthermore, the start of the reference sequence must be adapted.
>>> print(read_alignment_from_cigar("4S5M2D12M", 7, ref, seg)) GGTTTCCGACCGTAGGTAG GGTTT--GACCGTATGTAG
Hard-clipped bases are not part of the segment sequence. Hence H operations are completely ignored.
>>> seg = NucleotideSequence("GGTTTGACCGTATGTAG") >>> print(read_alignment_from_cigar("4H5M2D12M", 7, ref, seg)) GGTTTCCGACCGTAGGTAG GGTTT--GACCGTATGTAG
Reading from BAM codes is also possible.
>>> seg = NucleotideSequence("CCCCGGTTTGACCGTATGTAG") >>> op_tuples = [ ... (CigarOp.MATCH, 9), ... (CigarOp.DELETION, 2), ... (CigarOp.MATCH, 12) ... ] >>> print(read_alignment_from_cigar(op_tuples, 3, ref, seg)) AAAAGGTTTCCGACCGTAGGTAG CCCCGGTTT--GACCGTATGTAG