FastaFile
#
- class biotite.sequence.io.fasta.FastaFile(chars_per_line=80)[source]#
Bases:
TextFile
,MutableMapping
This class represents a file in FASTA format.
A FASTA file contains so called header lines, beginning with
>
, that describe following sequence. The corresponding sequence starts at the line after the header line and ends at the next header line or at the end of file. The header along with its sequence forms an entry.This class is used in a dictionary like manner, implementing the
MutableMapping
interface: Headers (without the leading>
) are used as keys, and strings containing the sequences are the corresponding values. Entries can be accessed using indexing,del
deletes the entry at the given index.- Parameters:
- chars_per_lineint, optional
The number characters in a line containing sequence data after which a line break is inserted. Only relevant, when adding sequences to a file. Default is 80.
Examples
>>> import os.path >>> file = FastaFile() >>> file["seq1"] = "ATACT" >>> print(file["seq1"]) ATACT >>> file["seq2"] = "AAAATT" >>> print(file) >seq1 ATACT >seq2 AAAATT >>> print(dict(file.items())) {'seq1': 'ATACT', 'seq2': 'AAAATT'} >>> for header, seq in file.items(): ... print(header, seq) seq1 ATACT seq2 AAAATT >>> del file["seq1"] >>> print(dict(file.items())) {'seq2': 'AAAATT'} >>> file.write(os.path.join(path_to_directory, "test.fasta"))
- copy()#
Create a deep copy of this object.
- Returns:
- copy
A copy of this object.
- classmethod read(file, chars_per_line=80)#
Read a FASTA file.
- Parameters:
- filefile-like object or str
The file to be read. Alternatively a file path can be supplied.
- chars_per_lineint, optional
The number characters in a line containing sequence data after which a line break is inserted. Only relevant, when adding sequences to a file. Default is 80.
- Returns:
- file_objectFastaFile
The parsed file.
- static read_iter(file)#
Create an iterator over each sequence of the given FASTA file.
- Parameters:
- filefile-like object or str
The file to be read. Alternatively a file path can be supplied.
- Yields:
- headerstr
The header of the current sequence.
- seq_strstr
The current sequence as string.
Notes
This approach gives the same results as FastaFile.read(file).items(), but is slightly faster and much more memory efficient.
- write(file)#
Write the contents of this object into a file (or file-like object).
- Parameters:
- filefile-like object or str
The file to be written to. Alternatively a file path can be supplied.
- static write_iter(file, items, chars_per_line=80)#
Iterate over the given items and write each item into the specified file.
In contrast to
write()
, the lines of text are not stored in an intermediateTextFile
, but are directly written to the file. Hence, this static method may save a large amount of memory if a large file should be written, especially if the items are provided as generator.- Parameters:
- filefile-like object or str
The file to be written to. Alternatively a file path can be supplied.
- itemsgenerator or array-like of tuple(str, str)
The entries to be written into the file. Each entry consists of an header string and a sequence string.
- chars_per_lineint, optional
The number characters in a line containing sequence data after which a line break is inserted. Only relevant, when adding sequences to a file. Default is 80.
Notes
This method does not test, whether the given identifiers are unambiguous.
Gallery#

Customized visualization of a multiple sequence alignment

Fetching and aligning a protein from different species

Plot epitope mapping data onto protein sequence alignments