LetterAlphabet
#
- class biotite.sequence.LetterAlphabet(symbols)[source]#
Bases:
Alphabet
LetterAlphabet
is a anAlphabet
subclass specialized for letter based alphabets, like DNA or protein sequence alphabets. The alphabet size is limited to the 94 printable, non-whitespace characters. Internally the symbols are saved as bytes objects. The encoding and decoding process is a lot faster than for a normalAlphabet
.The performance gain comes through the use of NumPy and Cython for encoding and decoding, without the need of a dictionary.
- Parameters:
- symbolsiterable object or str or bytes
The symbols, that are allowed in this alphabet. The corresponding code for a symbol, is the index of that symbol in this list.
- decode(code, as_bytes=False)#
Use the alphabet to decode a symbol code.
- Parameters:
- codeint
The symbol code to be decoded.
- Returns:
- symbolobject
The symbol corresponding to code.
- Raises:
- AlphabetError
If code is not a valid code in the alphabet.
- decode_multiple(code, as_bytes=False)#
Decode a sequence code into a list of symbols.
- Parameters:
- codendarray, dtype=uint8
The sequence code to decode. Works fastest if a
ndarray
is provided.- as_bytesbool, optional
If true, the output array will contain bytes (dtype ‘S1’). Otherwise, the the output array will contain str (dtype ‘U1’).
- Returns:
- symbolsndarray, dtype=’U1’ or dtype=’S1’
The decoded list of symbols.
- encode(symbol)#
Use the alphabet to encode a symbol.
- Parameters:
- symbolobject
The object to encode into a symbol code.
- Returns:
- codeint
The symbol code of symbol.
- Raises:
- AlphabetError
If symbol is not in the alphabet.
- encode_multiple(symbols, dtype=None)#
Encode multiple symbols.
- Parameters:
- symbolsiterable object or str or bytes
The symbols to encode. The method is fastest when a
ndarray
,str
orbytes
object containing the symbols is provided, instead of e.g. a list.- dtypedtype, optional
For compatibility with superclass. The value is ignored.
- Returns:
- codendarray
The sequence code.
- extends(alphabet)#
Check, if this alphabet extends another alphabet.
- Parameters:
- alphabetAlphabet
The potential parent alphabet.
- Returns:
- resultbool
True, if this object extends alphabet, false otherwise.
- get_symbols()#
Get the symbols in the alphabet.
- Returns:
- symbolstuple
The symbols.
- is_letter_alphabet()#
Check whether the symbols in this alphabet are single printable letters. If so, the alphabet could be expressed by a LetterAlphabet.
- Returns:
- is_letter_alphabetbool
True, if all symbols in the alphabet are ‘str’ or ‘bytes’, have length 1 and are printable.
Gallery#

Searching for structural homologs in a protein structure database