StringArrayEncoding
#
- class biotite.structure.io.pdbx.StringArrayEncoding(strings: ... = None, data_encoding: ... = None, offset_encoding: ... = None)[source]#
Bases:
Encoding
Encoding that compresses an array of strings into an array of indices that point to the unique strings in that array.
The unique strings themselves are stored as part of the
StringArrayEncoding
as concatenated string. The start index of each unique string in the concatenated string is stored in an offset array.- Parameters:
- stringsndarray, optional
The unique strings that are used for encoding. If omitted, the unique strings are determined from the data the first time
encode()
is called.- data_encodinglist of Encoding, optional
The encodings that are applied to the index array. If omitted, the array is directly encoded into bytes without further compression.
- offset_encodinglist of Encoding, optional
The encodings that are applied to the offset array. If omitted, the array is directly encoded into bytes without further compression.
Examples
>>> data = np.array(["apple", "banana", "cherry", "apple", "banana", "apple"]) >>> print(data) ['apple' 'banana' 'cherry' 'apple' 'banana' 'apple'] >>> # By default the indices would directly be encoded into bytes >>> # However, the indices should be printed here -> data_encoding=[] >>> encoding = StringArrayEncoding(data_encoding=[]) >>> encoded = encoding.encode(data) >>> print(encoding.strings) ['apple' 'banana' 'cherry'] >>> print(encoded) [0 1 2 0 1 0]
- Attributes:
- stringsndarray
- data_encodinglist of Encoding
- offset_encodinglist of Encoding
- decode(data)#
Apply the inverse of this encoding to the given data.
- Parameters:
- datandarray or bytes
The data to be decoded.
- Returns:
- decoded_datandarray
The decoded data.
- static deserialize(content)#
Create this component by deserializing the given content.
- Parameters:
- contentstr or dict
The content to be deserialized. The type of this parameter depends on the file format. In case of CIF files, this is the text of the lines that represent this component. In case of BinaryCIF files, this is a dictionary parsed from the MessagePack data.
- encode(data)#
Apply this encoding to the given data.
- Parameters:
- datandarray
The data to be encoded.
- Returns:
- encoded_datandarray or bytes
The encoded data.
- serialize()#
Convert this component into a Python object that can be written to a file.
- Returns:
- contentstr or dict
The content to be serialized. The type of this return value depends on the file format. In case of CIF files, this is the text of the lines that represent this component. In case of BinaryCIF files, this is a dictionary that can be encoded into MessagePack.
- static subcomponent_class()#
Get the class of the components that are stored in this component.
- Returns:
- subcomponent_classtype
The class of the subcomponent. If this component already represents the lowest level, i.e. it does not contain subcomponents,
None
is returned.
- static supercomponent_class()#
Get the class of the component that contains this component.
- Returns:
- supercomponent_classtype
The class of the supercomponent. If this component present already the highest level, i.e. it is not contained in another component,
None
is returned.