GlycoCT¶
A parser for GlycoCT{condensed} format.
GlycoCT{condensed} is a multi-line format for representing glycan structures and compositions published in [1]. The format is intended to be human-readable, easily compressed, and includes a canonicalization algorithm to ensure that there is only a single representation for a glycan structure.
GlycoCT{condensed} can represent glycan structures with ambiguous
or repeating sub-units. The specification includes additional section directives with
support for stochastic sub-units as well as disjoint subgraphs, though these have not
been implemented in glypy.
References
- [1] Herget, S., Ranzinger, R., Maass, K., & Lieth, C.-W. V. D. (2008).
GlycoCT-a unifying sequence format for carbohydrates. Carbohydrate Research, 343(12), 2162–2171. https://doi.org/10.1016/j.carres.2008.03.011
High Level Functions¶
- glypy.io.glycoct.dump(structure, buffer=None)[source]¶
Serialize the
Glycaninto GlycoCT{condensed}, usingbufferto store the result. IfbufferisNone, then the function will operate on a newly createdStringIOobject.- Parameters:
structure (
Glycan) – The structure to serializebuffer (file-like or None) – The stream to write the serialized structure to. If
None, uses an instance ofStringIO
- Return type:
file-like or str if
bufferisNone
- glypy.io.glycoct.load(stream, structure_class=<class 'glypy.structure.glycan.Glycan'>, allow_repeats=True, allow_multiple=True)[source]¶
Read all structures from the provided text stream.
- glypy.io.glycoct.dumps(structure)[source]¶
Serialize the
Glycaninto GlycoCT{condensed}, returning the text as a string.
Examples¶
>>> from glypy.io import glycoct
>>> glycoct.loads("""RES
1b:x-dglc-HEX-1:5
2s:n-acetyl
3b:b-dglc-HEX-1:5
4s:n-acetyl
5b:b-dman-HEX-1:5
6b:a-dman-HEX-1:5
7b:b-dglc-HEX-1:5
8s:n-acetyl
9b:a-lgal-HEX-1:5|6:d
10b:b-dgal-HEX-1:5
11b:a-dgro-dgal-NON-2:6|1:a|2:keto|3:d
12s:n-glycolyl
13b:b-dglc-HEX-1:5
14s:n-acetyl
15b:b-dgal-HEX-1:5
16s:n-acetyl
17b:b-dglc-HEX-1:5
18s:n-acetyl
19b:a-dman-HEX-1:5
20b:b-dglc-HEX-1:5
21s:n-acetyl
22b:a-lgal-HEX-1:5|6:d
23b:b-dgal-HEX-1:5
24b:a-dgro-dgal-NON-2:6|1:a|2:keto|3:d
25s:n-glycolyl
26b:b-dglc-HEX-1:5
27s:n-acetyl
28b:a-lgal-HEX-1:5|6:d
29b:b-dgal-HEX-1:5
30b:a-dgro-dgal-NON-2:6|1:a|2:keto|3:d
31s:n-acetyl
32b:a-lgal-HEX-1:5|6:d
LIN
1:1d(2+1)2n
2:1o(4+1)3d
3:3d(2+1)4n
4:3o(4+1)5d
5:5o(3+1)6d
6:6o(2+1)7d
7:7d(2+1)8n
8:7o(3+1)9d
9:7o(4+1)10d
10:10o(3+2)11d
11:11d(5+1)12n
12:6o(4+1)13d
13:13d(2+1)14n
14:13o(4+1)15d
15:15d(2+1)16n
16:5o(4+1)17d
17:17d(2+1)18n
18:5o(6+1)19d
19:19o(2+1)20d
20:20d(2+1)21n
21:20o(3+1)22d
22:20o(4+1)23d
23:23o(3+2)24d
24:24d(5+1)25n
25:19o(6+1)26d
26:26d(2+1)27n
27:26o(3+1)28d
28:26o(4+1)29d
29:29o(3+2)30d
30:30d(5+1)31n
31:1o(6+1)32d
""")
>>>
(Source code, svg, png, hires.png, pdf)
Object-Oriented Interface¶
- class glypy.io.glycoct.GlycoCTReader(stream, structure_class=<class 'glypy.structure.glycan.Glycan'>, allow_repeats=True, completes=True)[source]¶
Parse GlycoCT{condensed} text data into
Glycanobjects.The parser implements the
Iteratorinterface, yielding successive glycans from a text stream separated by empty lines.The parser can understand fully specified and partially ambiguous structures. When
allow_repeatsisTrueand aREPsection is encountered, it will be expanded to its minimum multiplicity, or 1 if the minimum is unknown.UNDsections will be connected to the main graph byAmbiguousLinkinstead ofLinkobjects.- Variables:
allow_repeats (
bool) – Whether or not to permitREPsections. Defaults toTruecompletes (
bool) – Whether or not to translate the built graph into aGlycanobject. Defaults toTruehandle (file-like) – The text file being read from
in_repeat (
bool) – Indicates the parser is currently parsing aREPsection’s sub-graphin_undetermined (bool) – Indicates the parser is currently parsing a
UNDsection’s sub-graphpostponed (list) – Holds all the deferred operations for the top-most graph as
callableobjectsroot (
Monosaccharide) – The root node of the produced graphstate (str) – The current state of the parser’s state machine
repeats (dict) – Maps RES section index to
RepeatedGlycoCTSubgraphundetermineds (dict) – Maps UND section index to
UndeterminedGlycoCTSubgraph
- glypy.io.glycoct.GlycoCTWriter¶
alias of
UNDOrderRespectingGlycoCTWriter
Implementation Details¶
- class glypy.io.glycoct.RepeatedGlycoCTSubgraph(graph_index, repeat_index, internal_linkage=None, external_linkage=None, multitude=None, graph=None, parent=None)[source]¶
Implements the machinery for representing a repeated subgraph in GlycoCT.
- Variables:
graph_index (int)
repeast_index (int) – The ``i``th repeating subgraph in the graph.
internal_linkage (object) – The linkage connecting two repetitions of the subgraph
external_linkage (object) – The linkage connecting from the final repetition and the outside nodes.
multitude (
RepeatedMultitude) – Holds the lower and upper range of multiplicities this subgraph may be repeated to.repetitions (
OrderedDict) – The repetitions of this subgraph, materialized duringpostprocess()postponed (
deque) – A queue of post-processing callbacks.