Monosaccharide¶
Represents individual saccharide residues and their associated functions. These are the basic unit of structural representation, possesing graph node-like properties.
Monosaccharide Objects¶
- class glypy.structure.monosaccharide.Monosaccharide(anomer=None, configuration=None, stem=None, superclass=None, ring_start=-1, ring_end=-1, modifications=None, links=None, substituent_links=None, composition=None, reduced=None, id=None, fast=False)[source]¶
Represents a single monosaccharide molecule, and its relationships with other molcules through
Linkobjects.Linkobjects stored inlinksfor connections to otherMonosaccharideinstances, building aGlycanstructure as a graph ofMonosaccharideobjects.Linkobjects connecting theMonosaccharideinstance toSubstituentobjects are stored insubstituent_links.Both
linksandsubstituent_linksare instances ofOrderedMultiMapobjects where the key is the index of the carbon atom in the carbohydrate backbone that hosts the bond. An index ofxor-1represents an unknown location.Warning
While
Monosaccharideobjects expose theirmodifications,links, andsubstituent_linksattributes as mutable, you should treat them as read-only. The methods for altering their contents,add_substituent(),add_monosaccharide(),add_modification(),drop_substituent(),drop_monosaccharide(), anddrop_modification()are all responsible for handling these mutations for you.Linkmethods likeLink.apply()andLink.break_link()are used internally.- Variables:
anomer (
Anomer) – An entry ofAnomerthat corresponds to the linkage type of the carbohydrate backbone. Is an entry of a class based onEnumsuperclass (
SuperClass) – An entry ofSuperClassthat corresponds to the number of carbons in the carbohydrate backbone of the monosaccharide. Controls the base composition of the instance and the number of positions open to be linked to or modified. Is an entry of a class based onEnumconfiguration (
Configurationor {‘d’, ‘l’, ‘x’, ‘missing’, None}) – An entry ofConfigurationwhich corresponds to the optical stereomer state of the instance. Is an entry of a class based onEnum. May possess more than one value.stem (
Stem) – Corresponds to the bond conformation of the carbohydrate backbone. Is an entry of a class based onEnum. May possess more than one value.ring_start (
int) – The index of the carbon of the carbohydrate backbone that starts a ring. A value of-1,'x', orNonecorresponds to an unknown start. A value of0refers to a linear chain.ring_end (
int) – The index of the carbon of the carbohydrate backbone that ends a ring. A value of-1,'x', orNonecorresponds to an unknown ends. A value of0refers to a linear chain.stereocode (
Stereocode) – The stereochemistry of all carbons of the monosaccharide’s backbone ring/chain.reducing_end (
ReducedEnd) – The reducing end terminal group of the monosaccharide if the monosaccharide is uncyclizedmodifications (
OrderedMultiMap) – The mapping of sites toModificationentries. Directly modifies the instance’scompositionlinks (
OrderedMultiMap) – The mapping of sites toLinkentries that refer to otherMonosaccharideinstancessubstituent_links (
OrderedMultiMap) – The mapping of sites toLinkentries that refer toSubstituentinstances.composition (
Composition) – An instance ofCompositioncorresponding to the elemental composition ofselfand its immediate modifications. If not provided, this will be inferred from field values.reduced (
ReducedEnd) – An instance of ReducedEnd, or the valueTrue, represents a reduced sugar. May be inferred frommodificationsif “aldi” is present
Connection Enumeration¶
- Monosaccharide.parents(links=False)[source]¶
Returns an iterator over the
Monosaccharideinstances which are considered the ancestors ofself.
- links: bool
Whether to return the Link objects, or their parents. Defaults to False
- Returns:
listofposition (int) – Location of the bond to the parent
Monosaccharideparent (Monosaccharide) –
Monosaccharideatposition
- Monosaccharide.children(links=False)[source]¶
Returns an iterator over the
Monosaccharideinstancess which are considered the descendants ofself>>> from glypy import glycans >>> n_linked_core = glycans["N-Linked Core"] >>> ch = n_linked_core.root.children() >>> ch[0] (4, RES 1b:b-dglc-HEX-1:5 2s:n-acetyl LIN 1:1d(2+1)2n) >>>
- Parameters:
links (bool) – Whether to return the Link objects, or their children. Defaults to False
- Returns:
listofposition (int) – Location of the bond to the child
Monosaccharidechild (Monosaccharide) –
Monosaccharideatposition
- Monosaccharide.substituents()[source]¶
Returns an iterator over all substituents attached to
selfby aLinkobject stored insubstituent_links
- Returns:
listofposition (int) – Location of the bond to the substituent
substituent (Substituent) –
Substituentatposition
Adding and Removing Connections and Modifications¶
- Monosaccharide.add_monosaccharide(monosaccharide, position=-1, max_occupancy=0, child_position=-1, parent_loss=None, child_loss=None)[source]¶
Adds a
Monosaccharideand associatedLinktolinksat the site given byposition.>>> from glypy import monosaccharides >>> hexnac = monosaccharides.HexNAc >>> hex = monosaccharides.Hex >>> hexnac.add_monosaccharide(hex, 1) RES 1b:x-xx-HEX-1:5 2s:n-acetyl LIN 1:1d(2+1)2n >>> hexnac.links[1][0].child RES 1b:x-xx-HEX-1:5
- Parameters:
monosaccharide (Monosaccharide) – The monosaccharide to add.
position (int or 'x') – The location to add the
Monosaccharidelink tolinks. Defaults to -1child_position (int) – The location to add the link to in
monosaccharide’slinks. Defaults to -1.max_occupancy (int, optional) – The maximum number of items acceptable at
position. Defaults to1parent_loss (Composition or str) – The elemental composition removed from
selfchild_loss (Composition or str) – The elemental composition removed from
monosaccharide- Raises:
IndexError –
positionexceeds the bounds set bysuperclass.ValueError –
positionis occupied by more thanmax_occupancyelements- Returns:
self, for chain calls- Return type:
- Monosaccharide.add_substituent(substituent, position=-1, max_occupancy=0, child_position=1, parent_loss=None, child_loss=None)[source]¶
Adds a
Substituentand associatedLinktosubstituent_linksat the site given byposition. This new substituent is included when calculating mass with substituents included.>>> from glypy import monosaccharides >>> hex = monosaccharides.Hex >>> hexnac = monosaccharides.HexNAc >>> hex.add_substituent("n-acetyl", 2, parent_loss="OH") RES 1b:x-xx-HEX-1:5 2s:n-acetyl LIN 1:1d(2+1)2n >>> hexnac == hex True
- Parameters:
substituent (str or Substituent) – The substituent to add. If passed a
strit will be translated into an instance ofSubstituent.position (int or 'x') – The location to add the
Substituentlink tosubstituent_links. Defaults to -1child_position (int) – The location to add the link to in
substituentlinks. Defaults to -1. Substituent indices are currently not checked.max_occupancy (int, optional) – The maximum number of items acceptable at
position. Defaults to1parent_loss (Composition or str) – The elemental composition removed from
selfchild_loss (Composition or str) – The elemental composition removed from
substituent- Raises:
IndexError –
positionexceeds the bounds set bysuperclass.ValueError –
positionis occupied by more thanmax_occupancyelements- Returns:
self, for chain calls- Return type:
- Monosaccharide.add_modification(modification, position, max_occupancy=0)[source]¶
Adds a modification instance to
modificationsat the site given byposition. This directly modifiescomposition, consequently changingmass()
- Parameters:
modification (str or Modification) – The modification to add. If passed a
str, it will be translated into an instance ofModificationposition (int or 'x') – The location to add the
Modificationto.max_occupancy (int, optional) – The maximum number of items acceptable at
position. defaults to1- Raises:
IndexError –
positionexceeds the bounds set bysuperclass.ValueError –
positionis occupied by more thanmax_occupancyelements- Returns:
self, for chain calls- Return type:
- Monosaccharide.drop_monosaccharide(position, refund=True)[source]¶
Remove the glycosidic bond at
position, detatching a connectedMonosaccharideIf there is more than one glycosidic bond at
position, an error will be raised.>>> from glypy import glycans >>> n_linked_core = glycans["N-Linked Core"] >>> n_linked_core.root.drop_monosaccharide(4) RES 1b:b-dglc-HEX-1:5 2s:n-acetyl LIN 1:1d(2+1)2n >>> n_linked_core.mass() 221.08993720321
- Parameters:
position (int) – The position to drop the modification from
refund (bool) – Passed to
break_link()- Raises:
- Returns:
self, for chain calls- Return type:
- Monosaccharide.drop_substituent(position, substituent=None, refund=True)[source]¶
Remove the
substituentatposition.If
substituentisNone, then the first substituent found atpositionis removed.>>> from glypy import monosaccharides >>> hex = monosaccharides.Hex >>> hexnac = monosaccharides.HexNAc >>> hexnac.drop_substituent(2) RES 1b:x-xx-HEX-1:5 >>> hexnac == hex True
- Parameters:
position (int) – The position to drop the modification from
substituent (Substituent) – The
Substituentto remove. IfNone, the first substituent found atpositionwill be removedrefund (bool) – Passed to
break_link()- Raises:
IndexError: – If
positionis not a valid carbohydrate backbone positionValueError: – If
substituentis not found atposition- Returns:
self, for chain calls- Return type:
- Monosaccharide.drop_modification(position, modification)[source]¶
Remove the
modificationatposition
- Parameters:
position (int) – The position to drop the modification from
modification (Modification) – The Modification to remove.
- Raises:
IndexError: – If
positionis not a valid carbohydrate backbone positionValueError: – If
modificationis not found atposition- Returns:
self, for chain calls- Return type:
Position Occupancy¶
- Monosaccharide.is_occupied(position)[source]¶
Checks to see if a particular backbone position is occupied by a
Modification,Substituent, orLinkto anotherMonosaccharide.
- Monosaccharide.open_attachment_sites(max_occupancy=0)[source]¶
When attaching
Monosaccharideinstances to other objects, bonds are formed between the carbohydrate backbone and the other object. If a site is already bound, the occupying object fills that space on the backbone and prevents other objects from binding there.Currently only cares about the availability of the hydroxyl group. As there is not a hydroxyl attached to the ring-ending carbon, that should not be considered an open site.
If any existing attached units have unknown positions, we can’t provide any known positions, in which case the list of open positions will be a
listof-1s of the length of open sites.
Equality Comparison¶
Monosaccharide objects support equality comparison operators,
==and!=. They also support hashing, using thehash()value ofMonosaccharide.id.
- Monosaccharide.exact_ordering_equality(other, substituents=True, visited=None)[source]¶
Performs equality testing between two monosaccharides where the exact position (and ordering by sort) of links must to match between the input
Monosaccharideobjects
- Return type:
- Monosaccharide.topological_equality(other, substituents=True, visited=None)[source]¶
Performs equality testing between two monosaccharides where the exact ordering of child links does not have to match between the input |Monosaccharide|s, so long as an exact match of the subtrees is found
- Return type:
- Monosaccharide.__eq__(other)[source]¶
Test for equality between
Monosaccharideinstances. First try scalar equality of fields, and then compare descendants.
Serialization¶
- Monosaccharide.serialize(name='glycoct')[source]¶
Convert this object into text using the requested textual encoding
- classmethod Monosaccharide.register_serializer(name, method)[source]¶
Add
methodasnameto the set of serializers to pick from inserialize()
- Parameters:
name (str) – The name of the serializer
method (Callable) – A callable object that when called with a
Monosaccharidereturns astr
Mass Spectrometry Utilities¶
- Monosaccharide.total_composition()[source]¶
Computes the sum of the composition of
selfand each of its linkedSubstituents
- Return type:
Composition
- Monosaccharide.mass(average=False, charge=0, mass_data=None, substituents=True)[source]¶
Calculates the total mass of
self.
- Parameters:
average (bool, optional, defaults to False) – Whether or not to use the average isotopic composition when calculating masses. When
average == False, masses are calculated using monoisotopic mass.charge (int, optional, defaults to 0) – If charge is non-zero, m/z is calculated, where m is the theoretical mass, and z is
chargemass_data (dict, optional) – If mass_data is None, standard NIST mass and isotopic abundance data are used. Otherwise the contents of mass_data are assumed to contain elemental mass and isotopic abundance information. Defaults to
None.substituents (bool, optional, defaults to True) – Whether or not to include substituents’ masses.
- Return type:
Miscellaneous¶
- Monosaccharide.clone(prop_id=False, fast=True, monosaccharide_type=None)[source]¶
Copies just this
Monosaccharideand its |Substituent|s, creating a separate instance with the same data. All mutable data structures are duplicated and distinct from the original.Does not copy any
linksas this would cause recursive duplication of the entireGlycangraph.
- Parameters:
prop_id (
bool) – Whether to copyidfromselfto the new instancefast (
bool) – Whether to use the fast-path initialization process inMonosaccharide.__init__()monosaccharide_type (
type) – A subclass ofMonosaccharideto use- Return type:
Explicit Uncyclized Reducing Ends and Labels¶
- class glypy.structure.monosaccharide.ReducedEnd(composition=None, substituents=None, valence=1, id=None)[source]¶
Represents the composition shift and conformation change created by reducing a
Monosaccharide.
- Variables:
composition (
Composition) – The elemental composition of the reducing end reduction modification.links (
OrderedMultiMap) – The attached substituentsvalence (
int) – Number of substituents this node can hostid (
int) – Unique identifier:ivar There is also a class attribute,
namefor comparison withaldi:
- add_substituent(substituent, position=-1, max_occupancy=0, child_position=1, parent_loss=None, child_loss=None)[source]¶
Adds a
Substituentand associatedLinktosubstituent_linksat the site given byposition. This new substituent is included when calculating mass with substituents included
- Parameters:
substituent (str or Substituent) – The substituent to add. If passed a
str, it will be translated into an instance ofSubstituentposition (int or 'x') – The location to add the
Substituentlink tosubstituent_links. Defaults to -1child_position (int) – The location to add the link to in
substituent’slinks. Defaults to -1. Substituent indices are currently not checked.max_occupancy (int, optional) – The maximum number of items acceptable at
position. Defaults to1parent_loss (Composition or str) – The elemental composition removed from
selfchild_loss (Composition or str) – The elemental composition removed from
substituent- Raises:
IndexError –
positionexceeds the bounds set bysuperclass.ValueError –
positionis occupied by more thanmax_occupancyelements
- children()[source]¶
Returns an iterator over the nodes which are considered the descendants of
self.
- clone(prop_id=True)[source]¶
Make a deep copy of
self.
- Parameters:
prop_id (bool) – Whether to copy over
id.- Return type:
- drop_substituent(position, substituent=None, refund=True)[source]¶
Remove the
substituentatposition.If
substituentisNone, then the first substituent found atpositionis removed.
- Parameters:
position (int) – The position to drop the modification from
substituent (Substituent) – The
Substituentto remove. IfNone, the first substituent found atpositionwill be removedrefund (bool) – Passed to
break_link()- Raises:
IndexError: – If
positionexceedsvalenceValueError: – If
substituentis not found atposition- Returns:
selffor chaining calls- Return type:
- is_occupied(position)[source]¶
Checks to see if a particular backbone position is occupied by a or
Substituent.
- Parameters:
position (int) – The position to check for occupancy. Passing -1 checks for undetermined attachments.
- Returns:
The number of occupants at
position, orfloat('inf')ifpositionexceedsvalence- Return type:
numeric
- mass(average=False, charge=0, mass_data=None)[source]¶
Calculates the total mass of
self.
- Parameters:
average (bool, optional, defaults to False) – Whether or not to use the average isotopic composition when calculating masses. When
average == False, masses are calculated using monoisotopic mass.charge (int, optional, defaults to 0) – If charge is non-zero, m/z is calculated, where m is the theoretical mass, and z is
chargemass_data (dict, optional) – If mass_data is None, standard NIST mass and isotopic abundance data are used. Otherwise the contents of mass_data are assumed to contain elemental mass and isotopic abundance information. Defaults to
None.- Return type: