Glycoinformatics

Investigators

William S. York, Professor of Biochemistry and Molecular Biology, Senior Investigator
René Ranzinger, Assistant Research Scientist, Project manager
Sena Arpinar, Scientific Computing Professional, Software developer
Brent Weatherly, Scientific Computing Professional, Software developer
Lovina Dmello, Scientific Computing Professional, Software developer

Overview

The Glycoinformatics core has developed software tools, ontologogies and databases to facilitate efficient acquisition, processing, analysis, sharing and dissemination of the data produced by the other cores. Several tools and databases have been generated as part of the center and are provided to the community at no charge.

Details

In order to make the information generated by the center accessible to the scientific community, it is necessary to process, interpret, organize, and store data from diverse sources, and to make them available in a form that is transparent to clients with expertise in a broad range of disciplines. These challenges are being addressed by the Glycoinformatics core. The goal is to develop a bioinformatics resource that includes databases and ontologies along with associated computational tools that facilitate efficient acquisition, description, analysis, sharing and dissemination of the data contained therein. This represents a major challenge, as the potential of this data to explain important biological phenomena will only be fully realized if it is examined in the context of the vast amounts of other data that are becoming available. Therefore, a major emphasis will be placed on data structures and tools that have a high degree of interoperability with the computational infrastructure now being developed for the storage and analysis of glycomics, genomics and proteomics data. During the funding period (2013-2018) the glycoinformatics core worked on several subprojects:

Figure 1: Start logo of GRITS Toolbox version 1.2

Figure 1: Start logo of GRITS Toolbox version 1.2

GRITS Toolbox is an extendable software system for processing, interpreting and archiving of glycomics MS data. This platform allows glycomics MS data (MS/MS, LC-MS/MS, TIM or MS profile data) to be loaded together with metadata describing the project, the analyzed sample and experimental procedures utilized. The integrated data interpretation module, called Glycomics Elucidation and Annotation Tool (GELATO), annotates experimental data with glycan and glycan fragment structures from candidate databases. An extensive set of graphical user interfaces can then be used to visualize, review, modify and export the annotated data or to compare the annotations of different samples side by side. The software can be freely downloaded from the project website. The GRITS software is based on the extendable Eclipse framework, enabling other research groups to develop their own extensions (so called Plugins) to add new functionality to our software.  So far, the following collaborative plugin projects have been initiated: (1) Dr. Kazuhiro Aoki – NIH funded extension for the annotation of glycolipid MS data; (2) Dr. Parastoo Azadi – NIH funded extension for the annotation of per-methylated glycopeptide MS data; and (3) Prof. Ten Feizi – Wellcome Trust funded extension for the interpretation and visualization of glycan array data.

Figure 2 : Three major databases with manual curated glycan structures have been created: a database for GSL glycans (285 structures), a database for O-glycans (217 structures) and a database for N-glycans (1190 structures).

Figure 2 : Three major databases with manual curated glycan structures have been created: a database for GSL glycans (285 structures), a database for O-glycans (217 structures) and a database for N-glycans (1190 structures).

Manually curated Glycan structure databases have been developed and populated in collaboration with Prof. Mike Tiemeyers group and Dr. Kazuhiro Aoki. These databases contain expert verified mammalian glycan structures and are an integral part of the GRITS Toolbox software. They are currently used for the annotation of glycan MS spectra. There are three major databases: a glycosphingolipid database with 285 glycans; a O-glycan database with 217 glycans and a N-glycan database with 1190 glycans. Although the databases are far from complete, they are an excellent basis for users to develop customized databases and are freely available.

The Minimum Information Required for A Glycomics Experiment (MIRAGE) initiative is an expert group funded by the Beilstein Institute to provide minimum information guidelines for reporting the results of glycomics experiments. The aim of these minimum information guidelines is not to tell scientists how to perform their experiments, but rather to provide a list of key information that should be reported along with experimental datasets in order to make them understandable and reproducible. Prof York and Dr Ranzinger are both founding members and project coordinators of the MIRAGE project. The group consists of more than 20 experts worldwide and has provided multiple guidelines since their start in 2011:

  • Mass spectrometry guideline (PMID: 23378518)
  • Glycan microarray guideline (PMID: 27993942)
Figure 4: Logo of the glycan structures registry - GlyTouCan.

Figure 4: Logo of the glycan structures registry – GlyTouCan.

GlyTouCan is the international glycan structure repository. It allows registering glycan structures and assigns globally unique accession numbers to any of these glycans independent of the level of information provided (e.g. fully defined structures or incomplete structural representations comprising only the glycans glycosyl composition or topology without linkage information). A prototype of the glycan registry was developed at the CCRC. When Prof. Aoki-Kinoshita acquired funding from the Japanese government to create a fully functional version, we transferred the source code to our Japanese colleagues and they modified and extended it to create a fully functional repository (GlyTouCan). They also developed a web-based front end that can be used to access the repository via a web browser under the URL https://glytoucan.org.

Key publications

MIRAGE: the minimum information required for a glycomics experiment.

York WS, Agravat S, Aoki-Kinoshita KF, McBride R, Campbell MP, Costello CE, Dell A, Feizi T, Haslam SM, Karlsson N, Khoo KH, Kolarich D, Liu Y, Novotny M, Packer NH, Paulson JC, Rapp E, Ranzinger R, Rudd PM, Smith DF, Struwe WB, Tiemeyer M, Wells L, Zaia J, Kettner C.
Glycobiology. 2014 May;24(5):402-6. doi: 10.1093/glycob/cwu018. Epub 2014 Mar 20.
PMID: 24653214

Qrator: a web-based curation tool for glycan structures.

Eavenson M, Kochut KJ, Miller JA, Ranzinger R, Tiemeyer M, Aoki K, York WS.
Glycobiology. 2015 Jan;25(1):66-73. doi: 10.1093/glycob/cwu090. Epub 2014 Aug 27.
PMID: 25165068

GlyTouCan 1.0–The international glycan structure repository.

Aoki-Kinoshita K, Agravat S, Aoki NP, Arpinar S, Cummings RD, Fujita A, Fujita N, Hart GM, Haslam SM, Kawasaki T, Matsubara M, Moreman KW, Okuda S, Pierce M, Ranzinger R, Shikanai T, Shinmachi D, Solovieva E, Suzuki Y, Tsuchiya S, Yamada I, York WS, Zaia J, Narimatsu H.
Nucleic Acids Res. 2016 Jan 4;44(D1):D1237-42. doi: 10.1093/nar/gkv1041. Epub 2015 Oct 17.
PMID:  26476458