The value of any kind of data is greatly enhanced when it exists in a form that allows it to be integrated with other data. One approach to integration is through the annotation of multiple bodies of data using common controlled vocabularies or ‘ontologies’. Unfortunately, the very success of this approach has led to a proliferation of ontologies which itself creates obstacles to integration. The Open Biomedical Ontologies (OBO) consortium has set in train a strategy to overcome this problem. Existing (...) OBO ontologies, including the Gene Ontology, are undergoing a process of coordinated reform and new ontologies being created on the basis of an evolving set of shared principles governing ontology development. The result is an expanding family of ontologies designed to be interoperable, logically well-formed, and to incorporate accurate representations of biological reality. We describe the OBO Foundry initiative, and provide guidelines for those who might wish to become involved. (shrink)
Biological ontologies are used to organize, curate, and interpret the vast quantities of data arising from biological experiments. While this works well when using a single ontology, integrating multiple ontologies can be problematic, as they are developed independently, which can lead to incompatibilities. The Open Biological and Biomedical Ontologies Foundry was created to address this by facilitating the development, harmonization, application, and sharing of ontologies, guided by a set of overarching principles. One challenge in reaching these goals was that the (...) OBO principles were not originally encoded in a precise fashion, and interpretation was subjective. Here we show how we have addressed this by formally encoding the OBO principles as operational rules and implementing a suite of automated validation checks and a dashboard for objectively evaluating each ontology’s compliance with each principle. This entailed a substantial effort to curate metadata across all ontologies and to coordinate with individual stakeholders. We have applied these checks across the full OBO suite of ontologies, revealing areas where individual ontologies require changes to conform to our principles. Our work demonstrates how a sizable federated community can be organized and evaluated on objective criteria that help improve overall quality and interoperability, which is vital for the sustenance of the OBO project and towards the overall goals of making data FAIR. Competing Interest StatementThe authors have declared no competing interest. (shrink)
Bio-ontologies are essential tools for accessing and analyzing the rapidly growing pool of plant genomic and phenomic data. Ontologies provide structured vocabularies to support consistent aggregation of data and a semantic framework for automated analyses and reasoning. They are a key component of the Semantic Web. This paper provides background on what bio-ontologies are, why they are relevant to botany, and the principles of ontology development. It includes an overview of ontologies and related resources that are relevant to plant science, (...) with a detailed description of the Plant Ontology (PO). We discuss the challenges of building an ontology that covers all green plants (Viridiplantae). Key results: Ontologies can advance plant science in four keys areas: 1. comparative genetics, genomics, phenomics, and development, 2. taxonomy and systematics, 3. semantic applications and 4. education. Conclusions: Bio-ontologies offer a flexible framework for comparative plant biology, based on common botanical understanding. As genomic and phenomic data become available for more species, we anticipate that the annotation of data with ontology terms will become less centralized, while at the same time, the need for cross-species queries will become more common, causing more researchers in plant science to turn to ontologies. (shrink)
The Planteome project provides a suite of reference and species-specific ontologies for plants and annotations to genes and phenotypes. Ontologies serve as common standards for semantic integration of a large and growing corpus of plant genomics, phenomics and genetics data. The reference ontologies include the Plant Ontology, Plant Trait Ontology, and the Plant Experimental Conditions Ontology developed by the Planteome project, along with the Gene Ontology, Chemical Entities of Biological Interest, Phenotype and Attribute Ontology, and others. The project also provides (...) access to species-specific Crop Ontologies developed by various plant breeding and research communities from around the world. We provide integrated data on plant traits, phenotypes, and gene function and expression from 95 plant taxa, annotated with reference ontology terms. (shrink)
As biological and biomedical research increasingly reference the environmental context of the biological entities under study, the need for formalisation and standardisation of environment descriptors is growing. The Environment Ontology (ENVO) is a community-led, open project which seeks to provide an ontology for specifying a wide range of environments relevant to multiple life science disciplines and, through an open participation model, to accommodate the terminological requirements of all those needing to annotate data using ontology classes. This paper summarises ENVO’s motivation, (...) content, structure, adoption, and governance approach. (shrink)
The National Center for Biomedical Ontology is a consortium that comprises leading informaticians, biologists, clinicians, and ontologists, funded by the National Institutes of Health (NIH) Roadmap, to develop innovative technology and methods that allow scientists to record, manage, and disseminate biomedical information and knowledge in machine-processable form. The goals of the Center are (1) to help unify the divergent and isolated efforts in ontology development by promoting high quality open-source, standards-based tools to create, manage, and use ontologies, (2) to create (...) new software tools so that scientists can use ontologies to annotate and analyze biomedical data, (3) to provide a national resource for the ongoing evaluation, integration, and evolution of biomedical ontologies and associated tools and theories in the context of driving biomedical projects (DBPs), and (4) to disseminate the tools and resources of the Center and to identify, evaluate, and communicate best practices of ontology development to the biomedical community. Through the research activities within the Center, collaborations with the DBPs, and interactions with the biomedical community, our goal is to help scientists to work more effectively in the e-science paradigm, enhancing experiment design, experiment execution, data analysis, information synthesis, hypothesis generation and testing, and understand human disease. (shrink)
The Plant Ontology (PO) is a community resource consisting of standardized terms, definitions, and logical relations describing plant structures and development stages, augmented by a large database of annotations from genomic and phenomic studies. This paper describes the structure of the ontology and the design principles we used in constructing PO terms for plant development stages. It also provides details of the methodology and rationale behind our revision and expansion of the PO to cover development stages for all plants, particularly (...) the land plants (bryophytes through angiosperms). As a case study to illustrate the general approach, we examine variation in gene expression across embryo development stages in Arabidopsis and maize, demonstrating how the PO can be used to compare patterns of expression across stages and in developmentally different species. Although many genes appear to be active throughout embryo development, we identified a small set of uniquely expressed genes for each stage of embryo development and also between the two species. Evaluating the different sets of genes expressed during embryo development in Arabidopsis or maize may inform future studies of the divergent developmental pathways observed in monocotyledonous versus dicotyledonous species. The PO and its annotation databasemake plant data for any species more discoverable and accessible through common formats, thus providing support for applications in plant pathology, image analysis, and comparative development and evolution. (shrink)
The Common Anatomy Reference Ontology (CARO) is being developed to facilitate interoperability between existing anatomy ontologies for different species, and will provide a template for building new anatomy ontologies. CARO has a structural axis of classification based on the top-level nodes of the Foundational Model of Anatomy. CARO will complement the developmental process sub-ontology of the GO Biological Process ontology, using it to ensure the coherent treatment of developmental stages, and to provide a common framework for the model organism communities (...) to classify developmental structures. Definitions for the types and relationships are being generated by a consortium of investigators from diverse backgrounds to ensure applicability to all organisms. CARO will support the coordination of cross-species ontologies at all levels of anatomical granularity by cross-referencing types within the cell type ontology (CL) and the Gene Ontology (GO) Cellular Component ontology. A complete cross-species CARO could be utilized in other ontologies for cross-product generation. (shrink)
Vaccine research, as well as the development, testing, clinical trials, and commercial uses of vaccines involve complex processes with various biological data that include gene and protein expression, analysis of molecular and cellular interactions, study of tissue and whole body responses, and extensive epidemiological modeling. Although many data resources are available to meet different aspects of vaccine needs, it remains a challenge how we are to standardize vaccine annotation, integrate data about varied vaccine types and resources, and support advanced vaccine (...) data analysis and inference. To address these problems, the community-based Vaccine Ontology (VO) has been developed through collaboration with vaccine researchers and many national and international centers and programs, including the National Center for Biomedical Ontology (NCBO), the Infectious Disease Ontology (IDO) Initiative, and the Ontology for Biomedical Investigations (OBI). VO utilizes the Basic Formal Ontology (BFO) as the top ontology and the Relation Ontology (RO) for definition of term relationships. VO is represented in the Web Ontology Language (OWL) and edited using the Protégé-OWL. Currently VO contains more than 2000 terms and relationships. VO emphasizes on classification of vaccines and vaccine components, vaccine quality and phenotypes, and host immune response to vaccines. These reflect different aspects of vaccine composition and biology and can thus be used to model individual vaccines. More than 200 licensed vaccines and many vaccine candidates in research or clinical trials have been modeled in VO. VO is being used for vaccine literature mining through collaboration with the National Center for Integrative Biomedical Informatics (NCIBI). Multiple VO applications will be presented. (shrink)