The value of any kind of data is greatly enhanced when it exists in a form that allows it to be integrated with other data. One approach to integration is through the annotation of multiple bodies of data using common controlled vocabularies or ‘ontologies’. Unfortunately, the very success of this approach has led to a proliferation of ontologies which itself creates obstacles to integration. The Open Biomedical Ontologies (OBO) consortium has set in train a strategy to overcome this problem. Existing (...) OBO ontologies, including the Gene Ontology, are undergoing a process of coordinated reform and new ontologies being created on the basis of an evolving set of shared principles governing ontology development. The result is an expanding family of ontologies designed to be interoperable, logically well-formed, and to incorporate accurate representations of biological reality. We describe the OBO Foundry initiative, and provide guidelines for those who might wish to become involved. (shrink)
Throughout the biological and biomedical sciences there is a growing need for, prescriptive ‘minimum information’ (MI) checklists specifying the key information to include when reporting experimental results are beginning to find favor with experimentalists, analysts, publishers and funders alike. Such checklists aim to ensure that methods, data, analyses and results are described to a level sufficient to support the unambiguous interpretation, sophisticated search, reanalysis and experimental corroboration and reuse of data sets, facilitating the extraction of maximum value from data sets (...) them. However, such ‘minimum information’ MI checklists are usually developed independently by groups working within representatives of particular biologically- or technologically-delineated domains. Consequently, an overview of the full range of checklists can be difficult to establish without intensive searching, and even tracking thetheir individual evolution of single checklists may be a non-trivial exercise. Checklists are also inevitably partially redundant when measured one against another, and where they overlap is far from straightforward. Furthermore, conflicts in scope and arbitrary decisions on wording and sub-structuring make integration difficult. This presents inhibit their use in combination. Overall, these issues present significant difficulties for the users of checklists, especially those in areas such as systems biology, who routinely combine information from multiple biological domains and technology platforms. To address all of the above, we present MIBBI (Minimum Information for Biological and Biomedical Investigations); a web-based communal resource for such checklists, designed to act as a ‘one-stop shop’ for those exploring the range of extant checklist projects, and to foster collaborative, integrative development and ultimately promote gradual integration of checklists. (shrink)
As biological and biomedical research increasingly reference the environmental context of the biological entities under study, the need for formalisation and standardisation of environment descriptors is growing. The Environment Ontology (ENVO) is a community-led, open project which seeks to provide an ontology for specifying a wide range of environments relevant to multiple life science disciplines and, through an open participation model, to accommodate the terminological requirements of all those needing to annotate data using ontology classes. This paper summarises ENVO’s motivation, (...) content, structure, adoption, and governance approach. (shrink)
The National Center for Biomedical Ontology is a consortium that comprises leading informaticians, biologists, clinicians, and ontologists, funded by the National Institutes of Health (NIH) Roadmap, to develop innovative technology and methods that allow scientists to record, manage, and disseminate biomedical information and knowledge in machine-processable form. The goals of the Center are (1) to help unify the divergent and isolated efforts in ontology development by promoting high quality open-source, standards-based tools to create, manage, and use ontologies, (2) to create (...) new software tools so that scientists can use ontologies to annotate and analyze biomedical data, (3) to provide a national resource for the ongoing evaluation, integration, and evolution of biomedical ontologies and associated tools and theories in the context of driving biomedical projects (DBPs), and (4) to disseminate the tools and resources of the Center and to identify, evaluate, and communicate best practices of ontology development to the biomedical community. Through the research activities within the Center, collaborations with the DBPs, and interactions with the biomedical community, our goal is to help scientists to work more effectively in the e-science paradigm, enhancing experiment design, experiment execution, data analysis, information synthesis, hypothesis generation and testing, and understand human disease. (shrink)
Biological ontologies are used to organize, curate, and interpret the vast quantities of data arising from biological experiments. While this works well when using a single ontology, integrating multiple ontologies can be problematic, as they are developed independently, which can lead to incompatibilities. The Open Biological and Biomedical Ontologies Foundry was created to address this by facilitating the development, harmonization, application, and sharing of ontologies, guided by a set of overarching principles. One challenge in reaching these goals was that the (...) OBO principles were not originally encoded in a precise fashion, and interpretation was subjective. Here we show how we have addressed this by formally encoding the OBO principles as operational rules and implementing a suite of automated validation checks and a dashboard for objectively evaluating each ontology’s compliance with each principle. This entailed a substantial effort to curate metadata across all ontologies and to coordinate with individual stakeholders. We have applied these checks across the full OBO suite of ontologies, revealing areas where individual ontologies require changes to conform to our principles. Our work demonstrates how a sizable federated community can be organized and evaluated on objective criteria that help improve overall quality and interoperability, which is vital for the sustenance of the OBO project and towards the overall goals of making data FAIR. Competing Interest StatementThe authors have declared no competing interest. (shrink)
Despite a large and multifaceted effort to understand the vast landscape of phenotypic data, their current form inhibits productive data analysis. The lack of a community-wide, consensus-based, human- and machine-interpretable language for describing phenotypes and their genomic and environmental contexts is perhaps the most pressing scientific bottleneck to integration across many key fields in biology, including genomics, systems biology, development, medicine, evolution, ecology, and systematics. Here we survey the current phenomics landscape, including data resources and handling, and the progress that (...) has been made to accurately capture relevant data descriptions for phenotypes. We present an example of the kind of integration across domains that computable phenotypes would enable, and we call upon the broader biology community, publishers, and relevant funding agencies to support efforts to surmount today's data barriers and facilitate analytical reproducibility. (shrink)