How Biologists Map the Evolution of Life's Vocabulary
Imagine a library containing every scientific discovery about biology, from the inner workings of a cell to the complex interactions between organisms. Now imagine that every book in this library uses a slightly different language, with the same words meaning different things in different contexts. This is the challenge facing modern biologists in the age of big data.
Biological ontologies—structured vocabularies that define concepts and their relationships—have emerged as a powerful solution to this problem. But like living organisms themselves, these ontologies are not static; they evolve, expand, and adapt over time.
In this article, we'll explore how scientists track the temporal distribution of terminologies in biological ontologies and why this process is crucial for everything from understanding disease to protecting against future pandemics.
In information science, an ontology is a formal representation of knowledge within a domain, consisting of concepts and their relationships 2 . Think of it as a detailed map of how ideas connect to one another. In biology, ontologies provide standardized terms that allow researchers worldwide to speak the same language when describing their findings. The most famous of these is the Gene Ontology (GO), which classifies gene functions across species 4 .
These ontologies don't spring to life fully formed—they develop over time through community efforts. The OBO Foundry serves as a central hub for these ontologies, establishing principles to ensure they work together harmoniously 4 7 . Like a standards organization for the language of biology, the OBO Foundry ensures that an anatomy term means the same thing to a fly geneticist as it does to a human medical researcher.
The temporal distribution of terminologies—how terms are added, modified, or retired over time—reveals important patterns in scientific progress. New terms often emerge in response to technological advances or groundbreaking discoveries. For instance, the COVID-19 pandemic prompted the development of the Coronavirus Infectious Disease Ontology (CIDO) to standardize research on the novel virus 7 .
Tracking these changes is not merely academic; it has real-world implications. Beginning in 2026, new U.S. government guidelines will require synthetic DNA providers to screen orders for sequences known to contribute to pathogenicity 1 9 . Without up-to-date ontologies that accurately classify dangerous sequences, this crucial biosecurity effort would be impossible.
The expansion of biological ontologies reflects the accelerating pace of biological discovery and the increasing need for standardized terminology.
When government policies identified a need to screen synthetic DNA sequences for pathogenic potential, researchers faced a problem: existing biological ontologies lacked the granular terminology to distinguish dangerous sequences from harmless ones 1 9 . A team of scientists undertook a project to expand the Gene Ontology's terms describing microbial pathogenesis.
The team canvassed thousands of scientific publications investigating microbial pathogens of humans, animals, and plants 1 .
They collected sequences that enable microbes to exploit hosts, often called virulence factors 9 .
Sequences were compared and grouped according to which host biological processes they subvert and the consequences for the host 1 9 .
Based on these groupings, the team developed new terms to capture varied pathogenic functions 9 .
These terms were systematically applied to their dataset of sequences to test their utility 1 .
The project resulted in a significant expansion of the Gene Ontology, particularly within the framework of "symbiont-mediated perturbation of host process" 9 . The researchers developed specialized terms that precisely describe how pathogens interfere with host immune systems. For example, while a broad term like "subverting host innate immune signaling" existed before, the team created more specific terms that distinguish between different mechanisms of immune evasion 1 .
| Ontology Term ID | Biological Process Description |
|---|---|
| GO:0141105 | Symbiont-mediated suppression of host toll-like receptor signal transduction |
| GO:0140886 | Symbiont-mediated suppression of host interferon-mediated signaling pathway |
| GO:0141074 | Symbiont-mediated suppression of host cGAS-STING signal transduction |
| GO:0085034 | Symbiont-mediated suppression of host NF-κB cascade |
This directly supports the 2026 biosecurity guidelines by giving synthetic DNA providers the vocabulary needed to rapidly assess sequences ordered by their customers for pathogenic capacity 9 .
| Time Period | Development in Pathogenesis Ontology |
|---|---|
| Pre-2018 | Limited terminology for nonviral pathogenic sequences |
| 2018-2022 | Creation of preliminary FunSoCs and PathGO vocabularies |
| 2022-2024 | Integration of terms into Gene Ontology framework |
| October 2026 | Planned implementation of new U.S. synthetic DNA screening guidelines |
Building and maintaining biological ontologies requires a specialized set of conceptual and software tools. These resources enable researchers to create, visualize, and manage the complex networks of terms that constitute modern ontologies.
| Tool Name | Type | Primary Function |
|---|---|---|
| OBO-Edit | Software | Editing ontologies in OBO format 4 |
| Protégé | Software | Ontology editor and knowledge base framework 2 |
| Relation Ontology (RO) | Conceptual Framework | Standardized relationships for OBO ontologies 8 |
| Basic Formal Ontology (BFO) | Conceptual Framework | Upper-level ontology providing high-level categories 7 |
| VOWL | Visualization | Visual Notation for OWL Ontologies 2 |
| GraphDB | Database | Triple store for handling massive ontology datasets |
The Relation Ontology (RO) deserves special mention as it provides the "grammar" that connects terms across different biological ontologies. With over 140 OBO Foundry ontologies using RO, it serves as the linguistic backbone for interdisciplinary research in biology 8 . RO defines fundamental relationships like "part of," "located in," and "has function," allowing computers to reason across different biological domains.
Community collaboration tools are equally important. The OBO Foundry Dashboard helps track ontology alignment with OBO principles 8 , while regular community events like the International Conference on Biological and Biomedical Ontology (ICBO) provide forums for researchers to coordinate their efforts 5 .
Specialized editor for OBO format ontologies with visualization capabilities.
Comprehensive ontology development platform with extensive plugin ecosystem.
Standardized relationship framework connecting biological concepts.
"An ontology is a description of the concepts and relationships that can formally exist for an agent or a community of agents" 2 .
The expansion of pathogenesis terminology exemplifies a broader trend in biological ontologies: the continuous refinement and specialization of terms to meet emerging scientific and policy needs. This definition highlights the living nature of ontologies—they evolve as our understanding deepens and new challenges emerge.
The recent approval of Principle 19 (Stability of Term Meaning) by the OBO Foundry underscores the community's awareness of both the need for evolution and the importance of stability 8 . Like a constitution for biological language, this principle sets standards for preserving semantic consistency even as ontologies grow.
As biological research generates ever more data, the role of ontologies in making this information findable, accessible, interoperable, and reusable (FAIR) will only increase 5 . The temporal patterns of terminology addition—which concepts emerge when—provide a unique window into the progress of science itself, revealing how our understanding of life becomes increasingly sophisticated with each passing year.
From protecting against engineered pathogens to integrating knowledge across species, the careful tracking and visualization of terminology distribution in biological ontologies represents more than an academic exercise—it is fundamental to the future of biological discovery and its application to human health.