Visualizing the Flow of Knowledge

How Biologists Map the Evolution of Life's Vocabulary

Biological Ontologies Terminology Visualization Bioinformatics

Introduction

Imagine a library containing every scientific discovery about biology, from the inner workings of a cell to the complex interactions between organisms. Now imagine that every book in this library uses a slightly different language, with the same words meaning different things in different contexts. This is the challenge facing modern biologists in the age of big data.

Biological ontologies—structured vocabularies that define concepts and their relationships—have emerged as a powerful solution to this problem. But like living organisms themselves, these ontologies are not static; they evolve, expand, and adapt over time.

In this article, we'll explore how scientists track the temporal distribution of terminologies in biological ontologies and why this process is crucial for everything from understanding disease to protecting against future pandemics.

What Are Biological Ontologies and Why Do They Change?

The Science of Organizing Biological Knowledge

In information science, an ontology is a formal representation of knowledge within a domain, consisting of concepts and their relationships 2 . Think of it as a detailed map of how ideas connect to one another. In biology, ontologies provide standardized terms that allow researchers worldwide to speak the same language when describing their findings. The most famous of these is the Gene Ontology (GO), which classifies gene functions across species 4 .

These ontologies don't spring to life fully formed—they develop over time through community efforts. The OBO Foundry serves as a central hub for these ontologies, establishing principles to ensure they work together harmoniously 4 7 . Like a standards organization for the language of biology, the OBO Foundry ensures that an anatomy term means the same thing to a fly geneticist as it does to a human medical researcher.

Why Timing Matters in Terminology

The temporal distribution of terminologies—how terms are added, modified, or retired over time—reveals important patterns in scientific progress. New terms often emerge in response to technological advances or groundbreaking discoveries. For instance, the COVID-19 pandemic prompted the development of the Coronavirus Infectious Disease Ontology (CIDO) to standardize research on the novel virus 7 .

Tracking these changes is not merely academic; it has real-world implications. Beginning in 2026, new U.S. government guidelines will require synthetic DNA providers to screen orders for sequences known to contribute to pathogenicity 1 9 . Without up-to-date ontologies that accurately classify dangerous sequences, this crucial biosecurity effort would be impossible.

Growth of Biological Ontologies Over Time

The expansion of biological ontologies reflects the accelerating pace of biological discovery and the increasing need for standardized terminology.

Case Study: Expanding the Ontology of Pathogenesis

The Experiment: Mapping the Language of Infection

When government policies identified a need to screen synthetic DNA sequences for pathogenic potential, researchers faced a problem: existing biological ontologies lacked the granular terminology to distinguish dangerous sequences from harmless ones 1 9 . A team of scientists undertook a project to expand the Gene Ontology's terms describing microbial pathogenesis.

Methodology
Literature Review

The team canvassed thousands of scientific publications investigating microbial pathogens of humans, animals, and plants 1 .

Sequence Collection

They collected sequences that enable microbes to exploit hosts, often called virulence factors 9 .

Functional Grouping

Sequences were compared and grouped according to which host biological processes they subvert and the consequences for the host 1 9 .

Term Development

Based on these groupings, the team developed new terms to capture varied pathogenic functions 9 .

Application and Testing

These terms were systematically applied to their dataset of sequences to test their utility 1 .

What the Researchers Discovered

The project resulted in a significant expansion of the Gene Ontology, particularly within the framework of "symbiont-mediated perturbation of host process" 9 . The researchers developed specialized terms that precisely describe how pathogens interfere with host immune systems. For example, while a broad term like "subverting host innate immune signaling" existed before, the team created more specific terms that distinguish between different mechanisms of immune evasion 1 .

Examples of New Specific Immune Evasion Terms
Ontology Term ID Biological Process Description
GO:0141105 Symbiont-mediated suppression of host toll-like receptor signal transduction
GO:0140886 Symbiont-mediated suppression of host interferon-mediated signaling pathway
GO:0141074 Symbiont-mediated suppression of host cGAS-STING signal transduction
GO:0085034 Symbiont-mediated suppression of host NF-κB cascade

This directly supports the 2026 biosecurity guidelines by giving synthetic DNA providers the vocabulary needed to rapidly assess sequences ordered by their customers for pathogenic capacity 9 .

Timeline of Pathogenesis Terminology Development
Time Period Development in Pathogenesis Ontology
Pre-2018 Limited terminology for nonviral pathogenic sequences
2018-2022 Creation of preliminary FunSoCs and PathGO vocabularies
2022-2024 Integration of terms into Gene Ontology framework
October 2026 Planned implementation of new U.S. synthetic DNA screening guidelines
Distribution of Pathogenesis Terms by Category

The Scientist's Toolkit: Resources for Ontology Development

Building and maintaining biological ontologies requires a specialized set of conceptual and software tools. These resources enable researchers to create, visualize, and manage the complex networks of terms that constitute modern ontologies.

Essential Tools for Ontology Development and Visualization
Tool Name Type Primary Function
OBO-Edit Software Editing ontologies in OBO format 4
Protégé Software Ontology editor and knowledge base framework 2
Relation Ontology (RO) Conceptual Framework Standardized relationships for OBO ontologies 8
Basic Formal Ontology (BFO) Conceptual Framework Upper-level ontology providing high-level categories 7
VOWL Visualization Visual Notation for OWL Ontologies 2
GraphDB Database Triple store for handling massive ontology datasets

The Relation Ontology (RO) deserves special mention as it provides the "grammar" that connects terms across different biological ontologies. With over 140 OBO Foundry ontologies using RO, it serves as the linguistic backbone for interdisciplinary research in biology 8 . RO defines fundamental relationships like "part of," "located in," and "has function," allowing computers to reason across different biological domains.

Community collaboration tools are equally important. The OBO Foundry Dashboard helps track ontology alignment with OBO principles 8 , while regular community events like the International Conference on Biological and Biomedical Ontology (ICBO) provide forums for researchers to coordinate their efforts 5 .

OBO-Edit

Specialized editor for OBO format ontologies with visualization capabilities.

Protégé

Comprehensive ontology development platform with extensive plugin ecosystem.

Relation Ontology

Standardized relationship framework connecting biological concepts.

The Future of Biological Ontologies

"An ontology is a description of the concepts and relationships that can formally exist for an agent or a community of agents" 2 .

The expansion of pathogenesis terminology exemplifies a broader trend in biological ontologies: the continuous refinement and specialization of terms to meet emerging scientific and policy needs. This definition highlights the living nature of ontologies—they evolve as our understanding deepens and new challenges emerge.

The recent approval of Principle 19 (Stability of Term Meaning) by the OBO Foundry underscores the community's awareness of both the need for evolution and the importance of stability 8 . Like a constitution for biological language, this principle sets standards for preserving semantic consistency even as ontologies grow.

Looking Ahead

As biological research generates ever more data, the role of ontologies in making this information findable, accessible, interoperable, and reusable (FAIR) will only increase 5 . The temporal patterns of terminology addition—which concepts emerge when—provide a unique window into the progress of science itself, revealing how our understanding of life becomes increasingly sophisticated with each passing year.

From protecting against engineered pathogens to integrating knowledge across species, the careful tracking and visualization of terminology distribution in biological ontologies represents more than an academic exercise—it is fundamental to the future of biological discovery and its application to human health.

References