Methods Description for CompBio

To reference CompBio in a publication please use the following text:


CompBio 1.0, PercayAI, LLC, 4220 Duncan Ave Suite 201, St. Louis, MO 63110 USA


Tool Description

CompBio performs a literature analysis to identify relevant biological processes and
pathways represented by the differentially expressed entities (genes, proteins, miRNA’s,
or metabolites). This is accomplished with an automated Biological Knowledge
Generation Engine (BKGE) that extracts all abstracts from PubMed that reference
entities of interest (or their synonyms), using contextual language processing and a
biological language dictionary that is not restricted to fixed pathway and ontology
knowledge bases. Conditional probability analysis is utilized to compute the statistical
enrichment of biological concepts (processes/pathways) over those that occur by
random sampling. Related concepts built from the list of differentially expressed entities
are further clustered into higher-level themes (e.g., biological pathways/processes, cell
types and structures, etc.).


Scoring Description

Within CompBio, scoring of gene, concept, and overall theme enrichment is
accomplished using a multi-component function referred to as the Normalized
Enrichment Score (NES). The first component utilizes an empirical p-value derived
from several thousand random entity lists of comparable size to the users input entity
list to define the rarity of a given entity-concept event. The second component,
effectively representing the fold enrichment, is based on the ratio of the concept
enrichment score to the mean of that concept’s enrichment score across the set of
randomized entity data. As such, the NES reflects both the rarity of the concept event
associated with an entity list, as well as it’s degree of overall enrichment. Based on
these empirical criteria, observed entity-concept scores above 10.0, 100.0, and 1,000.0
are labeled as moderate, marked, or high in level of enrichment above random.
Themes scoring above 500.0, 1000.0, and 5000.0 are labeled similarly.


4220 Duncan Ave

Suite 201

St. Louis, Missouri



© 2019 PercayAI  

  • LinkedIn - Black Circle
  • Twitter