• Gizem Özbek
  • August 12th 2021


In genome-wide analyses such as whole exome/genome sequencing (WES/WGS), 20,442 coding genes, 23,982 non-coding genes and 15,228 pseudogenes are annotated according to the hg38 assembly (1). Knowing the phenotypic relationships of these genes in humans and model organisms is critical when different types of variants (SNPs, Indels, SVs) in those genes are found during the interpretation.

Several open-source databases, including gene-phenotype relationships and acknowledging the details of the phenotypes, are actively used in computational and clinical workflows. Among these resources, OMIM (Online Mendelian Inheritance in Man®) and Orphanet are the first to come to mind, particularly for single-gene diseases. An average of 741 new titles are entered into the OMIM database annually, and 7381 are updated (Table 1 & Figure 1).


Table 1. Newly added and updated title statistics in OMIM per year (2)














Figure 1. Distribution of newly added and updated statistics in OMIM (2)

The phenotypic relationship of the genes with a Mendelian inheritance (~6000) is becoming much more evident every day due to the widespread use of next-generation sequencing technologies in daily medical practice (Figure 2). However, in addition to these curation-based open databases such as OMIM, automated literature mining followed by expert curation approaches are necessary to keep up with this discovery rate.

Figure 2. OMIM Entry Statistics - Number of Entries in OMIM (3)


HOPE (Harmonization of Ontologies by PairEnd)

HOPE is a semi-automatic, curated clinical knowledgebase resource where the clinical terms that make up the phenotypes are obtained by literature mining from different ontologies [HPO (4), SNOMED CT (5), Orphanet (6), MESH (7) etc.] and later, their relationships with the genes are curated. Thus, it acts as one of the most critical components of the GENIUS algorithm, which brings you the clinically relevant candidate variant list and the essential components that feed the algorithm.

NGS Cloud accepts clinical information as HPO Terms. However, in the backend, phenotype-gene matches are performed not only by relying on the HPO Annotation knowledge but also by extending it using HOPE.

One of the essential advantages of this system is that it eliminates the limitations of these resources and offers users the opportunity to consider gene-phenotype matches in the evaluation, as shown in the example in Figure 3 in the literature.


Figure 3. Gene matching for “Molar tooth sign on MRI (HP:0002419)” using HOPE


  1. (Last accessed: 08.01.2021)
  2. (Last accessed: 08.01.2021)
  3. (Last accessed: 08.01.2021)
  4. Köhler S, Gargano M, Matentzoglu N, et al. The Human Phenotype Ontology in 2021 [published online ahead of print, 2020 Dec 2]. Nucleic Acids Res. 2020;gkaa1043. doi:10.1093/nar/gkaa1043
  5. (Last accessed: 08.01.2021)
  6. Orphanet: an online database of rare diseases and orphan drugs. Copyright, INSERM 1997. Available at (Last accessed: 08.01.2021)
  7. (Last accessed: 08.01.2021)
Related Post :
Gizem Özbek August 10th 2021
Gizem Özbek August 11th 2021
Gizem Özbek August 12th 2021