Comparative Genomics

Life on Earth has been evolving for billions of years. Living organisms are found in virtually every environment, surviving and thriving under extreme heat, cold, radiation, pressure, salt, acidity, and darkness. Many of these environments are exclusively colonised by “simple” microorganisms, and the only nutrients available come from chemical compounds that only microbes can use. Their unparalleled genetic and metabolic diversity and range of environmental adaptations indicate that microbes long ago “solved” many problems for which scientists and engineers are still actively seeking solutions including carbon capture and nitrogen fixation.

The secrets to these adaptations are encoded in their genomes, which contain all the necessary instructions for building functioning organisms. The first complete bacterial genome was deciphered in 1995. Since then, the number of complete genomes sequenced has grown exponentially.

Our methods, at a glance

High-throughput community profiling using short- and long-read amplicon sequencing
Genome annotation: EffectiveDB, SIMAP, pCOMP, GenSkew, ConsPred
Functional genomics: NVT
Comparative genomics: Gepard, PICA, PhenDB, DeepNOG, VOGDB
Metagenomics: probeBase, probeCheck, HoloVir

Genome sequence map, artist impression. © Tetiana Lazunova

Powerful computers and sophisticated bioinformatics software are the keys to unlocking the potential of these massive amounts of genomic data for medical and environmental applications.

The majority of microorganisms and viruses cannot be cultivated in the laboratory so far. We, therefore, use culture-independent approaches such as amplicon sequencing and metagenomics, which can be applied directly to nucleic acids isolated from any environment, to study the diversity, structure, and functional potential of microbial communities. State-of-the-art sequencing technologies combined with computational methods often allow us to reconstruct complete genomes from metagenomes, and to predict phenotypic traits from these genome sequences.

CeMESS is involved in a wide range of genome sequencing and metagenomics projects and has established efficient tools and workflows for interpreting (meta)genomic data. We are engaged in maintaining and improving genomic data in public databases. We also create new software to push the limits of accuracy and throughput in computational genomics.