Evolution of Microbial Genomics: Conceptual Shifts over a Quarter Century
Eugene V.Koonin1, Kira S.Makarova1, Yuri I.Wolf1
National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD 20894, USA.
Prokaryote genomics started in earnest in 1995, with the complete sequences of two small bacterial genomes, those of Haemophilus influenzae and Mycoplasma genitalium. During the next quarter century, the prokaryote genome database has been growing exponentially, with no saturation in sight. For most of these 25 years, genome sequencing remained limited to cultivable microbes. Together with next-generation sequencing methods, advances in metagenomics and single-cell genomics have lifted this limitation, providing for an increasingly unbiased characterization of the global prokaryote diversity. Advances in computational genomics followed the progress of genome sequencing, even if occasionally lagging behind. Several major new branches of bacteria and archaea were discovered, including Asgard archaea, the apparent closest relatives of eukaryotes and expansive groups of bacteria and archaea with small genomes thought to be symbionts of other prokaryotes. Comparative analysis of numerous prokaryote genomes spanning a wide range of evolutionary distances changed the conceptual foundations of microbiology, supplanting the notion of species genomes with fixed gene sets with that of dynamic pangenomes and the notion of a single Tree of Life (ToL) with a statistical tree-like trend among individual gene trees. Strides were also made towards a theory and quantitative laws of prokaryote genome evolution.
Key words: prokaryote genome evolution, pangenome, phylogenetic trees, metagenomics, single-cell genomics, gene gain and loss.