TBG-Project Miklos Balint and team

Functional Environmental Genomics

Metagenomic monitoring of soil communities (MetaInvert)

Institution: Senckenberg Gesellschaft für Naturforschung



 There is a tremendous diversity of invertebrates right under our feet: a single square meter of soil in Germany might contain hundreds of species. Much of this diversity is unknown: it is estimated that for every known species there are about 20 that are still undescribed. These invertebrates are important contributors of ecosystem services essential for humanity, such as disease control, nutrient cycling, soil formation, or even primary production.

There are serious concerns that this diversity is rapidly lost. Since soil biodiversity loss likely impacts soil ecosystem services, there is demand from national and European authorities to understand and monitor soil biodiversity. However, extensive soil biodiversity monitoring is not yet feasible given the lack methods that might allow to describe invertebrate community composition, biomass and functions diversity on a massive scale (see Figure 1).

In the project MetaInvert, we are developing the necessary genomic resources, laboratory, and bioinformatic tools for a metagenomics-based monitoring of soil invertebrate communities. We also aim to start the standardization process of the relevant approaches through ISO.


Fig. 1. Typical sample of soil microarthropods, containing hundreds of individuals from probably tens of species. Identification of such samples through microscopy requires highly trained specialists, and it is time consuming.


MetaInvert establishes a database of genomes for several hundred soil invertebrate species: springtails, oribatid mites, nematodes, potworms, myriapods, and several other groups. We already obtained genome sequences from over 120 species (Fig. 2). As of July 2020, over 100 additional species are processed for sequencing.

Fig. 2. Taxonomic relationships of over 80 soil invertebrates, genome-sequenced within the MetaInvert project. The tree reflects the project status as of May 2020. Species sequenced since then (~40) and species processed for sequencing are not included. The tree was generated with the Interactive Tree of Life tool

Genome data in MetaInvert is mostly generated from single individuals. Most of these individuals were collected and identified at the Soil Zoology Division, Senckenberg Museum of Natural History Görlitz. Most of the genomes are sequenced as Illumina short reads. The genomes have a coverage between 2x – 40x, and a varying completeness (Fig. 3). We are also experimenting with PacBio Sequel II for long-read sequencing. The MetaInvert database serves to improve species identification through shotgun metagenomics, to reveal evolutionary relationships, special adaptations, and host-microbiome associations. 

Fig. 3. Completeness of the genomes of 80 soil invertebrate species sequenced until May 1, 2020, estimated with BUSCO counts 


A test project with mock communities is currently underway to check the accuracy and efficiency of the shotgun metagenomic approach. This project constitutes the MSc research of Alexandra Schmidt (Goethe University, Frankfurt). Mock communities were generated from a large variety of genome-sequenced species (oribatid and gamasine mites, nematodes, springtails, enchytraeids, tardigrades) at the Soil Zoology Division. Mocks were set up to allow for sensitivity and specificity analyses with regard to taxonomic identification and the recovery of biomass information from read numbers. 

The “BE-Spring: Discovering Collembola biodiversity on grasslands with emerging genomic and metagenomic tools” is a DFG-funded project within the Biodiversity Exploratories. It runs between 2020-2023, and constitutes the PhD research of Anna Küchler. BE-Spring complements the MetaInvert genome data with genomes of the most abundant soil springtails from the 150 Exploratories grassland plots to establish the Springtail Genome Reference Database (SGRD) of Germany. BE-Spring implements the shotgun metagenomic approach established within MetaInvert to understand
    • how environmental drivers and the intensification of grassland use influence springtail community composition and function, and 
    • how springtails relate to other biological components of the grassland ecosystems

Group expertise / Methods

  • environmental DNA
  • Metabarcoding
  • Soil arthropods collection
  • Collembola taxonom
  • Morphology
  • Imagery: phase contrast and DIC highend microscope
  • SEM
  • Phylogenetics
  • Python
  • R
  • machine learning
  • metagenomics
  • community ecology