Metagenomics approaches to understanding soil health in environmental research - a review
More details
Hide details
Facultad de Ciencias Agropecuarias, Departamento de Ciencias Biológicas, Universidad Nacional de Colombia, Colombia
Department of Neurology, Pathology, Johns Hopkins University School of Medicine, United States
Juan Diego Duque Zapata   

Facultad de Ciencias Agropecuarias, Departamento de Ciencias Biológicas, Universidad Nacional de Colombia, Colombia
Submission date: 2023-01-10
Final revision date: 2023-03-06
Acceptance date: 2023-04-05
Online publication date: 2023-04-05
Publication date: 2023-05-19
Soil Sci. Ann., 2023, 74(1)163080
Given the importance of soil as a supplier of nutrients and water for different ecosystems, understanding soil health and quality is necessary for its preservation. Microorganisms, due to their high abundance and their relationship with the degradation of organic matter and biogeochemical cycles, have a rapid response to environmental changes and thus are a discriminating factor that can be used as bioindicators of soil health. However, 97% of microorganisms are unculturable, leaving a gap in their taxonomic and functional knowledge. The development of metagenomics has reduced this problem through the direct extraction of DNA from soil, allowing the characterization of such non-culturable microorganisms, this technique can be considered one of the most impactful in soil health, given that it allows for an exploration of the biodiversity, the community structure, and the potential functions of the microbial communities from distinct environments. In addition to this, metagenomics have had an impact in different areas such as “OneHealth” or EcoGenomics allowing the formation of international projects. The aim of this paper is to show how metagenomics can be used as a technique to assess soil quality and health through the taxonomic and functional identification of the microorganisms present in the soil.
Alarcón Gutiérrez, E., Hernández, C., Gardner, T., García Pérez, J.A., Caballero, M., Perroni, Y., Farnet da Silva, A.M.A., Gaime Perraud, I., Barois, I., 2021. Soil bioindicators associated to different management regimes of Cedrela odorata plantations. Madera y Bosques 27(1), e2711912.
Alcock, B.P., et al, 2019. CARD 2020: antibiotic resistome surveillance with the comprehensive antibiotic resistance database. Nucleic Acids Research 48(D1), D517-D525.
Amir, A., McDonald, D., Navas-Molina, J. A., Kopylova, E., Morton, J.T., Zech Xu, Z., Kightley, E.P., Thompson, L.R., Hyde, E.R., Gonzalez, A., Knight, R., 2017. Deblur Rapidly Resolves Single-Nucleotide Community Sequence Patterns. Msystems 2(2), e00191-16.
Baksay, S., Andalo, C., Galop, D., Burrus, M., Escaravage, N., Pornon, A., 2022. Using Metabarcoding to Investigate the Strength of Plant-Pollinator Interactions From Surveys of Visits to DNA Sequences. Frontiers in Ecology and Evolution 10, 735588.
Bhowmik, A., Kukal, S.S., Saha, D., Sharma, H., Kalia, A., Sharma, S., 2019. Potential indicators of soil health degradation in different land use-based ecosystems in the shiwaliks of northwestern India. Sustainability 11(14), 3908.
Bolger, A.M., Lohse, M., Usadel, B., 2014. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30(15), 2114–2120.
Bolyen, E., et al., 2019. Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. Nature Biotechnology 37(8), 852–857.
Bonomo, M.G., Calabrone, L., Scrano, L., Bufo, S.A., di Tomaso, K., Buongarzone, E., Salzano, G., 2022. Metagenomic monitoring of soil bacterial community after the construction of a crude oil flowline. Environmental Monitoring and Assessment 194(2), 48.
Brady, A., Salzberg, S.L., 2009. Phymm and PhymmBL: metagenomic phylogenetic classification with interpolated Markov models. Nature Methods 6(9), 673–676.
Breitwieser, F.P., Baker, D.N., Salzberg, S.L., 2018. KrakenUniq: confident and fast metagenomics classification using unique k-mer counts. Genome Biology 19(1), 198.
Callahan, B.J., McMurdie, P.J., Rosen, M.J., Han, A.W., Johnson, A.J.A., Holmes, S.P., 2016. DADA2: High-resolution sample inference from Illumina amplicon data. Nature Methods 13(7), 581–583.
Caspi, R., Billington, R., Keseler, I.M., Kothari, A., Krummenacker, M., Midford, P.E., Ong, W.K., Paley, S., Subhraveti, P., Karp, P.D., 2020. The MetaCyc database of metabolic pathways and enzymes – a 2019 update. Nucleic Acids Research 48(D1), D445–D453.
Chukwuneme, C.F., Ayangbenro, A.S., Babalola, O.O., 2021. Metagenomic analyses of plant growth-promoting and carbon-cycling genes in maize rhizosphere soils with distinct land-use and management histories. Genes 12(9), 1431.
Churcheward, B., Millet, M., Bihouée, A., Fertin, G., Chaffron, S., 2022. MAGNETO: An Automated Workflow for Genome-Resolved Metagenomics. Msystems 7(4), e0043222.
Clarridge, J.E., 2004. Impact of 16S rRNA Gene Sequence Analysis for Identification of Bacteria on Clinical Microbiology and Infectious Diseases. Clinical Microbiology Reviews 17(4), 840–862.
Craig, J.W., Chang, F.Y., Kim, J.H., Obiajulu, S.C., Brady, S.F., 2010. Expanding Small-Molecule Functional Metagenomics through Parallel Screening of Broad-Host-Range Cosmid Environmental DNA Libraries in Diverse Proteobacteria. Applied and Environmental Microbiology 76(5), 1633–1641.
Deiner, K. et al., 2017. Environmental DNA metabarcoding: Transforming how we survey animal and plant communities. Molecular Ecology 26(21), 5872–5895.
Delmont, T.O., Simonet, P., Vogel, T.M., 2012. Describing microbial communities and performing global comparisons in the ‘omic era. The ISME Journal 6(9), 1625–1628.
Dentinger, B.T.M., Didukh, M.Y., Moncalvo, J.M., 2011. Comparing COI and ITS as DNA Barcode Markers for Mushrooms and Allies (Agaricomycotina). PloS ONE 6(9), e25081.
Diaz, N.N., Krause, L., Goesmann, A., Niehaus, K., Nattkemper, T.W., 2009. TACOA – Taxonomic classification of environmental genomic fragments using a kernelized nearest neighbor approach. BMC Bioinformatics 10(1), 56.
Dröge, J., Gregor, I., McHardy, A.C., 2015. Taxator-tk: precise taxonomic assignment of metagenomes by fast approximation of evolutionary neighborhoods. Bioinformatics 31(6), 817–824.
Evans, B.R., Leighton, F.A., 2014. A history of One Health. Revue Scientifique et Technique de l’OIE 33(2), 413–420.
Ezeokoli, O.T., Bezuidenhout, C.C., Maboeta, M.S., Khasa, D.P., Adeleke, R.A., 2020. Structural and functional differentiation of bacterial communities in post-coal mining reclamation soils of South Africa: bioindicators of soil ecosystem restoration. Scientific Reports 10(1), 1759.
Fazekas, A.J., Burgess, K.S., Kesanakurti, P.R., Graham, S.W., Newmaster, S.G., Husband, B.C., Percy, D.M., Hajibabaei, M., Barrett, S.C.H., 2008. Multiple Multilocus DNA Barcodes from the Plastid Genome Discriminate Plant Species Equally Well. PloS ONE 3(7), e2802.
Feng, G., Xie, T., Wang, X., Bai, J., Tang, L., Zhao, H., Wei, W., Wang, M., Zhao, Y., 2018. Metagenomic analysis of microbial community and function involved in cd-contaminated soil. BMC Microbiology 18(1), 1–13.
Fierer, N., 2017. Embracing the unknown: disentangling the complexities of the soil microbiome. Nature Reviews Microbiology 15(10), 579–590.
Fierer, N., Jackson, R.B., 2006. The diversity and biogeography of soil bacterial communities. Proceedings of the National Academy of Sciences of the United States of America 103(3), 626–631.
Frąc, M., Hannula, E.S., Bełka, M., Salles, J.F., Jedryczka, M., 2022. Soil mycobiome in sustainable agriculture. Frontiers in Microbiology 13. 1033824
Frąc, M., Hannula, S.E., Belka, M., Jȩdryczka, M., 2018. Fungal biodiversity and their role in soil health. Frontiers in Microbiology 9, 707..
Gerner-Smidt, P., Besser, J., Concepción-Acevedo, J., Folster, J.P., Huffman, J., Joseph, L. A., Kucerova, Z., Nichols, M.C., Schwensohn, C.A., Tolar, B., 2019. Whole Genome Sequencing: Bridging One-Health Surveillance of Foodborne Diseases. Frontiers in Public Health 7, 172.
Gilbert, J.A., Jansson, J.K., Knight, R., 2014. The Earth Microbiome project: successes and aspirations. BMC Biology 12(1), 69.
Gilbert, J.A. et al., 2010. Meeting Report: The Terabase Metagenomics Workshop and the Vision of an Earth Microbiome Project. Standards in Genomic Sciences 3(3), 243–248.
Greninger, A.L. et al.., 2010. A Metagenomic Analysis of Pandemic Influenza A (2009 H1N1) Infection in Patients from North America. PloS ONE, 5(10), e13381.
Hatten, J., Liles, G., 2019. A ‘healthy’ balance – The role of physical and chemical properties in maintaining forest soil function in a changing world. Developments in Soil Science 36, 373–396.
Haygarth, P.M., Ritz, K., 2009. The future of soils and land use in the UK: Soil systems for the provision of land-based ecosystem services. Land Use Policy 26, 187–197.
Hebert, P.D.N., Cywinska, A., Ball, S.L., DeWaard, J.R., 2003. Biological identifications through DNA barcodes. Proceedings of the Royal Society B: Biological Sciences 270(1512), 313–321.
Hollingsworth, P.M., Graham, S.W., Little, D.P., 2011. Choosing and Using a Plant DNA Barcode. PloS ONE 6(5), e19254.
Human Microbiome Project., 2019. The Integrative Human Microbiome Project. Nature 569(7758), 641–648.
Huson, D.H., Beier, S., Flade, I., Górska, A., El-Hadidi, M., Mitra, S., Ruscheweyh, H.-J., Tappu, R., 2016. MEGAN Community Edition – Interactive Exploration and Analysis of Large-Scale Microbiome Sequencing Data. PLOS Computational Biology 12(6), e1004957.
Jiao, J.Y., Liu, L., Hua, Z.S., Fang, B.Z., Zhou, E. M., Salam, N., Hedlund, B.P., Li, W.J., 2021. Microbial dark matter coming to light: Challenges and opportunities. National Science Review 8(3), nwaa280..
Kang, D.D., Li, F., Kirton, E., Thomas, A., Egan, R., An, H., Wang, Z., 2019. MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies. PeerJ 7, e7359.
Katz, K., Shutov, O., Lapoint, R., Kimelman, M., Brister, J.R., O’Sullivan, C., 2022. The Sequence Read Archive: a decade more of explosive growth. Nucleic Acids Research 50(D1), D387–D390.
Kaushik, P., Singh Sandhu, O., Singh Brar, N., Kumar, V., Singh Malhi, G., Kesh, H., Saini, I., 2021. Soil Metagenomics: Prospects and Challenges. In Mycorrhizal Fungi – Utilization in Agriculture and Industry. IntechOpen.
Keegan, K.P., Glass, E.M., Meyer, F., 2016. MG-RAST, a Metagenomics Service for Analysis of Microbial Community Structure and Function. Microbial environmental genomics (MEG) 207-233.
Kultima, J.R., Coelho, L.P., Forslund, K., Huerta-Cepas, J., Li, S.S., Driessen, M., Voigt, A.Y., Zeller, G., Sunagawa, S., Bork, P., 2016. MOCAT2: a metagenomic assembly, annotation and profiling framework. Bioinformatics 32(16), 2520–2523.
Langille, al., 2013. Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences. Nature Biotechnology 31(9), 814–821.
Leite, M.F.A., van den Broek, S.W.E.B., Kuramae, E.E., 2022. Current Challenges and Pitfalls in Soil Metagenomics. Microorganisms 10(10), 1900.
Li, A. M., 2017. Ecological determinants of health: food and environment on human health. Environmental Science and Pollution Research 24(10), 9002–9015.
Li, D., Liu, C.M., Luo, R., Sadakane, K., Lam, T.-W., 2015. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 31(10), 1674–1676.
Liu, B., Gibbons, T., Ghodsi, M., Pop, M., 2010. MetaPhyler: Taxonomic profiling for metagenomic sequences. 2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 95–100.
Liu, S., Moon, C.D., Zheng, N., Huws, S., Zhao, S., Wang, J., 2022. Opportunities and challenges of using metagenomic data to bring uncultured microbes into cultivation. Microbiome 10(1), 76.
Lomsadze, A., Gemayel, K., Tang, S., Borodovsky, M., 2018. Modeling leaderless transcription and atypical genes results in more accurate gene prediction in prokaryotes. Genome Research 28(7), 1079–1089.
Long, P.E., Williams, K.H., Hubbard, S.S., Banfield, J.F., 2016. Microbial Metagenomics Reveals Climate-Relevant Subsurface Biogeochemical Processes. Trends in Microbiology 24(8), 600–610.
Louca, S., Parfrey, L.W., Doebeli, M., 2016. Decoupling function and taxonomy in the global ocean microbiome. Science 353(6305), 1272–1277.
Madden, T., 2002. The BLAST Sequence Analysis Tool. In The NCBI Handbook.
Martin, T., Wade, J., Singh, P., Sprunger, C.D., 2022. The integration of nematode communities into the soil biological health framework by factor analysis. Ecological Indicators 136, 108676.
Menta, C., Remelli, S., 2020. Soil Health and Arthropods: From Complex System to Worthwhile Investigation. Insects 11(1), 54.
Mikheenko, A., Saveliev, V., Gurevich, A., 2016. MetaQUAST: evaluation of metagenome assemblies. Bioinformatics 32(7), 1088–1090.
Mistry, Jet al., 2021. Pfam: The protein families database in 2021. Nucleic Acids Research 49(D1), D412–D419.
Mitchell, G., Wilson, P.J., Manseau, M., Redquest, B., Patterson, B.R., Rutledge, L.Y., 2022. DNA metabarcoding of faecal pellets reveals high consumption of yew (Taxus spp.) by caribou (Rangifer tarandus) in a lichen-poor environment. FACETS 7, 701–717.
Moreira, F.M.S., Huising, J.E., Bignell, D. E., 2012. Manual de biología de suelos tropicales. Muestreo y caracterización de la biodiversidad bajo suelo. Instituto Nacional de Ecologia INE.
Nacke, H., Will, C., Herzog, S., Nowka, B., Engelhaupt, M., Daniel, R., 2011. Identification of novel lipolytic genes and gene families by screening of metagenomic libraries derived from soil samples of the German Biodiversity Exploratories. FEMS Microbiology Ecology 78(1), 188–201.
Nagarajan, N., Pop, M., 2013. Sequence assembly demystified. Nature Reviews Genetics 14(3), 157–167.
Namiki, T., Hachiya, T., Tanaka, H., Sakakibara, Y., 2012. MetaVelvet: an extension of Velvet assembler to de novo metagenome assembly from short sequence reads. Nucleic Acids Research 40(20), e155–e155.
Nannipieri, P., Ascher, J., Ceccherini, M.T., Landi, L., Pietramellara, G., Renella, G., 2003. Microbial diversity and soil functions. European Journal of Soil Science 54(4), 655–670.
Nannipieri, P., Pietramellara, G., Renella, G., 2014. Omics in Soil Science.
Natural Resources Conservation Services., 2012. Soil Health.
Navarrete, A.A., Aburto, F., González-Rocha, G., Guzmán, C.M., Schmidt, R., Scow, K., 2023. Anthropogenic degradation alter surface soil biogeochemical pools and microbial communities in an Andean temperate forest. Science of the Total Environment 854(1), 158508.
Nearing, J.T., Douglas, G.M., Comeau, A.M., Langille, M.G.I., 2018. Denoising the Denoisers: an independent evaluation of microbiome sequence error-correction approaches. PeerJ 6, e5364.
Nesme, J. et al., 2016. Back to the future of soil metagenomics. Frontiers in Microbiology 7, 73.
Nguyen, N.H., Song, Z., Bates, S.T., Branco, S., Tedersoo, L., Menke, J., Schilling, J.S., Kennedy, P.G., 2016. FUNGuild: An open annotation tool for parsing fungal community datasets by ecological guild. Fungal Ecology 20, 241–248.
Nurk, S., Meleshko, D., Korobeynikov, A., Pevzner, P.A., 2017. metaSPAdes: a new versatile metagenomic assembler. Genome Research 27(5), 824–834.
Pal, C., Bengtsson-Palme, J., Rensing, C., Kristiansson, E., Larsson, D.G.J., 2014. BacMet: antibacterial biocide and metal resistance genes database. Nucleic Acids Research 42(D1), D737–D743.
Parks, D.H., Imelfort, M., Skennerton, C.T., Hugenholtz, P., Tyson, G.W., 2015. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Research, 25(7) 1043–1055.
Pearman, W.S., Freed, N.E., Silander, O.K., 2020. Testing the advantages and disadvantages of short- and long- read eukaryotic metagenomics using simulated reads. BMC Bioinformatics 21(1), 220.
Pérez-Cobas, A.E., Gomez-Valero, L., Buchrieser, C., 2020. Metagenomic approaches in microbial ecology: An update on whole-genome and marker gene sequencing analyses. Microbial Genomics 6(8), 1–22.
Schloter, M., Nannipieri, P., Sørensen, S.J., van Elsas, J.D., 2018. Microbial indicators for soil quality. Biology and Fertility of Soils 54(1), 1–10.
Senn, S., Pangell, K., Bowerman, A.L., 2022. Metagenomic Insights into the Composition and Function of Microbes Associated with the Rootzone of Datura inoxia. BioTech 11(1).
Shi, Y., Su, C., Wang, M., Liu, X., Liang, C., Zhao, L., Zhang, X., Minggagud, H., Feng, G., Ma,W., 2020. Modern Climate and Soil Properties Explain Functional Structure Better Than Phylogenetic Structure of Plant Communities in Northern China. Frontiers in Ecology and Evolution 8, 531947.
Shumo, M., Khamis, F.M., Ombura, F.L., Tanga, C.M., Fiaboe, K.K.M., Subramanian, S., Ekesi, S., Schlüter, O.K., van Huis, A., Borgemeister, C., 2021. A Molecular Survey of Bacterial Species in the Guts of Black Soldier Fly Larvae (Hermetia illucens) Reared on Two Urban Organic Waste Streams in Kenya. Frontiers in Microbiology 12, 687103.
Simão, F.A., Waterhouse, R.M., Ioannidis, P., Kriventseva, E.V., Zdobnov, E.M., 2015. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31(19), 3210–3212.
Solden, L., Lloyd, K., Wrighton, K., 2016. The bright side of microbial dark matter: Lessons learned from the uncultivated majority. Current Opinion in Microbiology 31, 217–226.
Strous, M., Kraft, B., Bisdorf, R., Tegetmeyer, H.E., 2012. The Binning of Metagenomic Contigs for Microbial Physiology of Mixed Cultures. Frontiers in Microbiology 3, 410.
Su, P. et al, 2022. Recovery of metagenome-assembled genomes from the phyllosphere of 110 rice genotypes. Scientific Data, 9(1, 254.
Tang, L., 2019. Culturing uncultivated bacteria. Nature Methods 16(11), 1078–1078.
Techtmann, S.M., Hazen, T.C., 2016. Metagenomic applications in environmental monitoring and bioremediation. Journal of Industrial Microbiology and Biotechnology 43(10), 1345–1354.
Thompson, L.R. et al., 2017. A communal catalogue reveals Earth’s multiscale microbial diversity. Nature 551(7681), 457–463.
Torres, G.G., Figueroa-Galvis, I., Muñoz-García, A., Polanía, J., Vanegas, J., 2019. Potential bacterial bioindicators of urban pollution in mangroves. Environmental Pollution 255, 113293.
Treangen, T.J., Koren, S., Sommer, D.D., Liu, B., Astrovskaya, I., Ondov, B., Darling, A.E., Phillippy, A.M., Pop, M., 2013. MetAMOS: a modular and open source metagenomic assembly and analysis pipeline. Genome Biology 14(1), R2.
Ungerer, M.C., Johnson, L.C., Herman, M.A., 2008. Ecological genomics: understanding gene and genome function in the natural environment. Heredity 100(2), 178–183.
USDA, Natural Resources Conservation Service ., 2022. What is Soil Health? Https://Www.Nrcs.Usda.Gov/Wps/Portal/Nrcs/Main/Soils/Health/.
Vogel, T.M., Simonet, P., Jansson, J.K., Hirsch, P.R., Tiedje, J.M., van Elsas, J.D., Bailey, M.J., Nalin, R., Philippot, L., 2009. TerraGenome: a consortium for the sequencing of a soil metagenome. Nature Reviews Microbiology 7(4), 252–252.
Wang, M. et al., 2021. Soil Microbiome Structure and Function in Ecopiles Used to Remediate Petroleum-Contaminated Soil. Frontiers in Environmental Science 9, 624070.
Wang, S., Yan, Z., Wang, P., Zheng, X., Fan, J., 2020. Comparative metagenomics reveals the microbial diversity and metabolic potentials in the sediments and surrounding seawaters of Qinhuangdao mariculture area. PLOS ONE 15(6), e0234128.
Wang, Y., Leung, H.C.M., Yiu, S.M., Chin, F.Y.L., 2012. MetaCluster 5.0: a two-round binning approach for metagenomic data for low-abundance species in a noisy sample. Bioinformatics 28(18), i356–i362.
Wood, D.E., Lu, J., Langmead, B., 2019. Improved metagenomic analysis with Kraken 2. Genome Biology 20(1), 257.
Wood, D.E., Salzberg, S.L., 2014. Kraken: ultrafast metagenomic sequence classification using exact alignments. Genome Biology 15(3), R46.
Wu, Y.W., Ye, Y., 2011. A Novel Abundance-Based Algorithm for Binning Metagenomic Sequences Using l -tuples. Journal of Computational Biology 18(3), 523–534.
Wurtzel, O., Sesto, N., Mellin, J.R., Karunker, I., Edelheit, S., Bécavin, C., Archambaud, C., Cossart, P., Sorek, R., 2012. Comparative transcriptomics of pathogenic and non‐pathogenic Listeria species. Molecular Systems Biology 8(1), 583.
Xu, A., Li, L., Xie, J., Zhang, R., Luo, Z., Cai, L., Liu, C., Wang, L., Anwar, S., Jiang, Y., 2022. Bacterial Diversity and Potential Functions in Response to Long-Term Nitrogen Fertilizer on the Semiarid Loess Plateau. Microorganisms 10(8), 1579.
Zaghloul, A., Saber, M., Gadow, S., Awad, F., 2020. Biological indicators for pollution detection in terrestrial and aquatic ecosystems. Bulletin of the National Research Centre 44(1), 127.
Zhang, J., Kobert, K., Flouri, T., Stamatakis, A., 2014. PEAR: a fast and accurate Illumina Paired-End reAd mergeR. Bioinformatics 30(5), 614–620.