Shotgun metagenomics for infectious diseases

Overview

Shotgun metagenomic sequencing is an untargeted next generation sequencing approach that allows researchers to study microbial diversity in different environments. It can help provide infectious disease researchers with important information about difficult-to-cultivate organisms and identify novel pathogens.

Gene expression analysis in lab

What is shotgun metagenomics?

Shotgun metagenomic sequencing is a method that allows researchers to study microbial diversity in different natural and artificial environments. This next generation sequencing (NGS) approach provides access to the full genetic content in a sample, meaning that it is an untargeted method. One advantage of this is that it provides genomic data from difficult-to-cultivate organisms which cannot be studied using conventional wet lab techniques.

Shotgun metagenomic sequencing consists of 4 main steps—DNA extraction, library preparation, sequencing, and bioinformatic analysis. The first step, DNA extraction, is important because all downstream analyses in the shotgun metagenomics workflow depend on the quality of input DNA. Next, library preparation gets the extracted DNA ready for sequencing. Shotgun metagenomics requires DNA to be fragmented (sheared) into smaller pieces before sequencing. NGS library preparation also involves adding index sequences to DNA molecules so that they can be sequenced in parallel (multiplexed sequencing), which saves time and lowers costs. Finally, after sequencing is completed, sequencing reads need to be parsed out or demultiplexed, and bioinformatically analyzed. Since this is an untargeted NGS method, the resulting data include fragmented sequences from all the DNA present in the sample. The bioinformatic analysis step of a shotgun metagenomics workflow will vary depending on the aim of the project, but it usually includes assembling these short sequences into longer contiguous sequences (contigs) which can then be further analyzed downstream for a variety of applications, e.g., identifying novel variants and pathogens. To find out more about shotgun metagenomics, download the NGS 101 Application guide.

 

How is shotgun metagenomics used to study infectious diseases?

Shotgun metagenomics is used by infectious disease researchers to identify new pathogens, track transmission, and detect genomic mutations [1-3]. Since shotgun metagenomic sequencing is an untargeted NGS approach, it can be especially beneficial for identifying novel or emerging infectious microbes [1,2].

This approach allows researchers to assemble the reads generated via sequencing into complete or near complete genomes. Doing so can help identify where mutations have occurred and determine geographically, where infectious disease outbreaks started. Recently, shotgun metagenomic sequencing has been used to track the mpox virus (MPXV) [3].

Icons_Ocean_85x85_Safety Data Sheet

NGS 101 application guide

This detailed overview walks you through major advances in sequencing technology, types of next generation sequencing technologies, their applications and more.

Download now

Benefits and challenges of shotgun metagenomics

Benefits of shotgun metagenomic sequencing include its scalability, its ability to provide both functional and taxonomic information from all the DNA present in a sample, and the versatility of the data gathered. It is important to note that even though this method generates a more diverse dataset than a targeted NGS approach, it can be significantly more expensive. However, costs of sequencing have been consistently declining in past years [4].

This approach may not be ideal if researchers hope to obtain a large number of sequences from rare members of the microbial community or rare alleles. While deep sequencing has been shown to work around this difficulty, the added costs of increasing sequencing depth may not be justifiable for all laboratories, and targeted NGS may be a reasonable alternative. Enrichment approaches used in targeted NGS, including amplicon sequencing and hybridization capture, are commonly employed in infectious disease research to identify new variants of pathogens and to track outbreaks in wastewater [5-7].

Deciding between NGS approaches like amplicon sequencing and shotgun metagenomics largely depends on your research objectives.

Amplicon-based NGS vs. shotgun metagenomic sequencing

Applications of targeted vs. untargeted NGS approaches vary and benefits of each approach are listed in Table 1. Amplicon sequencing requires information about the targeted gene(s) or genomes before primer/probe design can be implemented. If that information is available, this approach can be a powerful tool for investigating and monitoring the evolution of microbes, pathogens, alleles, etc. Because of its intrinsically targeted nature, amplicon sequencing avoids unwanted host sequences, which can be problematic in shotgun metagenomic sequencing. This approach also takes advantage of a number of well-established pipelines for library construction and sequence analysis.

Shotgun metagenomics is well suited for researchers that are investigating the functional potential of a microbial community, identifying novel pathogens, or aiming to understand how species co-vary. This approach can lead to high sequencing costs and often requires sophisticated bioinformatic analyses. However, the high-resolution data generated by this approach may provide a system-level view of the genetic material in a sample.

Table 1. Key benefits of amplicon-based NGS and shotgun metagenomic sequencing.

Amplicon-based NGS Shotgun metagenomic sequencing
  • Cost-effective
  • Targeted sequencing generates data only for sequences of interest
  • Well-established pipelines for library prep and bioinformatic analyses
  • Functional and taxonomic genes sequenced
  • Untargeted approach allows for the identification of novel pathogens and genes
  • Data on bacteria, archaea, eukaryotes, and viruses

One specific type of amplicon-based sequencing often employed by researchers for applications that include infectious disease investigations is 16S rRNA gene sequencing. This approach predates shotgun metagenomic sequencing and relies on primers to target the 16S rRNA gene, which all bacteria and archaea have in their genome. By targeting this specific gene, researchers capture the taxonomic diversity of these microbes in their sample.

16S rRNA gene sequencing is particularly industrious for characterizing microbial communities and allows researchers to obtain a large amount of sequencing data only on microbes of interest (bacteria and archaea) without sequencing other organisms (e.g., fungi, viruses). Because 16S rRNA gene sequencing relies heavily on primers, primer design is extremely important. It has been shown that poorly designed primers can result in an incomplete or biased view of the microbial community [8,10]. Further, taxonomic resolution should be taken into consideration when deciding between 16S rRNA gene sequencing and shotgun metagenomics. If a more precise taxonomic resolution is needed, shotgun metagenomics is typically the preferred approach.

While 16S rRNA gene sequencing can provide information about which microbes are present in a given environment, it cannot tell you what they can do (i.e., it cannot assess their functional potential). This information can be obtained via shotgun metagenomic sequencing. Further, the untargeted approach of shotgun metagenomics allows for sequences from all the microbes present in a sample to be sequenced which includes microbes without 16S rRNA genes like viruses and fungi. You can read more about the capabilities of 16S rRNA gene amplicon sequencing here.

Icons_Ocean_85x85_Blogs and Articles

NGS solutions made for SARS-CoV-2 research

There are multiple factors to consider when choosing the best NGS approach for your SARS-CoV-2 research needs.

Download this 6-page brochure to explore amplicon sequencing and hybridization capture options that may be right for you.

Download now

IDT's products for shotgun metagenomics

Explore other infectious disease sequencing methods

Let's connect

Identify the unidentified.

Our NGS team is ready to answer questions to help you reach your infectious disease research goals. Fill out this form and one of our team members will contact you directly.

xGen NGS—made for infectious disease research.

Processing

References

  1. Vijayvargiya P, Jeraldo PR, Thoendel MJ, et al. Application of metagenomic shotgun sequencing to detect vector-borne pathogens in clinical blood samples. PLoS One. 2019;14(10):e0222915.
  2. Chen H, Li J, Yan S, et al. Identification of pathogen(s) in infectious diseases using shotgun metagenomic sequencing and conventional culture: a comparative study. PeerJ. 2021;9:e11699.
  3. Isidro J, Borges V, Pinto M, et al. Phylogenomic characterization and signs of microevolution in the 2022 multi-country outbreak of monkeypox virus. Nat Med. 2022;28(8):1569-1572.
  4. Wetterstrand KA. The Cost of Sequencing a Human Genome. 2021; https://www.genome.gov/about-genomics/fact-sheets/Sequencing-Human-Genome-cost. Accessed 10 Oct, 2022.
  5. Spurbeck RR, Minard-Smith A, Catlin L. Feasibility of neighborhood and building scale wastewater-based genomic epidemiology for pathogen surveillance. Sci Total Environ. 2021;789:147829.
  6. Fontenele RS, Kraberger S, Hadfield J, et al. High-throughput sequencing of SARS-CoV-2 in wastewater provides insights into circulating variants. Water Res. 2021;205:117710.
  7. Karthikeyan S, Levy JI, De Hoff P, et al. Wastewater sequencing reveals early cryptic SARS-CoV-2 variant transmission. Nature. 2022;609(7925):101-108.
  8. Fredriksson NJ, Hermansson M, Wilen BM. The choice of PCR primers has great impact on assessments of bacterial community diversity and dynamics in a wastewater treatment plant. PLoS One. 2013;8(10):e76431.
  9. Hong S, Bunge J, Leslin C, et al. Polymerase chain reaction primers miss half of rRNA microbial diversity. ISME J. 2009;3(12):1365-1373.
  10. Klindworth A, Pruesse E, Schweer T, et al. Evaluation of general 16S ribosomal RNA gene PCR primers for classical and next-generation sequencing-based diversity studies. Nucleic Acids Res. 2013;41(1):e1.
RUO22-1521_001