Since the first discovery of 16s rRNA sequences in environmental samples in 1990 and their value in microbial community research, we have entered the era of microbial community research in full swing. Today, community research has entered the high throughput sequencing era, and the mainstream of current research is at the intersection of next-generation amplicon sequencing and long-read amplicon sequencing.
Next-Generation Amplicon Sequencing
The most common targets of next-generation amplicon sequencing are bacterial 16S ribosomal RNA (16S rRNA), 18S rRNA and fungal internal transcribed spacer (ITS) with taxonomic information. However, NGS techniques provide limited information in the field of microbiome function prediction due to their short read lengths, with the limit length of short-read 16s amplicon fragments being only 500-600 bp. At most, only three consecutive variable regions can be analyzed. Next-generation amplicons picking variable regions is a big challenge and usually implies compromise with loss of information. Therefore, NGS places emphasis on microbial composition at the phylum or genus level, focusing on the overall community diversity. It does not discriminate well at the species level and is unable to distinguish between highly related strains.
Long-read sequencing technology (Oxford Nanopore and PacBio SMRT sequencing) can easily cover a total of 9 variable regions of 16s with a total length of about 1,500bp, preserving the possibility of species identification to the maximum extent. Taking PacBio SMRT sequencing technology as an example, the extra-long read length and circular template enable the insertion of full-length 16S fragments to be sequenced repeatedly, and random errors can be reduced by cross-checking through repeated sequencing to obtain high-quality Consensus Sequencing (CCS). The feasibility of PacBio technology for full-length 16S rRNA gene sequencing analysis was found to be applicable to multiple study subjects such as gut microbiome, soil microbiome, and water microbiome. The number of identified microbial community species at the species level was 2.3-15 times higher than at the gate level, with detectable abundance as low as 0.05%.
Long-read amplicon sequencing goes further than short-read technology by focusing more on the association between other omics, not only on species abundance at the phylum and genus level, but also being able to explore the collaboration and competition of species with high-resolution performance. Full-length 16s sequencing can provide a more comprehensive strain-level analysis, which is of great significance for multi-omics association and subsequent topics. In the future, 16S full-length sequencing may gradually replace short-read sequencing as the main force for predicting microbiome function.
How Long Read 16S Sequencing Facilities Multi-Omics?
Full-length 16S seq can be used to combine with metagenome analysis. Full-length 16S seq can describe the microbial community composition, while metagenomics validates the results. In addition, metagenomics allows access to genetic and functional annotation information, allowing the functional level resolution of differences between sample communities.
16s amplicon sequencing can also be combined with metabolomics for association analysis. By correlating the metabolomics with species abundance distribution obtained from 16s sequencing, the relationship network can be constructed from different molecular levels with the help of statistical models to explore the potential core regulatory factors.
Long-read full-length 16s sequencing has a strong potential as a new generation of community microbial research tool, and is a major necessary tool for future strain-level research. Full-length amplicon sequencing can obtain sequence information of all variant regions, which can not only improve the resolution of species identification, but also improve the accuracy and comprehensiveness of microbial identification in samples, thus a more realistic restoration of microbial community structure.