In Germany, researchers reported on the deeply sequenced metagenome and metatranscriptome of a complex biogas-producing microbial community from an agricultural production-scale biogas plant located in North Rhine Westphalia. The We assembled the metagenome and, as an example application, show that we reconstructed most genes involved in the methane metabolism, a key pathway involving methanogenesis performed by methanogenic Archaea. This result indicates that there is sufficient sequencing coverage for most downstream analyses.
The researchers from Bielefeld University led by Andreas Bremges report that “production of biogas takes place under anaerobic conditions and involves microbial decomposition of organic matter. Most of the participating microbes are still unknown and non-cultivable. Accordingly, shotgun metagenome sequencing currently is the method of choice to obtain insights into community composition and the genetic repertoire.”
“Sequenced at least one order of magnitude deeper than previous studies, our metagenome data will enable new insights into community composition and the genetic potential of important community members,” the research team said. “Moreover, mapping of transcripts to reconstructed genome sequences will enable the identification of active metabolic pathways in target organisms.”
Interesting in this development? Not only that we have a picture of an entire microbial community involved in producing biogas — the who’s there and who does what part. But, we also have a novel publishing strategy for science where the researchers are sharing huge troves of underlying data using the publication process. “By sharing our data, we want to actively encourage its reuse. This will hopefully result in novel biological and biotechnological insights, eventually enabling a more efficient biogas production.
In the publication package, raw sequencing data are available in the European Nucleotide Archive (ENA) under study accession PRJEB8813 (http://www.ebi.ac.uk/ena/data/view/PRJEB8813). The datasets supporting the results of this article are available in GigaScience’s GigaDB. The complete workflow is organized in a single GNU Makefile and available on GitHub. All data and results can be reproduced by a simple invocation of make. To further support reproducibility, we bundled all tools and dependencies into one Docker container available on DockerHub . docker run executes the aforementioned Makefile inside the container. Reproduction requires roughly 89 G i B memory and 83 G i B storage, and takes less than 24 hours on 32 CPU cores. Excluding the KEGG analysis, which relies on a commercial license of the KEGG database, all steps are performed using free and open-source software.