Metacoder documentation

Introduction

Metacoder is an R package for parsing, plotting, and manipulating large taxonomic data sets, like those generated from modern high-throughput sequencing, like metabarcoding (i.e. amplification metagenomics, 16S metagenomics, etc). It provides a tree-based visualization called “heat trees” used to depict statistics for every taxon in a taxonomy using color and size. It also provides various functions to do common tasks in microbiome bioinformatics on data in the taxmap format defined by the metacoder package, such as:

Summing read counts/abundance per taxon
Converting counts to proportions and rarefaction of counts using vegan
Comparing the abundance (or other characteristics) of groups of samples (e.g., experimental treatments) per taxon
Combining data for groups of samples
Simulated PCR, via EMBOSS primersearch, for testing primer specificity and coverage of taxonomic groups
Converting common microbiome formats for data and reference databases into the objects defined by the metacoder package.
Converting to and from the phyloseq format and the metacoder format

Relationship with other packages

Many of these operations can be done using other packages like phyloseq, which also provides tools for diversity analysis. The main strength of metacoder is that its functions use the flexible data types defined by metacoder, which has powerful parsing and subsetting abilities that take into account the hierarchical relationship between taxa and user-defined data. In general, metacoder is more of an abstracted tool kit, whereas phyloseq has more specialized functions for community diversity data, but they both can do similar things. I encourage you to try both to see which fits your needs and style best. You can also combine the two in a single analysis by converting between the two data types when needed using the parse_phyloseq and as_phyloseq functions.

Installation

This project is available on CRAN and can be installed like so:

install.packages("metacoder")

You can also install the development version for the newest features, bugs, and bug fixes:

install.packages("devtools")
devtools::install_github("grunwaldlab/metacoder")

Dependencies

The function that simulates PCR requires primersearch from the EMBOSS (Rice, Longden, and Bleasby 2000) tool kit to be installed. This is not an R package, so it is not automatically installed. Type ?primersearch after installing and loading metacoder for installation instructions.

Future development

Metacoder is under active development and many new features are planned. Some improvements that are being explored include:

Barcoding gap analysis and associated plotting functions
A function to aid in retrieving appropriate sequence data from NCBI for in silico PCR from whole genome sequences
Graphing of different node shapes in heat trees, possibly including pie graphs or PhyloPics.
Adding the ability to plot specific edge lengths in the heat trees so they can be used for phylogenetic trees.
Adding more data import and export functions to make parsing and writing common formats easier.

To see the details of what is being worked on, check out the issues tab of the metacoder Github site.

Acknowledgements

Metacoder’s major dependencies are taxize, vegan, igraph, dplyr, and ggplot2.

Feedback and contributions

We would like to hear about users’ thoughts on the package and any errors they run into. Please report errors, questions or suggestions on the issues tab of the Metacoder Github site. We also welcome contributions via a Github pull request. You can also talk with us using our Google groups site or the comments section below.

References

Rice, Peter, Ian Longden, and Alan Bleasby. 2000. “EMBOSS: The European Molecular Biology Open Software Suite.” Elsevier Current Trends.

Metacoder's documentation by The Grunwald lab and the USDA Horticultural Crops Research Unit is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Based on a work at https://github.com/grunwaldlab/metacoder_documentation.