Skip to contents

Before You Start

  • In the following example, we demonstrate key package functionality using the test dataset that is included in the package
  • You can follow along because all test data and associated CSV input files are contained in the package.
  • Additional examples are also available under the [Example vignettes]tab

Components of the test dataset

  1. Paired-end short read amplicon data
  • Files: S1_R1.fastq.gz, S1_R2.fastq.gz, S2_R1.fastq.gz, S2_R1.fastq.gz
  • The files must be zipped and end in either R1.fastq.gz , or R2.fastq.gz and each sample must have both R1 and R2 files.
  1. metadata.csv
  • New row for each unique sample
  • Samples entered twice if samples contain two pooled metabarcodes, as in the test data template
  1. primerinfo_params.csv
  • New row for each unique barcode and associated primer sequence
  • Optional cutadapt and DADA2 parameters
  1. Taxonomy databases
  • UNITE fungal database (abridged version)
  • oomyceteDB

1. Format of the paired-end read files

  • The package takes forward and reverse Illumina short read sequence data.
  • Format of file names To avoid errors, the only characters that are acceptable in sample names are letters and numbers. Characters can be separated by underscores, but no other symbols. The files must end with the suffix R1.fastq.gz or R2.fastq.gz.

2. Format of metadata file

  • The format of the CSV file named metadata.csv is simple.
  • A template is here.
  • The only two required columns (with these headers) are:
    • Column 1. sample_name
    • Column 2. primer_info

Additional columns

  • Other columns should be pasted after these two columns.
  • They can be referenced later during the analysis steps and save a step of loading metadata later.

Notes

  • S1 and S2 are come from a rhododendron rhizobiome dataset and are a random subset of the full set of reads.
  • S1 and S2 are included twice in the ‘metadata.csv’ sheet. This is because these two samples contain pooled metabarcodes (ITS1 and rps10).
  • To demultiplex and run both analyses in tandem, include the same sample twice under sample_name, and then change the primer_name.

Here is what the metadata.csv looks like for this test dataset:

sample_name primer_name organism
S1 rps10 Cry
S2 rps10 Cin
S1 its Cry
S2 its Cin

3. Format of primer and parameters file

DADA2 Primer sequence information and user-defined parameters are placed in primerinfo_params.csv

  • To simplify how functions are called, user will provide parameters within this input file.

  • We recommend using the template linked here.

  • Remember to add any additional optional DADA2 parameters you want to use.

  • The only three required columns (with these headers) are:

    • Column 1. sample_name
    • Column 2. forward (primer sequence)
    • Column 3. reverse(primer sequence)
Example template for primerinfo_params.csv
primer_name forward reverse already_trimmed minCutadaptlength multithread verbose maxN maxEE_forward maxEE_reverse truncLen_forward truncLen_reverse truncQ minLen maxLen minQ trimLeft trimRight rm.lowcomplex minOverlap maxMismatch min_asv_length
rps10 GTTGGTTAGAGYARAAGACT ATRYYTAGAAAGAYTYGAACT FALSE 100 TRUE FALSE 1.00E+05 5 5 0 0 5 150 Inf 0 0 0 0 15 0 50
its CTTGGTCATTTAGAGGAAGTAA GCTGCGTTCTTCATCGATGC FALSE 50 TRUE FALSE 1.00E+05 5 5 0 0 5 50 Inf 0 0 0 0 15 0 50

For more info on parameter specifics, see Documentation

4. Reference Database Format

Databases are downloaded from these sources and then placed in the from the user-specified data folder where raw data files and csv files are located.

Additional Notes

  • Computer specifications may be a limiting factor.
  • If you are using the SILVA or UNITE databases for taxonomic assignment steps, an ordinary personal computer (unless it has sufficient RAM) may not have enough memory for the taxonomic assignment steps, even with few samples.
  • The test databases and reads are subsetted and the following example should run on a personal computer with at least 16 GB of RAM.
  • If computer crashes during the taxonomic assignment step, you will need to switch to a computer with sufficient memory.
  • You must ensure that you have enough storage to save intermediate files in a temporary directory (default) or user-specified directory before proceeding.

Installation

Dependencies:
First install cutadapt program following the instructions here: https://cutadapt.readthedocs.io/en/stable/installation.html

Let’s locate where the cutadapt executable is.

You must do this from a Terminal window:

#If you installed with pip or pipx, or homebrew, run this command from a Terminal window
which cutadapt
cutadapt --version

If you followed the cutadapt installation instructions to create a conda environment called cutadapt (change to whatever you named your environment), to install it in.

Open up a Terminal window and type these commands:

#Run commands from a Terminal window
conda activate cutadapt
which cutadapt
cutadapt --version

Second, make sure the following R packages are installed:

To install the development version of package (while submission to CRAN is in progress):

#Here we install demulticoder (instructions will be updated once available through CRAN)
devtools::install_github("grunwaldlab/demulticoder")
library("demulticoder")

#Let's make sure other packages are loaded:
library("devtools")
library("dada2")
library("phyloseq")
library("metacoder")

Loading the Package

For now, the package will be loaded by retrieving it from GitHub. Submission to CRAN is in progress.

Reorganize data tables and set-up data directory structure

The sample names, primer sequences, and other metadata are reorganized in preparation for running Cutadapt to remove primers.

output<-prepare_reads(
  data_directory = system.file("extdata", package = "demulticoder"),
  output_directory = "~/demulticoder_test", #Run commands from a Terminal window", 
  overwrite_existing = TRUE)

Remove primers with Cutadapt

Before running Cutadapt, please ensure that you have installed it

cut_trim(
  output,
  cutadapt_path = "/usr/bin/cutadapt", #CHANGE LOCATION TO YOUR LOCAL INSTALLATION
  overwrite_existing = TRUE)
#> Running Cutadapt 3.5 for its sequence data
#> Running Cutadapt 3.5 for rps10 sequence data

ASV inference step

Raw reads will be merged and ASVs will be inferred

make_asv_abund_matrix(
  output,
  overwrite_existing = TRUE)
#> 710804 total bases in 2694 reads from 2 samples will be used for learning the error rates.
#> Initializing error rates to maximum possible estimate.
#> selfConsist step 1 ..
#>    selfConsist step 2
#>    selfConsist step 3
#> Convergence after  3  rounds.
#> Error rate plot for the Forward read of primer pair its
#> Sample 1 - 1479 reads in 660 unique sequences.
#> Sample 2 - 1215 reads in 613 unique sequences.
#> 724230 total bases in 2694 reads from 2 samples will be used for learning the error rates.
#> Initializing error rates to maximum possible estimate.
#> selfConsist step 1 ..
#>    selfConsist step 2
#>    selfConsist step 3
#> Convergence after  3  rounds.
#> Error rate plot for the Reverse read of primer pair its
#> Sample 1 - 1479 reads in 1019 unique sequences.
#> Sample 2 - 1215 reads in 814 unique sequences.

#> 824778 total bases in 2935 reads from 2 samples will be used for learning the error rates.
#> Initializing error rates to maximum possible estimate.
#> selfConsist step 1 ..
#>    selfConsist step 2
#> Convergence after  2  rounds.
#> Error rate plot for the Forward read of primer pair rps10
#> Sample 1 - 1429 reads in 933 unique sequences.
#> Sample 2 - 1506 reads in 1018 unique sequences.
#> 821851 total bases in 2935 reads from 2 samples will be used for learning the error rates.
#> Initializing error rates to maximum possible estimate.
#> selfConsist step 1 ..
#>    selfConsist step 2
#>    selfConsist step 3
#> Convergence after  3  rounds.
#> Error rate plot for the Reverse read of primer pair rps10

#> Sample 1 - 1429 reads in 1044 unique sequences.
#> Sample 2 - 1506 reads in 1284 unique sequences.

#> $its
#> [1] "/tmp/Rtmpb5UUUj/demulticoder_run/asvabund_matrixDADA2_its.RData"
#> 
#> $rps10
#> [1] "/tmp/Rtmpb5UUUj/demulticoder_run/asvabund_matrixDADA2_rps10.RData"

Taxonomic assignment step

Using the core assignTaxonomy function from DADA2, taxonomic assignments will be given to ASVs.

assign_tax(
  output,
  asv_abund_matrix,
  retrieve_files=TRUE,
  overwrite_existing = TRUE)
#>   samplename_barcode input filtered denoisedF denoisedR merged nonchim
#> 1          S1_R1_its  2564     1479      1425      1431   1315    1315
#> 2          S2_R1_its  1996     1215      1143      1122   1063    1063
#>   samplename_barcode input filtered denoisedF denoisedR merged nonchim
#> 1        S1_R1_rps10  1830     1429      1429      1422   1420    1420
#> 2        S2_R1_rps10  2090     1506      1505      1505   1503    1503

Format of output matrices

  • The default output is a CSV file per metabarcode with the inferred ASVs and sequences, taxonomic assignments, and bootstrap supports provided by DADA2. These data can then be used as input for downstream steps.
  • To view these files, locate the files with the prefix final_asv_abundance_matrix in your specified output directory.
  • In this example, it will be in the home directory within demulticoder_test.
  • You can then open your matrices using Excel
  • We recommend checking the column with the ASV sequences to make sure there aren’t truncated sequences *It is important to also review the column showing the taxonomic assignments of each ASV and their associated bootstrap supports.

We can also load the matrices into our R environment and view then from R Studio.

We do that here with the rps10 matrix

rps10_matrix<-read.csv("~/demulticoder_test/final_asv_abundance_matrix_rps10.csv", header=TRUE)
head(rps10_matrix)
#>   asv_id
#> 1  asv_1
#> 2  asv_2
#> 3  asv_3
#> 4  asv_4
#> 5  asv_5
#>                                                                                                                                                                                                                                                                                                                                                                                                                                                    sequence
#> 1 GAAAATCTTTGTGTCGGTGGTTCAAATCCACCTCCAGACAAATTAATTTTTTAAAACTTATGTTTATATTAAGAATTACATTTAAATCTATTAAAAAAATAAATAACATTAAACCTATTTTATTAAATCAAAAAAATTTAAATAAATTTAATAATATTAAAATAAATGGAATATTTAAAACAAAAAATAAAAATAAAATTTTTACAGTATTAAAATCACCACATGTTAATAAAAAAGCACGTGAACATTTTATTTATAAAAATTTTACACAAAAAGTTGAAATTAAATGTTTAAATATTTTTCAATTATTAAATTTTTTAATTTTGATTAAAAAAATCTTAACAGAAAATTTTATTATTACTACAAAAATAATTAAACAATAATTAATATATGCTTATAGCTTAATGGATAAAGCATTAGATTGCGGATCTACAAAATGAA
#> 2    GAAAATCTTTGTGTCGGTGGTTCAAGTCCGCCTCCAAACATCTTATAAAAATGTATGTATATTTTACGAATTAGTTTTAAATCAGTAGAAAAAATAAATGAATTAAAAAAAAATATCTCAAAAATAAAAAAAATATATAATTTTAAAAATTTAAAAGTAAATGGTATTTTTAGAACAAAAAATAAAAATAAAATTTTTACATTATTAAAATCACCCCATGTTAATAAAAAATCACGTGAACATTTTATATATAAAAATTATATTCAAAAAATTGATTTAAATTTTTCAAATTTTTTTCAATTATTAAATTTTTTAGTAATTTTAAAAAAAATATTATCTAAAGATTTTTTAATCAATATCAAAATTATTAAAAAAAATAAAAAAAATATGCTTATAGCTTAATGGATAAAGCGTTAGATTGCGGATCTATAAAATGAA
#> 3      GAAAATCTTTGTGTCGGTGGTTCAAGTCCACCTCCAGACAAAAATAATAACTATGTATATTTTAAGAATAACTTTTAAATCAATACAAAAAATAAATAATATTAAAAAAAATTTATTAAAATTAAAAAAAATAAATAAATTTAAAAATATTCAAATAAAAGGAATATTTAAAATAAAAGATAAAAATAAAATTTTTACTTTATTAAAATCACCTCACGTAAATAAAAAATCTCGTGAACATTTTATTTATAAAAATTATACTCAAAAAATAGATGTAAAATTTTCAAATATTATTGAATTATTTAATTTTATAATACTTATTAAAAAAGTTTTAACTAAAAATTTTATAATAAATTTTAAAATTATAAAATATAATAAAAAAAAAATGCTTATAGCTTAATGGATAAAGCGTTAGATTGCGGATCTATAAAATGAA
#> 4  GAAAATCTTTGTGTCGGTGGTTCAAATCCGCCTCCAAACAAAAATATAAATTATTTTATATGTATATTTTAAGAATAAGTTTTAAATCTGTATCAAAAATAGATAAAATTAAAAAAGAATTATCAAGATTAAAAAAAATATATAATTTTAAAAATATAACAATTAATGGTATTTTTAAAACAAAAAATAAAGATAAAATTTTTACATTATTAAAATCACCTCATGTTAATAAAAAATCTCGCGAACATTTTATTTATAAAAATTATGTACAAAAAATAGATTTAAATTTTATTAATATTTTTCAATTAATAAATTTTTTAATAATTTTAAAAAAAAAATTATCAAAAAATGTATTAATAAATGTAAAAATAATTAAAAAATAAAAAAATATGCTTATAGCTTAATGGATAAAGCGTTAGATTGCGGATCTATAAAATGAA
#> 5    GAAAATCTTTGTGTCGGTGGTTCAAGTCCGCCTCCAAACATCTTATAAAAATGTATGTATATTTTACGAATTAGTTTTAAATCAGTAGAAAAAATAAATGAATTAAAAAAAAATATCTCAAAAATAAAAAAAATATATAATTTTAAAAATTTAAAAGTAAAATGGTATTTTTAGAACAAAAAATAAAAATAAAATTTTTACATTATTAAAATCACCCCATGTTAATAAAAAATCACGTGAACATTTTATATATAAAAATTATATTCAAAAAATTGATTTAAATTTTTCAAATTTTTTTCAATTATTAAATTTTTTAGTAATTTTAAAAAAAACATTATCTAAAGATTTTTTAATCAATATTAAAATTATTAAAAAAAATAAAAAAATATGCTTATAGCTTAATGGATAAAGCGTTAGATTGCGGATCTATAAAATGAA
#>                                                                                                                                                                                                                   dada2_tax
#> 1 Stramenopila--100--Kingdom;Oomycota--100--Phylum;Oomycetes--100--Class;Peronosporales--26--Order;Peronosporaceae--26--Family;Phytophthora--23--Genus;Phytophthora_sp.kununurra--13--Species;oomycetedb_650--13--Reference
#> 2                    Stramenopila--100--Kingdom;Oomycota--100--Phylum;Oomycetes--100--Class;Pythiales--100--Order;Pythiaceae--100--Family;Pythium--100--Genus;Pythium_dissotocum--55--Species;oomycetedb_749--26--Reference
#> 3 Stramenopila--100--Kingdom;Oomycota--100--Phylum;Oomycetes--100--Class;Pythiales--100--Order;Pythiaceae--100--Family;Globisporangium--100--Genus;Globisporangium_cylindrosporum--26--Species;oomycetedb_41--26--Reference
#> 4                   Stramenopila--100--Kingdom;Oomycota--100--Phylum;Oomycetes--100--Class;Pythiales--94--Order;Pythiaceae--94--Family;Pythium--89--Genus;Pythium_aphanidermatum--19--Species;oomycetedb_727--19--Reference
#> 5                    Stramenopila--100--Kingdom;Oomycota--100--Phylum;Oomycetes--100--Class;Pythiales--100--Order;Pythiaceae--100--Family;Pythium--100--Genus;Pythium_oopapillum--69--Species;oomycetedb_771--69--Reference
#>   S1_rps10 S2_rps10
#> 1      842       17
#> 2      578      220
#> 3        0      587
#> 4        0      516
#> 5        0      163

We can also inspect the first rows of the ITS1 matrix

its_matrix<-read.csv("~/demulticoder_test/final_asv_abundance_matrix_its.csv", header=TRUE)
head(its_matrix)
#>   asv_id
#> 1  asv_1
#> 2  asv_2
#> 3  asv_3
#> 4  asv_4
#> 5  asv_5
#> 6  asv_6
#>                                                                                                                                                                                                                                                                                                                                 sequence
#> 1                                              AAAAAGTCGTAACAAGGTTTCCGTAGGTGAACCTGCGGAAGGATCATTAGTGAATCTTCAAAGTCGGCTCGTCAGATTGTGCTGGTGGAGACACATGTGCACGTCTACGAGTCGCAAACCCACACACCTGTGCATCTATGACTCTGAGTGCCGCTTTGCATGGCCCCTTGATTTGGGCCTGGCGCTCGAGTACTTTCACACACTCTCGAATGTAATGGAATGTCTTCTTGTGCATAACGTACAAACAGAAACAACTTTCAACAACGGATCTCTTGGCTCTC
#> 2                                              AAAAAGTCGTAACAAGGTTTCCGTAGGTGAACCTGCGGAAGGATCATTAGTGAATCTTCAAAGTCGGCTCGTCGGATTGTGCTGGTGGAGACACATGTGCACGTCTACGAGTCGCAAACCCACACACCTGTGCATCTATGACTCTGAGTGCCGCTTTGCATGGCCCCTTGATTTGGGCCTGGCGCTCGAGTACTTTCACACACTCTCGAATGTAATGGAATGTCTTGTTGTGCATAACGTACAAACAGAAACAACTTTCAACAACGGATCTCTTGGCTCTC
#> 3                                              AAAAAGTCGTAACAAGGTTTCCGTAGGTGAACCTGCGGAAGGATCATTAGTGAATCTTCAAAGTCGGCTCGTCGGATTGTGCTGGTGGAGACACATGTGCACGTCTACGAGTCGCAAACCCACACACCTGTGCATCTATGACTCTGAGTGCCGCTTTGCATGGCCCCTTGATTTGGGCCTGGCGCTCGAGTACTTTCACACACTCTCGAATGTAATGGAATGACTTGTTGTGCATAACGTACAAACAGAAACAACTTTCAACAACGGATCTCTTGGCTCTC
#> 4 AAGTCGTAACAAGGTTTCCGTAGGTGAACCTGCGGAAGGATCATTAACGAGTTCAACTGGGGTCTGTGGGGATTCGATGCTGGCCTCCGGGCATGTGCTCGTCCCCCGGACCTCACATCCACTCACAACCCCTGTGCATCATGAGAGGTGTGGTCCTCCGTCATGGGAGCAGCTCCCCTCCGGGCGTGTTTGTACACCCCGGGCGGTCGAGCCGGGACTGCACTTCGACGCCTTTACACAAACCTTTGAATCAGTGATGTAGAATGTCATGGCCAGCGATGGTCGAACTTTAAATACAACTTTCAACAACGGATCTCTTGGCTCTC
#> 5                                              AAAAAGTCGTAACAAGGTTTCCGTAGGTGAACCTGCGGAAGGATCATTAGTGAATCTTCAAAGTCGGCTCGTCGGATTGTGCTGGTGGAGACACATGTGCACATCTACGAGTCGCAAACCCACACACCTGTGCATCTATGACTCTGAGTGCCGCTTTGCATGGCCCCTTGATTTGGGCCTGGCGCTCGAGTACTTTCACACACTCTCGAATGTAATGGAATGTCTTGTTGTGCATAACGTACAAACAGAAACAACTTTCAACAACGGATCTCTTGGCTCTC
#> 6                                                                                                    AAGTCGTAACAAGGTTTCCGTAGGTGAACCTGCGGAAGGATCATTAAAAATGAAGCCGGGAAACCGGTCCTTCTAACCCTTGATTATCTTAATTTGTTGCTTTGGTGGGCCGCGCAAGCACCGGCTTCGGCTGGATCGTGCCCGCCAGAGGACCACAACTCTGTATCAAATGTCGTCTGAGTACTATATAATAGTTAAAACTTTCAACAACGGATCTCTTGGTTCTG
#>                                                                                                                                                                        dada2_tax
#> 1 Fungi--100--Kingdom;Basidiomycota--100--Phylum;Agaricomycetes--100--Class;Sebacinales--100--Order;unidentified--98--Family;unidentified--98--Genus;Sebacinales_sp--98--Species
#> 2 Fungi--100--Kingdom;Basidiomycota--100--Phylum;Agaricomycetes--100--Class;Sebacinales--100--Order;unidentified--97--Family;unidentified--97--Genus;Sebacinales_sp--97--Species
#> 3 Fungi--100--Kingdom;Basidiomycota--100--Phylum;Agaricomycetes--100--Class;Sebacinales--100--Order;unidentified--99--Family;unidentified--99--Genus;Sebacinales_sp--99--Species
#> 4 Fungi--100--Kingdom;Basidiomycota--62--Phylum;Agaricomycetes--59--Class;Hymenochaetales--21--Order;Repetobasidiaceae--8--Family;Rickenella--8--Genus;Rickenella_sp--8--Species
#> 5 Fungi--100--Kingdom;Basidiomycota--100--Phylum;Agaricomycetes--100--Class;Sebacinales--100--Order;unidentified--96--Family;unidentified--96--Genus;Sebacinales_sp--96--Species
#> 6         Fungi--100--Kingdom;Ascomycota--100--Phylum;Leotiomycetes--98--Class;Helotiales--98--Order;unidentified--86--Family;unidentified--86--Genus;Helotiales_sp--86--Species
#>   S1_its S2_its
#> 1    436      0
#> 2    405      0
#> 3      0    299
#> 4    209     65
#> 5      0    200
#> 6     16    116

Reformat ASV matrix as taxmap and phyloseq objects after optional filtering of low abundance ASVs

objs<-convert_asv_matrix_to_objs(output, minimum_bootstrap = 0, save_outputs = TRUE)

Objects can now be used for downstream data analysis

Here we make heattrees using taxmap object.

First we make a heat tree to examine fungal community composition using our ITS1 data.

objs$taxmap_its %>%
  filter_taxa(! grepl(x = taxon_names, "NA", ignore.case = TRUE)) %>%
  metacoder::heat_tree(node_label = taxon_names,
                       node_size = n_obs,
                       node_color = n_obs,
                       node_color_axis_label = "ASV count",
                       node_size_axis_label = "Total Abundance of Taxa",
                       layout = "da", initial_layout = "re")

Now we make a heat tree to showcase oomycete community composition using our rps10 data

objs$taxmap_rps10 %>%
  filter_taxa(! grepl(x = taxon_names, "NA", ignore.case = TRUE)) %>%
  metacoder::heat_tree(node_label = taxon_names,
                       node_size = n_obs,
                       node_color = n_obs,
                       node_color_axis_label = "ASV count",
                       node_size_axis_label = "Total Abundance of Taxa",
                       layout = "da", initial_layout = "re")

We can also do a variety of analyses, if we convert to phyloseq object

Here we demonstrate how to make a stacked bar plot of the relative abundance of taxa by sample for the ITS1-barcoded samples

data <- objs$phyloseq_its %>%
  phyloseq::transform_sample_counts(function(x) {x/sum(x)} ) %>% 
  phyloseq::psmelt() %>%                                        
  dplyr::filter(Abundance > 0.02) %>%                      
  dplyr::arrange(Genus)                                      

abund_plot <- ggplot2::ggplot(data, ggplot2::aes(x = Sample, y = Abundance, fill = Genus)) + 
  ggplot2::geom_bar(stat = "identity", position = "stack", color = "black", size = 0.2) +
  ggplot2::scale_fill_viridis_d() +
  ggplot2::theme_minimal() +
  ggplot2::labs(
    y = "Relative Abundance",
    title = "Relative abundance of taxa by sample",
    fill = "Genus"
  ) +
  ggplot2::theme(
    axis.text.x = ggplot2::element_text(angle = 90, hjust = 1, vjust = 0.5, size = 14),
    panel.grid.major = ggplot2::element_blank(),
    panel.grid.minor = ggplot2::element_blank(),
    legend.position = "top",
    legend.text = ggplot2::element_text(size = 14),
    legend.title = ggplot2::element_text(size = 14),  
    strip.text = ggplot2::element_text(size = 14),
    strip.background = ggplot2::element_blank()
  ) +
  ggplot2::guides(
    fill = ggplot2::guide_legend(
      reverse = TRUE,
      keywidth = 1,
      keyheight = 1,
      title.position = "top",
      title.hjust = 0.5,  # Center the legend title
      label.theme = ggplot2::element_text(size = 10)  
    )
  )

print(abund_plot)

Finally, we demonstrate how to make a stacked bar plot of the relative abundance of taxa by sample for the rps10-barcoded samples

data <- objs$phyloseq_rps10 %>%
  phyloseq::transform_sample_counts(function(x) {x/sum(x)} ) %>% 
  phyloseq::psmelt() %>%                                        
  dplyr::filter(Abundance > 0.02) %>%                      
  dplyr::arrange(Genus)                                      

abund_plot <- ggplot2::ggplot(data, ggplot2::aes(x = Sample, y = Abundance, fill = Genus)) + 
  ggplot2::geom_bar(stat = "identity", position = "stack", color = "black", size = 0.2) +
  ggplot2::scale_fill_viridis_d() +
  ggplot2::theme_minimal() +
  ggplot2::labs(
    y = "Relative Abundance",
    title = "Relative abundance of taxa by sample",
    fill = "Genus"
  ) +
  ggplot2::theme(
    axis.text.x = ggplot2::element_text(angle = 90, hjust = 1, vjust = 0.5, size = 14),
    panel.grid.major = ggplot2::element_blank(),
    panel.grid.minor = ggplot2::element_blank(),
    legend.position = "top",
    legend.text = ggplot2::element_text(size = 14),
    legend.title = ggplot2::element_text(size = 14),  # Adjust legend title size
    strip.text = ggplot2::element_text(size = 14),
    strip.background = ggplot2::element_blank()
  ) +
  ggplot2::guides(
    fill = ggplot2::guide_legend(
      reverse = TRUE,
      keywidth = 1,
      keyheight = 1,
      title.position = "top",
      title.hjust = 0.5,  # Center the legend title
      label.theme = ggplot2::element_text(size = 10)  # Adjust the size of the legend labels
    )
  )

print(abund_plot)