Assign taxonomy functions
tryRC = FALSE,
verbose = FALSE,
multithread = FALSE,
retrieve_files = FALSE,
overwrite_existing = FALSE,
db_rps10 = "oomycetedb.fasta",
db_its = "fungidb.fasta",
db_16S = "bacteriadb.fasta",
db_other1 = "otherdb1.fasta",
db_other2 = "otherdb2.fasta"
- analysis_setup
An object containing directory paths and data tables, produced by the
function- asv_abund_matrix
ASV abundance matrix.
- tryRC
Whether to try reverse complementing sequences during taxonomic assignment
- verbose
Logical, indicating whether to display verbose output
- multithread
Logical, indicating whether to use multithreading
- retrieve_files
Specify TRUE/FALSE whether to copy files from the temp directory to the output directory
- overwrite_existing
Logical, indicating whether to remove or overwrite existing files and directories from previous runs. Default is
.- db_rps10
The reference database for the rps10 locus
- db_its
The reference database for the ITS locus
- db_16S
The reference database for the 16S locus
- db_other1
The reference database for different locus 1 (assumes format is like SILVA DB entries)
- db_other2
The reference database for a different locus 2 (assumes format is like SILVA DB entries)
# Assign taxonomies to ASVs on a per barcode basis
analysis_setup <- prepare_reads(
data_directory = system.file("extdata", package = "demulticoder"),
output_directory = tempdir(),
tempdir_path = tempdir(),
tempdir_id = "demulticoder_run_temp",
overwrite_existing = TRUE
#> Rows: 2 Columns: 23
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (3): primer_name, forward, reverse
#> dbl (16): minCutadaptlength, maxN, maxEE_forward, maxEE_reverse, truncLen_fo...
#> lgl (4): already_trimmed, count_all_samples, multithread, verbose
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> Rows: 2 Columns: 23
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (3): primer_name, forward, reverse
#> dbl (16): minCutadaptlength, maxN, maxEE_forward, maxEE_reverse, truncLen_fo...
#> lgl (4): already_trimmed, count_all_samples, multithread, verbose
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> Rows: 4 Columns: 3
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (3): sample_name, primer_name, organism
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> Creating output directory: /tmp/RtmpRgOGJ7/demulticoder_run_temp/prefiltered_sequences
overwrite_existing = TRUE
#> Running Cutadapt 3.5 for its sequence data
#> Read in 2564 paired-sequences, output 1479 (57.7%) filtered paired-sequences.
#> Read in 1996 paired-sequences, output 1215 (60.9%) filtered paired-sequences.
#> Running Cutadapt 3.5 for rps10 sequence data
#> Read in 1830 paired-sequences, output 1429 (78.1%) filtered paired-sequences.
#> Read in 2090 paired-sequences, output 1506 (72.1%) filtered paired-sequences.
overwrite_existing = TRUE
#> 710804 total bases in 2694 reads from 2 samples will be used for learning the error rates.
#> Initializing error rates to maximum possible estimate.
#> selfConsist step 1 ..
#> selfConsist step 2
#> selfConsist step 3
#> Convergence after 3 rounds.
#> Error rate plot for the Forward read of primer pair its
#> Warning: log-10 transformation introduced infinite values.
#> Sample 1 - 1479 reads in 660 unique sequences.
#> Sample 2 - 1215 reads in 613 unique sequences.
#> 724230 total bases in 2694 reads from 2 samples will be used for learning the error rates.
#> Initializing error rates to maximum possible estimate.
#> selfConsist step 1 ..
#> selfConsist step 2
#> selfConsist step 3
#> Convergence after 3 rounds.
#> Error rate plot for the Reverse read of primer pair its
#> Warning: log-10 transformation introduced infinite values.
#> Sample 1 - 1479 reads in 1019 unique sequences.
#> Sample 2 - 1215 reads in 814 unique sequences.
#> 1315 paired-reads (in 21 unique pairings) successfully merged out of 1416 (in 32 pairings) input.
#> Duplicate sequences in merged output.
#> 1063 paired-reads (in 25 unique pairings) successfully merged out of 1108 (in 28 pairings) input.
#> Duplicate sequences detected and merged.
#> Identified 0 bimeras out of 38 input sequences.
#> 824778 total bases in 2935 reads from 2 samples will be used for learning the error rates.
#> Initializing error rates to maximum possible estimate.
#> selfConsist step 1 ..
#> selfConsist step 2
#> Convergence after 2 rounds.
#> Error rate plot for the Forward read of primer pair rps10
#> Warning: log-10 transformation introduced infinite values.
#> Sample 1 - 1429 reads in 933 unique sequences.
#> Sample 2 - 1506 reads in 1018 unique sequences.
#> 821851 total bases in 2935 reads from 2 samples will be used for learning the error rates.
#> Initializing error rates to maximum possible estimate.
#> selfConsist step 1 ..
#> selfConsist step 2
#> selfConsist step 3
#> Convergence after 3 rounds.
#> Error rate plot for the Reverse read of primer pair rps10
#> Warning: log-10 transformation introduced infinite values.
#> Sample 1 - 1429 reads in 1044 unique sequences.
#> Sample 2 - 1506 reads in 1284 unique sequences.
#> 1420 paired-reads (in 2 unique pairings) successfully merged out of 1422 (in 4 pairings) input.
#> 1503 paired-reads (in 5 unique pairings) successfully merged out of 1504 (in 6 pairings) input.
#> Identified 0 bimeras out of 5 input sequences.
#> $its
#> [1] "/tmp/RtmpRgOGJ7/demulticoder_run_temp/asvabund_matrixDADA2_its.RData"
#> $rps10
#> [1] "/tmp/RtmpRgOGJ7/demulticoder_run_temp/asvabund_matrixDADA2_rps10.RData"
overwrite_existing = TRUE
#> Duplicate sequences detected and merged.
#> samplename_barcode input filtered denoisedF denoisedR merged nonchim
#> 1 S1_R1_its 2564 1479 1425 1431 1315 1315
#> 2 S2_R1_its 1996 1215 1143 1122 1063 1063
#> samplename_barcode input filtered denoisedF denoisedR merged nonchim
#> 1 S1_R1_rps10 1830 1429 1429 1422 1420 1420
#> 2 S2_R1_rps10 2090 1506 1505 1505 1503 1503