Skip to contents

Filter ASV abundance matrix and convert to taxmap and phyloseq objects

Usage

convert_asv_matrix_to_objs(
  analysis_setup,
  min_read_depth = 0,
  minimum_bootstrap = 0,
  save_outputs = FALSE
)

Arguments

analysis_setup

analysis_setup An object containing directory paths and data tables, produced by the prepare_reads function

min_read_depth

ASV filter parameter. If mean read depth of across all samples is less than this threshold, ASV will be filtered.

minimum_bootstrap

Threshold for bootstrap support value for taxonomic assignments. Below designated minimum bootstrap threshold, taxnomoic assignments will be set to N/A

Value

ASV matrix converted to taxmap object

Examples

# Convert final matrix to taxmap and phyloseq objects for downstream analysis steps
analysis_setup <- prepare_reads(
  data_directory = system.file("extdata", package = "demulticoder"),
  output_directory = tempdir(),
  tempdir_path = tempdir(),
  tempdir_id = "demulticoder_run_temp",
  overwrite_existing = TRUE
)
#> Rows: 2 Columns: 23
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr  (3): primer_name, forward, reverse
#> dbl (16): minCutadaptlength, maxN, maxEE_forward, maxEE_reverse, truncLen_fo...
#> lgl  (4): already_trimmed, count_all_samples, multithread, verbose
#> 
#>  Use `spec()` to retrieve the full column specification for this data.
#>  Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> Rows: 2 Columns: 23
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr  (3): primer_name, forward, reverse
#> dbl (16): minCutadaptlength, maxN, maxEE_forward, maxEE_reverse, truncLen_fo...
#> lgl  (4): already_trimmed, count_all_samples, multithread, verbose
#> 
#>  Use `spec()` to retrieve the full column specification for this data.
#>  Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> Rows: 4 Columns: 3
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (3): sample_name, primer_name, organism
#> 
#>  Use `spec()` to retrieve the full column specification for this data.
#>  Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> Creating output directory: /tmp/RtmpRgOGJ7/demulticoder_run_temp/prefiltered_sequences

cut_trim(
analysis_setup,
cutadapt_path="/usr/bin/cutadapt",
overwrite_existing = TRUE
)
#> Running Cutadapt 3.5 for its sequence data 
#> Read in 2564 paired-sequences, output 1479 (57.7%) filtered paired-sequences.
#> Read in 1996 paired-sequences, output 1215 (60.9%) filtered paired-sequences.
#> Running Cutadapt 3.5 for rps10 sequence data 
#> Read in 1830 paired-sequences, output 1429 (78.1%) filtered paired-sequences.
#> Read in 2090 paired-sequences, output 1506 (72.1%) filtered paired-sequences.

make_asv_abund_matrix(
analysis_setup, 
overwrite_existing = TRUE
)
#> 710804 total bases in 2694 reads from 2 samples will be used for learning the error rates.
#> Initializing error rates to maximum possible estimate.
#> selfConsist step 1 ..
#>    selfConsist step 2
#>    selfConsist step 3
#> Convergence after  3  rounds.
#> Error rate plot for the Forward read of primer pair its 
#> Warning: log-10 transformation introduced infinite values.
#> Sample 1 - 1479 reads in 660 unique sequences.
#> Sample 2 - 1215 reads in 613 unique sequences.
#> 724230 total bases in 2694 reads from 2 samples will be used for learning the error rates.
#> Initializing error rates to maximum possible estimate.
#> selfConsist step 1 ..
#>    selfConsist step 2
#>    selfConsist step 3
#> Convergence after  3  rounds.
#> Error rate plot for the Reverse read of primer pair its 
#> Warning: log-10 transformation introduced infinite values.
#> Sample 1 - 1479 reads in 1019 unique sequences.
#> Sample 2 - 1215 reads in 814 unique sequences.
#> 1315 paired-reads (in 21 unique pairings) successfully merged out of 1416 (in 32 pairings) input.
#> Duplicate sequences in merged output.
#> 1063 paired-reads (in 25 unique pairings) successfully merged out of 1108 (in 28 pairings) input.

#> Duplicate sequences detected and merged.
#> Identified 0 bimeras out of 38 input sequences.
#> 824778 total bases in 2935 reads from 2 samples will be used for learning the error rates.
#> Initializing error rates to maximum possible estimate.
#> selfConsist step 1 ..
#>    selfConsist step 2
#> Convergence after  2  rounds.
#> Error rate plot for the Forward read of primer pair rps10 
#> Warning: log-10 transformation introduced infinite values.
#> Sample 1 - 1429 reads in 933 unique sequences.
#> Sample 2 - 1506 reads in 1018 unique sequences.
#> 821851 total bases in 2935 reads from 2 samples will be used for learning the error rates.
#> Initializing error rates to maximum possible estimate.
#> selfConsist step 1 ..
#>    selfConsist step 2
#>    selfConsist step 3
#> Convergence after  3  rounds.
#> Error rate plot for the Reverse read of primer pair rps10 
#> Warning: log-10 transformation introduced infinite values.

#> Sample 1 - 1429 reads in 1044 unique sequences.
#> Sample 2 - 1506 reads in 1284 unique sequences.
#> 1420 paired-reads (in 2 unique pairings) successfully merged out of 1422 (in 4 pairings) input.
#> 1503 paired-reads (in 5 unique pairings) successfully merged out of 1504 (in 6 pairings) input.

#> Identified 0 bimeras out of 5 input sequences.

#> $its
#> [1] "/tmp/RtmpRgOGJ7/demulticoder_run_temp/asvabund_matrixDADA2_its.RData"
#> 
#> $rps10
#> [1] "/tmp/RtmpRgOGJ7/demulticoder_run_temp/asvabund_matrixDADA2_rps10.RData"
#> 
assign_tax(
analysis_setup,
asv_abund_matrix, 
retrieve_files=FALSE, 
overwrite_existing=TRUE
)
#> Duplicate sequences detected and merged.
#>   samplename_barcode input filtered denoisedF denoisedR merged nonchim
#> 1          S1_R1_its  2564     1479      1425      1431   1315    1315
#> 2          S2_R1_its  1996     1215      1143      1122   1063    1063
#>   samplename_barcode input filtered denoisedF denoisedR merged nonchim
#> 1        S1_R1_rps10  1830     1429      1429      1422   1420    1420
#> 2        S2_R1_rps10  2090     1506      1505      1505   1503    1503
objs<-convert_asv_matrix_to_objs(
analysis_setup
)
#> Rows: 38 Columns: 5
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (3): asv_id, sequence, dada2_tax
#> dbl (2): S1_its, S2_its
#> 
#>  Use `spec()` to retrieve the full column specification for this data.
#>  Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> For its dataset 
#> Taxmap object saved in: /tmp/RtmpRgOGJ7/taxmap_obj_its.RData 
#> Phyloseq object saved in: /tmp/RtmpRgOGJ7/phylo_obj_its.RData 
#> ASVs filtered by minimum read depth: 0 
#> For taxonomic assignments, if minimum bootstrap was set to: 0 assignments were set to 'Unsupported' 
#> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#> Rows: 5 Columns: 5
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (3): asv_id, sequence, dada2_tax
#> dbl (2): S1_rps10, S2_rps10
#> 
#>  Use `spec()` to retrieve the full column specification for this data.
#>  Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> For rps10 dataset 
#> Taxmap object saved in: /tmp/RtmpRgOGJ7/taxmap_obj_rps10.RData 
#> Phyloseq object saved in: /tmp/RtmpRgOGJ7/phylo_obj_rps10.RData 
#> ASVs filtered by minimum read depth: 0 
#> For taxonomic assignments, if minimum bootstrap was set to: 0 assignments were set to 'Unsupported' 
#> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~