Prepare reads for primer trimming using Cutadapt
Source:R/prepare_reads_count_primers.R
prepare_reads.Rd
Prepare reads for primer trimming using Cutadapt
Usage
prepare_reads(
data_directory = "data",
output_directory = "output",
tempdir_path = NULL,
tempdir_id = "demulticoder_run",
multithread = FALSE,
overwrite_existing = FALSE
)
Arguments
- data_directory
User-specified directory path where the user has placed raw FASTQ (forward and reverse reads), metadata.csv, and primerinfo_params.csv files. Default is "data".
- output_directory
User-specified directory for outputs. Default is "output".
- tempdir_path
Path to a temporary directory. If
NULL
, a temporary directory path will be identified using thetempdir()
command.- tempdir_id
ID for temporary directories. Default is "demulticoder_run". The user can provide any helpful ID, whether it be a date or specific name for the run.
- multithread
Logical, indicating whether to use multithreading for certain operations. Default is
FALSE
.- overwrite_existing
Logical, indicating whether to remove or overwrite existing files and directories from previous runs. Default is
FALSE
.
Value
A list containing data tables, including metadata, primer sequences to search for based on orientation, paths for trimming reads, and user-defined parameters for all subsequent steps.
Examples
# Pre-filter raw reads and parse metadata and primer_information to prepare
# for primer trimming and filter
analysis_setup <- prepare_reads(
data_directory = system.file("extdata", package = "demulticoder"),
output_directory = tempdir(),
tempdir_path = tempdir(),
tempdir_id = "demulticoder_run_temp",
overwrite_existing = TRUE
)
#> Rows: 2 Columns: 22
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (3): primer_name, forward, reverse
#> dbl (16): minCutadaptlength, maxN, maxEE_forward, maxEE_reverse, truncLen_fo...
#> lgl (3): already_trimmed, multithread, verbose
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> Rows: 2 Columns: 22
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (3): primer_name, forward, reverse
#> dbl (16): minCutadaptlength, maxN, maxEE_forward, maxEE_reverse, truncLen_fo...
#> lgl (3): already_trimmed, multithread, verbose
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> Rows: 4 Columns: 3
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (3): sample_name, primer_name, organism
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> Creating output directory: /tmp/Rtmp23cMn7/demulticoder_run_temp/prefiltered_sequences