Prepare reads for primer trimming using 'Cutadapt'

Usage

prepare_reads(
  data_directory = "data",
  output_directory = tempdir(),
  tempdir_path = NULL,
  tempdir_id = "demulticoder_run",
  overwrite_existing = FALSE
)

Arguments

data_directory: Directory path where the user has placed raw FASTQ (forward and reverse reads), metadata.csv, and primerinfo_params.csv files. Default is "data".
output_directory: User-specified directory for outputs. Default is tempdir().
tempdir_path: Path to a temporary directory. If NULL, a temporary directory path will be identified using the tempdir() command.
tempdir_id: ID for temporary directories. The user can provide any helpful ID, whether it be a date or specific name for the run. Default is "demulticoder_run"
overwrite_existing: Logical, indicating whether to remove or overwrite existing files and directories from previous runs. Default is FALSE.

Value

A list containing data tables, including metadata, primer sequences to search for based on orientation, paths for trimming reads, and user-defined parameters for all subsequent steps.

Examples

# \donttest{
# Pre-filter raw reads and parse metadata and primer_information to prepare
# for primer trimming and filter
analysis_setup <- prepare_reads(
  data_directory = system.file("extdata", package = "demulticoder"),
  output_directory = tempdir(),
  overwrite_existing = TRUE
)
#> Existing files found in the output directory. Overwriting existing files.
#> Rows: 2 Columns: 25
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr  (3): primer_name, forward, reverse
#> dbl (18): minCutadaptlength, maxN, maxEE_forward, maxEE_reverse, truncLen_fo...
#> lgl  (4): already_trimmed, count_all_samples, multithread, verbose
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> Rows: 2 Columns: 25
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr  (3): primer_name, forward, reverse
#> dbl (18): minCutadaptlength, maxN, maxEE_forward, maxEE_reverse, truncLen_fo...
#> lgl  (4): already_trimmed, count_all_samples, multithread, verbose
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> Rows: 4 Columns: 3
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (3): sample_name, primer_name, organism
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> Creating output directory: /tmp/RtmpAtZc28/demulticoder_run/prefiltered_sequences

# }