Skip to contents

Generate multiple mock data files based on a specification list. Useful for creating complete test datasets for bioinformatics workflows. This function automatically creates a shared genome context for biological files.

Usage

sn_generate_mockdata_batch(spec, base_dir = ".", seed = 123, compress = NULL)

Arguments

spec

List. Specification for files to generate. Each element should be a list with 'datatype', and optionally 'output_file', 'size', 'n_records', 'options'. If 'output_file' is NULL, temporary files will be created.

base_dir

Character. Base directory for output files (default: current directory). Only used when output_file is specified in spec.

seed

Integer. Random seed for reproducible generation.

compress

Logical. Default compression setting for files (default: auto-detect).

Value

Character vector. Paths to generated files (invisibly).

Examples

if (FALSE) { # \dontrun{
# Generate a complete test dataset with specific files
spec <- list(
  list(datatype = "fasta", output_file = "reference.fa", size = "small"),
  list(
    datatype = "fastq", output_file = "reads_R1.fastq.gz",
    options = list(read_type = "R1"), size = "medium"
  ),
  list(
    datatype = "fastq", output_file = "reads_R2.fastq.gz",
    options = list(read_type = "R2"), size = "medium"
  ),
  list(datatype = "gtf", output_file = "annotation.gtf", size = "small")
)
files <- sn_generate_mockdata_batch(spec, base_dir = "test_data/")

# Generate temporary compatible files
temp_spec <- list(
  list(datatype = "fasta", size = "small"),
  list(datatype = "fastq", options = list(read_type = "R1")),
  list(datatype = "fastq", options = list(read_type = "R2")),
  list(datatype = "gtf", size = "small")
)
temp_files <- sn_generate_mockdata_batch(temp_spec)

# Clean up temporary files when done
sn_cleanup_mockdata_examples()
} # }