Generate multiple mock data files based on a specification list. Useful for creating complete test datasets for bioinformatics workflows. This function automatically creates a shared genome context for biological files.
Arguments
- spec
List. Specification for files to generate. Each element should be a list with 'datatype', and optionally 'output_file', 'size', 'n_records', 'options'. If 'output_file' is NULL, temporary files will be created.
- base_dir
Character. Base directory for output files (default: current directory). Only used when output_file is specified in spec.
- seed
Integer. Random seed for reproducible generation.
- compress
Logical. Default compression setting for files (default: auto-detect).
See also
Other mock data generation:
sn_cleanup_mockdata_examples()
,
sn_generate_mockdata()
,
sn_generate_rnaseq_dataset()
,
sn_get_example_value_with_mockdata()
Examples
if (FALSE) { # \dontrun{
# Generate a complete test dataset with specific files
spec <- list(
list(datatype = "fasta", output_file = "reference.fa", size = "small"),
list(
datatype = "fastq", output_file = "reads_R1.fastq.gz",
options = list(read_type = "R1"), size = "medium"
),
list(
datatype = "fastq", output_file = "reads_R2.fastq.gz",
options = list(read_type = "R2"), size = "medium"
),
list(datatype = "gtf", output_file = "annotation.gtf", size = "small")
)
files <- sn_generate_mockdata_batch(spec, base_dir = "test_data/")
# Generate temporary compatible files
temp_spec <- list(
list(datatype = "fasta", size = "small"),
list(datatype = "fastq", options = list(read_type = "R1")),
list(datatype = "fastq", options = list(read_type = "R2")),
list(datatype = "gtf", size = "small")
)
temp_files <- sn_generate_mockdata_batch(temp_spec)
# Clean up temporary files when done
sn_cleanup_mockdata_examples()
} # }