Load a directory's worth of 3-column counts.tab.gz files from polyApipe.py into a SingleCellExperiment for analysis. Each cell will be uniquely named, and colData will include the batch. The 'batch_regex' can be used to extract nice sample names from filenames.

load_peaks_counts_dir_into_sce(
  counts_file_dir,
  peak_info_file,
  output,
  batch_regex = "(.*)\\.tab\\.gz",
  ...
)

Arguments

counts_file_dir

The directory full of counts files (in .tab.gz) as output from polyApipe.py script (usually called <something>_counts)

peak_info_file

GTF formatted peak file _specfically_ as output from polyApipe.py

output

Where to save the sce object on disk using saveHDF5SummarizedExperiment. Should not exist. This directory will be created.

batch_regex

A string of a regex to extract a nice sample name from a the counts files in counts_file_dir . This will become the batch name (and prefix of cell name) for each cell in colData. If not specified this will just use the filename up to but excludeing '.tab.gz'. The regex should have one capture group '()' and pull out something unique for every file. (Default = '(.*)\.tab\.gz')

...

Other parameters passed through to load_peaks_counts_files_into_sce/load_peaks_counts_into_sce

See also

Other peak counts loading functions: load_peaks_counts_files_into_sce(), load_peaks_counts_into_sce()

Examples

counts_dir <- dirname(system.file("extdata", "demo_dataset/demo_counts/SRR5259354_demo.tab.gz", package = "polyApiper")) peak_info_file <- system.file("extdata", "demo_dataset/demo_polyA_peaks.gff", package = "polyApiper") if (FALSE) { sce <- load_peaks_counts_dir_into_sce(counts_dir, peak_info_file = peak_info_file, output = "demodirsce", min_reads_per_barcode=1) # Required for tiny demo data. sce <- load_peaks_counts_dir_into_sce("myexpr_counts/", "my_expr_polyA_peaks.gff", "./myexpr_sce") sce <- load_peaks_counts_dir_into_sce("data/MyExperimentFullRun20190516_counts/", peak_info_file = "data/MyExperimentFullRun20190516_polyA_peaks.gff", output = "data/MyExperimentFullRun20190516_sce", batch_regex = 'FullRun_(.*)\\.tab\\.gz' ) }