The default format for the CSV files in a R/qtl2 bundle is:
matrix of individuals × (markers/phenotypes/covariates/phenotype covariates/etc.)
(A) (f/F)ile(s) in the R/qtl2 bundle could however
which means the system needs to "un-transpose" the file(s) before processing.
Currently, the system does this by reading all the files of a particular type, and then "un-transposing" the entire thing. This leads to a very slow system.
This issue proposes to do the quality control/assurance processing on each file in isolation, where possible - this will allow parallelisation/multiprocessing of the QC checks.
The main considerations that need to be handled are as follows:
We should probably detail the type of QC checks done for each type of file