Edit this page |
Blame
Investigate and fix "qtl2::calc_genoprob()" run due to failing with negative length vectors
Tags
-
Assigned: Flisso
-
type: bug
-
status: in progress
-
key words: cross, qtl2, calc_genoprob, bugs
Description
Running subset of genotype and founders csv on qtl2 to generate founder aware smoothed genotypes. The script is crushing as per the followin error message:
calc_genoprob failing with negative length vectors are not allowed
For reference, see "qtl2_hmm_pipeline.R" script:
The following were key findings from the run, and the error:
-
Map and IDs were consistent:
-
- 50,000 markers
-
- no duplicate marker IDs
-
- monotonic increasing cM
-
Genotype dimensions:
-
- HS genotypes: 1499 x 50000
-
- Founder genotypes: 8 x 50000
-
Error cause matched integer-length overflow conditions:
-
The original workflow tried to allocate a genotype-probability object effectively sized around 1499 * 50000 * 36 = 2,698,200,000, which exceeds R’s 32-bit vector-length limit (2,147,483,647), causing negative length vectors are not allowed.
-
So the solution was to chunk the files to 5000 lines, but still the culprit is on the calc_genoprob() runtime.
What is already solved
-
[x] error: "calc_genoprob failing with negative length vectors are not
allowed)"
TODO
-
[] Re-run the script per specified chunks
-
[] Evaluate the smoothed output for its validity and intepretability
-
[] use the proximal/distal founder aware markers to extract snps from the original geno file.
-
[] or, extend a function in the script to perform this
-
[] Test the results with gemma and rqtl2 mapping