Edit this page | Blame

Investigate and fix "qtl2::calc_genoprob()" run due to failing with negative length vectors

Tags

  • Assigned: Flisso
  • type: bug
  • status: in progress
  • key words: cross, qtl2, calc_genoprob, bugs

Description

Running subset of genotype and founders csv on qtl2 to generate founder aware smoothed genotypes. The script is crushing as per the followin error message:

calc_genoprob failing with negative length vectors are not allowed

For reference, see "qtl2_hmm_pipeline.R" script:

The following were key findings from the run, and the error:

  • Map and IDs were consistent:
  • - 50,000 markers
  • - no duplicate marker IDs
  • - monotonic increasing cM
  • Genotype dimensions:
  • - HS genotypes: 1499 x 50000
  • - Founder genotypes: 8 x 50000
  • Error cause matched integer-length overflow conditions:
  • The original workflow tried to allocate a genotype-probability object effectively sized around 1499 * 50000 * 36 = 2,698,200,000, which exceeds R’s 32-bit vector-length limit (2,147,483,647), causing negative length vectors are not allowed.
  • So the solution was to chunk the files to 5000 lines, but still the culprit is on the calc_genoprob() runtime.

What is already solved

  • [x] error: "calc_genoprob failing with negative length vectors are not

allowed)"

TODO

  • [] Re-run the script per specified chunks
  • [] Evaluate the smoothed output for its validity and intepretability
  • [] use the proximal/distal founder aware markers to extract snps from the original geno file.
  • [] or, extend a function in the script to perform this
  • [] Test the results with gemma and rqtl2 mapping
(made with skribilo)