Edit this page | Blame

GEMMA output differs from R/qtl2

Tags

Description

When running trait BXD_21526 results differ significantly.

So I confirm I am getting the same results as Dave in GN for GEMMA (see Conclusion below).

Tasks

GeneNetwork

I run GEMMA for precompute on the command line and that I confirmed to be the same as what we see in the browser. This suggests either data or method is different with Dave's approach.

I confirmed that gemma in GN matches Dave's results. It is interesting to see that running without LOCO has some impact, but not as bad as the R/qtl2 difference. First we should check the genotype files to see if they match. I checked that the phenotypes match.

Our inputs are different if I count genotypes (first yours, the other on production):

     1  2184941 B
     2  2132744 D
     3   628980 H
     1  2195662 B
     2  2142959 D
     3   650168 H

The number of rows/markers is the same. So we probably added some genometypes, but if we miss one that would matter. Dave you can find the file in /home/wrk/BXD.geno on tux02 if you want to look.

I notice that we don't use H in the R/qtl2 control file. That might make a difference though it probably won't explain what we see now. BTW I also correlated the LOD scores from GEMMA and R/qtl2 in the spreadsheet and at 0.7 that is too low. So it is probably not just a magnitude problem. The results differ a lot in your spreadsheet.

Next step is that I need to run R/qtl2 using the script in your dropbox and see what Karl's code does. The exercise does not hurt because it will help us bring R/qtl2 to GN.

R/qtl2

R/qtl2 is packaged in guix and can be run in a shell with

guix shell -C r r-qtl2
> library(qtl2)
> bxd <- read_cross2(file = "bxd_cancer_new_GN_July_2024.json")
Warning messages:
1: In recode_geno(sheet, genotypes) :
  630519 genotypes treated as missing: "H", "U"
2: In matrix(as.numeric(unlist(pheno)), ncol = nc) :
  NAs introduced by coercion
3: In check_cross2(output) : Physical map out of order on chr 1, 2, 11, 19

The first warning matches above. If data is missing it may be filtered out. We'll have to check for that. The third warning I am not sure about. Probably a ranking of markers.

Conclusion

It turned out that R/qtl was running HK - so it was a QTL mapping rather than an LMM.

(made with skribilo)