Edit this page | Blame

Databases Getting Out of Wack

This issue refers to precomputed scores generated by the ancient reaper module that runs as a script:

We'll create a new issue:

Tags

  • assigned: pjotrp
  • priority: high
  • type: bug, enhancement
  • status: unclear
  • keywords: database, gemma, reaper

Let's use Gemma instead of Reaper

Zachary:

If we're using GEMMA, we'll need to recalculate all other trait Max LRS scores using GEMMA as well (so I think we should just do this with qtlreaper for now). Otherwise we'll just have a bunch of qtlreaper scores mixed with GEMMA scores without the user having any way of knowing the difference. Also, storing the full results (what Rob calls the "full vector model") will require some sort of fundamental change to the way we store this data and should be postponed for later (since the high priority immediate issue is to ensure that the stored Max LRS values aren't wrong)
As for Mean, that should be simple, since it's just taking the average of sample values immediately after an update.
The main thing I'm uncertain how to do (though I know is possible since Bonface already did something like this with GEMMA) is making the code run in the background after an update. It's probably more simple than I'm thinking, though.

Pjotr:

Current qtlreaper runs in one of Arthur's scripts globally.
When we recompute on uploading the data we can use GEMMA. That is the plan. But let the team do qtlreaper first. Don't want too many moving parts.

Rob:

Yes, that and more. These values are displayed in search results and used to sort by expression mean and peak LRS (using pathetically old code and genotypes). Now if GEMMA were wicked fast we could recompute the 60 million BXD vector results and store that as a big juicy TRANSFORMATIVE blob of data. A big paper in doing just that. Reaper is just wrong at this point. We have LMM: We should use it.
(made with skribilo)