Edit this page
Simplify `dataset.py` in GeneNetwork2
keywords: technical debt
The entire file
is a mess, and we need to chunk it out into smaller logic.
As part of this, the idea is to begin with the
and split it into various chunks, that
compute the `self.sample_list`
retrieve `sample_ids` values from the database using the `self.sample_list` values computed above
retrieve `trait_sample_data` from `sample_ids` retrieved above. This can have a number of helper function to compute the appropriate queries for each of the `dataset_type` values ("Publish", "Geno", "ProbeSet", "Temp")
compute the `self.trait_data` from the `trait_sample_data` above
We can split each of the steps above into one or more methods.
To help with moving away from using classes, we can ensure that each of these methods returns the values computed/retrieved (in addition to setting the class member variables).