- Genotype database
document created on Jun 03 2022 by Arun Isaac, last updated 3 weeks ago by Pjotr Prins
Database layout
genodb is an immutable functional database built on the LMDB key-value store. An immutable database may sound like an oxymoron, but is indeed possible and practical. More precisely...
- Setting up Local Development Database
document created on Aug 19 2022 by Frederick Muriuki Muriithi, last updated on Dec 03 2023 by Pjotr Prins
...--protocol tcp -u root
```
Create a database db_webqtl_s
```
MariaDB [mysql]> CREATE DATABASE db_webqtl_s;
```
Load the small database dump into the database. You may find the small database either...
- Make xapian index rebuild conditional on database checksums
✓ issue opened on Jun 01 2023 by Arun Isaac, last updated on Dec 19 2023 by Arun Isaac; 3 of 3 tasks done
...conditional on database checksums
* assigned: arun
Currently, we unconditionally rebuild the xapian index once every day regardless of whether the database has actually changed over the last day. Not...
- Database Migrations
issue opened on Oct 19 2022 by Frederick Muriuki Muriithi
Database Migrations
## Tags
* assigned:
* type: feature
* priority: high
* keywords: database migrations
* status: pending
## Description
There might need to be some form of database migration...
- MariaDB
document created on Oct 26 2021 by Pjotr Prins, last updated on Feb 20 2024 by Pjotr Prins
...of the database running on production. We do this by restoring backups of the production database into MariaDB database directory. Here's how.
Backups are managed using Borg as the ibackup user. First...
- Invoking SQLite3: CLI
document created 7 weeks ago by Frederick Muriuki Muriithi
...modes, do:
```
$ sqlite3
SQLite version 3.40.0 2022-11-16 12:10:08
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
- Tux02 Production
issue opened on Oct 07 2021 by Pjotr Prins, last updated on Dec 14 2022 by Pjotr Prins; 17 of 22 tasks done
...borg backups
* [ ] create check list for manual testing
* [ ] look at performance
## Info
We have a protocol for updating GN2 on Tux02.
### Restore database from backup
Databases no longer get copied.
- Tests for the genodb genotype database
✓ issue opened on Aug 30 2022 by Arun Isaac, last updated on Sep 05 2022 by Alexander_Kabui
...database
Write tests for the genodb genotype database Common Lisp code.
=> https://git.genenetwork.org/GeneNetwork/cl-gn cl-gn repository which contains the genodb code
Perhaps, imitate similar tests...
- Implementing Efficient Database Caching for Query Responses in GN-LLM system
issue opened on Jan 17 2024 by Alexander_Kabui, last updated on Jan 25 2024 by Alexander_Kabui; 2 of 3 tasks done
...task aims to enhance the performance and responsiveness of our GN-LLM (Large Language Model) system by incorporating a robust database caching mechanism. The focus will be on utilizing a database...
- Developer links for
document created on May 06 2022 by Arun Isaac, last updated on Oct 16 2022 by Pjotr Prins
.../archive/dump-genenetwork-database/latest/sql.svg Continuous SQL schema visualization
=> https://ci.genenetwork.org/archive/dump-genenetwork-database/latest/rdf.svg Continuous RDF schema visualization
- Errors, defects and missing data in the database
issue opened on Oct 27 2022 by Arun Isaac, last updated on Oct 30 2022 by Arun Isaac
...database, which we try to track in this issue. These are best fixed directly in the database rather than by working around them in code.
## LRS values listed as 0.000
Some LRS values in the database...
- Move Uploader to tux02
issue opened on Mar 12 2024 by Frederick Muriuki Muriithi, last updated on Mar 12 2024 by Frederick Muriuki Muriithi
.../lib/mysql3307
socket = /var/run/mysqld/mysqld3307.sock
���
```
### SQLite
- [ ] Provide separate path for the SQLite database file
- [ ] Run migrations on SQLite database file...
- Virtuoso
document created on Aug 27 2021 by Pjotr Prins, last updated 6 weeks ago by Munyoki Kilyungi
...MySQL database
See also
=> ../RDF/genenetwork-sql-database-to-rdf
To dump data into a ttl file, first make sure that you are in the guix environment in the "dump-genenetwork-database" repository...
- Installation
document created on Jun 18 2023 by Pjotr Prins, last updated on Feb 25 2024 by Pjotr Prins
...15s
UMask=007
PrivateTmp=false
```
## Load the small database in MySQL
Currently we have two databases for deployment,
'db_webqtl_s' is the small testing database containing experiments
from BXD mice...
- MGAMMA Convert
issue opened 4 weeks ago by Pjotr Prins, last updated 2 weeks ago by Artyom Bologov; 7 of 11 tasks done
...that information
```
{type: "GRM", version:0.01, float: true, symmetric: true}
```
* [ ] Support genodb database format:
=> ../../topics/database/genotype-database See the genotype-database topic
- LMDB Phenotype/Genotype Store
document created on Jun 22 2023 by Munyoki Kilyungi, last updated on Jun 30 2023 by Alexander_Kabui
...and completeness in a data store?
* [C] guile bindings for lmdb for important stuff
* [B] Using hashes to track updates on database---proposal
Alex:
* Fetching all the phenotype data from the database...
- Full text search
✓ issue opened on Jun 30 2022 by Arun Isaac, last updated on Feb 13 2023 by Arun Isaac; 1 of 1 tasks done
...-database repo using guile-xapian.
=> https://xapian.org/ Xapian search engine library
=> https://git.genenetwork.org/arunisaac/dump-genenetwork-database dump-genenetwork-database repository...
- Database: `ProbeSetSE` Schema Bug
issue opened 8 weeks ago by Frederick Muriuki Muriithi, last updated 8 weeks ago by Frederick Muriuki Muriithi
Database: `ProbeSetSE` Schema Bug
## Tags
* type: bug
* priority: critical
* status: open
* keywords: database, mariadb, schema
* assigned:
## Description
The schemas are defined as follows...
- Deploying gn-auth
document created on Mar 04 2024 by Frederick Muriuki Muriithi, last updated on Mar 12 2024 by Frederick Muriuki Muriithi
...yoyo apply --database sqlite:////home/fredm/auth-run-migrations.db ./migrations/auth/
[20221103_01_js9ub-initialise-the-auth-entic-oris-ation-database]
Shall I apply this migration? [Ynvdaqjk?]: Y...
- Use genodb in genenetwork
issue opened on Jul 19 2022 by Arun Isaac, last updated on Mar 31 2023 by Arun Isaac
This will serve as an example based on which the team can port the rest of genenetwork to genodb.
=> https://issues.genenetwork.org/topics/genotype-database Design and use of the genodb database
- Upload GeneWiki RDF metadata to CD
✓ issue opened on Apr 17 2023 by Munyoki Kilyungi, last updated on Apr 18 2023 by Munyoki Kilyungi
...-database$ curl -I localhost:9082/sparql
curl: (7) Failed to connect to localhost port 9082: Connection refused
```
### Resolution
There was a database format mismatch due to a virrtuoso upgrade. Now...
- GeneNetwork SQL Database to RDF
document created on Mar 24 2023 by Pjotr Prins, last updated on Apr 04 2023 by Arun Isaac
GeneNetwork SQL Database to RDF
We use RDF in virtuoso to handle metadata for GN using
=> https://github.com/genenetwork/dump-genenetwork-database
See also
=> ../systems/virtuoso
- QC: Fix Integration Tests
✓ issue opened on Jan 09 2024 by Frederick Muriuki Muriithi, last updated on Feb 27 2024 by Frederick Muriuki Muriithi
...a new database for the test session, enabling the tests to run unhindered, but also without tainting the production redis databases.
### Update 2024-02-27
The system was updated to use prefixed keys...
- CLI Utility Scripts
document created on May 29 2023 by Frederick Muriuki Muriithi, last updated on May 30 2023 by Frederick Muriuki Muriithi
...for the
auth(entic|oris)ation database and the MariaDB database.
You could also run the script directly with:
```sh
python3 -m scripts.migrate_existing_data AUTHDBPATH MYSQLDBURI
```
where `AUTHDBPATH`...
- Fire up system container for GN
document created on Mar 02 2024 by Pjotr Prins, last updated 6 weeks ago by Pjotr Prins
...see there is no GN database yet.
```
/gnu/store/xj4bfqch8zs3sfzvj65ykbvnpprwaj7f-mariadb-10.10.2/bin/mysql -e 'show databases'
```
mariadb initialized a new database in /var/lib/msyql. We need to stop...
- Materialised Views for Correlations
✓ issue opened on Oct 19 2022 by Frederick Muriuki Muriithi, last updated on Dec 19 2022 by Frederick Muriuki Muriithi
...CI/CD) database to get similar results.
The problem here, is that the migration might be moot, if the data is then moved out of the database, as is being planned.
### Queries to Materialise
Possible...
- Read Samples/Cases/Individuals From Database
✓ issue opened on Jan 20 2024 by Frederick Muriuki Muriithi, last updated on Feb 27 2024 by Frederick Muriuki Muriithi
...database.
This bug is even "encoded" in
=> https://gitlab.com/fredmanglis/gnqc_py/-/blob/6200a60eb6f04a5d50bfe0ad366674dc49a08119/README.org#L26 the original specifications.
> - check strain headers...
- Fix Broken UTF-8 characters in our Database
✓ issue opened on Nov 29 2022 by Munyoki Kilyungi, last updated on Aug 29 2023 by Munyoki Kilyungi
Fix Broken UTF-8 characters in our Database
## Tags
* assigned: bonfacem, arthur
* type: database
* priority: high
## Description
We have jumbled up text in our database and this has been the case...
- GeneNetwork Uploader Requirements
document created on Feb 22 2024 by Frederick Muriuki Muriithi, last updated on Feb 22 2024 by Frederick Muriuki Muriithi
...database. This implies use of a data staging area, or even a separate testing database to hold the data. There might need to be a GeneNetwork system with access to the staging area or testing database...
- Improve Menu Generation and Move it to GN3
✓ issue opened on May 16 2022 by Frederick Muriuki Muriithi, last updated on Jun 23 2022 by Frederick Muriuki Muriithi; 2 of 3 tasks done
...database within loops.
This ruins the performance of the system significantly.
The queries should be reworked, and the code should be moved to GN3 since it does database access.
### TODOs
* [ ] Rework...
- GeneNetwork Hacking Documentation
document created on Mar 11 2022 by Frederick Muriuki Muriithi, last updated on Jul 18 2022 by Frederick Muriuki Muriithi
...2022-07-18
### Platforms
Stored in the *GeneChip* table in the database.
* TODO: Elaborate what these are once you understand them
### Groups
These are in the *InbredSet* table in the database.
* What...
- Clean up Authorisation
✓ issue opened on Nov 19 2021 by BonfaceKilz, last updated on Dec 04 2023 by Frederick Muriuki Muriithi; 4 of 4 tasks done
...extra value
* [x] Fetch complete list of samples from database and genotype file
instead of only fetching that list from the database. Look at trait
page for reference.
* [X] Extend idea of csv...
- Databases Getting Out of Wack
issue opened on Mar 03 2022 by jgart, last updated on Mar 16 2023 by Pjotr Prins
...database, gemma, reaper
## Let's use Gemma instead of Reaper
Zachary:
> If we're using GEMMA, we'll need to recalculate all other trait Max LRS scores using
> GEMMA as well (so I think we should just...
- Phenotype Correlation Error
✓ issue opened on Sep 28 2022 by Zachary Sloan, last updated on Nov 28 2023 by Frederick Muriuki Muriithi
.../production/gene/wqflask/base/trait.py", line 599, in retrieve_trait_info
raise KeyError(repr(trait.name)
KeyError: "'1422223_at' information is not found in the database."
```
so far, triangulated...
- Precompute steps
document created 9 days ago by Pjotr Prins, last updated 25 hours ago by Pjotr Prins
...database and can be loaded into a SQL database on demand. This is all to be able to distribute data and make sure we only compute once.
At this point we can write
```
{"2":9.40338,"3":10.196,"4"...
- Restore backup
document created on Feb 07 2023 by Pjotr Prins, last updated on Apr 25 2023 by Munyoki Kilyungi
...large, and we'll muck it out at some point. The total mariadb database at this point is 430Gb.
### Move DB in place
To move the new DB in place we first have to stop mariadb, move the old database out...
- My Software Development Journey so far,
document created on Dec 04 2023 by fetche-lab, last updated on Jan 04 2024 by Lisso_
...database. This presents itself as a window of opportunity to improve the functionality of the uploader, where a user can directly update the names when discovering them to be missing in the database.
- Quality Control of Data in Uploaded R/qtl2 Bundles
issue opened on Feb 02 2024 by Frederick Muriuki Muriithi, last updated on Feb 16 2024 by Frederick Muriuki Muriithi; 10 of 15 tasks done
...database, prior to attempting to parse the file and load data into the database
* [ ] If listed samples/cases do not exist in database, verify they are all listed in the "geno" file(s)
### [ ] phenose...
- GN1 Time machines
issue opened on Oct 07 2021 by Pjotr Prins, last updated on Jul 01 2022 by Arun Isaac
...databases, source code and etc files to set up the containers. Start with the most recent one and see if you can get that to run on Penguin2. After that we'll do the others. The database are named...
- Data Uploads: Zero Representation
✓ issue opened on Feb 08 2022 by BonfaceKilz, last updated on Nov 11 2022 by Munyoki Kilyungi
...gn2 the value still remains as "x" and isn't updated in the database.
Also, ATM you cannot edit a value to "x", which is similar to removing
a field in the database.
#### Wed 25 May 2022 22:01:44 EAT...
- Clean Up
issue opened on Jan 02 2022 by Pjotr Prins, last updated on Jul 01 2022 by Arun Isaac
...database administration
* keywords: database, mariadb
## Description
Find all larger tables
```
SELECT TABLE_SCHEMA,TABLE_NAME,DATA_LENGTH FROM information_schema.TABLES WHERE DATA_LENGTH>10000...
- Support searching using SNP names
issue opened on Feb 15 2023 by Arun Isaac
...SNP name in an external database and resolve it to coordinates before searching. This is a needless extra step and can be automated.
Implementing this will require us to have a database (perhaps dbSNP)...
- Dump GeneWiki data
✓ issue opened on Mar 30 2023 by Munyoki Kilyungi, last updated on Apr 17 2023 by Munyoki Kilyungi
...GeneWiki data comes from the GeneRIF database maintained by NCBI. In GeneNetwork, this is stored in GeneRIF_BASIC. [Authorised] Users of GN can add their own entries so that they are associated...
- Add HTML Page for ProbeSet Page
✓ issue opened on Dec 14 2023 by Munyoki Kilyungi, last updated on Jan 09 2024 by Munyoki Kilyungi
...were url-encoded to form valid urls. See: 4a62e1781692, 56d09222742c, and 03df1227c419 from:
=> https://git.genenetwork.org/gn-transform-databases/ gn-transform-databases
Other Relevant PR's/Commits...
- Login issues with gn-auth
issue opened on Mar 01 2024 by Pjotr Prins, last updated on Mar 02 2024 by Frederick Muriuki Muriithi
...52 cursor.execute("INSERT INTO users VALUES (?, ?, ?)",
2024-03-02 01:53:52 sqlite3.OperationalError: attempt to write a readonly database
```
Looks like the container cannot write to the database.
- Minor Phenotype Page UI updates
✓ issue opened on Nov 30 2023 by Munyoki Kilyungi, last updated on Dec 01 2023 by Munyoki Kilyungi; 6 of 6 tasks done
...-logP"(LOD-Score), peak-location, effect-size
Resolved in: 77f9036298e8, abe23c624c66, a850acb21152, 7b2a0e1be7d8 in:
=> https://git.genenetwork.org/gn-transform-databases/tree/ gn-transform-databases...
- Add Metadata To The Trait Page (RDF)
document created on Sep 28 2022 by Munyoki Kilyungi, last updated on Dec 03 2023 by Pjotr Prins
...Trait Page (RDF)
Fri 30 Sep 2022 11:48:41 EAT
## Introduction
We are migrating the GN2 relational database to a plain text and RDF database. Matrix-like data (E.g. fetching sample data for a given...
- ProbeSetData
issue opened on Dec 31 2021 by Pjotr Prins, last updated on Mar 13 2023 by Pjotr Prins; 15 of 17 tasks done
...the database to a new partition:
```
root@tux01:/export4/local/home/mariadb/database/db_webqtl# rsync -vaP /var/lib/mysql/db_webqtl/* . --delete --bwlimit=20M
```
Note I throttle the speed because the...
- Add mouse data-set
issue opened on Jun 30 2022 by BonfaceKilz, last updated on Apr 18 2023 by Munyoki Kilyungi; 0 of 4 tasks done
...-database/blob/master/csv-dump.scm
Remaining tasks:
* [ ] Share latest changes.
* [ ] Test the script in a copy of the production database.
* [ ] Make this more generic
* [ ] Integrate with GN2
The...
- Hanging database
issue opened on Dec 21 2021 by Pjotr Prins, last updated on Mar 11 2023 by Pjotr Prins
...'hanging'
In the last 12 hours GN2 monitoring shows the website is responding intermittendly. A quick check shows the database is blocking. Rather than simply restarting the database - which is known...
- Authentication/authorisation design
document created on Oct 17 2022 by Pjotr Prins, last updated on Dec 04 2023 by Frederick Muriuki Muriithi
...manageable. Any changes to the privileges shall require a system redeployment.
## Other Implementation Concerns
* Local database should be independent from other services and copied as a file (SQLite...
- Quality Control Project
✓ issue opened on Nov 19 2021 by Arthur Centeno, last updated on Jul 20 2022 by Frederick Muriuki Muriithi
...able to use it as
something that is in the database. GN1 does some of that. This is
where Arun comes in - we need to have a common handler for data that
is in the database and data that is in escrow.
- Editing Case-Attributes
document created on Jul 07 2023 by Frederick Muriuki Muriithi, last updated on Mar 13 2024 by Frederick Muriuki Muriithi
...endpoints.
## Database
The existing database tables of concern to us are:
* InbredSet
* CaseAttribute
* StrainXRef
* Strain
* CaseAttributeXRefNew
We can fetch case-attribute data from the database...
- Improving Metadata Audit
document created on Jul 25 2023 by Frederick Muriuki Muriithi, last updated on Aug 11 2023 by Frederick Muriuki Muriithi
...some interesting opportunities, e.g. showing when a trait was last edited and by whom.
## Notes
### Saving Diffs in Database
It turns out, we only store diffs in the database that have been approved.
- Wrong CSV in ITP_10001 longevity dataset
✓ issue opened on Apr 11 2022 by BonfaceKilz, last updated on Apr 12 2022 by BonfaceKilz
...database, some characters are inserted with control sequences that need to be stripped out. Here's a current snip of how that looks like:
```
JL00005,896.000000,x,x,896,4/22/04,,4OHPBN_J,Oct,,0^M,M,JL...
- Editing Metadata [Improvements to Make]
✓ issue opened on Mar 31 2022 by BonfaceKilz, last updated on Apr 18 2023 by Munyoki Kilyungi; 7 of 11 tasks done
...Published Database*
* [X] Hmm, this header is not very good-- *Edit Trait for Published Database.* a. We mean the word "edit" as a verb, but status of this word is ambiguous. b. Published Database...
- Some correlations running very slowly
issue opened on Mar 06 2023 by Zachary Sloan, last updated on Mar 07 2023 by Pjotr Prins
...database
## Description
Some correlations (it specifically seems to be ones done against ProbeSet databases) are running extremely slowly.
After looking into this, the cause seemed to be a specific...
- Partial Correlation
document created on Oct 15 2021 by Frederick Muriuki Muriithi, last updated on May 16 2022 by Frederick Muriuki Muriithi
...all and Add in search results.
Pick 3 and hit 'Partial'
Put one each in X, Y and Z columns
And compute against database (lower half).
That gives you a list of hits.
## Members
* fredm
* pjotp
* alex...
- Adding Species
document created 3 weeks ago by Frederick Muriuki Muriithi, last updated 3 weeks ago by Frederick Muriuki Muriithi
...nlm.nih.gov/ Go to NCBI
* In the "All Databases" drop-down, select "Taxonomy"
* In the search box, enter the species' FullName, e.g. Caenorhabditis elegans
* Click "Search"
* If the species exists...
- GN2 Time Machines
issue opened on Aug 19 2022 by Pjotr Prins, last updated on Sep 05 2022 by Alexander_Kabui; 4 of 10 tasks done
...'show databases'
```
### Mariadb database from backup
We have daily incremental backups on P2, Tux02 and Epysode. First restore the files with
```
. ~/.borg-pass
cd /export2/tux01-restore
borg extract...
- Precompute mapping input data
document created on Mar 20 2023 by Pjotr Prins, last updated 9 days ago by Pjotr Prins
...about how locations are stored. We don't actually
> database locations in the ProbeSetXRef table - we only database the
> peak Locus marker name. This is then cross-referenced against the Geno...
- Capture Data on the BXDs in RDF
✓ issue opened on Mar 23 2022 by Frederick Muriuki Muriithi, last updated on Oct 12 2022 by Munyoki Kilyungi
...metadata from RDF
Work on dumping RDF has already been done in:
=> https://github.com/genenetwork/dump-genenetwork-database dump-genenetwork-database
Also, vector/matrix data should be put in lmdb...
- AI Community Symposium at iHub
document created on Jan 26 2023 by Brian Muhia, last updated on Jan 30 2023 by Pjotr Prins
...entails converting a traditional SQL database, comprising over 80 tables, to RDF, the language of the semantic web. The goal of this conversion is to leverage the benefits of RDF databases, which are...
- Guix system containers and how we use them
document created on Mar 31 2023 by Arun Isaac, last updated 7 weeks ago by Pjotr Prins
...are completely ephemeral, that is, they have no persistent state. But often, we have services that need to retain some state. Think a database server needing to persist its database directory, a web...
- ProbeSE
issue opened on Dec 30 2021 by Pjotr Prins, last updated on Jul 01 2022 by Arun Isaac
...database, mariadb, innodb, ProbeSE
## Description
Zach pointed out that ProbeSE is used on GN1 with
=> http://gn1.genenetwork.org/webqtl/main.py?FormID=showProbeInfo&database=HC_M2_0606_P&ProbeSetID...
- Upload Strains
✓ issue opened on Dec 06 2023 by Frederick Muriuki Muriithi, last updated on Feb 27 2024 by Frederick Muriuki Muriithi; 5 of 5 tasks done
...for the strains (think InbredSet)
* [x] UI to select the CSV file with the strains data
* [x] UI to select the way to interprete the CSV file
* [x] Code to insert the new strains, into the database
- Global search does not close connections properly (and is slow)
✓ issue opened on Mar 23 2022 by Frederick Muriuki Muriithi, last updated on Oct 14 2022 by Arun Isaac
...generating 6Mb of log file info.
In fact, every row in the table has a SQL query that does not close the connection properly.
## Resolution
The new xapian search does not use the SQL database at...
- Remove everything elastic search
✓ issue opened on Oct 22 2021 by Pjotr Prins, last updated on Apr 01 2022 by BonfaceKilz
Remove everything elastic search
We are no longer using that database
Seems related to
=> ../issues/remove-elastic-search...
- LMM precomputed scores
issue opened on Mar 16 2023 by Pjotr Prins, last updated 9 days ago by Pjotr Prins; 0 of 1 tasks done
Interestingly, this ties in with our xapian search and fast querying of value ranges.
# Tags
* assigned: pjotrp
* priority: high
* type: bug, enhancement
* status: ongoing
* keywords: database, gemma...
- Add Cis-Trans plot
issue opened on Apr 07 2023 by Pjotr Prins
See also
=> ../../topics/systems/mariadb/precompute-mapping-input-data.gmi
# Tags
* assigned: zsloan, pjotrp
* priority: medium
* type: enhancement
* status: unclear
* keywords: database, gemma
- Troubleshoot CD Menu Failure
✓ issue opened on Apr 21 2023 by Munyoki Kilyungi, last updated on Apr 26 2023 by Munyoki Kilyungi
...in CD fails. This is because the database in CD is out of sync with the one in production. In particular:
```
2023-04-21 11:54:41 MySQLdb._exceptions.OperationalError: (1054, "Unknown column 'Family'...
- Delete Rejected Diffs from Database
✓ issue opened on Jul 25 2023 by Frederick Muriuki Muriithi, last updated on Jul 25 2023 by Frederick Muriuki Muriithi
...Database
## Tags
* type: feature request
* status: closed
* assigned: fredm
* keywords: editing, metadata audit
* priority: high
## Description
The rejected diffs will be maintained, but will simply...
- Do Bulk Query for Correlation Results' Display
✓ issue opened on Oct 21 2022 by Frederick Muriuki Muriithi, last updated on Oct 24 2022 by Frederick Muriuki Muriithi
.../wqflask/wqflask/correlation/show_corr_results.py#L112-L220 This loop
in lines 118 to 120 (call to `create_trait(...)) queries the database at least once every iteration, which leads to performance...
- DOL group mapping issues
✓ issue opened on Apr 07 2022 by Zachary Sloan, last updated on Apr 20 2022 by Zachary Sloan
...addressed yet with displaying mitochondrial markers in the GN2 figure. GEMMA is outputting the results, so I suspect this is because mitochondria isn't included in the databased list of chromosomes...
- Profiling Python code
document created on Dec 03 2023 by Pjotr Prins
...define how to connect to the database.
* `the-script.py` is the name of the python script to be run under the profiler
The output can be redirected, e.g.
* env [various-env-vars] python3 -m cProfile...
- This document has some useful SQL tricks
document created on Aug 08 2023 by Munyoki Kilyungi, last updated on Aug 11 2023 by Munyoki Kilyungi
- tux01 running out of RAM
✓ issue opened on Sep 03 2022 by Pjotr Prins, last updated 13 days ago by Pjotr Prins; 1 of 6 tasks done
...(2.523 sec). So it is fine now. It might be that on reboot the table got fixed, but we'll check the tables anyway. First take a look at the state of the engine itself as described in
=> ../database-...
- Fetch trait names for phenotypes
issue opened on Mar 22 2022 by Frederick Muriuki Muriithi
...database.
### Describe the solution you'd like
For example, downloading the data as CSV via the web interface includes the following header information:
```
Record ID,10620
Symbol,WMZTrgtQuadTime...
- MySQLdb._exceptions.OperationalError: (1040, 'Too many connections')
✓ issue opened on Aug 29 2022 by Munyoki Kilyungi, last updated on Sep 05 2022 by Alexander_Kabui; 2 of 2 tasks done
...(1040, 'Too many connections')
## Tags
* assigned: bonfacem, fredm, aruni
* type: bug
* keywords: mysql, database
## Tasks
* [x] Figure out root cause
* [x] Send patch
## Description
See the strack...
- Update production
document created on Dec 07 2022 by Pjotr Prins, last updated on Dec 09 2022 by Pjotr Prins
...database in a file 'data.ttl' we can test it for correctness with:
```
tux01:~$ rapper --input turtle --count dump.ttl
rapper: Parsing URI file:///home/wrk/dump.ttl with parser turtle
rapper: Parsing...
- Rework Fetching Settings
✓ issue opened on Sep 25 2022 by Munyoki Kilyungi, last updated on Oct 11 2023 by Munyoki Kilyungi
...settings from wqflask.database, appropriately called "get_setting". Perhaps, when this task is being worked on, we should move that "get_setting" to a more appropriately named module.
Getting rid...
- Capture state of phenotypes in a HASH
issue opened on Mar 23 2022 by Frederick Muriuki Muriithi
.../551 From GitHub
To represent the state of the database we need to start using HASH values or UUIDs. For phenotypes we should create these for phenotype columns within a dataset - i.e. the column one...
- Edit OAuth2 Clients
✓ issue opened 3 weeks ago by Frederick Muriuki Muriithi, last updated 2 weeks ago by Frederick Muriuki Muriithi
...recent updates to use JWT in place of simple "AuthorizationCode" tokens, we needed to update the database to ensure the OAuth2 clients had the appropriate grant types set up.
It turns out, at least...
- Simplify `dataset.py` in GeneNetwork2
issue opened on Sep 13 2022 by Frederick Muriuki Muriithi
.../wqflask/base/data_set.py#L740-L832
and split it into various chunks, that
* compute the `self.sample_list`
* retrieve `sample_ids` values from the database using the `self.sample_list` values computed...
- Upload probeset metadata
✓ issue opened on Jun 19 2023 by Munyoki Kilyungi, last updated on Jun 26 2023 by Munyoki Kilyungi
...-database$ ls -lah data/
total 3.2G
drwxr-xr-x 2 bonfacem bonfacem 4.0K Jun 16 08:04 .
drwxr-xr-x 9 bonfacem bonfacem 4.0K Jun 15 07:04 ..
-rw-r--r-- 1 bonfacem bonfacem 1.2G Jun 16 07:48 dump-...
- Fix load-rdf.scm script
✓ issue opened on Apr 05 2023 by Munyoki Kilyungi, last updated on Apr 05 2023 by Munyoki Kilyungi
...-pipe _ _ _ . _)
load-rdf.scm:117:24: In procedure call-with-pipe:
Invocation of program failed ("isql")
```
See the following for the fix:
=> https://github.com/genenetwork/dump-genenetwork-database/...
- Rewrite qc and qc-uploads in Python3
✓ issue opened on Apr 11 2022 by Frederick Muriuki Muriithi, last updated on Nov 28 2023 by Frederick Muriuki Muriithi
...to the database, to link it properly.
### Answers to Questions
#### Question 01
The first field will be treated as text, and will not undergo any verification
#### Question 02
The line-endings will...
- Data Upload Process
document created 2 weeks ago by Munyoki Kilyungi, last updated 2 weeks ago by Munyoki Kilyungi
...Study for Breast Cancer Dataset".
# Challenges Faced and Solutions
During the data upload process, I encountered several challenges that required solutions. One challenge was identified as a database...
- Correlations fail for at least some ProbeSet datasets (as the target dataset)
✓ issue opened on Oct 17 2022 by Zachary Sloan, last updated on Nov 22 2022 by Frederick Muriuki Muriithi
...the database, leading to issues with the final path.
In this case, the dataset name used to generate the file is (** Note the forward slash **):
```
EPFL/ETHZ BXD Liver Proteome CD-HFD (Nov19)
```
The...
- Queries for fetching/editing metadata
document created on Jul 22 2023 by Munyoki Kilyungi
- Temp traits don't seem to be handled by the authorization system
✓ issue opened on Jul 28 2023 by Zachary Sloan, last updated on Feb 27 2024 by Frederick Muriuki Muriithi
...publicly readable. This is necessary since the "Temp" traits are not attached to any resources. It is also because unlike all the other traits, "Temp" traits are not saved in the database, rather...
- Modifying dump macros
document created on Jul 01 2023 by Munyoki Kilyungi
- Running postgres in a Guix container
document created on Jan 17 2023 by Pjotr Prins
...this
```
. ~/opt/postgresql14/etc/profile
psql test
\dt
etc etc
```
## More
=> https://fluca1978.github.io/2021/09/30/GNU_GUIX_PostgreSQL.html
=> https://guix.gnu.org/cookbook/en/html_node/A-Database-...
- Slow Correlations and UI crashes
✓ issue opened on Oct 18 2021 by BonfaceKilz, last updated on Jul 08 2022 by Alexander_Kabui
According to Rob, GN1 does not rely on a cache. Instead it is
computing from a materialized view of the database that is
intentionally designed for a fast web service.
# Notes
### Tue, 12 April 2022...
- Autogenerate documentation: trees, and labels
✓ issue opened on Jun 23 2023 by Munyoki Kilyungi, last updated on Oct 11 2023 by Munyoki Kilyungi
...NextGenDatabases
* priority: high
* keywords: RDF, GNSOC2023
See this
=> https://github.com/genenetwork/dump-genenetwork-database/pull/11
Given an s-expression say:
```
(define-dump dump-species...
- Utility Scripts
document created on Jun 05 2023 by Frederick Muriuki Muriithi, last updated on Dec 03 2023 by Pjotr Prins
...utility scripts manually to set up certain things that do not render themselves to automation very well.
This is especially relevant for any script that might need to interact with the SQLite database.
- Genenetwork3 Effective UID
✓ issue opened on Jun 05 2023 by Frederick Muriuki Muriithi, last updated on Jun 09 2023 by Frederick Muriuki Muriithi
...KeyError: 'getpwuid(): uid not found: 1000'
2023-06-05 03:46:38
2023-06-05 03:46:48 [2023-06-05 03:46:48,918] ERROR in errors: unable to open database file
2023-06-05 03:46:48 unable to open database...
- Ontologies
document created on Jul 31 2023 by Munyoki Kilyungi, last updated on Oct 11 2023 by Munyoki Kilyungi
- Orchestration and fallbacks
document created on Sep 02 2022 by Pjotr Prins, last updated on Oct 25 2022 by Pjotr Prins
...Partial synchronization between data sources
The only way we *can* scale is by adding machines. But the system is not yet ready for that. Also getting rid of monolithic primary databases in favor...
- Cool Interfaces We Should Emulate
issue opened on Apr 13 2022 by BonfaceKilz, last updated on Apr 14 2022 by BonfaceKilz
...tool, or manually?
[Dave] All plots/visualizations/tools in the Mouse Phenome Database are dynamic and interactive, written in either D3.js or with HighCharts, depending on the complexity of the plot.
- Xapian search
document created on May 02 2023 by Arun Isaac, last updated on Dec 03 2023 by Pjotr Prins
...It retrieves data using several SQL queries and indexes them to build the index. Due to the enormous size of the GeneNetwork database, this is quite an expensive operation and relies on various tricks...
- Migrate User Accounts from Redis to new Auth DB
✓ issue opened on Dec 22 2022 by Frederick Muriuki Muriithi, last updated on May 22 2023 by Frederick Muriuki Muriithi
...to register anew and their access details reconfirmed.
--------------------
Currently, on GN2, user details are stored in Redis. We need to migrate these to the new auth database (SQLite3) in order to...
- GNSoC 2023
document created on Jun 21 2023 by Pjotr Prins, last updated on Aug 24 2023 by Pjotr Prins
...Nextgen databases
lmdb+RDF
* lead: Bonface
* team: Fred, Alex
* contact: Pjotr
git repo genenetwork3
=> ../../topics/next-gen-databases/design-doc Design doc
### Week 1
* RDF dumps
* Parsing S-exp...
- Adding Quantitative Tracks Using BigWig Files
document created on Nov 14 2023 by cel7t, last updated on Dec 03 2023 by Pjotr Prins
...chrom.sizes files for the database
Use the fetchChromSizes binary to create chrom.sizes files for the existing wig files
=> http://hgdownload.soe.ucsc.edu/admin/exe/ fetchChromeSizes binary location...
- ProbeData
issue opened on Dec 30 2021 by Pjotr Prins, last updated on Mar 14 2023 by Arun Isaac
...database, mariadb, innodb, ProbeData
## Description
Probe level data is used to examine the correlation structure among the
N probes that have the same nominal target. Sometimes several probes
are...
- Xapian indexing
document created on Oct 30 2022 by Arun Isaac, last updated on Dec 03 2023 by Pjotr Prins
...to the enormous size of the GeneNetwork database, indexing it in a reasonable amount of time is a tricky process that calls for careful identification and optimization of the performance bottlenecks.
- Fallbacks and backups
issue opened on Aug 31 2021 by Pjotr Prins, last updated on Apr 25 2023 by Munyoki Kilyungi; 19 of 24 tasks done
...too. Incremental copies work with rsync - so that is fast. To restore the full MariaDB database from a local borg repo takes a few minutes:
```
wrk@epysode:/export/restore_tux01$ time borg extract -v...
- Error when fetching SNPs in a search page
✓ issue opened on Sep 28 2022 by Munyoki Kilyungi, last updated on Apr 18 2023 by Munyoki Kilyungi
...local database, you get an error because the "RatSnpPattern" table does not exist:
```
ERROR:wqflask:http://localhost:5004/snp_browser?first_run=true&species=mouse&gene_name=BG976607&limit_strains...
- MariaDB: Move to InnoDB Engine
issue opened on Dec 28 2021 by Pjotr Prins, last updated 3 weeks ago by Pjotr Prins; 0 of 8 tasks done
...database, mariadb, innodb
## Report
With the SQL database we need to move from myisam to innodb format,
mostly to stop the problem of full table locks. Also I expect the
occasional crashes we see to...
- Add to Collection Error
✓ issue opened on Oct 09 2023 by Frederick Muriuki Muriithi, last updated on Nov 28 2023 by Frederick Muriuki Muriithi
..."Calculate Correlations" accordion
* Set Method = "sample r"
* Set Database = "GTEXv8 Human Kidney-Cortex RNA-Seq (Feb20) TPM log2"
* Set "Limit to" = 500
* Set Samples = GTEx_v8
* Set Type = Pearson...
- Genewiki conversion
issue opened on Aug 26 2022 by Pjotr Prins, last updated on Sep 05 2022 by Alexander_Kabui; 0 of 3 tasks done
...to (1) migrate the existing genewiki data in the database to named markdown documents in that repository and (2) create a rendered page that is found through
=> https://genenetwork.org/doc/genes/BRCA2...
- Phenotype Naming Conventions
document created on Nov 23 2022 by Munyoki Kilyungi, last updated on Dec 03 2023 by Pjotr Prins
...IMPOSE. For better or for worse, we are apparently one of the major curators for formats for phenotype abbreviations. Perhaps we need to formalize this with the Phenome Database team.
Given the above...
- Understanding GN's Classification Scheme
document created on Aug 29 2023 by Munyoki Kilyungi, last updated on Aug 31 2023 by Munyoki Kilyungi
- Queries and Prepared Statements in Python
document created on Sep 27 2022 by Frederick Muriuki Muriithi, last updated on Dec 03 2023 by Pjotr Prins
- Handling Tissue in Uploader
issue opened 7 weeks ago by Frederick Muriuki Muriithi, last updated 7 weeks ago by Frederick Muriuki Muriithi; 2 of 2 tasks done
...is via the `ProbeFreeze`
table that refers to the `InbredSet` table that then refers to the `Species`
table. Even with that, on the **Tux02** database, we have 48 tissues that are
not connected to any...
- Fire up system container for GN-QA System
document created 3 days ago by Munyoki Kilyungi, last updated 3 days ago by Munyoki Kilyungi
..."/tmp"))
(environment-variable
(name "AUTHLIB_INSECURE_TRANSPORT")
(value "true"))))
(mappings (list database-mapping...
- slow text search query
issue opened on Mar 12 2023 by Pjotr Prins
- Designing an issue tracker on gemini
✓ issue opened on Jul 25 2021 by Pjotr Prins, last updated on Feb 02 2022 by Pjotr Prins
...-threads/blob/main/issues/database-not-responding.gmi
## Process
We leverage git to pull out dates and people contributions (pjotrp wrote ...) for display in a document/web page. This page is generated...
- Export Uploaded Data to LMDB and RDF Stores
issue opened on Nov 03 2023 by Frederick Muriuki Muriithi, last updated on Nov 14 2023 by Frederick Muriuki Muriithi; 0 of 6 tasks done
.../genenetwork3/pull/130 2: Munyoki's Pull request
=> https://github.com/BonfaceKilz/gn-dataset-dump 3: Dataset -> LMDB export repository
=> https://github.com/genenetwork/dump-genenetwork-database 4...
- R/qtl JSONDecodeError
✓ issue opened on Dec 23 2022 by Pjotr Prins, last updated on Dec 24 2022 by Pjotr Prins; 3 of 3 tasks done
.../lib/python3.9/contextlib.py", line 119, in __enter__
return next(self.gen)
File "/home/gn2/gn3_production/genenetwork3/gn3/db_utils.py", line 55, in xapian_database
db = xapian.Database(path)...
- Annotate traits page with metadata from RDF
✓ issue opened on Sep 30 2022 by Munyoki Kilyungi, last updated on Dec 15 2022 by Munyoki Kilyungi; 16 of 16 tasks done
...have accession id's
* [X] Refactor the dataset fetch fn in GN3 to use the Maybe Monad
* [X] Write tests for the above
* [X] Test on test database upstream - if this is set-up
* [X] Submit patches...
- Reasons There is HTML and CSS in GN3
document created on Jul 26 2023 by Frederick Muriuki Muriithi
...to the database, but you would still need the user to authenticate themselves (to prevent randos from registering clients willy-nilly).
## Footnotes
=> https://oauth.net/2/grant-types/ fn:grant-types...
- Backup Drops
document created on Oct 27 2022 by Pjotr Prins, last updated on Feb 18 2024 by Pjotr Prins
...proves pretty resilient over time. Only on the synology server I can't get it to work because of some CRON permission issue.
# Tags
* assigned: pjotrp
* keywords: systems, backup, sheepdog, database...
- Automated Testing
document created on Oct 12 2022 by Munyoki Kilyungi, last updated on Dec 03 2023 by Pjotr Prins
...(among others):
* Each API endpoint responds within a specified amount of time
* Select computation-heavy functions respond within a specified amount of time for given data
* Database-querying...
- Automated Testing
✓ issue opened on Feb 10 2022 by Frederick Muriuki Muriithi, last updated on Oct 12 2022 by Munyoki Kilyungi; 0 of 10 tasks done
...(among others):
* Each API endpoint responds within a specified amount of time
* Select computation-heavy functions respond within a specified amount of time for given data
* Database-querying...
- Fetch trait data using genofiles
issue opened on Jul 11 2023 by Alexander_Kabui, last updated on Jul 13 2023 by Alexander_Kabui; 2 of 3 tasks done
...database does not have all genotype files when fetching sample data use genotypes to fetch trat data given a dataset and the trait
Having fetched the sample names of a given group from the genofiles...
- OAuth2
document created on May 29 2023 by Frederick Muriuki Muriithi, last updated on Jun 07 2023 by Frederick Muriuki Muriithi
...database. It does perform some quality-control on the data before upload. Currently, this application is not available to the general public due to the potential to mess up the data. Once the auth...
- Editing Data
document created on Nov 23 2021 by BonfaceKilz, last updated on Jul 25 2023 by Frederick Muriuki Muriithi
...-u webqtlout db_webqtl < metadata_audit.sql
```
And check this works
```
select * FROM information_schema.COLUMNS WHERE table_schema=DATABASE() AND TABLE_NAME='metadata_audit';
```
For everything to...
- Developing against GeneNetwork
document created on Jun 18 2023 by Pjotr Prins, last updated on Dec 03 2023 by Pjotr Prins
...-pjotr.genenetwork.org/api/v_pre1/gen_dropdown
```
check the logs. If there is ERROR 1054 (42S22): Unknown column
'InbredSet.Family' in 'field list' it may be you are trying the small
database.
### Run...
- ProbeSetXRef
issue opened on Dec 29 2021 by Pjotr Prins, last updated on Jul 01 2022 by Arun Isaac
...database, mariadb, innodb
* type: enhancement, documentation
* assigned: pjotrp
* status: unclear
* priority: medium
## Table ProbeSetXRef
Juggling indexes and transforming to InnoDB led to a massive...
- Migrate GN1 Clustering
document created on Jul 18 2021 by Pjotr Prins, last updated on Mar 22 2022 by Frederick Muriuki Muriithi
...'category': 'C57BL/6J +'}, ..., {'value': 1.3089193078506003, 'category': 'C57BL/6J +'}]]
```
but that did work as expected.
Paused on heatmap generation to first test out the database access code.