Edit this page | Blame

Meeting Notes

2024-07-12

Plan for this week:

  • @shelby use Claude Sonnet with R2R RAG engine with 1000 papers and fix bugs.
  • @shelby final run through for paper 1 before Pjotr's review
  • @shelby and @bmunyoki review dissertation paper for Masters
  • @shelby @bmunyoki @alexm to define the problem with RDF triple stores
  • @alexm integrate the markdown parser
  • @alexm rewrite UI code using htmx
  • @bmunyoki investigate why xapian index isn't getting rebuilt
  • @bmunyoki investigate discrepancies between wiki and rif search
  • @jnduli update the generif_basic table from NCBI
  • @jnduli blog post of preference for documentation

We have qa.genenetwork.com. We need to have this set up to `qa.genenetwork.com/paper1` so that we always have the system that was used for this. How?

Nice to Haves

  • @bmunyoki Nice to have tag for paper1
  • @bmunyoki fix CI job that transforms gn tables to TTL

2024-06-24

Plan for this Week:

  • CANCELED: @bmunyoki Remove boolean prefixes from search where it makes sense.
  • DONE: @bmunyoki GeneWiki + GeneRIF search in production. Mostly needs to be run in prod to see impact.
  • DONE: @jnduli Children process termination when we kill the main index-genenetwork script
  • CANCELED: @bmunyoki Follow up on getting virtuoso child pages in production
  • IN PROGRESS @alexm push endpoints for editting and making commits for markdown files
  • DONE: @all Reply to survey from Shelby
  • DONE: @jnduli Fix JS import orders (without messing up the rest of Genenetwork)
  • DONE: @jnduli fix search results when nothing is found
  • CANCELED: @jnduli test out running guix cron jobs locally
  • NOT DONE: @Jnduli mention our indexing documentation in gn2 README

Note: For qa.genenetwork.com, we chose to pause work on this until papers are done.

Review for last week

  • DONE: @bmunyoki rebuild guix container with new mcron changes
  • WIP: @jnduli attempts to make UI change that shows all supported keys in the search: Blocked because our JS imports aren't ordered correctly and using `boolean_prefixes` means our searches don't work as we'd expect.
  • WIP: @bmunyoki create an issue with all the problems experienced with search and potential solutions. Make sure it has replication steps, and plans for solutions. Issue was created but we need to get a better understanding for how cis and trans searches work.
  • TODO: @bmunyoki and @jnduli genewiki indexing: PR for WIKI indexing is completed, but we didn't test it out due to the outage caused by RAM and our script. We don't have a way to easily instrument how much RAM our process uses and how to kill the process.
  • DONE: @bmunyoki demoes and documents how to run and test guix cron job for indexing
  • DONE: @bmunyoki trains @jnduli on how to review patchsets from emails
  • DONE: @jnduli Follow up notes on setting up local index-genenetwork search
  • DONE: @alexm handling with graduation, AFK
  • TODO: @bmunyoki follow up with Rob to makes sure he tests search after everything is complete: He got some feedback and Rob is out of Town but wants RIF and Wiki search by July 2nd.

Nice to haves:

  • TODO: minor: bonfacem makes sure that mypy/pylint in CI runs against the index-genenetwork script.
  • TODO: @bmunyoki follow up how do we make sure that xapian prefix changes in code retrigger xapian indexing?

- howto: xapian prefix changes, let's maintain a hash for the file and store it in xapian - howto: for RDF changes, since we have ttl files, if this ever changes we trigger the script. It's also nice to be able to automatically also load up data to virtuoso if this file changes.

2024-06-21

Outage for 2024-06-20

What happened?

2024-06-19: Dev experienced intermittent ssh connections, and got logged out 2024-06-19: Dev assumed this wasn't related to the server problems and clocked out 2024-06-20: Another team mate experienced problems accessing git.genenetwork.org and reported this to the genenetwork sphere channel 2024-06-20: We noted that CI and CD services were also down 2024-06-20: We assumed it could be a DNS issue and followed up with our providers?? 2024-06-20: Realized it was because the server was out of RAM. Killing processes resolved the issue

What can we learn from this?

  • If we experience network problems, communicate to other team members.
  • Our gunicorn processes are expensive each taking 17GB.
  • We don't have monitoring and alerting for our server resources.
  • Shouldn't OOM have helped with this?
  • How much memory/resources do the scripts we run use? This can help us know the impact before running in production.
  • If we start multiple processes from python, similar to the index-genenetwork script, does sending SIGTERM or SIGKILL kill the children processes or will they remain as orphans? Note: there were orphans, so we should investigate ways of killing the script next time completely.

How do we prevent something similar from happening in the future?

  • Ask if anything our server is slow and attempt to inspect this. Add documentation for quick bash scripts to run for this.
  • Reduce the amount of gunicorn processes to reduce their memory footprints. How did we end up with the no of processes we currently have? What impact will reducing this have on our users?
  • Attempt to get an estimate memory footprint for `index-genenetwork` and use this to determine when it's safe to run the script or not. This can even end up integrated into the cron job.
  • Create some alerting mechanism for sane thresholds that can be send to a common channel/framework e.g. when CPU usage > 90%, memory usage > 90% etc. This allows someone to be on the look out in case something drastic needs to be taken.
  • Python doesn't kill child processes when SIGTERM is used. This means when testings, we were creating more and more orphaned processes. Investigate how to propagate the SIGTERM signal to all children processes.

2024-06-18

Agenda

  • Last week review: DONE, made good progress on what we planned. We need to make sure we're more aware of the TODOs we had.
  • Plan for this week: DONE, LET'S GOOO!
  • Reviewing patches: added to week's goals.
  • Search bugs discussions: Boni will create an issue with all details and we'll plan further on how to attack this.

If we have a plan for the week, and something comes up that breaks our plan:

  • Are we aware that it broke our plans?
  • Communicate the impact this may have on the plan.

Plan for Week

  • TODO: @bmunyoki rebuild guix container with new mcron changes
  • TODO: @jnduli attempts to make UI change that shows all supported keys in the search
  • TODO: @bmunyoki create an issue with all the problems experienced with search and potential solutions. Make sure it has replication steps, and plans for solutions.
  • TODO: @bmunyoki follows up to make sure RDF changes are visible in production and fix issues that come up
  • TODO: @bmunyoki and @jnduli genewiki indexing
  • TODO: @bmunyoki demoes and documents how to run and test guix cron job for indexing
  • TODO: @bmunyoki trains @jnduli on how to review patchsets from emails
  • TODO: @jnduli attempts to add stronger types to index-genenetwork script, to make it explicit that we're using MonadicDicts
  • TODO: @jnduli Documentation improvements for GN2 and GN3 and auth? None done last week.
  • TODO: @jnduli Follow up notes on setting up local index-genenetwork search

Nice to haves:

  • TODO: minor: bonfacem makes sure that mypy in CI runs against the index-genenetwork script.
  • TODO: @bmunyoki improve search documentation and fix bugs in the frontend: binary term search doesn't work as expected
  • TODO: @bmunyoki follow up with Rob to makes sure he tests search after everything is complete
  • TODO: @bmunyoki follow up how do we make sure that xapian prefix changes in code retrigger xapian indexing?

2024-06-11

Agenda

  • Local checks to do before PRs:

In gn3, make sure to run:

pylint main.py setup.py wsgi.py setup_commands tests gn3 scripts TODO jnduli run this: mypy --show-error-codes .

DONE: led to a PR that fixed all mypy errors in index-genenetwork

pytest -k unit_test
  • Set up new DB before sync + fixing any problems that occur: DONE, no errors after Jnduli set up new DB.

Generif Indexing

  • Probeset data exists in SQL. Generif metadata exists in RDF.
  • Write code that queries RDF for Generif metadata and enriches exising Probeset query.
  • DONE: checksums in Generif rdf output
  • TODO: jnduli look at xapian docs and their example for python bindings: DONE, led to a WIP PR for wiki data indexing and a local custom script to search the index without the need to run genenetwork web service
  • TODO: bonfacem makes sure tux02 indexing works: DONE, there's a working patchset sent to Arun for review.
  • TODO: bonfacem make changes to mcron and guix-machines: DONE, there's a working patch set sent to Arun. Follow up needs to be done for the learnings gained

How would GeneWiki work?

  • GeneWiki = GeneRif data from NCBI
  • Workflow would be similar to Generif Indexing. We need to figure out if we'll need an extra RDF query or if we can modify the existing SPARQL query.
  • TODO: jnduli attempts to add stronger types to index-genenetwork script, to make it explicit that we're using MonadicDicts: Not DONE, got stuck trying to run index-genenetwork locally and removing global variables from the script..
  • TODO: bonfacem makes sure that mypy in CI runs against the index-genenetwork script: NOT DONE, needs a separate PR that will be sent to Arun.
(made with skribilo)