Edit this page |
Blame
Orchestration and fallbacks
After the Penguin2 crash in Aug. 2022 it has become increasingly clear how hard it is to deploy GeneNetwork. GNU Guix helps a great deal with dependencies, but it does not handle orchestration between machines/services well. Also we need to look at the future.
What is GN today in terms of services
-
[X] Main GN2 server (Python, 20+ processes, 3+ instances: depends on all below)
-
[X] Matching GN3 server and REST endpoint (Python: less dependencies)
-
[X] Mariadb
-
[X] redis
-
[X] virtuoso (@aruni)
-
[X] GN-proxy (Racket, authentication handler: redis, mariadb)
-
[X] Alias proxy (Racket, gene aliases wikidata)
-
[X] opar server
-
[+] Jupyter, R-shiny and Julia notebooks, nb-hub server
-
[X] BNW server (@efraimf)
-
[+] UCSC browser (@efraimf)
-
[X] GN1 instances (older python, 12 instances in principle, 2 running today)
-
[ ] Access to HPC for GEMMA (coming)
-
[+] Backup services (sheepdog, rsync, borg)
-
[+] monitoring services (incl. systemd, gunicorn, shepherd, sheepdog)
-
[ ] mail server
-
[X] https certificates
-
[X] http(s) proxy (nginx)
-
[X] CI/CD services (with github webhooks)
-
[+] git server (gitea or cgit)
-
[X] file server (formerly IPFS)
-
[ ] SPARQL endpoint
Somewhat decoupled services:
-
[X] genecup
-
[X] R/shiny power service Dave
-
[ ] biohackrxiv
-
[ ] hegp
-
[ ] covid19
-
[ ] guix publish server (runs on penguin2, needs tux02 @efraimf)
I am still missing a few! All run by a man and his diligent dog.
For the future the orchestration needs to be more robust and resilient. This means:
-
A fallback for every service on a separate machine
-
Improved privacy protection for (future) human data
-
Separate servers serving different data sources
-
Partial synchronization between data sources
The only way we *can* scale is by adding machines. But the system is not yet ready for that. Also getting rid of monolithic primary databases in favor of files helps synchronization.