Edit this page | Blame

GNSoC 2023

GN Summer of Code

Introduction

We are running a GN Summer of Code in small teams.

  • Runs July + August 2023
  • Weekly plenary where projects present progress - Thursday 9am EU, 10 am EAT.
  • Projects should be (slightly) out of comfort zone
  • Use gemtext documentation
  • Option for publishing on progress by then end as a BLOG or BioHackrXiv

CI for guix-bioinformatics (guix pull)

  • lead: Arun
  • team: Efraim, Pjotr, Sarthak

Making GN deployment rock solid

git repo genenetwork-machines, guix-bioinformatics

Week 1

  • Proposal written

- guix pull on guix-bioinformatics - updated guix is broken gemma (Pjotr)

  • Efraim guix GN2 - so we can have a channel
  • Next step build substitutes for guix-bioinformatics

- once built they are shared

  • And create a GN3 channel (Efraim?)

Week 2

  • RISC-V port progressing with node and zig 0.10
  • guix-bioinformatics now has CI!
  • ~700 packages, 240 are broken ;)

Week 3

  • Arun gives a presentation on laminar using guix-forge: slides:
  • New system is simpler and has reproducibility issues
  • Efraim is doing channels for GN2 and GN3
  • Sarthak will try to run guix-forge

Week 4

  • Arun added channels to CI
  • localhost
  • cgit
  • Efraim: gn2 -> channel; Arun tested

Week 5

  • Added Klaus server for git
  • Fixed channels to work wit Python 3.9 instead of 3.10

Week 6

  • Towards a new workbench with cgit support

Week 7

  • Arun has cgit running on his own server - soon bringing up tux02 after resolving https

Week 8

  • cgit deployment at https://git.genenetwork.org/
  • guix forge

Week 9

  • Discussion on importers for guix forge
  • Talked about propagator networks - part of Seattle presentation

More

  • CI/CD is up and running again (and broken)
  • Rethink: channels and pull channels are used for CI/CD
  • Move unused packages elsewhere

Nextgen databases

lmdb+RDF

  • lead: Bonface
  • team: Fred, Alex
  • contact: Pjotr

git repo genenetwork3

Week 1

  • RDF dumps
  • Parsing S-exp -> markdown
  • Hashing tables (Fred)

- automated updates

  • Some progress on sample data from SQL -> lmdb (Alex)
  • Next week: guile bindings for lmdb

- improving RDF

Week 2

  • RDF structure to markdown dump
  • Fred is running SPARQL queries
  • Alex is adding lmdb phenotype API endpoints

Week 3

  • Bonface demoes new documentation & code

Week 4

  • Settled on prefixes terms and id
  • Updated man pages
  • Started work on lmdb+guile:

Week 5

  • Metadata - renamed prefixes
  • Short names gn: gnt:
  • Updated virtuoso
  • parsing geno files - lmdb support

Week 6

  • Improvements on RDF

Week 7

  • RDF improvements with ontology
  • Inconsistencies and privacy discussion today

Week 8

  • Transformed most tables now
  • guile-lmdb started by Alex

Week 9

  • New renaming and modelling
  • GeneRif
  • Working on unique IDs
  • SKOS

LLMs & metadata (RDF)

  • lead: Shelby
  • team: Priscilla
  • contact: Bonface, Pjotr, Rupert

Week 1

  • Created issue page
  • Downloading publications (Priscilla)
  • Flask server
  • Next: Connecting OpenAI
  • Create matrix room

Week 2

  • Open AI API is working
  • Shelby is integrating into a Flask interface for GeneNetwork
  • Using a pubmed UI style

Week 3

  • Shelby shows code
  • Plan to host
  • Priscilla is working on SLA and document acquisition
  • Hosting GN Q&A

Week 4

  • working on container
  • fetched 1000 publication
  • JSON documentation on references

Week 5

  • Guix container for LLM
  • Expose container to Rupert
  • Add a password

Week 6

  • Very close to a working flask app Rupert can try next week

Week 7

  • Working FLASK app ready for testing - Rupert will have a go

Week 8

  • Working prototype!
  • references as JSON

Week 9

  • Shelby showed working references and challenges
  • System is ready for viewing by Rob

API to access data from GN

  • lead: Rupert
  • team: Flavia
  • contacts: Bonface, Zach, Fred

Documentation and adding endpoints

git repo gn-docs & genenetwork3 & SPARQL

Week 1

  • Mapping out the API
  • Ideas on structuring
  • Questions on GN
  • Next: unify access to information
  • collecting questions from users
  • settle on form of API
  • create example URLs for mouse

Week 2

  • GraphQL Arun gives a mumi demo - schema allows for (partial) queries and querying the schema itself
  • Pjotr convenience API demo - add endpoints in results
  • Flavia added questions in gn-doc - e.g. for synteny search

Week 3

Week 4

  • Start working on endpoints
  • R api - reference GN/API
  • synteny

Week 5

  • Added back-end support for wikidata - finding inconsistencies

Week 6

  • API ready for running in a production environment
  • Using latest RDF

Week 7

  • Test version of API is running at https://luna.genenetwork.org/api/v2.0/
  • We'll continue building up facilities

Week 8

  • R script to parse API by Rupert
  • Move forward with populations/strains and datasets

Week 9

  • Progressed API software to include groups

Editing data

  • lead: Fred
  • team: Arthur, Rupert, Zach
  • contacts: Rob

Week 1

  • Edit phenotype metadata works
  • Next: phenotype values and testing on live

Week 2

  • Fixing issues
  • Meeting on requirements from Arthur and Zach

Week 4

  • Editing works!
  • Discussed the approval procedure and edit button for everyone

Week 5

  • Editing has gone on production - fixing issues
  • Discussion on REST API

Week 6

  • Progress on authorization and editing

Week 7

  • Fred has moved code into a new repo for https://github.com/genenetwork/gn-auth

Week 8

  • gn-auth is building locally and needs to go on the forge
  • case attributes

Week 9

  • gn-auth continuation
  • case-attribute editing progress and challenges

Guix parametrization

  • lead: Sarthak
  • team: Pjotr, Gabor
  • contacts: Ludo

Week 1

  • Next: focus on statically built packages optimized for arch.

Week 2

  • Looking into GeneNetwork3 service
  • Enumerated types

Week 3

  • Preparing for BLOG on S-exp

Week 4

  • BLOG in a stage that we discuss naming conventions

Week 5

  • Proposed DSL for parameters

Week 6

  • Posted BLOG and started implementation

Week 7

  • Agreed on final deliverables for GSoC

Week 8

  • Milestone on dependencies!

Week 9

  • Sarthak showed us the working prototype

Links

  • Matrix room is GNSoC2023

For more info contact pjotr.public912 at thebird.nl

(made with skribilo)