Read the design-doc here:
This task is related to:
In one of the reviews, it was pointed that we shouldn't really store table IDs in RDF. They are confusing and not necessarily important.
No much progress was made with this. We use table IDs to reference and build relationships in GN. This has made it difficult to have drop-in replacements in RDF without breaking a big part of GN2 functionality. Also, how we fetch data in GN through deep inheritance. With time, this deep inheritance that introduces unnecessary coupling should be untangled - a task for another day - in favor of composition.
This demo was unsatisfactory; but nevertheless, I have a good understanding of how federated queries work. My findings are: federated queries can be slow (querying wikidata took as long as 5-30 seconds). As such, a better strategy would be to write scripts to enrich our dataset from other data sources as entries with the right ontology. Also, being exposed to many other RDF sources that use different ontology was confusing.
The submitted demo was for this SPARQL query for the trait:
PREFIX gn: <http://genenetwork.org/> SELECT ?name ?dataset ?dataset_group ?title ?summary ?aboutTissue ?aboutPlatform ?aboutProcessing WHERE { ?dataset gn:accessionId "GN112" ; rdf:type gn:dataset . OPTIONAL { ?dataset gn:name ?name } . OPTIONAL { ?dataset gn:aboutTissue ?aboutTissue} . OPTIONAL { ?dataset gn:title ?title } . OPTIONAL { ?dataset gn:summary ?summary } . OPTIONAL { ?dataset gn:aboutPlatform ?aboutPlatform} . OPTIONAL { ?dataset gn:aboutProcessing ?aboutProcessing} . OPTIONAL { ?dataset gn:geoSeries ?geo_series } . }
The particular trait in GN2 is:
The equivalent version in GN1 is:
Metadata about this dataset can be found in: