Google Summer of Code (GSoC '23) @ DBpedia Organisation

Google Summer of Code (GSoC '23) @ DBpedia Organisation

DBpedia Search API enhancement

This project took place over the summer of 2023 as part of the Google Summer of Code, working with the research team of Leipzig University of Applied Sciences, Member of DBpedia Association.

The project's objective was to improve the SANTé's branch mulang by containerizing all the applications separately and implement GraphQL. For context, SANTé stands for Semantic Search Engine and is designed to simplify RDF data access and exploration. SANTé covers different aspects of search engines, such as indexing, ranking as well as interaction. You can use SANTé via the command line or via SANTé Web Interface (smile).

The original goal of this project arose out of the constraints of the usability of the applications and make it more accessible to the researchers and users trying to explore Semantic Web.

Specification

Although not originally a focus for this project, my mentor suggested to chalk out the specifications for the GraphQL implementation.

Some similarities between the prexisting standards like GraphQL-LD approach by Comunica and the GraphQL approach by Stardog:

  1. Use of GraphQL: Both approaches use GraphQL as the query language. GraphQL is a powerful query language for APIs and a runtime for executing those queries with our existing data.

  2. Translation to SPARQL: Both systems translate GraphQL queries into SPARQL, a query language for RDF. This allows them to leverage the power and flexibility of SPARQL while providing a simpler and more intuitive query interface through GraphQL.

  3. Schema-Based: Both approaches are schema-based. In GraphQL-LD, the schema is provided by the JSON-LD context, while in Stardog's GraphQL, it's provided by the database schema. In both cases, the schema provides the structure and types for the data, guiding the formation of the GraphQL queries.

  4. Data Retrieval: Both systems are designed to retrieve data from databases or data sources based on the GraphQL queries. They provide a structured and efficient way to fetch data.

  5. Targeted at Developers: Both GraphQL-LD and Stardog's GraphQL are designed with developers in mind, providing tools and interfaces that make it easier to query and manipulate data.

  6. Semantic Web Technologies: Both approaches leverage Semantic Web technologies. GraphQL-LD uses JSON-LD for linking data, while Stardog's GraphQL operates on data stored in a Stardog database, which supports RDF and other Semantic Web standards.

While they have different implementations and target different use cases, both GraphQL-LD and Stardog's GraphQL share a common foundation in GraphQL and Semantic Web technologies. They both aim to make it easier and more efficient to query and retrieve data.

Although more details can be found at https://github.com/ronitblenz/sante/wiki/GraphQL-in-Sante-for-mulang.

Tasks

  1. Creation of Docker Images for :

    • SANTé MAIN

    • SANTé SMILE

    • SANTé API

  2. Sample Specification : https://github.com/ronitblenz/sante/wiki/GraphQL-in-Sante-for-mulang

  3. Documentation : https://aksw.github.io/sante-api-docs/

Workflow example

In this 5 minutes tutorial, we will help you to instantiate your first knowledge base search engine over FOAF ontology using KBox https://github.com/AKSW/KBox.

  1. Downloads KBox and instantiates the FOAF knowledge graph.
docker run --network=host aksw/kbox -server -kb "http://xmlns.com/foaf/0.1,https://www.w3.org/2000/01/rdf-schema,http://www.w3.org/2002/07/owl,http://www.w3.org/1999/02/22-rdf-syntax-ns,http://purl.org/dc/elements/1.1/,http://purl.org/dc/terms/,http://purl.org/dc/dcam/,http://purl.org/dc/dcmitype/" -install

Loading Model...
Publishing service on http://localhost:8080/kbox/query
Service up and running ;-) ...

You can now access and query your knowledge graph at http://localhost:8080. Notice that in the example above, we also include RDFS, RDF, and OWL ontologies. That's because we need their information to correctly instantiate FOAF ontology. If the SPARQL endpoint does not contain all necessary information, SANTé will not be capable of retrieving or searching for it and will display the resource as URI.

  1. Creating the Volume
docker volume create index

This volume stores the index directory, so that it becomes accessible to all the docker images once it is set up.

  1. Create the index.

Assuming that you successfully performed step (1) and (2),

docker build -t sante/main -f sante.main/Dockerfile .

Using the URI to generate the foaf_kg.

docker run --network=host -v index:/sante/foaf_kg -e endpoint=http://localhost:8080/kbox/query sante/main

The SANTé Main build creates the foaf_kg folder and inserts it into the mounted volume index.

  1. Instantiate smile
docker build -t sante/smile -f sante.smile/Dockerfile .

To run the docker image along with the specified index, here is the command:

docker run -p 7070:7070 -v index:/index -itd sante/smile


  ____    _    _   _ _____  __   __        _______ ____       _
 / ___|  / \  | \ | |_   _|/_/_  \ \      / / ____| __ )     / \   _ __  _ __
 \___ \ / _ \ |  \| | | || ____|  \ \ /\ / /|  _| |  _ \    / _ \ | '_ \| '_ \
  ___) / ___ \| |\  | | ||  _|_    \ V  V / | |___| |_) |  / ___ \| |_) | |_) |
 |____/_/   \_\_| \_| |_||_____|    \_/\_/  |_____|____/  /_/   \_\ .__/| .__/
                                                                  |_|   |_|

2022-09-13 09:58:15.842  INFO 21938 --- [           main] org.aksw.sante.SanteWebApp               : Starting SanteWebApp v2.5.3 using Java 11.0.10 on ... with PID 21938
...
2022-09-13 09:58:15.846  INFO 21938 --- [           main] org.aksw.sante.SanteWebApp               : No active profile set, falling back to default profiles: default

If you correctly executed all the steps above, now you should be able to access SANTé at http://localhost:7070.

Future work

Although I did not complete all of the goals set out at the start of this project, what I've accomplished gives a good starting point for future experimentation with the project. The implementation of the GraphQL schema and it's endpoints is still an undone task which can be picked up by contributors in the later stage of this project.

List of Pull Requests (Both Merged and Un-Merged)

https://github.com/AKSW/sante/pull/12

https://github.com/AKSW/sante/pull/20

https://github.com/AKSW/sante/pull/21

https://github.com/AKSW/sante/pull/22

https://github.com/AKSW/sante/pull/24

https://github.com/AKSW/sante/pull/25

https://github.com/AKSW/sante/pull/26

https://github.com/AKSW/sante-api-docs/pull/1

https://github.com/AKSW/sante-api-docs/pull/2

https://github.com/AKSW/sante-api-docs/pull/3

https://github.com/AKSW/sante-api-docs/pull/4