ServerLDCompliance.md 3.27 KB
Newer Older
Laurent Wouters's avatar
Laurent Wouters committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# Linked Data Server Compliance

The [CubicWeb Linked Data Browser]() is a browser to navigate the web of data.
The point of this document is not to completely explain the web of data; but it is important to important to understand one of its underlying promise:
Namely that data or resources can be referred to by URIs and that by connecting to those URIs, interpreted as locations, the data associated with the URIs can be retrieved, potentially through content negotiation.
For example, a RDF resource (file) refers to entities identified by URIs.
The promise is then that the data(sets) about these entities can be fetched at the URIs.
Therefore, in the same way that a browser for the web of documents can navigate from pages to pages through links using URLs,
a browser for the web of (linked) data could navigate within and between datasets through the URIs of the entities referred to and find new datasets about them.

For this ideal scenario, the web servers that in the end serve the content, in this case the data, should *play nice*, i.e. be compliant with the web of data.
In this context, this mainly mean that when asked about an URI in its scope, a web server should answer with data about the requested URI, at least if explicitely asked to using HTTP headers related to content negotiation.
For example, a compliant server of linked data handling resources for `example.com` should readily (or through content negotiation) answers with the data related to `http://example.com/some_resource`.

Example of compliance issues:

* The server always answers HTML, or redirect to an HTML page, despite content negotiation requesting RDF data.
    * Possible fix: Rely on content negotiation to answer with RDF content. HTML can still be returned by default.
* Instead of replying with data, the server redirects to `http://example.com/some_resource.rdf`.
    * Possible dix: Do not redirect to different URIs depending on RDF syntaxes (.n3, .nt, .rdf, etc.) but directly return the content in the appropriate syntax, as requested through content negotiation.
* The datasets returned by the server make use of multiple writings for the same logical entity, for example using unicode escape sequences `\uxxxx` in URIs.
    * Possible fix: Try to canonicalize data to a single URI writing.
* Multiple datasets have been located for different writing of the same URI and they have different content.
* The provided datasets use language specific URIs with different data for different languages.
    * Possible fix: Unify the datasets in a language-neutral naming scheme for the entities and use language-tag on language-specific RDF literals.


The [CubicWeb Linked Data Browser]() still tries to cooperate with servers that are not already compliant.
It does the following to try to detect datasets:

* Simple content negotiation by requesting RDF data at the URI for the resource (case for compliant servers).
* Follow redirections to URIs for datasets serialized in specific syntaxes.
* Try to detect datasets linked to from HTTP `Link` header.
* When HTML was obtained, try to detect datasets linked to from HTML `Link` headers.
* When HTML was obtained, try to detect links to corresponding datasets, i.e. links starting with the URI of the requested resource and ending with a file extension for a known RDF syntax (.n3, .nt, .rdf, etc.).