Triplify update vocabulary



When RDB data is published on the Web e.g. as Linked Data it is important to keep track of DB (and hence RDF) updates so crawlers know what has changed (after the last crawl) and should be re-retrieved from that endpoint.

To have a centralized registry (such as e.g. implemented by PingTheSemanticWeb service) does not seem to be feasible when Linked Data becomes more popular – think of millions of Linked Data endpoints pinging such a registry each time a small change occurred.

The approach: Linked Data Update Logs

Each Linked Data endpoint provides information about updates performed in a certain timespan as a special/standardized Linked Data source.

Let's assume the company provides a Linked Data endpoint with information about their products, employees etc. The endpoint is reachable via

The LOD endpoint contains a special LOD space below which contains information about updates. for example will return the following RDF:   rdf:type   update:UpdateCollection .   rdf:type   update:UpdateCollection . could then return the following RDF:   rdf:type   update:UpdateCollection .   rdf:type   update:UpdateCollection .

This nesting could continue until we finally reach an URL, which exposes all updates performed in a certain second in time. For very frequently updated LOD endpoints (e.g. Wikipedia) this interval of one second will be sufficiently small enough, so the related update information can be still easily retrieved. For rarely updated LOD endpoints (e.g. a personal Weblog) links should only point to non-empty Update Collections in order to prevent crawlers from performing unnecessary HTTP requests. then would for example contain RDF links (and additional metadata) to the Linked Data documents updated on Jan 1st, 2008 at 17:58:06, e.g. following triples:   update:updatedResource .   update:updatedAt         "20080101T17:58:06"^<xsd:dateTime> .   update:updatedBy .

Individual updates are identified by a sequential identifier (i.e. “user123” in the example). Arbitrary meta data can be attached to these updates, such as the time of the update (probably redundant since that can be inferred from the URL) or a certain person who performed the update.


Triplify automatically generates all the resources in the update URI space, when the Triplify configuration $triplify['queries'] contains a query named update. This query has to return at least two columns. The first column contains the date when to update occurred, the second column contains the id of the updated resource. An example is given below:

SELECT p.changed AS id, AS 'update:updatedResource->project' FROM project p


The workings of the Triplify Linked Data Update Logs can be observed with Triplify's own datasource registry:



A collection of updates.


An atomic update performed on an RDF resource.


SubClassOf: update:Update

Represents the deletion of an RDF resource.



Type: ObjectProperty
Domain: update:Update

Points to the resource which was updated.


Type: DatatypeProperty
Domain: update:Update
Range: xsd:dateTime

Refers to the date and time when a certain update occurred.


Type: Property
Domain: update:Update

Points to the user description who performed the update.

There are no files on this page. [Display files/form]
There is no comment on this page. [Display comments/form]


Last Modification: 2008-08-01 13:48:21 by Elias Theodorou