Configuration


After you have downloaded and extracted the Triplify script into your Web application, you have to create a Triplify configuration suitable for your Web application. You can view the Triplify example configuration with detailed explanations.


Once a Web application was “triplized”, this configuration can be reused for all subsequent installations of the same Web application. This page contains community-contributed Triplify configurations. Feel free to add configurations you created or just send them to us by email and we will add them here.


Application and link to config.inc.php:


Generation of Provenance Information in your metadata

Beginning from version 0.6. Triplify supports the generation of provenance information for your data. This is described in detail at the wiki page for the Triplify Metadata Extension created by Olaf Hartig.

Calling Triplify from the Command Line / using Triplify as ETL tool


Triplify can be used as a tool for Extract-Tranform-Load (ETL) cycles by calling it from the command line:


php5 index.php cli-config.php


Triplify automatically detects that it was called from the command line and overloads the default config file config.inc.php with the values set in the file given as parameter (in the example above cli-config.php). Subsequently, it processes all queries in order to extract all possible data from the database. The resulting triples are returned on stout and can be post-processed with other tools, e.g.:


php5 index.php cli-config.php | bzip2 -c > export.nt.bz2


For example the 3 billion LinkedGeoData triples have been generated using Triplify.

Using Regular Expressions for Mapping URLs to RDF


The default behavior of Triplify of mapping URIs of the pattern http://example.com/triplify/CLASS/ID to SQL queries can be easily extended by using regular expressions to match request URL's:


$triplify['queries']=array(
  '/^product\/([0-9]+)$/'=>'SELECT id, name FROM products WHERE id="$1"'
);


In this example, an URL request of http://example.com/triplify/product/45 would match the regular expression '/^product\/([0–9]+)$/' given as key in the queries configuration array. Triplify would then substitute any occurrences of back references via $n in the SQL query to matching patterns in the requested URL, execute the query and transform the result into RDF.


This approach enables Triplify to be used for easily creating RDF returning REST services. A more complex example, which is used to realize the LinkedGeoData near REST service is:


$triplify['queries']=array(
    '/^near\/(-?[0-9\.]+),(-?[0-9\.]+)\/([0-9]+)\/?$/'=>'SELECT CONCAT("base:",n.type,"/",n.id,"#id") AS id, CONCAT("vocabulary:",n.type) AS "rdf:type", '.$latlon.',
  rv.label AS "t:unc", REPLACE(rk.label,":","%25"), '.$distance.'
FROM  elements n
  INNER JOIN tags t USING(type,id)
  INNER JOIN resources rk ON(rk.id=t.k)
  INNER JOIN resources rv ON(rv.id=t.v)
WHERE '.$box.'
HAVING distance < $3 LIMIT 1000', 
);

Integrating Triplify into Web Applications


Triplify is most beneficial, when it becomes a direct part of the mainstream distribution of standard Web applications such as CMS, Wikis, Blogs etc.. Since Triplify is very lightweight, it is extremely easy to integrate it into these Web applications:

  • add the Triplify folder to the root folder of your Web application
  • configure SQL queries selecting the information to be exposed
  • make sure Triplify uses the database connection parameters of the Web application, which can usually be achieved by including the Web applications configuration file and creating the PDO object or MySQL connection string by using the respective configuration values
  • modify the installation procedure of your Web application so that the triplify/cache/ folder is writable by the Web server or change the Triplify configuration variable $triplify['cachedir']
  • make sure that the installation procedure registers the Triplify instance with the Registry, so it can be found by other Semantic Web applications. This can be done simply by accessing http://triplify.org/register/?url='.urlencode($baseURI).'&type='.urlencode($triplify['namespaces']['vocabulary']) either by means of fopen() or by embedding a corresponding link (e.g. within an iframe) in the installer

Such a triplification of your Web application has tremendous advantages:

  • The installations of the Web application are better found and search engines can better evaluate the content.
  • Different installations of the Web application can easily syndicate arbitrary content without the need to adopt interfaces, content representations or protocols, even when the content structures change.
  • It is possible to create custom-tailored search engines targeted at a certain niche. Imagine a search engine for products, which can be queried for digital cameras with high resolution and large zoom.

Ultimately, a triplification will counteract the centralization we have faced through Google, YouTube and Facebook and lead to an increased democratization of the Web.

Publishing update logs as Linked Data


Triplify can be used for publishing a hierarchically structured update log as linked data itself. Details can be found here:


http://triplify.org/vocabulary/update

Using Triplify Data

Creating Mashups


Probably the largest benefit when using Triplify is that your Web application becomes easily mashable with other Web data sources.


  • Simile Potluck allows to mix and mash any Triplify data sources without any programming
  • Yahoo! Pipes – Triplify instances can be used as data sources within Yahoo!'s pipe designer

  • Programmableweb – provides a database of API's, which can be combined with Triplify data sources
  • JSON output: simply append ?t-output=json to any Triplify URI and you can easily access Triplify data, even without an RDF/N3 parser

Semantic Search


Search engines which exploit semantic representations are still rare or in early stages of development.


  • Sindice is a lookup index for Semantic Web documents
  • SWSE
  • Swoogle is probably the oldest Semantic Web search engine

Registered Triplify data sources will be submitted to these and any upcoming future semantic search engines.

RDF and Linked Data Browser


There are some RDF browsers available, which allow to browse and filter Triplify data:



Just enter the URL of your Triplify installation and start browsing your data.

Vocabularies and Links

Vocabularies


When creating Triplify configurations, try to reuse existing vocabularies as much as possible. Some central vocabularies are:



A comprehensive list of vocabularies can be also found at: http://schemaweb.info.

Links


W3C's Semantic Web initiative is an umbrella for many standards (such as RDF) and activities that aim at bringing more structure to the Web.


  • The Open Knowledge Foundation is devoted to promoting and protecting open knowledge.
  • DataPortability is devoted to increase awareness for preventing data lock-ins in certain applications or services.
  • Linked Data is recommended best practice for exposing, sharing, and connecting pieces of data, information, and knowledge on the Semantic Web by using URIs and RDF.
  • D2RQ – Treating Non-RDF Databases as Virtual RDF Graphs. D2RQ has basically the same aim as Triplify, although it is more difficult to deploy. It includes its own mapping language to map DB content to ontologies, whereas Triplify just uses SQL. In contrast to Triplify, D2RQ also contains SPARQL endpoint functionality.
  • Openlink's Sponger allows to access non-RDF data sources using SPARQL.
  • DBpedia “semantifies” Wikipedia by providing RDF datasets and a SPARQL endpoint for semantics extracted from Wikipedia.

 
There are no files on this page. [Display files/form]
There is no comment on this page. [Display comments/form]

Information

Last Modification: 2010-06-29 15:25:50 by Soeren Auer