Problem

Short Story

There ought to be something like a Web of data, of semantics or of knowledge.

Long Story


Despite significant research and development efforts the vision of the Semantic Web has not yet become reality. The growth of semantic representations is probably still outpaced by the growth of traditional Web pages and one might remain skeptical about the potential success of the Semantic Web at all. But are there alternatives? From our point of view: Not really! We think that the missing spark for starting the Semantic Web is to overcome the chicken-and-egg dilemma in the simultaneous lack of semantic representations and semantics-conscious search facilities on the Web.


Triplify tackles this dilemma by leveraging relational representations behind existing Web applications. A large part of Web content is generated by database-driven Web applications. However, the structure and semantics encoded in relational database schemes is unfortunately inaccessible to Web search engines, mashups, etc.


ProjectAreaDownloads
phpBBdiscussion forum235480
Galleryphoto gallery166005
XOOPSCMS115807
Copperminephoto gallery113854
Typo3CMS63641
Liferay PortalPortal39615
eGroupWaregroup ware33865
AlfrescoCMS31914
e107CMS19996
LifetypeBlogging16730
PloneCMS13993
CompiereERP + CRM13718
WebCalendarCalendar12832
NucleusBlogging12739
TikiwikiWiki6368


The table on the right contains the 15 most popular web application projects hosted at Sourceforge according to download figures (in May '07).


Imagine the wealth of content available for semantic searches and mashups, if the structured content of these Web applications would be accessible on the Web. Within the Semantic Web initiative a number of standards and techniques have been developed to support the encoding and exchange of structured information and knowledge on the Web. That's the core of the Triplify approach – exploiting structured relational representations behind Web applications to create a critical mass of semantic representations on the Web.


Solution


Triplify is based on the definition of relational database queries for a specific Web application in order to retrieve valuable information and to convert the results of these queries into RDF, JSON and Linked Data. Experience has shown that for most web-applications a relatively small number of queries (usually between 3–7) is sufficient to extract the important information. After generating such database views, the Triplify software can be used to convert the views into an RDF, JSON or Linked Data representation, which can be shared and accessed on the (Semantic) Web.

Installation


Triplify works as follows:

  1. You download and extract a folder containing the Triplify project into your Web application directory.
  2. You download a Triplify configuration matching your Web application or create a new one by defining a number of SQL queries which select the information to be made publicly available.
  3. Your Web application is part of the Semantic Web and interoperable & mashable with other Web applications.

You don't have to struggle with ontologies, mapping languages, logics or other scary things ;-)

SQL Query Structure

If there is no existing Triplify configuration for your Web application, you have to create your own. The main part is the definition of a number of SQL queries selecting information that is meant for public use. In order for Triplify to be able to convert the results of your SQL queries into RDF, the query results are required to have a certain structure:

  • The first column must contain identifiers which can be used to generate instance URIs (i.e. the primary key of your database table).
  • Column names will be used to generate property URIs; by renaming the columns of your database table (e.g. SELECT id,name AS 'foaf:name' FROM users), you can reuse properties from existing vocabularies such as Dublin Core, FOAF, SIOC.
  • The individual cells of the query result contain data values or references to other instances and will eventually constitute the objects of resulting triples.


Details are explained in Triplify's configuration file. Configurations for popular Web applications are collected in Documentation.

Requirements


  • Currently, Triplify is only implemented in PHP, but we welcome volunteers who work on implementations in other Web application languages. This should shortly be possible, since the crucial parts have barely more than 200 lines of code.
  • Triplify needs direct access to the relational database by means of either a PDO object (which is standard in PHP) or the standard MySQL driver. However, your Web application may use any other database abstraction framework.
  • In order to expose RDF as Linked Data, Triplify relies on URL rewriting, as provided by Apache's mod_rewrite. Standard RDF and JSON export will, however, work without URL rewriting.

Issues

  • Triplify is still beta grade software. There might be issues with exotic encodings, databases, PHP/Apache configurations. We would appreciate your feedback in order to solve these issues.
  • Performance: Triplify is currently aimed at small to medium Web applications (i.e. less than 100MB database content). However, Triplify supports the caching of the triplification results and can hence also be used with large Web applications.
  • Privacy: Please be cautious not to reveal any sensitive information. Email addresses should be SHA1 hashed; password hashes and information, which is not meant to be publicly accessible, should be omitted in the Triplify SQL queries. As a rule of thumb, you should only make information available through Triplify, which is also publicly accessible on Web pages.

License



Triplify is licensed under the terms of the GNU Lesser General Public License. You are free to copy, modify or redistribute it, even together with commercial software.


 
There are no files on this page. [Display files/form]
Comments [Hide comments/form]

Information

Last Modification: 2010-03-04 12:30:07 by Sebastian Dietzold