Resource Description Framework (RDF)
Resource Description Framework (RDF) is a specification for expressing information about resources in a Data Graph.
With RDF all data is modeled with Triples
. A Triple
is the atomic data entity in the RDF data model. It consists of - you guessed it - three fields:
Subject
: An IRI identifying an entity.Predicate
: An IRI specifying how the subject relates to theObject
.Object
: A IRI to an object (which could be anotherTripple
).
RDF heavily depends on schemas, similar to how every XML Element MUST be of a specific XML Namespace, that is defined in an XSD.
So suppose there is a Person whose first name is John and whose last name is Doe, then this is more or less what it would look like in RDF:
- First we need a
Subject
. TheSubject
MUST be an IRI. So let’s assume we created our own schema that defines that there is a thing called a “Person”. Our John Doe also must have an ID, which MUST be part of the URI. Then this could be a validSubject
:http://openecosystems.com/person#22761879-a876-4e38-a9a0-9e44fe498e3e
. - The next thing we need is a
Predicate
. Let’s start with the first name. We cannot just say that thePredicate
is “firstname”, because thePredicate
MUST be a IRI. We could either add predicates to our own schema, or we could use one of the standard ones. Let’s use “firstName” of the “foaf” namespace (Friend of a Friend). Then thePredicate
would behttp://xmlns.com/foaf/0.1/firstName
. - The last thing we need is the
Object
. In this case it’s the value “John”, which is a “Literal” in therdfs
namespace. So theObject
would be: “John”. (“John” must still be of a specific type).
Then we need to repeat this process for the last name. There is no Predicate
called “lastname”, though, so let’s use “family_name”.
This is John Doe:
Subject | Predicate | Object |
---|---|---|
http://openecosystems.com/person#22761879-a876-4e38-a9a0-9e44fe498e3e | http://xmlns.com/foaf/0.1/firstName | John |
http://openecosystems.com/person#22761879-a876-4e38-a9a0-9e44fe498e3e | http://xmlns.com/foaf/0.1/family_name | Doe |
Now let’s say that John Doe is married to Jane Doe. We’d have to create another two Triples
for Jane
, and then we can create another Triple
to marry the two. There is no existing IRI for “married to”, but there is one for “knows”. So here is John marrying Jane:
Subject | Predicate | Object |
---|---|---|
http://openecosystems.com/person#22761879-a876-4e38-a9a0-9e44fe498e3e | http://xmlns.com/foaf/0.1/firstName | John |
http://openecosystems.com/person#22761879-a876-4e38-a9a0-9e44fe498e3e | http://xmlns.com/foaf/0.1/family_name | Doe |
http://openecosystems.com/person#7b9c0cad-a13e-4e42-850d-d7d0d0c50160 | http://xmlns.com/foaf/0.1/firstName | Jane |
http://openecosystems.com/person#7b9c0cad-a13e-4e42-850d-d7d0d0c50160 | http://xmlns.com/foaf/0.1/family_name | Doe |
http://openecosystems.com/person#22761879-a876-4e38-a9a0-9e44fe498e3e | http://xmlns.com/foaf/0.1/knows | http://openecosystems.com/person#22761879-a876-4e38-a9a0-9e44fe498e3e |
XML Implications
- Validation: Just like one can unambiguously validate an XML document if one has all the XSD, one can also unambiguously validate an RDF if one has all the RDF schema’s.
- Strongly typed: Though the data that is stored in an RDF has no structure, all the
Objects
are Strongly Typed. Just like one could define an “Enumeration” in an XSD, one could also define those in an RDF Schema. - Bloaty: The official RDF specification is even more bloaty than XML itself.
- Prefixes: Because an RDF is a specification on top of XSL, one can create aliases for various namespaces. I.e.: By adding the XML attribute
xmlns:foaf="http://xmlns.com/foaf/0.1/"
, “http://xmlns.com/foaf/0.1/family_name
” can be shortened to “foaf:family_name”.
Performance
RDF has the same problem as XML: it is very heavy on computing resources. For that reason the official RDF specification is not practical at scale.1
The concept of a Triple
is still valid though. So RDF based Graph Databases could not adhere to the RDF standard, but only to the concept of Triples
. They will also apply much smarter ways of storing those Triples
, so that data can be queried far more efficiently.
Advantages and Disadvantages
One advantage of RDF is that the Data Model is very Simple: it only consists of Triples
. This Simplicity also comes with a disadvantage: the Data Model has no structure, so the Data Model gives no indication whether something is a property
or an entity
.
Serialization
There are several ways to serialize RDF data:
References
- https://www.w3.org/TR/rdf11-primer/
- https://www.w3.org/RDF/
- https://www.w3.org/1999/02/22-rdf-syntax-ns
- https://www.w3.org/2001/sw/RDFCore/Schema/20010913/
- http://xmlns.com/foaf/0.1/