Semantic Modelling


> reading: Chapter 4 "Incorporating Semantics", Semantic Web Programming, J. Hebeler et al.

Open World and Closed World Assumption

Open world assumption (OWA) means that information that is not present in a certain domain is not considered to be false. Just because it's not there, does not mean it's wrong.
Closed world assumption (CWA) is obviously the opposite. When there is a piece of information missing, it means it is wrong.

For example: "John works with Paul".
With CWA we couldn't say whether John and Paul know each other. We would have to assume that they don't. There are just working together. We would have to answer "no" to the question, whether they know each other.
With OWA the information about whether they know each other or not is not stated. So we just can not say anything about it, whether it's true that they're working together or not. But we couldn't say it is false either.

No unique Names Assumption

In the WWW we can not assume that information is always identified by an unique name. When two web sites referencing to a single information, we can not assume both are using the same URIs to reference to this information. Unless the two are explicitly stating that the information they are talking about are the same (by adding additional attributes) ....... wait something is confusing! need to think about that again

Ontology Elements

  • Header
    • represents the ontology itself
    • contains comments, labels, version and other imported ontologies
  • Class
    • special kind of resource that represents a set of resources
  • Individual
    • member of a class
    • can be member directly or indirectly
  • Property
    • predicate to describe individuals
    • object properties link individuals to other individuals
    • data properties link individuals to literal values
  • Annotation
    • basically like a Property
    • has no associated semantics
    • commonly used for label or comment
  • Datatype
    • well data type
Semantic Modelling

Modelling RDF

> reading: Chapter 3 "Modelling Information", Semantic Web Programming, J. Hebeler et al.

Very briefly: there are three ways of writing information in RDF-style. Most likely there are much more ways of writing, or better representing, infortmation with RDF. But the following three are the most popular ones. The first one is RDF/XML, which is a XML presentation of RDF. Second is the Terse RDF Triple Language (aka Turtle ... why Turtle!?). And the third on is N-Tripel.

The XML notation is obvious. It's specialised XML for RDF. In the entry to the RDF Primer are some little examples. A snippet from a full RDF/XML could look like this:

  <rdf:Description rdf:about="http://.../people#Paul">
     <ext:worksWith rdf:resource="http://.../people#John/>
<!-- More -->

Well, it's XML. Not pretty, but it's interpretable for software.

The second one, Turtle, is supposed to be more human-friendly:

@prefix rdf:    <>
@prefix ext:    <>
@prefix foaf:   <>
@prefix people: <> 

people:Paul ext:worksWith people:John .
people:Matt foaf:knows people:John .
     foaf:knows people:Matt ;
     foaf:surname "Lopez" .

But I must say it is not that easier to read. Maybe a bit, because it has less brackets, obviously. The biggest benefit from this one is I think, that it has much less overhead (well the bracket-stuff). So file sizes would be much smaller, especially because with RDF you  can produce incredibly huge files.

And thirdly the N-Triples, which is a "simplified version of Turtle":

<urn:> <urn:> <urn:>

So N-Triples are expressed in one line. Each entry is filled up with one line. There is no such thing as @prefix as in Turtle, which is stupid I think. Imagine: There is the full URL to a certain namespace for each subject, predicate and object. Of course this can be considered a "simplified version of Turtle", but I don't see this as practicable. If there would be a possibility to define QNames for the namespaces, this would be great. It would be so compact.

Semantic Modelling

RDF Primer

> reading the RDF Primer @

This Primer is so long - I couldn't make it to the end of the document, yet. The more I read the more complex this whole topic seem to become.

Till paragraph 2.4 it explains the basic structure of RDF. Data or any information is formed as triples, separated in a predicate, an subject and an object. When you are not using any abbreviations it would look like this:

<> <> <>

The first defines let's say the "thing" we want to bind information to. The second defines what type of information we are saving here. And the third is the value.  The value could also be a literal string like "value", instead of a resource to the value.

Fortunately this can be made shorter. You can define abbreviations for long URIrefs (in this case the URLs). These are called QNames (qualified names). So you could say "ex:" stands from now on for "". Then you can reference it like this:

ex:index.html info:data exdata:value

(where info = "" and exdata = "") When you use QNames you don't need the brackets < and > around the resources. Just when the resource is a URI than you have to wrap them with brackets.

Basically you could define your own dictionary (the "abbreviations") with your own meaning for what is what. There are no restrictions from RDF. But if everybody defines his "own crap" it would get quickly very messy in the web and no one (especially no software) would understand what is meant. So it is considered best practise to use already well defined dictionaries, like this one . This way everything is more clean and if you find an RDF document referencing to this dict you can be sure what it is about. Of course you could also define your own one and reference to it - but why try to invent the wheel again 😉