What's in a name - Part 4

I promised you four solutions to the problem of dubbing with appropriate URIs. So, without further ado, let's go.

The first one you've seen already. It's using anonymous nodes. _person foaf:interest _security. http://dmoz.org/Computers/Security/ dc:subject _security.

But here we get the problem, that we can't reference _security from outside, thus loosing a lot of the possibilities inherent in the Semantic Web, because this way you can not say that someone else is interested in the same topic as _person above. Even if you say, in another RDF file, _person2 foaf:interest _security. http://dmoz.org/Computers/Security/ dc:subject _security. _security actually does not have to be the same as above. Who says, websites only have one subject? The coincidental equality of the variable name _security bears as much semantics as the equality of two variables x in a C and a Python-Program.

So this solution, although possible, bears too much short-comings. Let's move on.

The second solution is hardly available to the majority of us puny mortals. It's introducing a new URI schema. Let's return to our very first example, where we wanted to say that the Politeia was written by Plato.

urn:isbn:0192833707 dc:creator "Plato".

Great! No problems here. Sure, your web-browser can't (yet) resolve urn:isbn:0192833707, but no ambiguity here: we know exactly of what we speak.

Do we? Incidentally, urn:isbn:0465069347 also denotes the Politeia. No, not in another language (those would be another handful of ISBN numbers), just a different version (the text is public domain). Now, does the following statement hold?

urn:isbn:0192833707 owl:sameAs urn:isbn:0465069347.

Most definitively not. They have different translators. They have different publishers. These are different books. But it's the same - what? What is the same? It's not the same text. It's not the same book. They may have the same source text they are translated from. But how to express this correctly and still useful?

The urn:isbn: scheme is very useful for a very special kind of entities - published books, even the different versions of published books.

The problem with this solution that you would need tons of schemes. Imagine the number of committees! This would, no, this should never happen. We definitively need an easier solution, although this one certainly does work for very special domains.

Let's move on to the third solution: the magic word is fragment identifier. #. Instead of saying: http://semantic.nodix.net/Politeia dc:creator http://semantic.nodix.net/Plato. and thus getting 404s en masse, I just say: http://semantic.nodix.net/#Politeia dc:creator http://semantic.nodx.net/#Plato.

See? No 404. You get to the homepage of this blog by clicking there. And it's valid RDF as well. So, isn't it just perfect? Everything we wished for?

Not totally, I fear. If I click on http://semantic.nodx.net/#Plato, I actually expect to read something about Plato, and not to see a blog about the Semantic Web. So this somehow would disappoint me. Better than a 404, still...

The other point is my bandwidth. There can be RDF files with thousands of references. Following every single one will lead to considerable bandwidth abuse. For naught, as there is no further information about the subject on the other side. Maybe using http://semantic.nodix.net/person#Plato would solve both problems, with http://semantic.nodix.net/person being a website saying something like "This page is used to reserve conceptual space for persons. To understand this, you must understand the magic of URIs and the Semantic Web. Now, go back whereever you came from and have a nice day." Not too much webspace and bandwith will be used for this tiny HTML-page.

You should be careful though to not have a real fragment identifier "Plato" in the page, or you would actually dereference to this element. URI collision again. You don't want Plato to become half-philosopher / half-XML-element, do you?

We will return to fragment identifiers in the last part of this six part series again. And now let's take a quick look at the fourth solution - we will discuss it more thoroughly next time.

Use a fresh URI whenever you need an URI and don't care about it giving a 404.