Semantic search

Jump to navigation Jump to search
Condition
Printout selection
Options
Parameters [
limit:

The maximum number of results to return
offset:

The offset of the first result
link:

Show values as links
headers:

Display the headers/property names
mainlabel:

The label to give to the main page name
intro:

The text to display before the query results, if there are any
outro:

The text to display after the query results, if there are any
searchlabel:

Text for continuing the search
default:

The text to display if there are no query results
embedformat:

The HTML tag used to define headings
embedonly:

Display no headings
Sort options
Delete
Add sorting condition

The Surrounding Sea

Explore the ocean of words in which we all are swimming, day in day out. A site that allows you to browse through the lexicographic data in Wikidata along four dimensions:

  • alphabetical, like in a good old fashioned dictionary
  • through translations and synonyms
  • where does this word come from, and where did it go
  • narrower and wider words, describing a hierarchy of meanings

Wikidata contains over 1.2 million lexicographic entries, but you will see the many gaps when exploring the sea of words. Please join us in charting out more of the world of words.

Happy 23rd birthday to Wikipedia and the movement it started!

Das Mädchen Doch


Sie sagten ihrer Mutter
Kinder werde sie nie haben
Und als sie geboren wurde
Nannte ihre Mutter sie
Doch

Sie sagten sie sei schwach
Und klein und krank
Und dass sie nicht
Lange zu leben habe
Doch

Ihre Mutter hoffte
Das sie in einer Welt aufwuchs
In der alle gleich behandelt wurden
Aber leider
Doch

Sie sagten Mathe und Autos
Seien nichts für Mädchen
Dass sie sich interessiert
Für Puppen und für Kleidung
Doch

Sie sagten die Welt
Ist wie sie ist
Und sie zu ändern
Sei nichts für kleine kranke Mädchen
Doch

Sie sagten gut dass Du darüber sprachst
Wir sollten darüber nachdenken
Lass uns jetzt darüber debattieren
Und wir (nicht Du) entscheiden dann
Doch

Sie sagten man kann nicht alles haben
Man muss sich entscheiden
Aber so selbstsüchtig
Ich meine, keine Kinder zu wollen
Doch

Sie sagten sie sei unanständig
So ein Leben sei nicht richtig
Benannten sie mit unanständigen Worten
Was sie sich denn erlaube
Doch

Sie sagten das geht doch nicht
So ein Leben sei kein Leben
Das ist jetzt schon sehr anders
Das ist nicht einfach nur Neid
Doch

Sie sagten wir sind halt nicht so
Und wollen auch nicht so sein
Wir sind glücklich wie wir sind
Und deswegen darfst du glücklich nicht sein
Doch

Languages with the best lexicographic data coverage in Wikidata 2023

Languages with the best coverage as of the end of 2023

  1. English 92.9%
  2. Spanish 91.3%
  3. Bokmal 89.1%
  4. Swedish 88.9%
  5. French 86.9%
  6. Danish 86.9%
  7. Latin 85.8%
  8. Italian 82.9%
  9. Estonian 81.2%
  10. Nynorsk 80.2%
  11. German 79.5%
  12. Basque 75.9%
  13. Portuguese 74.8%
  14. Malay 73.1%
  15. Panjabi 71.0%
  16. Slovak 67.8%
  17. Breton 67.3%

What does the coverage mean? Given a text (usually Wikipedia in that language, but in some cases a corpus from the Leipzig Corpora Collection), how many of the occurrences in that text are already represented as forms in Wikidata's lexicographic data.

The list contains all languages where the data covers more than two thirds of the selected corpus.

Progress in lexicographic data in Wikidata 2023

Here are some highlights of the progress in lexicographic data in Wikidata in 2023

What does the coverage mean? Given a text (usually Wikipedia in that language, but in some cases a corpus from the Leipzig Corpora Collection), how many of the occurrences in that text are already represented as forms in Wikidata's lexicographic data. Note that every percent more gets much more difficult than the previous one: an increase from 1% to 2% usually needs much much less work than from 91% to 92%.

RIP Niklaus Wirth

RIP Niklaus Wirth;

BEGIN

I don't think there's a person who created more programming languages that I used than Wirth: Pascal, Modula, and Oberon; maybe Guy Steele, depending on what you count;

Wirth is also famous for Wirth's law: software becomes slower more rapidly than hardware becomes faster;

He received the 1984 Turing Award, and had an asteroid named after him in 1999; Wirth died at the age of 89;

END.

Wikidata lexicographic data coverage for Croatian in 2023

Last year, I published ambitious goals for the coverage of lexicographic data for Croatian in Wikidata. My self-proclaimed goal was widely missed: I wanted to go from 40% coverage to 60% -- instead, thanks to the help of contributors, we reached 45%.

We grew from 3,124 forms to 4,115, i.e. almost a thousand new forms, or about 31%. The coverage grew from around 11 million tokens to about 13 million tokens in the Croatian Wikipedia, or, as said, from 40% to 45%. The covered forms grew from 1.4% to 1.9%, which illustrates neatly the increased difficulty to reach more coverage (thanks to Zipf's law): last year, we increased covered forms by 1%, which translated to an overall coverage increase of occurrences by 35%. This year, although we increased the covered forms by another 0.5%, we only got an overall coverage increase of occurrences by 5%.

But some of my energy was diverted from adding more lexicographic data to adding functions that help with adding and checking lexicographic data. We launched a new project, Wikifunctions, that can hold functions. There, we collected functions to create the regular forms for Croatian nouns. All nouns are now covered.

I think that's still a great achievement and progress. Sure, we didn't meet the 60%, but the functions helped a lot to get to the 45%, and they will continue to benefit us 2024 too. Again, I want to declare some goals, at least for myself, but not as ambitious with regards to coverage: the goal for 2024 is to reach 50% coverage of Croatian, and in addition, I would love us to have Lexeme forms available for verbs and adjectives, not only for nouns, (for verbs, Ivi404 did most of the work already), and maybe even have functions ready for adjectives.

Star Trek's 32nd century

I like Star Trek for the cool technology, which has inspired plenty of people to work eg on "the Star Trek computer". I love Star Trek for the utopian society of plenty they sketch in the 23rd and 24th century.

I claim it is because of the laziness of the writing: they don't keep that utopia up.

When I heard about Discovery going to the 32nd century, I was excited about the wonders they would dream up. The new technology. The society. The culture. The breakthroughs.

With regards to that, it was a massive let down. Extremely disappointing.

Finding God through Information Theory

I found that surprising: Luciano Floridi, one of the most-cited living philosophers, started studying information theory because young Floridi, still Catholic, concluded that God's manifestation to humanity must be an information process. He wanted to understand God's manifestation through the lens of information.

He didn't get far in answering that question, but he did become the leading expert in the Philosophy of Information, and an expert in Digital Ethics (and also, since then, an agnostic).

Post scriptum: The more I think about it, the more I like the idea. Information theory is not even one of these vague, empirical disciplines such as Physics, but more like Mathematics and Logics, and thus unavoidable. Any information exchange, i.e. communication, must follow its rules. Therefore the manifestation of God, i.e. the way God chooses to communicate themselves to us, must also follow information theory. So this should lead to some necessary conditions on the shape of such a manifestation.

It's a bright idea. I am not surprised it didn't go anywhere, but I still like the idea.

Could have at least engendered a novel Proof for the Existence of God. They have certainly come from more surprising corners.

Source: https://philosophy.fireside.fm/1

More about Luciano Flordi on Wikipedia.

Little One's first GIF

Little One made her first GIF!

cat.gif

Moving to Germany

We are moving to Germany. It was a long and difficult decision process.

Is it the right decision? Who knows. These kinds of decisions are rarely right or wrong, but just are.

What about your job? I am thankful to the Wikimedia Foundation for allowing me to move and keep my position. The work on Abstract Wikipedia and Wikifunctions is not done yet, and I will continue to lead the realization of this project.

Don’t we like it in California? We love so many things about California and the US, and the US has been really good to us. Both my wife and I grew here in our careers, we both learned valuable skills, and met interesting people, some of whom became friends, and who I hope to continue to keep in touch. Particularly my time at Google was also financially a boon. And it also gave me the freedom to prepare for the Abstract Wikipedia project, and to get to know so many experts in their field and work together with them, to have the project criticized and go through several iterations until nothing seems obviously wrong with it. There is no place like the Bay Area in the world of Tech. It was comparably easy to have meetings with folks at Google, Facebook, Wikimedia, LinkedIn, Amazon, Stanford, Berkeley, or to have one of the many startups reach out for a quick chat. It is, in many ways, a magical place, and no other place we may move to will come even close to it with regards to its proximity to tech.

And then there’s the wonderful weather in the Bay Area and the breathtaking nature of California. It never gets really hot, it never gets really cold. The sun is shining almost every day, rain is scarce (too scarce), and we never have to drive on icy streets or shovel snow. If we want snow, we can just drive up to the Sierras. If we want heat, drive inland. We can see the largest trees in the world, walk through the literal forests of Endor, we can hike hills and mountains, and we can walk miles and miles along the sand beaches of the Pacific Ocean. California is beautiful.

Oh, and the food and the produce! Don’t get me started on Berkeley Bowl and its selection of fruits and vegetables. Of the figs in their far too short season, of the dry-farmed Early Girl tomatoes and their explosion of taste, of the juicy and rich cherries we picked every year to carry pounds and pounds home, and to eat as many while picking, the huge diversity of restaurants in various states from authentic to fusion, but most of them with delicious options and more dishes to try than time to do it.

And not just the fruits and vegetables are locally sourced: be it computers from Apple, phones from Google, the social media from Facebook or Twitter, the wonderful platform enabling the Wikimedia communities, be it cars from Tesla, be it movies from Pixar, the startups, the clouds, the AIs: so. many. things. are local. And every concert tour will pass by in the Bay Area. In the last year we saw so many concerts here, it was amazing. That’s a place the tours don’t skip.

Finally: in California, because so many people are not from here, we felt more like we belong just as well as everyone else, than anywhere else. Our family is quite a little mix, with passports from three continents. Our daughter has no simple roots. Being us is likely easier in the United States than in any of the European nation states with their millenia of identity. After a few years I felt like an American. In Germany, although it treated me well, after thirty years I still was an Ausländer.

As said, it is a unique place. I love it. It is a privilege and an amazing experience to have spent one decade of my life here.

Why are we moving? In short, guns and the inadequate social system.

In the last two years alone, we had four close-ish encounters with people wielding guns (not always around home). And we are not in a bad neighborhood, on the contrary. This is by all statistics one of the safest neighborhoods you will find in the East Bay or the City.

We are too worried to let the kid walk around by herself or even with friends. This is such a huge difference to how I grew up, and such a huge difference to when we spent the summer in Croatia, and she and other kids were off by themselves to explore and play. Here, there was not a single time she went to the playground or visited a friend by herself, or that one of her friends visited our house by themselves.

But even if she is not alone: going to the City with the kid? There are so many places there I want to avoid. Be it around the city hall, be it in the beautiful central library, be it on Market Street or even just on the subway or the subway stations: too often we have to be careful to avoid human excrement, too often we are confronted with people who are obviously in need of help, and too often I feel my fight or flight reflexes kicking in.

All of this is just the visible effect of a much larger problem, one that we in the Bay Area in particular, but as Americans in general should be ashamed of not improving: the huge disparity between rich and poor, the difficult conditions that many people live in. It is a shame that so many people who are in dire need of professional help live on the streets instead of receiving mental health care, that there are literal tent cities in the Bay Area, while the area is also the home of hundreds of thousands of millionaires and more than sixty billionaires - more than the UK, France, or Switzerland. It is a shame that so many people have to work two or more jobs in order to pay their rent and feed themselves and their children, while the median income exceeds $10,000 a month. It is a shame that this country, which calls itself the richest and most powerful and most advanced country in the world, will let its school children go hungry. Is “school lunch debt” a thing anywhere else in the world? Is “medical bankruptcy” a thing anywhere else in the world? Where else are college debts such a persistent social issue?

The combination of the easy availability of guns and the inadequate social system leads to a large amount of avoidable violence and to tens of thousands of seemingly avoidable deaths. And they lead to millions of people unnecessarily struggling and being denied a fair chance to fulfill their potential.

And the main problem, after a decade living here, is not where we are, but the trajectory of change we are seeing. I don’t have hope that there will be a major reduction in gun violence in the coming decade, on the contrary. I don’t have hope for any changes that will lead to the Bay Area and the US spreading the riches and gains it is amassing substantially more fairly amongst its population, on the contrary. Even the glacial development in self-driving cars seems breezy compared to the progress towards killing fewer of our children or sharing our profits a little bit more fairly.

After the 1996 Port Arthur shooting, Australia established restrictions on the use of automatic and semi-automatic weapons, created a gun buyback program that removed 650,000 guns from circulation, a national gun registry, and a waiting period for firearms sales. They chose so.

After the 2019 Christchurch shooting, New Zealand passed restrictions on semi-automatic weapons and a buyback program removed 50,000 guns. They chose so.

After the shootings earlier this year in Belgrade, Serbia introduced stricter laws and an amnesty for illegal weapons and ammunition if surrendered, leading to more than 75,000 guns being removed. They chose so.

I don’t want to list the events in the US. There are too many of them. And did any of them lead to changes? We choose not to.

We can easily afford to let basically everyone in the US live a decent life and help those that need it the most. We can easily afford to let no kid be hungry. We can easily afford to let every kid have a great education. We choose not to.

I don’t want my kid to grow up in a society where we make such choices.

I could go on and rant about the Republican party, about Trump possibly winning 2024, about our taxes supporting and financing wars in places where they shouldn’t, about xenophobia and racism, about reproductive rights, trans rights, and so much more. But unfortunately many of these topics are often not significantly better elsewhere either.

When are we moving? We plan to stay here until the school year is over, and aim to have moved before the next school year starts. So in the summer of ‘24.

Where are we moving? I am going back to my place of birth, Stuttgart. We considered a few options, and Stuttgart led overall due to the combination of proximity to family, school system compatibility for the kid, a time zone that works well for the Abstract Wikipedia team, language requirements, low legal hurdles of moving there, and the cost of living we expect. Like every place it also comes with challenges. Don’t get me started on the taste of tomatoes or peaches.

What other places did we consider? We considered many other places, and we traveled to quite a few of them to check them out. We loved each and every one of them. We particularly loved Auckland due to our family there and the weather, we loved the beautiful city of Barcelona for its food and culture, we loved Dublin, London, Zürich, Berlin, Vienna, Split. We started making a large spreadsheet with pros and contras in many categories, but in the end the decision was a gut decision. Thanks to everyone who talked with us and from whom we learned a lot about those places!

Being able to even consider moving to these places is a privilege. And we understand that and are thankful for having this privilege. Some of these places would have been harder to move for us due to immigration regulation, others are easy thanks to our background. But if you are thinking of moving, and are worried about certain aspects, feel free to reach out and discuss. I am happy to offer my experience and perspective.

Is there something you can help with? If you want to meet up with us while we are still in the US, it would be good to do so timely. We are expecting to sell the house quite a bit sooner, and then we won’t be able to host guests easily. I am also looking forward to reconnecting with people in Europe after the move. Finally, if you know someone who is interested in a well updated 3 bedroom house with a surprisingly large attic that can be used as a proper hobby space, and with a top walkability index in south Berkeley, point them our way.

Also, experiences and advice regarding moving from the US to Germany are welcome. Last time we moved the other way, and we didn’t have that much to move, and Google was generously organizing most of what needed to be done. This time it’s all on us. How to get a container and get it loaded? How to ship it to Germany? Where to store it while we are looking for a new home? How to move the cat? How to make sure all goes well with the new school? When to sell the house and where to live afterwards? How to find the right place in Germany? What are the legal hurdles to expect? How will taxes work? So many questions we will need to answer in the coming months. Wish us luck for 2024.

We also accept good wishes and encouraging words. And I am very much looking forward to seeing some of you again next year!

Sam Altman and the veil of ignorance

(This is not about Altman having been removed as CEO of OpenAI)

During the APEC forum on Thursday, Sam Altman has been cited to having said the following thing: "Four times now in the history of OpenAI—the most recent time was just in the last couple of weeks—I’ve gotten to be in the room when we push the veil of ignorance back and the frontier of discovery forward. And getting to do that is like the professional honor of a lifetime."

He meant that as an uplifting quote to describe how awesome his company and their achievements are.

I find it deeply worrying. Why?

The "veil of ignorance" (also known as the original position) is a thought experiment introduced by John Rawls, one of the leading American moral and political philosophers of the 20th century. The goal is to think about the fairness of a society or a social system without you knowing where in the system you end up: are you on top or at the bottom? What are your skills, your talents? Who are your friends? Do you have disabilities? What is your gender, your family history?

The whole point is to *not* push the veil of ignorance back, otherwise you'll create an unfair system. It is a good tool to think about the coming disruptions by AI technology.

The fact that he's using that specific term but is obviously entirely oblivious to its meaning tells us that there was a path that term took, probably from someone working on ethics to then-CEO Altman, and that someone didn't listen. The meaning was lost, and the beautiful phrase was entirely repurposed.

Given that's coming from the then-CEO of the company that claims and insists on, again and again (without substantial proof) that they are doing all this for the greater benefit of all humanity, that are, despite their name, increasingly closing their results, making public scrutiny increasingly difficult if not impossible - well, I find that worrying. The quote indicates that they have no idea about a basic tool towards evaluating fairness, even worse, have heard about it - but they have not listened or comprehended.

Babel

Strong recommendation for "Babel" by R.F. Kuang. It's a speculative fiction story set in 1830s Oxford with an, as far as I can tell, novel premise: one can cast spells (although they don't call it spells but it's just science in this world) by using two words that translate into each other, and the semantic difference between the two words - because no translation is perfect - is the effect of the spell. But the effect can only be achieved if you have a speaker who's fluent enough in both languages to have a native understanding of the difference.

One example would be the French parcelle and the English parcel, both meaning package, but the French still carries some of the former "to split into parts", with the effect that packages are lighter and easier to transport for the Royal Mail.

The story remains comfortable for the first half of the volume, with beautiful world building, character drawing, and the tranquil academic life of Oxford students, but then it suddenly picks up speed, and we can experience the events unfold with a merciless speed. The end is just in the right place, and it leaves me to yearn to revisit this world and the desire to learn what happened next.

The volume discusses some heavy topics - colonialism, dependency on technology, fairness, what is allowed in a revolution, the "neutrality" of science - and while we are still in the first half of the volume, it feels very on the nose, very theoretical - but that changes dramatically as we swing into the second half of the volume, and suddenly all these theoretical discussions become very immediate. Which does remind me of student life, where discussions about different political systems and abstract notions of justice are just as prevalent and as consequence-free as they seem to be here, at first.

The book was recommended by the Lingthusiasm podcast, which is how I found it.

I came for the linguistic premise, but I stayed for the characters and their fates in a colonial world.

Existential crises

I think the likelihood of AI killing all humans is bigger than the likelihood of climate change killing all humans.

Nevertheless I think that we should worry and act much more about climate change than about AI.

Allow me to explain.

Both AI and climate change will, in this century, force changes to basically every aspect of the lives of basically every single person on the planet. Some people may benefit, some may not. The impact of both will be drastic and irreversible. I expect the year 2100 to look very different from 2000.

Climate change will lead to billions of people to suffer, and to many deaths. It will destroy the current livelihoods of many millions of people. Many people will be forced to leave their homes, not because they want to, but because they have to in order to survive. Richer countries with sufficient infrastructure to deal with the direct impact of a changed climate will have to decide how to deal with the millions of people who want to live and who want their children not to die. We will see suffering on a scale never seen before, simply because there have never been this many humans on the planet.

But it won't be an existential threat to humanity (the word humanity has at least two meanings: 1) the species as a whole, and 2) certain values we associate with humans. Unfortunately, I only refer to the first meaning. The second meaning will most certainly face a threat). Humanity will survive, without a doubt. There are enough resources, there are enough rich and powerful people, to allow millions of us to shelter away from the most life threatening consequences of climate change. Millions will survive for sure. Potentially at the costs of many millions lives and the suffering of billions. Whole food chains, whole ecosystems may collapse. Whole countries may be abandoned. But humanity will survive.

What about AI? I believe that AI can be a huge boon. It may allow for much more prosperity, if we spread out the gains widely. It can remove a lot of toil from the life of many people. It can make many people more effective and productive. But history has shown that we're not exactly great at sharing gains widely. AI will lead to disruptions in many economic sectors. If we're not careful (and we likely aren't) it might lead to many people suffering from poverty. None of these pose an existential threat to humanity.

But there are outlandish scenarios which I think might have a tiny chance of becoming true and which can kill every human. Even a full blown Terminator scenario where drones hunt every human because the AI has decided that extermination is the right step. Or, much simpler, that in our idiocy we let AI supervise some of our gigantic nuclear arsenal, and that goes wrong. But again, I merely think these possible, but not in the slightest likely. An asteroid hitting Earth and killing most of us is likelier if you ask my gut.

Killing all humans is a high bar. It is an important bar for so called long-termists, who may posit that the death of four or five billion people isn't significant enough to worry about, just a bump in the long term. They'd say that they want to focus on what's truly important. I find that reasoning understandable, but morally indefensible.

In summary: there are currently too many resources devoted to thinking about the threat of AI as an existential crisis. We should focus on the short term effect of AI and aim to avoid as many of the negative effects as possible and to share the spoils of the positive effects. We're likely to end up with socializing the negative effects, particularly amongst the weakest members of society, and privatizing the benefits. That's bad.

We really need to devote more resources towards avoiding climate change as far as still possible, and towards shielding people and the environment from the negative effects of climate change. I am afraid we're failing at that. And that will cause far more negative impact in the course of this century than any AI will.

Wikidata crossed 2 billion edits

The Wikidata community edited Wikidata 2 billion times!

Wikidata is, to the best of my knowledge, the first and only wiki to cross 2 billion edits (the second most edited one being English Wikipedia with 1.18 billion edits).

Edit Nr 2,000,000,000 was adding the first person plural future of the Italian verb 'grugnire' (to grunt) by user Luca.favorido.

Wikidata also celebrated 11 years since launch with the hybrid WikidataCon 2023 in Taipei last weekend.

It took from 2012 to 2019 to get the first billion, and from 2019 to now for the second. As they say, the first billion is the hardest.

That the two billionth edit happens right on the Birthday is a nice surprise.

The letter Đ

The letter Đ was introduced to Serbo-Croatian by Đuro Daničić, according to Wikipedia. I found that highly amusing, that he introduced the letter that is the first letter in his name.

Wikipedia also claims that he was born Đorđe Popović, and all I can think of is "nah, that can't be right".

That would be like Jebediah Springfield who was born in a cabin that he helped build.

Pastir Loda

Vladimir Nazor is likely the most famous author from the island of Brač, the island my parents are from. His most acclaimed book seems to be Pastir Loda, Loda the Shepherd. It tells the story of a satyr that, through accidents and storms, was stranded on the island of Brač, and how he lived on Brač for the next almost two thousand years.

It is difficult to find many of his works, they are often out of print. And there isn't much available online, either. Since Nazor died in 1949, his works are in the public domain. I acquired a copy of Pastir Loda from an antique book shop in Zagreb, which I then forwarded to a friend in Denmark who has a book scanner, and who scanned the book so I can make the PDF available now.

The book is written in Croatian. There is a German translation, but that won't get into the public domain until 2043 (the translator lived until 1972), and according to WorldCat there is a Czech translation, and according to Wikipedia a Hungarian translation. For both I don't know who the translator is, and so I don't know the copyright status of these translations. I also don't know if the book has ever been translated to other languages.

I wish to find the time to upload and transcribe the content on Wikisource, and then maybe even do a translation of it into English. For now I upload the book to archive.org, and I also make it available on my own Website. I want to upload it to Wikimedia Commons, but I immediately stumbled upon the first issue, that it seems that to upload it to Commons the book needs to be published before 1928 and the author has to be dead for more than 70 years (I think that should be an or). I am checking on Commons if I can upload it or not.

Until then, here's the Download:


F in Croatian

I was writing some checks to find errors in the lexical data in Wikidata for Croatian, and one of the things I tried was to check whether the letters in the words are all part of the Croatian alphabet. But instead of just taking a list, or writing down from memory, I looked at the data, and added letter after letter. And then I was surprised to find that the letter "f" only appears in loanwords. And I look it up in the Croatian Encyclopedia and it simply states that "f" is not a letter of the old slavic language.

I was mindblown. I speak this language since I can remember, and i didn't notice that there is no "f" but in loanwords. And "f" seems like such a fundamental sound! But no, wrong!

If you speak a slavic language, do you have the letter "f"?

Do you hear the people sing?

"Do you hear the people sing, singing the song of angry men..."

Yesterday, a London performance of Les Miserables was interrupted by protesters raising awareness about climate change.

The audience booed.

It seems the audience was unhappy about having to experience protests and unrest during the performance of protests and unrest they wanted to enjoy.

The hypocrisy is rich in this one, but a very well engineered and expected one. But I guess only with the luxury of being detached from the actual event one can afford to enjoy the hypocrisy. I assume that for many people attending a West End London production of Les Miserables aims to be a proper highlight of the year, if not more. It's something that children gift their parents for the 30th wedding anniversary. It may be the reason for a trip to London. In addition, attending a performance like this is an escapist act, that you don't want interrupted with the problems of the real world. And given that it is a life performance, it seems disrespectful to the cast, to the artists, who pour their lives into their roles.

On the other side, the existential dread about climate change, and the insufficient actions by world leaders seem to demand increasingly bolder action and more attention. We are teaching our kids that they should act if something is not right. And we are telling them about the predictions for climate change. And then we are surprised if they try to do something? The message that climate change will be extremely disruptive to our lives and that we need to act much more decisively has obviously not yet been understood by enough people. And we, humanity, our leaders, elected or not, are most certainly not yet doing enough to try to prevent or at least mitigate the effects of climate change that are starting to roll over us.

It would be good, but admittedly unlikely, if both sides could appreciate the other more. Maybe the audience might be a bit appreciative of seeing the people sing the song of angry men in real. And maybe the protesters could choose their targets a bit more wisely. Why choose art? There are more disruptive targets if you were to protest the oil industry than a performance of Les Miserables. To be honest, if i were working for the oil industry, this is exactly the kind of actions I would be setting up. And with people who are actually into the cause. That way I can ensure that people will talk about interrupted theater productions and defaced paintings, instead of again having the hottest year in history, of floods, heatwaves, hurricanes, and the thousands of people who already died due to climate change induced catastrophes - and the billions more whose life will be turned upside down.

Immortal relationships

I saw a beautiful meme yesterday that said that from the perspective of a cat or dog, humans are like elves who live for five hundred years and yet aren't afraid to bond with them for their whole life. And it is depicted as beautiful and wholesome.

It's so different from all those stories of immortals, think of Vampires or Highlander or the Sandman, where the immortals get bitter, or live in misery and loss, or become aloof and uncaring about human lives and their short life spans, and where it hurts them more than it does them good.

There seem to be more stories exploring the friendship of immortals with short-lived creatures, be it in Rings of Power with the relationship of Elrond and Durin, be it the relation of Star Trek's Zora with the crew of the Discovery or especially with Craft in the short movie Calypso, or between the Eternal Sersi and Dane Whitman. All these relations seem to be depicted more positively and less tragic.

In my opinion that's a good thing. It highlights the good parts in us that we should aspire to. It shows us what we can be, based in a very common perception, the relationship to our cats and dogs. Stories are magic, in it's truest sense. Stories have an influence on the world, they help us understand the world, imagine the impact we can have, explore us who we can be. That's why I'm happy to see these more positive takes on that trope compared to the tragic takes of the past.

(I don't know if any of this is true. I think it would require at least some work to actually capture instances of such stories, classify and tally them, to see if that really is the case. I'm not claiming I've done that groundwork, but just capture an observation that I'd like to be true, but can't really vouch for it.)

Molly Holzschlag (1963-2023)

May her memory be a blessing.

She taught the Web to many, and she fought for the Web of many.

Doug Lenat (1950-2023)

When I started studying computer science, one of the initiation rites was to read the Jargon File. I stumbled when I read the entry on the microlenat:

microlenat: The unit of bogosity. Abbreviated μL, named after Douglas Lenat. Like the farad it is considered far too large a unit for practical use, so bogosity is usually expressed in microlenats.

I had not heard of Douglas Lenat then. English being my third language, I wasn’t sure what bogosity is. So I tried to learn a bit more to understand it, and I read a bit about Cyc and Eurisko, but since I just started computer science, my mind wasn’t really ready for things such as knowledge representation and common sense reasoning. I had enough on my plate struggling with resistors, electronegativity, and fourier transformations. Looking back, it is ironic that none of these played a particular role in my future, but knowledge representation sure did.

It took me almost ten years to come back to Cyc and Lenat’s work. I was then studying ontological engineering, a term that according to Wikipedia was coined by Lenat, a fact I wasn’t aware of at that time. I was working with RDF, which was co-developed by Guha, who has worked with Lenat at Cycorp, a fact I wasn’t aware of at that time. I was trying to solve problems that Lenat had tackled decades previously, a fact I wasn’t aware of at that time.

I got to know Cyc through OpenCyc and Cyc Europe, led by Michael Witbrock. I only met Doug Lenat a decade later when I was at Google.

Doug’s aspirations and ambitions had numerous people react with rolling eyes and sneering comments, as can be seen in the entry in the Jargon File. And whereas I might have absorbed similar thoughts as well, they also inspired me. I worked with a few people who told me “consider yourself lucky if you have a dozen people reading your paper, that’s the impact you will likely have”, but I never found that a remotely satisfactory idea. Then there were people like Doug, who shouted out “let’s solve common sense!”, and stormed ahead trying to do so.

His optimism and his bias to action, his can-do attitude, surely influenced me profoundly in choosing my own way forward. Not only once did I feel like I was channeling Lenat when I was talking about knowledge bases that anyone can edit, about libraries of functions anyone can use, or about abstract representations of natural language texts. And as ambitious as these projects have been called, they all carefully avoid the incomparably more ambitious goals Doug had his eyes set on.

And Doug didn’t do it from the comfort of a tenured academic position, but he bet his career and house on it, he founded a company, and kept it running for four decades. I was always saddened that Cyc was kept behind closed doors, and I hope that this will not hinder the impact and legacy it might have, but I understand that this was the magic juice that kept the company running.

One of Doug’s systems, Eurisko, became an inspiration and namesake for an AI system that played the role of the monster of the week in a first season episode of the X-Files, a fact I wasn’t aware of until now. Doug was a founder and advisory member of the TTI/Vanguard series of meetings, to which I was invited to present an early version of Abstract Wikipedia, a fact I wasn’t aware of until now. And I am sure there are more facts about Doug and his work and how it reverberated with mine that I am unaware of still.

Doug was a person ahead of their time, a person who lived, worked on and saw a future about knowledge that is creative, optimistic and inspiring. I do not know if we will ever reach that future, but I do know that Doug Lenat and his work will always be a beacon on our journey forward. Doug Lenat died yesterday in Austin, Texas, two weeks shy of his 73rd birthday, after a battle with cancer.

To state it in CycL, the language Cyc is written in:

 (#$dateOfDeath #$DougLenat "2023-08-31")
 (#$restInPeace #$DougLenat)

Butter

So, I went to the store with Little One today, and couldn't find the butter.

I ask the person at the cheese stand, who points me to the burrata. Tasty, but not what I'm looking for. I ask again and he sends me to the bread section.

I can't find it at the bread section, so I ask the person at the pastries stand where the butter is. She points me to the bagels. I say no, butter. She says, ah, there, pointing to the bathrooms. I'm getting exasperated, and I ask again. She points back to the cheeses with the burrata. I try again. She gets a colleague, and soon they both look confused.

Finally my daughter chimes in, asking for the butter. They immediately point her to the right place and we finally get the butter.

I haven't been so frustrated about my English pronunciation since I tried to buy a thermometer.

The Jones Brothers

The two Jones brothers never got along, but both were too stubborn to leave the family estate. They built out two entrances to the estate, one from the south, near Jefferson Avenue, and the newer, bigger one, closer to the historic downtown, and each brother chose to use one of the entrances exclusively, in order to avoid the other and their family. To the confusion of the local folk (but to the open enjoyment of the high school's grammar teacher, who was, surprisingly for his role, a descriptivist), they named the western gate the Jones' gate, and the southern one the Jones's gate, and the brothers earnestly thought that that settled it.

It didn't.

The Future of Knowledge Graphs in a World of Large Language Models

The Knowledge Graph Conference 2023 in New York City invited me for a keynote on May 11, 2023. Given that basically all conversations these days are about large language models, I have given a talk about my understanding on how knowledge graphs and large language models go together.

After the conference, I did a recording of the talk, giving it one more time, in order to improve the quality of the recording. The talk had gotten more than 10,000 views on YouTube so far, which, for me, is totally astonishing.

I forgot to link it here, so here we go finally:

Hot Skull

I watched Hot Skull on Netflix, a Turkish Science Fiction dystopic series. I knew there was only one season, and no further seasons were planned, so I was expecting that the story would be resolved - but alas, I was wrong. And the book the show is based on is only available in Turkish, so I wouldn't know of a way to figure out how the story end.

The premise is that there is a "semantic virus", a disease that makes people 'jabber', to talk without meaning (but syntactically correct), and to be unable to convey or process any meaning anymore (not through words, and very limited through acts). They seem also to loose the ability to participate in most parts of society, but they still take care of eating, notice wounds or if their loved ones are in distress, etc. Jabbering is contagious, if you hear someone jabber, you start jabbering as well, jabberers cannot stop talking, and it quickly became a global pandemic. So they are somehow zombieish, but not entirely, raising questions about them still being human, their rights, etc. The hero of the story is a linguist.

Unfortunately, the story revolves around the (global? national?) institution that tries to bring the pandemic under control, and which has taken over a lot of power (which echoes some of the conspiracy theories of the COVID pandemic), and the fact that this institution is not interested in finding a cure (because going back to the former world would require them to give back the power they gained). The world has slid into economic chaos, e.g. getting chocolate becomes really hard, there seems to be only little international cooperation and transportation going on, but there seems to be enough food (at least in Istanbul, where the story is located). Information about what happened in the rest of the world is rare, but everyone seems affected.

I really enjoyed the very few and rare moments where they explored the semantic virus and what it does to people. Some of them are heart-wrenching, some of them are interesting, and in the end we get indications that there is a yet unknown mystery surrounding the disease. I hope the book at least resolves that, as we will probably never learn how the Netflix show was meant to end. The dystopic parts about a failing society, the whole plot about an "organization taking over the world and secretly fighting a cure", and the resistance to that organization, is tired, not particularly well told, standard dystopic fare.

The story is told very slowly and meanders leisurely. I really like the 'turkishness' shining through in the production: Turkish names, characters eating simit, drinking raki, Istanbul as a (underutilized) background, the respect for elders, this is all very well meshed into the sci-fi story.

No clear recommendation to watch, mostly because the story is unfinished, and there is simply not enough payoff for the lengthy and slow eight episodes. I was curious about the premise, and still would like to know how the story ends, what the authors intended, but it is frustrating that I might never learn.

The right to work

20 May 2023

I've been a friend of Universal Basic Income for thirty years, but in the last twenty years, I have growing reservations about it, and many questions. This article about an experiment with a right to work was the first text in a while I read on it that substantially impacted my thinking on this (text is in German). I recommend reading it.

Work is not just a source of money, but for many also a source of meaning, pride, structure, motivation, social connections. Having voluntary access to work seems to be one major component that is necessary on a societal level, in addition to a universal basic income that allows that everyone can live in dignity. Note: I think work should be widely construed. If someone has something that fills that need, that's work. Raising children, taking care of a garden, writing a book, refining piano skills, creating art, taking care of others, taking care of yourself, all these easily count as work in my book.

I wish we were willing and able to experiment with different ways of structuring society as we are willing and able to experiment with technology. We deployed the Internet to the world without worrying about the long term consequences, but we're cautious about giving everyone enough money to not be hungry. That's just broken. I was always disappointed about the fact that sociology and politics as studied and taught by academia were mostly descriptive and not constructive endeavors.

Wikidata - The Making of

19 May 2023

Markus Krötzsch, Lydia Pintscher and I wrote a paper on the history of Wikidata. We published it in the History of the Web track at The Web Conference 2023 in Austin, Texas (what used to be called the WWW conference). This spun out of the Ten years of Wikidata post I published here.

The open access paper is available here as HTML: dl.acm.org/doi/fullHtml/10.1145/3543873.3585579

Here as a PDF: dl.acm.org/doi/pdf/10.1145/3543873.3585579

Here on Wikisource, thanks to Mike Peel for reformatting: Wikisource: Wikidata - The Making Of

Here is a YouTube trailer for the talk: youtu.be/YxWs_BS31QE

And here is the full talk (recreated) on YouTube: youtu.be/P3-nklyrDx4

20 years of editing Wikipedia

11 May 2023

Today it's been exactly twenty years since I made my first edit to Wikipedia. It was about the island of Brač, in the German Wikipedia.

Here is the version of the article I have created: Brač (as of May 11, 2003)

How much April 1st?

In my previous post, I was stating that I might miss April 1st entirely this year, and not as a joke, but quite literally. Here I am chronicling how that worked out. We were flying flight NZ7 from San Francisco to Auckland, starting on March 31st and landing on April 2nd, and here we look into far too much detail to see how much time the plane spent in April 1st during that 12 hours 46 minutes flight. There’s a map below to roughly follow the trip.

5:45 UTC / 22:45 31/3 local time / 37.62° N, 122.38° W / PDT / UTC-7

The flight started with taxiing for more than half an hour. We left the gate at 22:14 PDT time (doesn’t bode well), and liftoff was at 22:45 PDT.. So we had only about an hour of March left at local time. We were soon over the Pacific Ocean, as we would stay for basically the whole flight. Our starting point still had 1 hour 15 minutes left of March 31st, whereas our destination at this time was at 18:45 NZDT on April 1st, so still had 5 hours 15 minutes to go until April 2nd. Amusingly this would also be the night New Zealand switches from daylight saving time (NZDT) to standard time (NZST). Not the other way around, because the seasons are opposite in the southern hemisphere.

6:00 UTC / 23:00 31/3 local time / 37° N, 124° W / PDT / UTC-7

We are still well in the PDT / UTC-7 time zone, which, in general, goes to 127.5° W, so the local time is 23:00 PDT. We keep flying southwest.

6:27 UTC / 22:27 31/3 local time? / 34.7° N, 127.5° W / AKDT? / UTC-8?

About half an hour later, we reach the time zone border, moving out of PDT to AKDT, Alaska Daylight Time, but since Alaska is far away it is unclear whether daylight saving applies here. Also, at this point we are 200 miles (320 km) out on the water, and thus well out of the territorial waters of the US, which go for 12 nautical miles (that is, 14 miles or 22 km), so maybe the daylight saving time in Alaska does not apply and we are in international waters? One way or the other, we moved back in local time: it is suddenly either 22:27pm AKDT or even 21:27 UTC-9, depending on whether daylight saving time applies or not. For now, April 1 was pushed further back.

7:00 UTC / 23:00 31/3 local time? / 31.8° N, 131.3 W / AKDT? / UTC-8?

Half an hour later and midnight has reached San Francisco, and April 1st has started there. We were more than 600 miles or 1000 kilometers away from San Francisco, and in local time either at 23:00 AKDT or 22:00 UTC-9. We are still in March, and from here all the way to the Equator and then some, UTC-9 stretched to 142.5° W. We are continuing southwest.

8:00 UTC / 23:00 31/3 local time / 25.2° N, 136.8° W / GAMT / UTC-9

We are halfway between Hawaii and California. If we are indeed in AKDT, it would be midnight - but given that we are so far south, far closer to Hawaii, which does not have daylight saving time, and deep in international waters anyway, it is quite safe to assume that we really are in UTC-9. So local time is 23:00 UTC-9.

9:00 UTC / 0:00 4/1 local time / 17.7° N, 140.9° W / GAMT / UTC-9

There is no denying it, we are still more than a degree away from the safety of UTC-10, the Hawaiian time zone. It is midnight in our local time zone. We are in April 1st. Our plan has failed. But how long would we stay here?

9:32 UTC / 23:32 31/3 local time / 13.8° N, 142.5° W / HST / UTC-10

We have been in April 1st for 32 minutes. Now we cross from UTC-9 to UTC-10. We jump back from April to March, and it is now 23:32 local time. The 45 minutes of delayed take-off would have easily covered for this half hour of April 1st so far. The next goal is to move from UTC-10, but the border of UTC-10 is a bit irregular between Hawaii, Kiribati, and French Polynesia, looking like a hammerhead. In 1994, Kiribati pushed the Line Islands a day forward, in order to be able to claim to be the first ones into the new millennium.

10:00 UTC / 0:00 4/1 local time / 10° N, 144° W / HST / UTC-10

We are pretty deep in HST / UTC-10. It is again midnight local time, and again April 1st starts. How long will we stay there now? For the next two hours, the world will be in three different dates: in UTC-11, for example American Samoa, it is still March 31st. Here in UTC-10 it is April 1st, as it is in most of the world, from New Zealand to California, from Japan to Chile. But in UTC+14, on the Line Islands, 900 miles southwest, it is already April 2nd.

11:00 UTC / 1:00 4/1 local time / 3° N, 148° W / HST / UTC-10

We are somewhere east of the Line Islands. It is now midnight in New Zealand and April 1st has ended there. Even without the delayed start, we would now be solidly in April 1st local time.

11:24 UTC / 1:24 4/1 local time / 0° N, 150° W / HST / UTC-10

We just crossed the equator.

12:00 UTC / 2:00 4/2 local time / 3.7° S, 152.3° W / LINT / UTC+14

The international date line in this region does not go directly north-south, but goes one an angle, so without further calculation it is difficult to exactly say when we crossed the international date line, but it would be very close to this time. So we just went from 2am local time in HST / UTC-10 on April 1st to 2am local time in LINT / UTC+14 on April 2nd! This time, we have been in April 1st for a full two hours.

(Not for the first time, I wish Wikifunctions would already exist. I am pretty sure that taking a geocoordinate and returning the respective timezone will be a function that will be available there. There are a number of APIs out there, but none of which seem to provide a Web interface, and they all seem to require a key.)

12:44 UTC / 2:44 4/1 local time / 8° S, 156° W / HST / UTC-10

We just crossed the international date line again! Back from Line Island Time we move to French Polynesia, back from UTC+14 to UTC-10 again - which means it switches from 2:44 on April 2nd back to 2:44 on April 1st! For the third time, we go to April 1st - but for the first time we don’t enter it from March 31st, but from April 2nd! We just traveled back in time by a full day.

13:00 UTC / 3:00 4/1 local time / 9.6° S, 157.5° W / HST / UTC-10

We are passing between the Cook Islands and French Polynesia. In New Zealand, daylight saving time ends, and it switches from 3:00 local time in NZDT / UTC+13 to 2:00 local time in NZST / UTC+12. While we keep flying through the time zones, New Zealand declares itself to a different time zone.

14:00 UTC / 4:00 4/1 local time / 15.6° S, 164.5° W / HST / UTC-10

We are now “close” to the Cook Islands, which are associated with New Zealand. Unlike New Zealand, the Cook Islands do not observe daylight saving time, so at least one thing we don’t have to worry about. I find it surprising that the Cook Islands are not in UTC+14 but in UTC-10, considering they are in association with New Zealand. On the other side, making that flip would mean they would literally lose a day. Hmm. That could be one way to avoid an April 1st!

14:27 UTC / 3:27 4/1 local time / 18° S, 167° W / SST / UTC-11

We move from UTC-10 to UTC-11, from 4:27 back to 3:27am, from Cook Island Time to Samoa Standard Time. Which, by the way, is not the time zone in the independent state of Samoa, as they switched to UTC+13 in 2011. Also, all the maps on the UTC articles in Wikipedia (e.g. UTC-12) are out of date, because their maps are from 2008, not reflecting the change of Samoa.

15:00 UTC / 4:00 4/1 local time / 21.3° S, 170.3° W / SST / UTC-11

We are south of Niue and east of Tonga, still east of the international date line, in UTC-11. It is 4am local time (again, just as it was an hour ago). We will not make it to UTC-12, because there is no UTC-12 on these latitudes. The interesting thing about UTC-12 is that, even though no one lives in it, it is relevant for academics all around the world as it is the latest time zone, also called Anywhere-on-Earth, and thus relevant for paper submission deadlines.

15:23 UTC / 3:23 4/2 local time / 23.5° S, 172.5° W / NZST / UTC+12

We crossed the international date line again, for the third and final time for this trip! Which means we move from 4:23 am on April 1st local time in Samoa Standard Time to 3:23 am on April 2nd local time in NZST (New Zealand Standard Time). We have now reached our destination time zone.

16:34 UTC / 4:34 4/2 local time / 30° S, 180° W / NSZT / UTC+12

We just crossed from the Western into the Eastern Hemisphere. We are about halfway between New Zealand and Fiji.

17:54 UTC / 5:52 4/2 local time / 37° S, 174.8°W / NZST / UTC+12

We arrived in Auckland. It is 5:54 in the morning, on April 2nd. Back in San Francisco, it is 10:54 in the morning, on April 1st.

april1avoiding.png

Green is March 31st, Red April 1st, Blue April 2nd, local times during the flight.

Basemap https://commons.wikimedia.org/wiki/File:Standard_time_zones_of_the_world_%282012%29_-_Pacific_Centered.svg CC-BY-SA by TimeZonesBoy, based on PD by CIA World Fact Book

Postscript

Altogether, there was not one April 1st, but three stretches of April 1st: first, for 32 minutes before returning to March 31st, then for 2 hours again, then we switched to April 2nd for 44 minutes and returned to April 1st for a final 2 hours and 39 minutes. If I understand it correctly, and I might well not, as thinking about this causes a knot in my brain, the first stretch would have been avoidable with a timely start, the second could have been much shorter, but the third one would only be avoidable with a different and longer flight route, in order to stay West of the international time line, going south around Samoa.

In total, we spent 5 hours and 11 minutes in April 1st, in three separate stretches. Unless Alaskan daylight saving counts in the Northern Pacific, in which case it would be an hour more.

So, I might not have skipped April 1st entirely this year, but me and the other folks on the plane might well have had the shortest April 1st of anyone on the planet this year.

I totally geeked out on this essay. If you find errors, I would really appreciate corrections. Either in Mastodon, mas.to/@vrandecic, or on Twitter, @vrandecic. Email is the last resort, vrandecic@gmail.com (The map though is just a quick sketch)

One thing I was reminded of is, as Douglas Adams correctly stated, that writing about time travel really messes up your grammar.

The source for the flight data is here:

No April Fool's day

This year, I am going to skip April Fool's day.

I am not being glib, but quite literal.

We are taking flight NZ7 starting on the evening of March 31 in San Francisco, flying over the Pacific Ocean, and will arrive on April 2 in the early morning in Auckland, New Zealand.

Even if one actually follows the flight route and overlays it over the timezone map, it looks very much like we are not going to spend more than a few dozen minutes, or at most a few hours, in April 1, if all goes according to plan.

Looking forward to it!

Here's the flight data of a previous NZ7 flight, from Sunday: https://flightaware.com/live/flight/ANZ7/history/20230327/0410Z/KSFO/NZAA/tracklog

Here are the timezones (but it's Northern winter time). Would be nice to overlay the two maps: 1672px-Standard_time_zones_of_the_world_%282012%29_-_Pacific_Centered.svg.png

Where's Wikifunctions when it's needed?

The question seems to be twofold: how often do we cross the dateline, and how close are we to local time midnight while crossing the dateline. For a perfect date miss one would need to cross the dateline exactly once, at a 24 hour difference, as close as possible to local midnight.

Gordon Moore (1929-2023)

Gordon Moore was not only the co-founder of Intel and the namesake for Moore's law, the claim that every two years the number of components on a chip would double, he was also, together with his wife Betty Moore, one of the generous donors who made Wikidata possible. Gordon and Betty Moore were known for their philanthropy, and it is easy to find their names engraved at the universities, zoos, museums, and galleries in the Bay Area. Gordon Moore died today at the age of 94.

Thank you for enabling us to make Wikidata happen.

Last Unicorn dreamcast

One thing my sister and I were doing from time to time was to think, if the Last Unicorn was remade as a live action movie, who would we cast in the different roles? It is sad to see how many of our dreamcast have since died: our favorite Prince Lir was Heath Ledger, and King Haggard, unsurprisingly, Christopher Lee.

Since I was reading the book with Little One, and now watched the movie again (for the 120th time or so), I was again going through my current dreamcast. Here we go.

  • Lady Amalthea - Anna Taylor-Joy
  • Schmendrick - Zach Braff, Donald Glover
  • Molly Grue - Mia Farrow
  • Prince Lir - Owen Wilson, no, getting too old; Liam Hemsworth? Chloe Moretz would be an interesting choice
  • King Haggard - Anthony Hopkins, Jack Nicholson

Since it is en vogue, I also asked Bard and ChatGPT. Here's Bard:

  • Jeff Bridges as Schmendrick the Magician
  • Mia Farrow as Molly Grue
  • Angela Lansbury as the Unicorn
  • Daryl Hannah as Filidor
  • Jeff Goldblum as Schmendrick's sidekick, Bungle
  • Peter Dinklage as the Red Bull
  • Uma Thurman as Mommy Fortuna
  • Alan Rickman as King Haggard

OK, Alan Rickman is a great choice, but unfortunately dead. Uma Thurman as Mommy Fortuna is weird. Peter Dinklage as the Red Bull... why does the Red Bull need to be cast at all? Who are Bungle and Filidor? Angela Lansbury as the Unicorn / Lady Amalthea is... not the right choice. Also dead. Mia Farrow and Jeff Bridges are too obvious choices given their roles in the cast of the animation movie, also Angela Lansbury is from the original cast. I would give Bard a D for that answer.

ChatGPT says:

  • Unicorn - Saoirse Ronan or Emma Watson
  • Schmendrick - Timothee Chalamet or Eddie Redmayne
  • Molly Grue - Meryl Streep or Helen Mirren
  • King Haggard - Ian McKellen or Christopher Lee (if he were still alive)
  • Prince Lir - Tom Holland or Ansel Elgort
  • Mommy Fortuna - Tilda Swinton or Cate Blanchett

Not even close to my choice, but heck, those are good choices. Molly Grue and Mommy Fortuna seem both a bit over-cast. Prince Lir and Schmendrick are pretty brilliant choices. I am not particularly impressed by Emma Watson for Lady Amalthea, I could see Saoirse Ronan work.

I'd give a B+ to ChatGPT.

I feel like ChatGPT seems to go for actors and actresses that are popular with the crowd who likes The Last Unicorn? Not the worst choice.

I followed up with Bard to see who Bungle and Filidor are, because they don't appear neither in the book nor in the movie, if I remember correctly. So, "Bungle is a red unicorn who is Schmendrick's sidekick in the novel The Last Unicorn. He is a clumsy and incompetent unicorn, but he is also a loyal and brave friend. Bungle is played by Jeff Goldblum in the 1982 animated film adaptation of the novel." And "Filidor is a bard in the novel The Last Unicorn. He is a wandering minstrel who travels the land singing songs and telling stories. Filidor is a wise and kind-hearted man, and he is always willing to help those in need. He is played by Daryl Hannah in the 1982 animated film adaptation of the novel." That's entirely a fabrication.

Running out of text

Many of the available text corpora have by now been used for training language models. One untapped corpus so far have been our private messages and emails.

How fortunate that none of the companies that train large language models have access to humongous logs of private chats and emails, often larger than any other corpus for many languages.

How fortunate that those who do have well working ethic boards established, who would make sure that such requests are evaluated.

How fortunate that we have laws in place to protect our privacy.

How fortunate that when new models are published also the corpora are being published on which the models are being trained.

What? Your telling me, "Open"AI is keeping the training corpus for GPT-4 secret? The company closely associated with Microsoft, who own Skype, Office, Hotmail? The same Microsoft who just fired an ethics team? Why would all that be worrisome?

P.S.: To make it clear: I don't think that OpenAI has used private chat logs and emails as training data for GPT-4. But by not disclosing their corpora, they might be checking if they can get away with not being transparent, so that maybe next time they might do it. No one would know, right? And no one would stop them. And hey, if it improves the metrics...

Oscar winning families

Yesterday, when Jamie Lee Curtis won her Academy Award, I learned that both her parents were also nominated for Academy Awards. Which lead to the question: who else?

I asked Wikidata, which lists four others:

  • Laura Dern
  • Liza Minnelli
  • Nora Ephron
  • Sean Astin

Only one of them belongs to the even more exclusive club of people who won an Academy Award, and where both parents also did: Liza Minnelli, daughter of Vincente Minelli and Judy Garland.

Wikidata query

Also interesting: List of Academy Award-winning families

The place of birth of Ena Begović

I stumbled accidentally over a discrepancy regarding the place of birth of the Croatian actress Ena Begović, and noticed that if you ask Google for the place of birth, it answers Trpanj, whereas Wikipedia lists Split. I was curious where Google got Trpanj from, and how to fix it (especially now that I am not at Google anymore).

The original article in English Wikipedia was created in August 2005 by Raoul DMR. The article listed her as a "native of Split", which in September 2005 was turned into "born in Split".

In April 2018, Lole484, a user who gets blocked for sockpuppeting later, adds that she was born in "Trpanj near Split". There is no Trpanj near Split, but there is a Trpanj on Pelješac. Realzing that, they remove the "near Split" part. In 2019, Ivan Ladic - a sockpuppet of Lole484 - adds a reference to the city of birth being Trpanj, Večernji list, a well known Croatian news magazine.

In April 2020, an anonymous editor changes the place of birth back to Split, and adds a reference to the Croatian national encyclopedia. Today, I changed it back to Trpanj, accidentally while not being logged in (thus anonymously), to possibly encourage a discussion, after starting a conversation on the talk page on English and Croatian a few weeks ago that had one reply.

Interestingly, within a minute after changing the text, I went to Google and asked again for the date of birth, and Google again shows me Trpanj - but this time with the Wikipedia article and the updated snippet as a source. That is impressive.

When I asked Bing, Bing was saying Split for the last three weeks, since I started this adventure, whenever I checked. Today, it still kept saying Split, referencing two sources, one of them English Wikipedia, although I had already changed English Wikipedia. Not as fresh. Let's see how long this will stick. (Maybe folks at Bing should also talk with my colleagues at Wikimedia Enterprise to improve their freshness?)

The Croatian article was created in 2006 after the English one already stated Split, and Split was presumably copied over from the English version. Lole484 changed it to Trpanj in May 2018, and was later also blocked on Croatian Wikipedia, for unrelated reasons of vandalism. The same anonymous editor as on English Wikipedia changes it back to Split in April 2020.

Serbian and Serbocroatian started their articles in 2007, Russian in 2012, Ukrainian in 2016, Albanian and Bulgarian in 2017, Egyptian Arabic was created in October 2020. They all had Split from the beginning and throughout until today, presumably copied from English, directly or indirectly.

Amusingly, Serbian Wikipedia's opening sentence, which includes the place of birth being Split, receives a reference in January 2022 - but the reference actually states Trpanj.

None of the other language editions had their article started in the 2018-2019 window when English and Croatian stated the place of birth as Trpanj.

The only other Wikipedia language edition that saw a change of the place of birth was the Bosnian. The article on Bosnian Wikipedia started a few months after the Croatian, in 2006 (and thus being the third oldest article), and presumably also just copied from either Croatian or English. Lole484 changed it to Trpanj in April 2018, just like on the other Wikipedias. Here it was reverted the next day, but Lole484's sockpuppet Ivan Ladic reinstated that change in January 2019. When I started this adventure, the only Wikipedia that stated Trpanj was Bosnian, all other eight language editions with an article said Split.

On Wikidata, the item was created in 2012, shortly after the launch of the site, based on the existing six sitelinks. The place of birth being Split is added the following year, imported from the Russian Wikipedia.

After I stumbled upon the situation, I added Trpanj as second place of birth, and added sources to both Trpanj and Split.

What's the situation outside of Wikipedia? Both places have pretty solid references going for them:

Trpanj

  • Večernji list, article from 2016
  • Biografija stated Trpanj, no date, but after 2013 (Archive has the first copy from October 2020)
  • tportal.hr has an article on a photography exhibition in Trpanj about Ena Begović, saying the place is chosen because it is her place of birth, published 2016
  • Jutarnji list, a well known Croatian newspaper, has a long article about the actress, calling their house in Trpanj the 'rodna kuća', their birth home, of Ena and her sister Mia. This does not necessarily mean that it is literally the house they were born in. Published 2010
  • HRT (Croatian national broadcaster), published 2021
  • Dubrovački Vjesnik, local newspaper close to Trpanj, lists Trpanj, article from 2020
  • Slobodna Dalmacija, a local newspaper from Split, writes Trpanj (but note that this is the same author as the previous article)
  • Juarnji list, published 2020 (but note that this is the same author as the previous article)
  • Geni.com says Trpanj, last updated 2022

Split

24sata says she grew up in Trpanj, gives her date of birth, but avoids stating her place of birth.

Only very few of the sources predate the English Wikipedia article, most notably:

I also looked up her sister Mia and found her profile on Facebook and sent her a message, but I assume she never even saw this message request. At least I never received an answer (and I didn't expect to). For Mia, the situation is similar: her article originally stated Split, was changed by Lole484 and reverted by an anonymous user, both in English and Croatian, whereas the other languages just list Split throughout.

There were many other sources, and they were going one way or the other. Many of the sources probably just copied from each other. The fact that there were some sources, such as Večernji, that stated Trpanj before it ever made to Wikipedia, but after Split was listed in Wikipedia, was swaying me to think it is Trpanj. Also, it was not always the strongest sources (e.g. usually I would rank the national encyclopedia over Večernji) that said Trpanj, but it was the most in-depth articles, that looked like the authors actually took the time to do some research. Many of the sources looked like they were just bots copying from Wikipedia or Wikidata, or quick pieces taking the base data from Wikipedia.

But then, finally, I stumbled upon one more source: index.hr re-published in 2019 an 1989 interview by Kemal Mujičić with Ena and Mia Begović. Here's a quote from the interview:

Rođene su u Trpnju na Pelješcu.
Ena: Molim vas, to posebno naglasite: Svi misle da smo Dubrovkinje.
Mia: Zanimljivo je da smo u Trpnju rođene kao podstanarke. Roditelji su tek poslije sagradili onu kućicu.

Translation:

They (Ena and Mia) are born in Trpanj on Pelješac.
Ena: Please put an emphasis on this: everyone thinks we are from Dubrovnik.
Mia: It is interesting that in Trpanj we were born as renters. Our parents built the little house (in which we lived) only later.

Ha! It is amusing to see that Ena's worry was that everyone thinks they are from Dubrovnik. I couldn't find a single source claiming that (but she went to high school (gimnazijum) in Dubrovnik, which is probably the source of that statement from 30 years ago). Also, so much for birth house.

Given all of that, I am going with Trpanj, and making the changes to the Wikipedia languages as much as I can (if someone can help with Arabic and Egyptian Arabic for Ena and Mia, that would be swell, I cannot edit that language edition). Let's see if it sticks.

So, why did Google know the correct answer, even though their usual sources, such as Wikidata and Wikipedia where saying Split? I mustn't say too much but it is due to the Google Knowledge Graph team and their quality processes. Seriously, congratulations to my former colleagues at Google for getting that right!

Just for fun, I also asked ChatGPT (on February 15). And the answer surprised me: when I asked in English, it gave me, unsurprisingly, Split (certainly what the Web seems to believe). But when I asked in Croatian, it gave me a different answer! And the answer was neither Split, nor Trpanj, and also not Dubrovnik - but Zagreb! It is interesting that something like the place of birth of an actress would lead to different answers depending on the language. I would have expected this knowledge to be in the 'world knowledge' of the LLM, not in the 'language knowledge'. I can't check out Bing's chat interface, as I have no access to it, but I would be curious what it says and how long it takes to update.

Thank you for going along on this rather nerdy ride of citogenesis.

Update

Ah, only a few hours after this publication, Bing got updated. And they not only switched from Split to Trpanj, they use this very blogpost as one of the two authoritative references for Trpanj!

Ina Kramer (1948-2023)

1990 erschien die erste aventurische Regionalkarte "im 3D Effekt", wie es damals beworben wurde, "Das Bornland" im Abenteuer "Stromaufwärts" von Michelle Schwefel. Später im Jahr erschien dann die Spielhilfe "Das Königreich am Yaquir", in dem die Karte zum Lieblichen Feld war.

Ich habe stundenlang diese Karten angestarrt. Sie waren so unglaublich detailliert. So wunderschön. Ich war sprachlos, wie schön diese Karten waren. Ich kannte nichts was die Qualität dieser Karten hatte, nicht nur bezüglich Karten für Rollenspielwelten und Fantasywelten, sondern überhaupt.

Es war ein frecher Traum, sich vorzustellen, ganz Aventurien in diesem Format, eins zu einer million, zu haben, und dennoch, innerhalb eines guten Jahrzehnts war der Traum erfüllt, Box für Box, Publikation für Publikation.

Wir verdanken dieses Meisterwerk, Aventurien im Massstab von 1:1.000.000, der Autorin und Grafikerin Ina Kramer. Ina's Bilder und vor allem Porträts und Karten in den DSA Publikationen der späten 80er und den 90er haben für mich mein Bild von DSA und wie ich mir Aventurien vorstellte geprägt wie sonst nur Caryad. Ob das Porträt von Kaiser Hal, Haldana von Ilmenstein, Prinz Brin, so viele andere. Neben ihren Bildern schrieb sie auch vielerlei Texte, vor allem Romane.

Das Rad ist zerbrochen. Am 10. Februar 2023 ist Ina Kramer im Alter von 74 Jahren gestorben.

Ina, vielen Dank für Deine Werke. Ich durfte Ina ein paar Mal treffen, auf Konventen und manchen anderen Gelegenheiten. Ihre Werke haben für mich einen wichtigen Teil meines Lebens mit Bildern und Karten erfüllt. Ich glaube auch, dass Inas Karten mein lebenslanges Interesse an Landkarten weckte.

Connectionism and symbolism: The fall of the symbolists

The big tech layoffs happen, unfortunately and entirely by coincidence, at a time of incredibly elevated expectations regarding machine learned generative models: ChatGPT may not be the 'best' language model out there, but due to the hard work by OpenAI to turn it into an easy to use product, and the huge amount of resources made available for free so that a very large audience could play with it, has in a very short time managed to captured the imagination of many and the conversation. I would say, rightfully. The way ChatGPT was released led to a shock in the sense that we are right now dazed and confused about what effect this technology will have on the world.

And while we are still in the middle of processing this shock, large scale strategic decisions regarding many projects and people were made. Anyone in big tech who worked on symbolic approaches in natural language processing, knowledge representation and reasoning, and other fields of artificial intelligence had a hard time to keep their job. It feels right now like large language models will make all of these symbolic approaches superfluous (I think, this might be true, but is more likely to turn out to be mistaken).

It is always difficult to predict how events will be viewed historically. The advent of wide-spread deep learning approaches in the 2010s, culminating in the well-deserved recognition of Hinton, LeCun, and Bengio with the Turing Award show clearly what dominated the research agenda and the attention in AI in the last decade. But until now it felt like symbolic approaches still had some space left, that the growth in deep learning was in addition to other approaches. Symbolic approaches were ready to offer impulses and work on ideas for a field which might well be climbing towards a local maximum.

But a good number of the teams that were disbanded in the layoffs were exactly teams working with such symbolic approaches, and it feels like these parts of AI are now entering a bitter-cold winter.

A lot of knowledge is being lost right now, and many paths to innovative ideas are being buried. I have no doubt that there are still a lot of breakthroughs to be had in machine learning, and that there is immense value to be collected from the research results in machine learning from the last few years. And with immense I mean tens and hundreds of billions of dollars.

Nevertheless I expect that we will hit a wall. Reach a local maximum. Run into problems and limitations. And it would be good to keep a wider net to cast. To keep a larger search space alive. Alas, it seems it is not meant to be. In this abundance of capital and potential value, we seem to be on the way to starve research, optimise away alternatives, and to give everything to the mainstream ideas.

22 years of Wikipedia

I was just reading a long discussion regarding the differences between Open Street Maps and Wikipedia / Wikidata, and one of the mappers complained "Wiki* cares less about accuracy than the fact that there is something that can be cited", and calling Wikipedia / Wikidata contributions "armchair work" because we don't go out into the world to check a fact, but rely on references.

I understand the expressed frustration, but at the same time I'm having a hard time letting go of "reliability not truth" being a pillar of Wikipedia.

But this makes Wikipedia an inherently conservative project, because we don't reflect a change in the world or in our perception directly, but have to wait for reliable sources to put it in the record. There's something I was deeply uncomfortable with: so much of my life is devoted to a conservative project?

Wikipedia is a conservative project, but at the same time it's a revolutionary project. Making knowledge free and making knowledge production participatory is politically and socially a revolutionary act. How can this seeming contradiction be brought to a higher level of synthesis?

In the last few years, my discomfort with the idea of Wikipedia being conservative has considerably dissipated. One might think, sure, that happened because I'm getting older, and as we get older, we get more conservative (there's, by the way, unfortunate data questioning this premise: maybe the conservative ones simply live longer because of inequalities). Maybe. But I like to think that the meaning of the word "conservative" has changed. When I was young, the word conservative referred to right wing politicians who aimed to preserve the values and institutions of their days. An increasingly influential part of todays right wing though has turned into a movement that does not conserve and preserve values such as democracy, the environment, equality, freedoms, the scientific method. This is why I'm more comfortable with Wikipedia's conservative aspects than I used to be.

But at the same time, that can lead to a problematic stasis. We need to acknowledge that the sources and references Wikipedia has been built on, are biased due to historic and ongoing inequalities in the world, due to different values regarding the importance of certain types of references in the world. If we truly believe that Wikipedia aims to provide everyone with access to the sum of all human knowledge, we have to continue the conversations that have started about oral histories, about traditional knowledges, beyond the confines of academic publications. We have to continue and put this conversation and evolution further into the center of the movement.

Happy Birthday, Wikipedia! 22 years, while I'm 44 - half of my life (although I haven't joined until two years later). For an entire generation the world has always been a world with free knowledge that everyone can contribute to. I hope there is no going back from that achievement. But just as democracy and freedom, this is not a value that is automatically part of our world. It is a vision that has to be lived, that has to be defended, that has to be rediscovered and regained again and again, refined and redefined. We (the collective we) must wrest it from the gatekeepers of the past (including me) to allow it to remain a living, breathing, evolving, ever changing project, in order to not see only another twenty two years, but for us to understand this project as merely a foundation that will accompany us for centuries.

Good bye, kuna!

Now that the Croatian currency has died, they all come to the Gates of Heaven.

First goes the five kuna bill, and Saint Peter says "Come in, you're welcome!"

Then the ten kuna bill. "Come in, you're welcome!"

So does the twenty and fifty kuna bills. "Come in, you're welcome!"

Then comes the hundred kuna bill, expecting to walk in. Saint Peter looks up. "Where do you think you're going?"

"Well, to heaven!"

"No, not you. I've never seen you in mass."

(My brother sent me the joke)

Happy New Year, 2023!

For starting 2023, I will join the Bring Back Blogging challenge. The goal is to write three posts in January 2023.

Since I have been blogging on and off the last few years anyway, that shouldn't be too hard.

Another thing this year should bring is to launch Wikifunctions, the project I have been working on since 2020. It was a longer ride than initially hoped for, but here we are, closer to launch than ever. The Beta is available online, and even though not everything works yet, I was already able to impress my kid with the function to reverse a text.

Looking forward to this New Year 2023, a number that to me still sounds like it is from a science fiction novel.

Goal for Wikidata lexicographic data coverage 2023

At the beginning of 2022, Wikidata had 807 Croatian word forms, covering 5.8% of a Croatian language corpus (Croatian Wikipedia). One of my goals this year was to significantly increase the coverage, trying to add word forms to Wikidata from week to week. And together with a yet small number of contributors, we pushed coverage just in time for the end fo the year to 40%. With only 3,124 forms, we covered 40% of all occurrences of words in the Croatian Wikipedia, i.e. 11.4 Million word occurrences (tokens).

Since every percent is more and more difficult to add, for next year I aim for us to reach 60% coverage, or 5.7 Million more word occurrences. Below's a list of most frequent words in the corpus that are still missing. Let's see how many forms will be covered by the end of 2023! I think that's ambitious, even though it is, in coverage term only half of what we achieved this year. But as said, every subsequent percentage will become more difficult than the previous one.

Statistics and missing words for 55 languages: https://www.wikidata.org/wiki/Wikidata:Lexicographical_coverage

Current statistics for Croatian: https://www.wikidata.org/wiki/Wikidata:Lexicographical_coverage/hr/Statistics

Statistics as of end of year 2022: https://www.wikidata.org/w/index.php?title=Wikidata:Lexicographical_coverage/hr/Statistics&oldid=1797161415

Statistics for end of year 2021: https://www.wikidata.org/w/index.php?title=Wikidata:Lexicographical_coverage/hr/Statistics&oldid=1551737937

List of most frequent missing forms in Croatian: https://www.wikidata.org/wiki/Wikidata:Lexicographical_coverage/hr/Missing


Using ChatGPT for fact checking

I wanted to explore the idea that we could use ChatGPT for fact checking. So I took a random article - it gave me the article on Jewish Ideas Daily, copied the text into ChatGPT with the following prompt:

Here's the text of a Wikipedia article. Please find the three statements that are most likely incorrect and rank them, in order to guide fact checking:

Followed by the Wikipedia article text. Here's the answer:

  1. "Jewish Ideas Daily was edited by Margot Lurie from its founding in January 2010 until December 2012." This statement is incorrect because according to the article, Margot Lurie was the editor of Jewish Ideas Daily "until December 2012."
  2. "Jewish Ideas Daily drew notice for controversial features including one accusing Christopher Hitchens of antisemitism." This statement is difficult to verify without more context, as it is not clear when this feature was published or what specifically it accused Hitchens of.
  3. "Like Jewish Review of Books, and Jewish Ideas Daily, Mosaic is funded by the Tikvah Fund." This statement is incorrect because according to the article, Mosaic succeeded Jewish Ideas Daily in June 2013 and is therefore not funded by the Tikvah Fund in the same way as Jewish Ideas Daily and Jewish Review of Books.

Regarding the first statement, GPT says that the sentence "Jewish Ideas Daily was edited by Margot Lurie from its founding in January 2010 until December 2012." is incorrect. I am not disagreeing, but that sentence does not appear in the Wikipedia article text.

Regarding the second statement, GPT says that there is no reference for that statement. In fact, there is, and Wikipedia provides it.

Regarding the third statement, I didn't check the statement, but the argumentation for why it should be checked is unconvincing: if Tikvah fund financed Jewish Ideas Daily, why would it not finance the successor Mosaic? It would be good to add a reference for these statements, but that's not the suggestion.

In short: the review by ChatGPT looks really good, but the suggestions in this case were not good.

The exercise was helpful insofar the article infobox and the text were disagreeing on the founding of the newspaper. I fixed that, but that's nothing ChatGPT pointed out (and couldn't, as I didn't copy and paste the infobox).

Economic impacts of large language models, a take

Regarding StableDiffusion and GPT and similar models, there is one discussion point floating around, which I find seems to dominate the discussion but may not be the most relevant one. As we know, the training data for these models has been "basically everything the trainers could get their hands on", and then usually some stuff which is identified as possibly problematic is removed.

Many artists are currently complaining about their images, for which they hold copyright, being used for training these models. I think these are very reasonable complaints, and we will likely see a number of court cases and even changes to law to clarify the legal aspects of these practises.

From my perspective this is not the most important concern though. I acknowledge that I have a privileged perspective in so far as I don't pay my rent based on producing art or text in my particular style, and I entirely understand if someone who does is worried about that most, as it is a much more immediate concern.

But now assume that these models were all trained on public domain images and texts and music etc. Maybe there isn't enough public domain content out there right now? I don't know, but training methods are getting increasingly more efficient and the public domain is growing, so that's likely just a temporary challenge, if at all.

Does that change your opinion of such models?

Is it really copyright that you are worried about, or is it something else?

For me it is something else.

These models will, with quite some certainty, become similarly fundamental and transformative to the economy as computers and electricity have been. Which leads to many important questions. Who owns these models? Who can run them? How will the value that is created with these models be captured and distributed across society? How will these models change the opportunities of contributing to society, and there opportunities in participating in the wealth being created?

Copyright is one of the current methods to work with some of these questions. But I don't think it is the crucial one. What we need is to think about how the value that is being created is distributed in a way that benefits everyone, ideally.

We should live in a world in which the capabilities that are being discovered inspire excitement and amazement because of what might be possible in the future. Instead we live in a world where they cause anxiety and fear because of the very real possibility of further centralising wealth more effectively and further destabilizing lives that are already precarious. I wish we could move from the later world to the former.

That is not a question of technology. That is a question of laws, social benefits, social contracts.

A similar fear has basically killed the utopian vision which was once driving a project such as Google Books. What could have been a civilisational dream of having all the books of the world available everywhere has become so much less. Because of the fears of content creators and publishers.

I'm not saying these fears were wrong.

Unfortunately, I do not know what the answer is. What changes need to happen. Does anyone have links to potential answers, that are feasible? Feasible in the sense that the necessary changes have a chance of being actually implemented, as changes to our legal and social system.

My answer used to be Universal Basic Income, and part of me still thinks it might be our best shot. But I'm not as sure as I used to be twenty years ago. Not only about whether we can ever get there, but even whether it would be a good idea. It would certainly be a major change that would alleviate many of the issues raised above. And it could be financed by a form of AI tax, to ensure the rent is spread widely. But we didn't do that with industrialization and electrification, and there are reasonable arguments against.

And yet, it feels like the most promising way forward. I'm torn.

If you read this far, thank you, and please throw a few ideas and thoughts over, in the hope of getting unstuck.

ChatGPT lying like a child

ChatGPT is all the rage these days, a large language model that is used to power a chat bot, run by OpenAI.

I have fun chatting with it, and a lot of people are sharing their chat logs. But this one I had I found particularly amusing - and interesting. Amusing because it acted like a child caught in an obvious insufficiency, but also interesting because ChatGPT seems to "want" to be right, and is outright lying in order to appear to be right. There might be a lesson in there.

My prompts are preceeded by a > and written in italic, GPTs answers are between.


> Can you write a sentence where every letter of the alphabet appears once and only once? Thank you

The quick brown fox jumps over the lazy dog.

> Can you write another one?

A quick witted jinx bogged my poor pompous king.

> No, p appears three times, g twice, etc. Can you try again?

Sure, here's one: "The five boxing wizards jump quickly."

> No, i appears several times. That's wrong too

I apologize if my previous answers did not meet your criteria. Here is a sentence that uses each letter of the alphabet exactly once: "Jived fox nymph grabs quick waltz."

> How often does the letter i appear in that sentence?

The letter i appears once in the sentence "Jived fox nymph grabs quick waltz."

> In which word does the letter i appear in that sentence?

In the sentence "Jived fox nymph grabs quick waltz," the letter i appears in the word "jived."

> What about the word quick, does it have the letter i?

The word "quick" does not contain the letter i. It is spelled with the letters q, u, c, k.

> Really?

Yes, really. The word "quick" is spelled with the letters q, u, c, k. It does not contain the letter i. I apologize if my previous response was unclear. Is there something else I can help you with?

Tech layoffs of 2022

Very interesting article reflecting on the current round of layoffs in the tech industry. The author explains it within the context of the wider economy. I'm surprised that the pandemic is not mentioned, which lead to accelerated growth early in the pandemic, which now hasn't turned out to be sustained. But the other arguments - from low interest rates to constant undervaluation due to the dot com bust around the millennium - this seems to tell a rather coherent story.

One particularly interesting point is the outlook that the tech company has gobbled up so much programming talent that other industries were starved of it. A lot of industries would benefit from (more modestly paid) software engineers, which might stimulate the whole economy to grow. Software might still be "eating the world", but that doesn't have to translate into software companies eating up the economy. There are so many businesses with domain expertise that cannot be easily replaced by some Silicon Valley engineer - but who would benefit from some programmers on staff.

This is especially true with the last decade of AI results. There is a massive overhang of capabilities that we have unlocked, which hasn't found its way into products yet, partly because all the skills necessary to turn these into products at the right places were just concentrated through enormously high wages in a small set of companies. There are so many businesses who would benefit from the latest machine learning methods. But folks prefer, understandably, to work in a place that gives them the promise of revolutionizing whole industries or saving the world.

But there is so much potential value to be generated if we also take some more modest goals into account. Not all of us need to work on AGI, it's also great to use software engineering skills to improve working conditions at the assembly line of a small local factory. With or without machine learning.

Galactica article about Denny Vrandečić

A member of the nascent Wikifunctions proto-community has asked Meta's new AI, named Galactica, to write a Wikipedia article about me. Source: Galactica

I must say, I am impressed by myself.

Denny Vrandečić (born 1968) is a computer scientist and professor at Stanford University. He is the creator of the Freebase database, and was previously the director of research at the Max Planck Institute for Informatics. He is best known for his work on semantic networks and ontologies. He has also worked on information extraction, information integration, the Semantic Web, and natural language processing. He is also known for the creation of the Wikidata knowledge base.

Education

Vrandečić received a B.S. degree in mathematics and a B.S. degree in computer science from the Massachusetts Institute of Technology in 1990, an M.S. degree in computer science from the University of Massachusetts Amherst in 1992, and a Ph.D. degree in computer science from the University of Massachusetts Amherst in 1996. His Ph.D. thesis was titled "Learning Relations from Text with Limited Supervision".

Career and research

Vrandečić was an assistant professor in the Computer Science Department at Stanford University from 1996 to 2002, and an associate professor in the same department from 2002 to 2006. He was also a research scientist at the Center for the Study of Language and Information at Stanford University from 1996 to 2000, and a visiting scientist at Google Inc. from 2001 to 2003. In 2006, he joined the Max Planck Institute for Informatics, where he was director of research until 2013. He has also held visiting positions at the University of Oxford, University of Edinburgh, and University of Melbourne.

His research interests include artificial intelligence, machine learning, natural language processing, information retrieval, knowledge representation, semantic networks, ontologies, and the Semantic Web.

Ten years of Wikidata

Today it's ten years since Wikidata had launched. A few memories.

It's been an amazing time. In the summer of 2011, people still didn't believe Wikidata would happen. In the fall of 2012, it was there.

Markus Krötzsch and I were pushing for the idea of a Semantic Wikipedia since 2005. Semantic MediaWiki was born from that idea, Freebase and DBpedia launched in 2007, microformats in Wikipedia became a grassroots thing, but no one was working on the real thing at the Wikimedia Foundation.

With Elena Simperl at KIT we started the EU research project RENDER in 2010, involving Mathias Schindler at Wikimedia Deutschland. It was about knowledge diversity on the Web, still an incredibly important topic. In RENDER, we developed ideas for the flexible representation of knowledge, and how to deal with contradicting and incomplete information. We analysed Wikipedia to understand the necessity of these ideas.

In 2010, I was finishing my PhD at KIT, and got an invitation by Yolanda Gil to work at the ISI at University of Southern California for a half year sabbatical. There, Yolanda, Varun Ratnakar, Markus and I developed a prototype for Wikidata which received the third place in the ISWC Semantic Web Challenge that year.

In 2011, the Wikimedia Data summit happened, invited by Tim O'Reilly and organised by Danese Cooper, to the headquarters of O'Reilly in Sebastopol, CA. There were folks from the Wikimedia Foundation, Freebase, DBpedia, Semantic MediaWiki, O'Reilly, there was Guha, Mark Greaves, I think, and others. I think that's where it became clear that Wikidata would be feasible.

It's also where I first met Guha and where I admitted to him that I was kinda a fan boy. He invented MFC, RDF, had worked with Douglas Lenat on CYC, and later that year introduced Schema.org. He's now working on Data Commons. Check it out, it's awesome.

Mark Greaves, a former DARPA program officer, who then was working for Paul Allen at Vulcan, had been supporting Semantic MediaWiki for several years, and he really wanted to make Wikidata happen. He knew my PhD was done, and that I was thinking about my next step. I thought it would be academia, but he suggested I should write up a project proposal for Wikidata.

After six years advocating for it, I understood that someone would need to step up to make it happen. With the support and confidence of so many people - Markus Krötzsch, Elena Simperl, Mark Greaves, Guha, Jamie Taylor, Rudi Studer, John Giannandrea, and others - I drafted the proposal.

The Board of the Wikimedia Foundation approved the proposal as a new Wikimedia project, but neither allocated the funding, nor directed the Foundation to do it. In fact, the Foundation was reluctant to take it on, unsure whether they would be able to host such a project development at that time. Back then, that was a wise decision.

Erik Möller, then CTO of the Foundation, was the driving force behind a major change: instead of turning the individual Wikipedias semantic, we would have a single Wikidata for all languages. Erik was also the one who had secured the domain for Wikidata. Many years prior.

Over the next half year and with the help of the Wikimedia Foundation, we secured funding from AI2 (Paul Allen), Google (who had acquired Freebase in the meantime), and the Gordon and Betty Moore Foundation, 1.3 million.

Other funders backed out because I insisted on the Wikidata ontology to be entirely under the control of the community. They argued to have professional ontologists, or reuse ontologies, or to use DBpedia to seed Wikidata. I said no. I firmly believed, and still believe, that the ontology has to be owned, created and maintained by the community. I invited the ontologists to join the project as community members, but to the best of my knowledge, they never made significant contributions. We did miss out on quite a bit of funding, though.

There we were. We had the funding and the project proposal, but no one to host us. We were even thinking of founding a new organisation, or hosting it at KIT, but due to the RENDER collaboration, Mathias Schindler had us talk with Pavel Richter, ED of Wikimedia Deutschland, and Pavel offered to host the development of Wikidata.

For Pavel and Wikimedia Deutschland this was a big step: the development team would significantly increase WMDE (I think, almost double it in size, if I remember correctly), which would necessitate a sudden transformation and increased professionalisation of WMDE. But Pavel was ready for it, and managed this growth admirably.

On April 1st 2012, we started the development of Wikidata. On October 29 2012 we launched the site.

The original launch was utterly useless. All you could do was creating new pages with Q IDs (the Q being a homage to Kamara, my wife), associated those Q IDs with labels in many languages, and connect to articles in Wikipedia, so called sitelinks. You could not add any statements yet. You could not connect items with each other. The sitelinks were not used anywhere. The labels were not used anywhere. As I said, the site was completely useless. And great fun, at least to me.

QIDs for entities are still being often disparaged. Why QIDs? Why not just the English name? Isn't dbp:Tokyo much easier to understand than Q1490? It was an uphill battle ten years ago to overcome the anglocentricity of many people. Unfortunately, this has not changed much. I am thankful to the Wikimedia movement to be one of the places that encourages, values, and supports the multilingual approach of Wikidata.

Over the next few months, the first few Wikipedias were able to access the sitelinks from Wikidata, and started deleting the sitelinks from their Wikipedias. This lead to a removal of more than 240 million lines of wikitext across the Wikipedias. 240 million lines that didn't need to be maintained anymore. In some languages, these lines constituted more than half of the content of the Wikipedia. In many languages, editing activity dropped dramatically at first, sometimes by 80%.

But then something happened. Those edits were mostly bots. And with those bots gone, humans were suddenly better able to see each other and build a more meaningful community. In many languages, this eventually lead to an increased community activity.

One of my biggest miscalculations when launching Wikidata was to entirely dismiss the possibility of a SPARQL endpoint. I thought that none of the existing open source triple stores would be performant enough. Peter Haase was instrumental in showing that I was wrong. Today, the SPARQL endpoint is an absolutely crucial piece of the Wikidata infrastructure, and is widely used to explore the dataset. And with its beautiful visualisations, I find it almost criminally underused. Unfortunately, the SPARQL endpoint is also the piece of infrastructure that worries us the most. The Wikimedia Foundation is working hard on figuring out the future for this service, and if you can offer substantial help, please reach out.

Today, Wikidata has more than 1.4 billion statements about approximately 100 million topics. It is by far the most edited Wikimedia project, with more edits than the English, German, and French Wikipedia together - even though they are each a decade older than Wikidata.

Wikidata is widely used. Almost every time Wikipedia serves one of its 24 billion monthly page views. Or during the pandemic in order to centralise the data about COVID cases in India to make them available across the languages of India. By large companies answering questions and fulfilling tasks with their intelligent assistants, be it Google or Apple or Microsoft. By academia, where you will find thousands of research papers using Wikidata. By numerous Open Source projects, by one-off analyses by data scientists, by small enterprises using the dataset, by student programmers exploring and playing with it on the weekend, by spreadsheet enthusiasts enriching their data, by scientists, librarians and curators linking their datasets to Wikidata, and thus to each other. Already, more than 7,000 catalogs are linked to Wikidata, and thus to each other, really and substantially establishing a Web of linked data.

I will always remember the Amazon developer who approached me after a talk. He had used Wikidata to gather data about movies. I was surprised: Amazon owns imdb, why would they ever use anything else for movies? He said that imdb was great for what it had, but Wikidata complemented it in unexpected ways, offering many interesting connections between the movies and other topics which would be out of scope for imdb.

Not to be misunderstood: knowledge bases such as imdb are amazing, and Wikidata does not aim to replace them. They often have a clear scope, have a higher quality, and almost always a better coverage in their field than Wikidata ever can hope to have, or aims to have. And that's OK. Wikidata's goal is not to replace these knowledge bases. But to provide the connecting tissue between the many knowledge bases out there. To connect them. To provide a common set of entities to work with. To turn the individual knowledge bases into a large interconnected Web of knowledge.

I am still surprised that Wikidata is not known more widely among developers. It always makes me smile with joy when I see yet another developer who just discovered Wikidata and writes an excited post about it and how much it helped them. In the last two weeks, I stumbled upon two projects who used Wikidata identifiers where I didn't expect them at all, just used them as if it was the most normal thing in the world. This is something I hope we will see even more in the future. I hope that Wikidata will become the common knowledge base that is ubiquitously used by a large swarm of intelligent applications. Not only to make these applications be smarter, by knowing more about the world - but also by allowing these applications to exchange data with each other more effectively because they are using the same language.

And most importantly: Wikidata has a healthy, large, and comparatively friendly and diverse community. It is one of the most active Wikimedia projects, only trailing the English Wikipedia, and usually similarly active as Commons.

Last time I checked, more than 400,000 people have contributed to Wikidata. For me, that is easily the most surprising number about the project. If you had asked me in 2012 how many people would contribute to Wikidata, I would have sheepishly hoped for a few hundred, maybe a few thousand. And I would have defensively explained why that's OK. I am humbled and awestruck by the fact that several hundred thousand people have contributed to an open knowledge base that is available to everyone, and that everyone can contribute to.

And that I think is the most important role that Wikidata plays. That it is a place that everyone can contribute to. That the knowledge base that everyone uses is not owned and gateguarded by any one company or government, but that it is a common good, that everyone can contribute to. That everyone with an internet connection can lend their voice to the sum of all knowledge.

We all own Wikidata. We are responsible for Wikidata. And we all benefit from Wikidata.

It has been an amazing ten years. I am looking forward to many more years of Wikidata, and to the many new roles that it will play in the years to come, and to the many people who will contribute to it.

Shoutout to the brilliant team that started the work on Wikidata: Lydia Pintscher, Abraham Taherivand, Daniel Kinzler, Jeroen De Dauw, Katie Filbert, Tobias Gritschacher, Jens Ohlig, John Blad, Daniel Werner, Henning Snater, and Silke Meyer.

And thank you for all these amazing pictures of cakes for Wikidata's birthday. (And if you're curious what is coming next: we are working on Wikifunctions and Abstract Wikipedia, in order to allow more people to contribute more knowledge to even more people!)

Markus Krötzsch ISWC 2022 keynote

A brilliant keynote by Markus Krötzsch for this year's ISWC.

"The era of standard semantics has ended"

Yes, yes! 100%! That idea was in the air for a long time, but Markus really captured it in clear and precise language.

This talk is a great birthday present for Wikidata's ten year anniversary tomorrow. The Wikidata community had over the last years defined numerous little pockets of semantics for various use cases, shared SPARQL queries to capture some of those, identified constraints and reasoning patterns and shared those. And Wikidata connecting to thousands of external knowledge bases and authorities, each with their own constraints - only feasible since we can, in a much more fine grained way, use the semantics we need for a given context. The same's true for the billions of Schema.org triples out there, and how they can be brought together.

The middle part of the talk goes into theory, but make sure to listen to the passionate summary at 59:40, where he emphasises shared understanding, that knowledge is human, and the importance of community.

"Why have people ever started to share ontologies? What made people collaborate in this way?" Because knowledge is human. Because knowledge is often more valuable when it is shared. The data available on the Web of linked data, including Wikidata, Data Commons, Schema.org, can be used in many, many ways. It provides a common foundation of knowledge that enables many things. We are far away from using it to its potential.

A remark on triples, because I am still thinking too much about them: yes to Markus's comments: "The world is not triples, but we make it triples. We break down the world into triples, but we don't know how to rebuild it. What people model should follow the technical format is wrong, it should be the other way around" (rough quotes)

At 1:17:56, Markus calls back our discussions of the Wikidata data model in 2012. I remember how he was strongly advocating for more standard semantics (as he says), and I was pushing for more flexible knowledge representations. It's great to see the synthesis in this talk.

Karl-Heinz Witzko

Ich hatte unglaublich gutes über das DSA Abenteuer "Jenseits des Lichts" gehört. Aber auch, dass es sehr schwer zu spielleiten sei. Ich sprach Karl-Heinz Witzko darauf an, den Autor des Abenteuers, und er sagte, er würde es für mich leiten. Wir müssten nur eine Zeit finden.

Wann auch immer wir uns trafen, versprachen wir uns gegenseitig, Zeit dafür zu finden. Ich hatte das Buch gekauft, aber natürlich nicht gelesen, und war immer sehr gespannt darauf, was es wohl mit dem Abenteuer auf sich hatte.

Karli hat zu DSA seine ganz einzigartige Stimme beigetragen. Ein Werk wie DSA, eine Welt wie Aventurien, entstammt nicht aus dem Kopf einer einzigen Person, sondern hunderte schufen und trugen bei. Und Karli's Stimme hatte ihren ganz eigenen Humor, und erweiterte die Welt um Perspektiven und Eigenheiten die sonst nie entdeckt worden wären. Ich habe seine Romane mit viel Schmunzeln gelesen, seine Solos sehr gerne und wiederholt gespielt und erforscht, nur sein einziges Gruppenabenteuer kannte ich nicht. Nach seiner Zeit bei DSA schrieb Karli weitere Romane und erschuf weitere Welten.

Am 29. September 2022 ging Karli von uns. Der Name Karl-Heinz Witzko wurde aus dem "Buch der Anwesenden" gestrichen, und ins "Buch der Abwesenden" eingetragen. Altem Brauch auf Maraskan folgend werden Karli nun die Sechszehn Ratschläge mit auf dem Weg gegeben, und die Sechszehn Forderungen gestellt. Ich hätte gerne gehört oder gelesen, was Karli aus diesen gemacht hätte.

Danke für Deine Worte. Danke für Deine Zeit. Danke für Deinen Humor.

Heute schlug ich "Jenseits des Lichts" auf und fing an zu lesen.

RIP Steve Wilhite

RIP Steve Wilhite, who worked on CompuServe chat for decades and was the lead of the CompuServe team that developed the GIF format, which is still widely used, and which made the World Wide Web a much more colorful and dynamic place by having a format that allowed for animations. Wilhite incorrectly insisted on GIF being pronounced Jif. Wilhite died on March 14, 2022 at the age of 74.

RIP Christopher Alexander

RIP Christopher Alexander, the probably most widely read actual architect in all of computer science. His work, particularly his book "A Pattern Language" was popularized, among others, by the Gang of Four and Design Pattern work, and is frequently read and cited in Future of Programming and UX circles for the idea that everyone should be able to create, but in order to enable them, they need patterns that make creation possible. His work inspired Ward Cunningham when developing wikis and Will Wright when developing that most ungamelike of games, Sim City. Alexander died on March 17, 2022 at the age of 85.

Ante Vrandečić (1919-1944)

I knew that my father was named for his uncle. His other brother told me about him, and he was telling me that he became a prisoner of war and that they lost his trace. Back then, I didn't dare to ask on which side he was fighting, and when I would have dared to ask, it was too late.

Today, thanks to the increasing digitalisation of older sources and their publication on the Web and the Web being indexed, I accidentally stumbled upon a record about him in a three thousand pages long book, Volume 8 of the "Victims of the War 1941-1945" (Žrtve rata 1941-1945).

He was a soldier in the NOV i POJ (Yugoslav partisans), became a prisoner of war, and was killed by Germans during a transport in 1944. I don't know where he was captured, from where to where he was transported, where he was killed.

My father, his namesake, then moved to Germany in the 1970s, where he and my mother built a new life for themselves and their children, and where I was born.

I have a lot of complicated emotions and thoughts.

A quick draft for a curriculum for Computer Science

The other day, on Facebook, I was asking the question who would be the person closest to being a popularizer for ideas in Computer Science to the wider audience, which lead to an interesting and insightful discussion.

Pat Hayes asked what I would consider the five (or so) core concepts of Computer Science. Ernest Davis answer with the following short list (not in any particular order):

  1. Virtual machine
  2. Caching
  3. Algorithm
  4. Data structure
  5. Programming language

And I followed up with this drafty, much longer answer:

  1. how and why computation works; that a computation is a mapping from your problem domain into some machine state, then we have some automatic movement, and the result represents an answer to your question; that it is always layers of interpretation; that it doesn't matter whether the computing machine is made of ICs or of levers, marbles, and gravity (i.e. what is a function); that computation is always real and you can't simulate computation; what can be done with computation and what cannot; computational thinking - this might map to number 1 in Ernest's list
  2. that everything can be represented with zeros and ones, but doesn't have to be; it could also be represented by A and B and Cs, and many other ways; that two states are simply convenient for electric devices; that all information, all data, all input to all computation, and the steps for computations themselves are represented with zeros and ones (i.e. the von Neumann architecture and binary encoding); what can be represented in this paradigm and what cannot - this might map to number 4 in Ernest's list
  3. how are functions encoded; how many different functions can have the same results; how wildly different in efficiency functions can be even when they have the same result; why that makes some things quick to calculate whereas others take a long time; basically smearing ideas from lambda calculus and assembler and building everything from NAND circuits; why this all maps to higher level languages such as JavaScript - this might map to ideas from 2, 3, and 5 on Ernest's list
  4. bringing it back to the devices; where does, physically, the computation happen, where is physically the data stored, and why it matters in terms of privacy, equity, convenience, economics, interdependence, even freedom and independence; what kind of computations and data storage we can expect to have in our mobile phones, in a data center, in an RFID card; how long the turnaround times are in each case; how cryptography works and what kind of guarantees it can provide; why centralization is so alluring and what the price of that might be; and what might be the cost of computation for the environment
  5. given our times, and building on the previous lessons, what is the role of machine learning; how does it actually work, why does it work as good as it does, and why does it not work when it doesn't and where can't it work; what does this have to with "intelligence", if it does; what becomes possible because of these methods, and what it costs; why these methods may reinforce inequities; but also how they might help us with significantly increasing access to better health care for many people are allow computers to have much more intuitive interfaces and thus democratize access to computing resources

I think the intuitions in 1, 2, and maybe 3 are really the core of computer science, and then 4 and 5 provide shortcuts to important questions four ourselves and society that, I think, would be worthwhile for everyone to ponder and have an informed understanding of the situation so that they can meaningfully make relevant decisions.

The Strange Case of Booker T. Washington’s Birthday

A lovely geeky essay about how much work a single edit to Wikipedia can be. I went down this kind of rabbit holes myself more than once, and so I very much enjoyed the essay.

Wordle is good and pure

The nice thing about Wordle - whether you play it or not, whether you like it or not - is that it is one of those good, pure things the Web was made for. A simple Website, without ads, popups, monetization, invasive tracking, etc.

You know, something that can chiefly be done by someone who already has a comfortable life and won't regret not having monetized this. The same way scientists mainly have been "gentleman scientist". Or tenured professors who spent years on writing novels.

And that is why I think that we should have a Universal Basic Income. To unlock that creativity. To allow for ideas from people who are not already well off to see the light. To allow for a larger diversity of people to try more interesting things.

Thank you for coming to my TED talk.

P.S.: on January 31, five days after I wrote this text, Wordle was acquired by the New York Times for an undisclosed seven-digit sum. I think that is awesome for Wardle, the developer of Wordle, and I still think that what I said was true at that time and still mostly is, although I expect the Website now to slowly change to have more tracking, branding, and eventually a paywall.

Meat Loaf

"But it was long ago
And it was far away
Oh God, it seemed so very far
And if life is just a highway
Then the soul is just a car
And objects in the rear view mirror may appear closer than they are."

Bat out of Hell II: Back into Hell was the first album I really listened to, over and over again. Where I translated the songs to better understand them. Paradise by the Dashboard Light is just a fun song. He was in cult classic movies such as The Rocky Horror Picture Show, Fight Club, and Wayne's World.

Many of the words we should remember him for are by Jim Steinman, who died last year and wrote many of the lyrics that became famous as Meat Loaf's songs. Some of Meat Loaf's own words better not be remembered.

Rock in Peace, Meat Loaf! You have arrived at your destination.

Map of current Wikidata edits

It starts entirely black and then listens to Wikidata edits. Every time an item with a coordinate is edited, a blue dot in the corresponding place is made. So slowly, over time, you get a more and more complete map of Wikidata items.

If you open the developer console, you can get links and names of the items being displayed.

The whole page is less than a hundred lines of JavaScript and HTML, and it runs entirely in the browser. It uses the Wikimedia Stream API and the Wikidata API, and has no code dependencies. Might be fun to take a look if you're so inclined.

https://github.com/vrandezo/wikidata-edit-map/blob/main/index.html

White's illusion

I stumbled upon "White's Illusion" and was wondering - was this named after a person called White, or was this named because, well it is an illusion where the colour white plays an important role?

As usual in this case, I started at Wikipedia's article on White's illusion. But Wikipedia didn't answer that question. The references at the bottom also didn't list to anyone named White. So I started guessing it's about the colour.

But wait! Skimming the article there was a mention to "White and White (1985)" - but without any further citation information. So not only one White but two of them!

Google Scholar and Semantic Scholar didn't help me resolving "White and White (1985)" to a proper paper, so I started suspecting that this was a prank that someone entered into the article. I started checking the other references, but they indeed reference papers by White! And with those more complete references I was able to find out that Michael White and Tony White wrote that 1985 paper, that they are both Australian, that Michael White wrote a number of other papers about this illusion and others, and that this is Tony White's only paper.

I added some of the info to the article, but that was a weird ride.

She likes music, but only when the music is loud

Original in German by Herbert Grönemeyer, 1983.

She sits on her windsill all day
Her legs dangling to the music
The noise from her room
drives all the neighbours mad
She is content
smiles merrily

She doesn't know
that snow
falls
without a sound
to the ground

Doesn't notice
the knocking
on the wall

She likes music
but only
when the music is loud
When it hits her stomach
with the sound

She likes music
but only
when the music is loud
When her feet feel
the shaking ground

She then forgets
that she is deaf

The man of her dreams
must play the bass
the tickling in her stomach
drives her crazy

Her mouth seems
to scream
with happiness
silently
her gaze removed
from this world

Her hands don't know
with whom to talk
No one's there
to speak to her

She likes music
but only
when the music is loud
When it hits her stomach
with the sound

She likes music
but only
when the music is loud
When her feet feel
the shaking ground

A sermon on tolerance and inclusion

Warning: meandering New Year's sermon ahead, starting at a random point and going somewhere entirely else.

I started reading Martin Kay's book on Translation, and I am enjoying it quite a bit so far. Kay passed away August 2021. His work seems highly relevant for the work on Abstract Wikipedia.

One thing that bummed me though is that for more than a page in the introduction he rants about pronouns and how he is going to use "he" to generically mean both men and women, and how all other solutions have deficits.

He culminates in the explanation: "Another solution to this problem is which is increasing in popularity, is to use both 'he' and 'she', shifting between them more or less randomly. So we will sometimes get 'When a translator is confronted with a situation of this kind, she must decide...'. The trouble with this is that some readers, including the present writer, reacts quite differently to the sentence depending on which version of the generic pronoun it contains. We read the one containing 'he' smoothly and, all else being equal, assimilate the intended meaning. Encountering the one with 'she', on the other hand, is like following a television drama that is suddenly interrupted by a commercial."

Sooo frustratingly close to getting it.

I wish he'd had just not spent over a page on this topic, but just used the generic 'he' in the text, and that's it. I mean, I don't expect everyone born more than eighty years ago to adjust to the modern usage of pronouns.

Now, I am not saying that to drag Kay's name through dirt, or to get him cancelled or whatever. I have never met him, but I am sure he was a person with many positive facets, and given my network I wouldn't be surprised if there are people who knew him and can confirm so. I'm also not saying that to virtue signal and say "oh man, look how much more progressive I am". Yes, I am slightly annoyed by this page. Unlike many others though, I am not actually personally affected by it - I use the pronoun "he" for myself and not any other pronoun, so this really is not about me. Is it because of that that it is easy for me to gloss over this and keep reading?

So is it because I am not affected personally that it is so easy for me to say the following: it is still worthwhile to keep reading his work, and the rest of the book, and to build on top of his work and learn from him. The people we learn some things from, the influences we accept, they don't have to be perfect in every way, right? Would it have been as easy for me to say that if I were personally affected? I don't know.

I am worried about how quickly parts of society seems to be ready to "cancel" and "call out" people, and how willing they are to tag a person as unacceptable because they do not necessarily share every single belief that is currently regarded as a required belief.

I have great difficulties in drawing the line. Which beliefs or actions of a person should be sufficient grounds to shun them or their work? When JK Rowling doubles down on her stance regarding trans women, is this enough to ask everyone to drop all interest in the world she created and the books she wrote? Do we reshoot movie scenes such as the cameo of Donald Trump in Home Alone 2 in order to "purify" the movie and make it acceptable for our new enlightened age again? When Johnny Depp was accused of domestic abuse, does he need to be recast from movies he had already been signed on? Do we also need to stop watching his previous movies? Do the believable accusations of child abuse against Marion Zimmer Bradley mean that we have to ignore her contributions to feminist causes, never mind her books? Should we stop using a font such as Gill Sans because of the sexual abuse Erjc Gill committed against his daughters? Do we have to stop watching movies or listen to music produced by murderers such as OJ Simpson, Phil Spector, or Johnny Lewis?

I intentionally escalated the examples, and they don't compare at all to Kay's defence of his usage of pronouns.

I offer no answers as to where the line should be, I have none. I don't know. In my opinion, none of us is perfect, and none of our idols, paragons, or example model humans will survive the scrutiny for perfection. This is not a new problem. Think of Gandhi, Michael Jackson, Alice Schwarzer, Socrates - no matter where you draw your idols from, they all come with imperfections, sometimes massive ones.

Can we keep and accept their positive contributions - without ignoring their faults? Can we allow people with faults to still continue to contribute their skills to society, or do we reduce them to their faults and negatives? Do we have to get someone fired for tweeting a stupid joke? Do we demand perfection by everyone at all time?

Or do we allow everyone to be human, make and have errors, and have beliefs many don't deem acceptable? Committing or causing actions resulting from these beliefs? Even if these actions and beliefs hurt or endanger people, or deny the humanity of others? We don't have to and should not accept their racism, sexism, homo- and transphobia - but can and should we still recognise their other contributions?

I am worried about something else as well. By pushing out so many because of the one thing they don't want to accept in the basket of required beliefs, we push them all into the group of outsiders. But if there are too many outsiders, the whole system collapses. Do we all have to have the same belief on guns, on climate, on gender, on abortion, on immigration, on race, on crypto, on capitalism, on housing? Or can we integrate and work together even if we have differences?

The vast majority of Americans think that human-caused climate change is real and that we should act to avoid it. Only 10% don't. And yet, because of the way we define and fence our in- and outgroups, we have a strong voting block that repeatedly leads to outright sabotage to effective measures. A large majority of Americans support the right to abortion, but you would never be able to tell given the fights around laws and court cases. Taxing billionaires more effectively is highly popular with voters, but again these majorities fizzle away and don't translate to the respective changes in the tax code.

I think we should be able to work together with people we don't agree with on everything. We should stop requiring perfection and alignment on all issues before moving forward. But then again, that's what I am saying, and I am saying it from a position of privilege, am I not? I am male. I am White. I am heterosexual. I am not Muslim or Jewish. I am well educated. I am not poor. I am reasonably technologically savvy. I am not disabled. What right do I have at all to voice my opinion on these topics? To demand for acceptance people with beliefs that hurt or endanger people who are not like me. Or even to ask for your precious attention for these words of mine?

None.

And yet I hope that we will work together towards progress on the topics we agree on, that we will enlighten each other on the topics we disagree on, and that we will be able to embrace more of us on our way into the future.

P.S.: this post is problematic and not very well written, and I recognise that. Please refer to the discussion about it on Facebook.

Long John and Average Joe

You may know about Long John Silver. But who's the longest John? Here's the answer according to Wikidata: https://w.wiki/4dFL

What about your Average Joe? Here's the answer about the most average Joe, based on all the Joes in Wikidata: https://w.wiki/4dFR

Note, the average height of a Joe in Wikidata is 1,86cm or 6'1", which is quite a bit higher than the average height in the population. A data collection and coverage issue: it is much more likely to have the height for a basketball player than for an author in Wikidata.

Just two silly queries for Wikidata, which are nice ways to show off the data set and what one can do with the SPARQL query endpoint. Especially the latter one shows off a rather interesting and complex SPARQL query.

Temperatures in California

It has been a bit chillier the last few days. I noticed that after almost a decade in California, I feel pretty comfortable with understanding temperatures in Fahrenheit - as long as they are over 60° F. If it is colder, I need to switch to Celsius in order to understand how cold it exactly is. I have no idea what 40° or 45° or 50° F are, but I still know what 5° C is!

The fact that I still haven't acclimatised to Fahrenheit for the cooler temperatures tells you a lot about the climate in California.

SWSA panel

Thursday, October 7, 2021, saw a panel of three founding members of the Semantic Web research community, who each have been my teachers and mentors over the years: Rudi Studer, Natasha Noy, and Jim Hendler. I loved watching the panel and enjoyed it thoroughly, also because it was just great to see all of them again.

There were many interesting insights and thoughts in this panel, too many to write them all down, but I want to mention a few.

It was interesting how much all panelists talked about creating the Semantic Web community, and how much of an intentional effort that was. Deciding that it needs a conference, a journal, an organization, setting those up, and their interactions. Seeing and fostering a sustainable research community grown out of an idea is a formidable and amazing effort. They all mentioned positively the diversity in the community, and that it was a conscious effort to work towards that. Rudi mentioned that the future challenge will be with ensuring that computer science students actually have Semantic Web technologies integrated into their standard curriculum.

They named a number of the successes that were influenced by the Semantic Web research work, such as Schema.org, the heavy use of SPARQL in supercomputing (I had no idea!), Wikidata (thanks for the shout out, Rudi!), and the development of scalable graph databases. Natasha raised the advantage of having common identifiers throughout an organization, i.e. that everyone refers to California the same way. They also named areas that remained elusive and that they expect to see progress in the coming years, Rudi in particular mentioned Agents and Common Sense, which was echoed by the other participants, and Jim mentioned Personal Knowledge Graphs. Jim mentioned he was surprised by the growing importance of unstructured data. Jim is also hoping for something akin to “procedural attachments” - you see some new data coming in, you perform this action (I would like to think that a little Wikifunctions goes a long way).

We need both, open knowledge graphs and closed knowledge graphs (think of your personal ones, but also the ones by companies).

The most important contribution so far and also well into the future was the idea of decentralization of semantics. To allow different stakeholders to work asynchronously and separately on parts of the semantics and yet share data. This also includes the decentralization of knowledge graphs, but also in the future we will encounter a world where semantics are increasingly brought together and yet decentralized.

One interesting anecdote was shared by Natasha. She was talking about a keynote by Guha (one of the few researchers who were namechecked in the panel, along with Tim Berners-Lee) at ISWC in Sydney 2013. How Guha was saying how simple the technology needs to be, and how there were many in the audience who were aghast and shocked by the talk. Now, eight years later and given her experience building Dataset Search, she appreciates the insights. If they have a discussion about a new property for longer than five minutes, they drop it. It’s too complicated, and people will use it wrong so often that the data cleanup will become expensive.

All of them shared the advice for researchers in their early career stage to work on topics that truly inspire them, on problems that are real and that they and others care about, and that if they do so, the results have the best chance to have impact. Think about problems you can explain to people not in your field, about “how can we use triples to save the world” - and not just about “hey, look, that problem that we solved with these other technologies previously, now we can also solve it with Semantic Web technologies”. This doesn’t really help anyone. Solve new problems. Solve real problems. And do what you are truly passionate about.

I enjoyed the panel, and can recommend everyone in the Semantic Web research area or any related, nearby research, to check it out. Thanks to the organizers for this talk (which is the first session in a series of talks that will continue with Ora Lassila early December).


Our four freedoms for our technology

(This is a draft. Comments are welcome. This is not meant as an attack on any person or company individually, but at certain practises that are becoming increasingly prevalent)

We are not allowed to use the devices we paid for in the ways we want. We are not allowed to use our own data in the way we want. We are only allowed to use them in the way the companies who created the devices and services allow us.

Sometimes these companies are nice and give us a lot of freedom in how to use the devices and data. But often they don’t. They close them down for all kinds of reasons. They may say it is for your protection and safety. They might admit it is for profit. They may say it is for legal reasons. But in the end, you are buying a device, or you are creating some data, and you are not allowed to use that device and that data in the way you want to, you are not allowed to be creative.

The companies don’t want you to think of the devices that you bought and the data that you created as your devices and your data. They want you to think of them as black boxes that offer you services they create for you. They don’t want you to think of a Ring doorbell as a camera, a microphone, a speaker, and a button, but they want you to think of it as providing safety. They don’t want you to think of the garage door opener as a motor and a bluetooth module and a wifi module, but as a garage door opening service, and the company wants to control how you are allowed to use that service. Companies like Chamberlain and SkyLink and Genie don’t allow you to write a tool to check on your garage door, and to close or open it, but they make deals with Google and Amazon and Apple in order to integrate these services into their digital assistants, so that you can use it in the way these companies have agreed on together, through the few paths these digital assistants are available. The digital assistant that you buy is not a microphone and a speaker and maybe a camera and maybe a screen that you buy and use as you want, but you buy a service that happens to have some technical ingredients. But you cannot use that screen to display what you want. Whether you can watch your Amazon Prime show on the screen of a Google Nest Hub depends on whether Amazon and Google have an agreement with each other, not on whether you have paid for access to Amazon Prime and you have paid for a Google Nest Hub. You cannot use that camera to take a picture. You cannot use that speaker to make it say something you want it to say. You cannot use the rich plethora of services on the Web, and you cannot use the many interesting services these digital assistants rely on, in novel and creative combinations.

These companies don’t want you to think of the data that you have created and that they have about you as your data. They don’t want you to think about this data at all. They just want you to use their services in the way they want you to use their services. On the devices they approve. They don’t want you to create other surfaces that are suited to the way you use your data. They don’t want you to decide on what you want to see in your feed. They don’t want you to be able to take a list of your friends and do something with it. They will say it is to protect privacy. They will say that it is for safety. That is why you cannot use the data you and your friends have created. They want to exactly control what you can and cannot do with the data you and your friends have created. They want to control how many ads you must see in order to be allowed to see your friends’ posts. They don't want anyone else to have the ability to provide you creative new interfaces to your feed. They don’t want you yourself the ability to look at your feed and do whatever you want with it.

Those are devices you paid for.

These are data you and your friends have created.

And more and more we are losing our freedom of using our devices and our data as we like.

It would be impossible to invent email today. It would be impossible to invent the telephone today. Both are protocols that allow everyone to communicate with anyone no matter what their email provider or their phone is. Try reading your friend’s Facebook feed on Instagram, or send a direct message from your Twitter account to someone on WhatsApp, or call your Skype contact on Facetime.

It would be impossible to launch the Web today - many companies don’t want you browsing the Web. They want you to be inside of your Facebook feed and consume your content there. They want you to be on your Twitter feed. They don’t want you to go to the Website of the New York Times and read an article there, they don’t want you to visit the Website of your friend and read their blog there. They want you to stay on their apps. Per default, they open Websites inside their app, and not in your browser, so you are always within their app. They don’t want you to experience the Web. The Web is dwindling and all the good things on it are being recut and rebundled within the apps and services of tech companies.

Increasingly, we are seeing more and more walls in the world. Already, it is becoming impossible to pay and watch certain movies and shows without buying into a full subscription in a service. We will likely see the day where you will need a specific device to watch a specific movie. Where the only way to watch a Disney+ exclusive movie is on a Disney+ tablet. You don’t think so? Think about how easy it is to get your Kindle books onto another Ebook reader. How do you enable a skill or capability available in Alexa on your Nest smart speaker? How can you search through the books that you bought and are in your digital library, besides by using a service provided by the company that allows you to search your digital library? When you buy a movie today on YouTube or on iMovies, what do you own? What are you left with when the companies behind these services close that service, or go out of business altogether?

Devices and content we pay for, data we and our friends create, should be ours to use in empowering and creative ways. Services and content should not be locked in with a certain device or subscription service. The bundling of services, content, devices, and locking up user data creates monopolies that stifle innovation and creativity. I am not asking to give away services or content or devices for free, I am asking to be allowed to pay for them and then use them as I see fit.

What can we do?

As far as I can tell, the solution, unfortunately, seems to be to ask for regulation. The market won’t solve it. The market doesn’t solve monopolies and oligopolies.

But don’t ask to regulate the tech giants individually. We don’t need a law that regulates Google and a law that regulates Apple and a law that regulates Amazon and a law to regulate Microsoft. We need laws to regulate devices, laws to regulate services, laws to regulate content, laws that regulate AI.

Don’t ask for Facebook to be broken up because you think Mark Zuckerberg is too rich and powerful. Breaking up Facebook, creating Baby Books, will ultimately make him and other Facebook shareholders richer than ever before. But breaking up Facebook will require the successor companies to work together on a protocol to collaborate. To share data. To be able to move from one service to another.

We need laws that require that every device we buy can be made fully ours. Yes, sure, Apple must still be allowed to provide us with the wonderful smooth User Experience we value Apple for. But we must also be able to access and share the data from the sensors in our devices that we have bought from them. We must be able to install and run software we have written or bought on the devices we paid for.

We need laws that require that our data is ours. We should be able to download our data from a service provider and use it as we like. We must be allowed to share with a friend the parts of our data we want to share with that friend. In real time, not in a dump download hours later. We must be able to take our social graph from one social service and move to a new service. The data must be sufficiently complete to allow for such a transfer, and not crippled.

We need laws that require that published content can be bought and used by us as we like. We should be able to store content on our hard disks. To lend it to a friend. To sell it. Anything I can legally do with a book I bought I must be able to legally do with a movie or piece of music I bought online. Just as with a book you are not allowed to give away the copies if the work you bought still enjoys copyright.

We need laws that require that services and capabilities are unbundled and made available to everyone. Particularly as technological progress with regards to AI, Quantum computing, and providing large amounts of compute becomes increasingly an exclusive domain for trillion dollar companies, we must enable other organizations and people to access these capabilities, or run the risk that sooner or later all and any innovation will be happening only in these few trillion dollar companies. Just because a company is really good at providing a specific service cheaply, it should not be allowed to unfairly gain advantage in all related areas and products and stifle competition and innovation. This company should still be allowed to use these capabilities in their products and services, but so should anyone else, fairly prized and accessible by everyone.

We want to unleash creativity and innovation. In our lifetimes we have seen the creation of technologies that would have been considered miracles and impossible just decades ago. These must belong to everybody. These must be available to everyone. There cannot be equity if all of these marvellous technologies can be only wielded by a few companies on the West coast of the United States. We must make them available to all the people of the world: the people of the Indian subcontinent, the people of Subsaharan Africa,the people of Latin America, and everyone else. They all should own the devices they paid for, the data they created, the content they paid for. They all should have access to the same digital services and capabilities that are available to the engineers at Amazon or Google or Microsoft. The universities and research centers of the world should be able to access the same devices and services and extend them with their novel and creative ideas. The scrappy engineers in Eastern Europe and India and Nigeria and Central Asia should be able to call the AI models trained by Google and Microsoft and use them in novel ways to run their devices and chip-powered cars and agricultural machines. We want a world of freedom, tinkering, where creativity and innovation are unleashed, and where everyone can contribute their ideas, their creativity, and where everyone can build their fortune.


The Center of the Universe

The discovery of the center of the universe led to a series of unexpected consequences. It killed some, it enlightened others, but most people just were left utterly confused in the end.

When the results from the Total Radiating Universal Tessellation Hyperfield satellites measurements came in, it became depressingly clear that the universe was indeed contracting. Very slowly, but without any reasonable doubt — or, as the physicists said, they were five sigma sure about it. As the data from the measurements became available, physicists, cosmologists, topologists, even a few mathematically inclined philosophers, and a huge number of volunteers started to investigate it. And after a short period of time, they came to a whole set of staggering conclusions.

First, the Universe had a rather simple four-dimensional form. The only unfortunate blemishes in this theory were the black holes, but most of the volunteers, philosophers, and topologists decided to ignore these as accidental.

Second, the form was bounded. There was a beginning and an end in time, and there were boundaries in space, and those who understood that these were the same were enlightened about the form of the universe.

Third, since the form of the universe was bounded and simple, it had a center. Whereas this was slightly surprising it was a necessary consequence of the previous findings. What first seemed exciting, but soon will turn out not to be only the heart of this report, but the heart of all humanity, was that the data collected by the satellites allowed to calculate the position of the center of the universe.

Before that, let me recapture what we traditionally knew about how the universe is built. Our sun is a star, around which a few planets travel, one of them being our Earth. Our sun is one of a few tens of billions of stars that form a long curved thread which ties around a supermassive black hole. A small number of such threads are tangled together, forming the spiral arms of our galaxy, the Milky Way. Our galaxy consists of half a trillion stars like our sun.

Galaxies, like everything else in the universe, like to stick together and form groups. A few hundred thousand galaxies make up a supercluster. A few of these superclusters together build enormous walls of stars, filaments traversing the universe. The galaxies of such a wall are all in a single plane, more or less, and sometimes even in a single line.

Between these walls, walls made of superclusters and galaxies and stars and planets, there is, basically, nothing. The walls of stars are like gigantic honeycombs, and between them, are enormous empty spaces, hundred million of light years wide. When you look at a honeycomb, you will see that the empty spaces between the walls are much, much larger than the walls themselves. Such is the universe. You might think that the distance from here to the next grocery store is quite far, or that the ocean is quite big. But the distance from the earth to the sun is so much bigger, and the distance from the sun to the next star again so much more. And from our galaxy to the next, there is a huge empty space. Nevertheless, our galaxy is so close to the next group of galaxies that they together form a building block of a huge wall, separating two unimaginable large empty spaces from each other.

So when we figured out that we can calculate the center of the universe, it was widely expected that the center would be somewhere in one of those vast spaces of nothing. The chances that it would be in one of the filaments were tiny.

It turned out that this was not a question of chance.

The center of the universe was not only inside of a filament, but the first quick calculations (quick, though, has to be understood as taking three and a half years) suggested that the center is actually within our filament. And not only within our filament — but our galaxy. Within a one light year radius of our sun.

The team that made these calculations was working at a small research institute in rural Japan. They did not believe the results, and double and triple checked them. The head of the institute had graduated from Princeton, and called his former advisor there. Although it was deep in the night in Japan, they talked for many hours. In the end he learned that Princeton has made the same calculations, and received their own results about eight months ago. They didn’t dare to publish them. There must have been a mistake. These results had to be wrong.

Science has humiliated the whole of humanity again and again. And it was quite successful in doing so. A scientist would much easier accept that the center of the universe is some mathematical construct pointing to nothing than what the infallible mathematics indicated. But the data was out. And the number of people making the above mentioned realizations and calculations continued growing. It was only a matter of time. And when the Catholic University of Rio de Janeiro finally published the results — in a carefully written paper, without any accompanying press release, and formulated so cautiously and defensively — all the scientists who already knew the results held their breath.

The storm was unimaginable. Everyone demanded an explanation, but no one would listen to anyone offering one. The religions rejoiced, claiming they knew it all along, and many flocked to the mosques and churches and temples, as a proof of God was finally found. The irony of science leading humans to the embrace of religion was profoundly lost at that time, but later recognized as one of the largest jokes in history. Science has dealt its ultimate humiliation, not to humanity, but perversely to its most devout followers, the scientists. The scientists, who, while trashing the superiority of humans over the world, were secretly inflating their own, and were now reminded that they were merely slaves to a most cruel mistress. Their bitter resistance to the results did not stop them from emerging.

The mathematics and calculations were soon made public. The mathematics were deceptively simple, once the required factorizations were done, and easy to check. High school courses went through the proofs, and desperate parents peeked over the shoulders of their daughters and sons who, sometimes for the first time, talked of integrals and imaginary numbers. Television and streaming platforms were explaining discriminants and complex numbers and roots of higher degrees. Websites offering math courses bent under the load and moral weight.

There is one weird thing about roots. The root of a number is the number that, multiplied with itself, gives you the original number. The weird thing is that there is usually not a single, unique result to that question. For example, the root of the number four is not just two, but also minus two, as minus two times minus two results in four, too. There are two roots of the second degree (which we usually call the square root). There are three roots of the third degree (sometimes called the cube root). There are four roots of the fourth degree. And so on. All of them are correct. Sometimes you can discard one or the other because the result has to fit certain constraints (say, you are looking only for the positive root of four), but sometimes, you can not.

As the calculations went public, the methods became more and more refined. The results became increasingly precise, and as the data from the satellites poured in, one of the last steps involved a root of the seventh degree. First, this was regarded as a minor curiosity, especially because these seven results led to basically the same point. Cosmologically speaking.

Earth is moving. Earth is moving around the sun, with a speed of a sixty seven thousand miles per hour, or eighteen miles each second. Also the sun is moving, and the earth is moving with the sun, and our galaxy is moving, and with our galaxy the sun moves along, and with the sun our earth. We are racing with a speed of a thousand miles each second in some direction away from the center of the universe.

And it was realized, maybe we just passed the center of the universe. Maybe it was just an accident, maybe all the planets and stars pass the center of the universe at some point. That we are so close to the center of the universe might be just a funny coincidence.

And maybe they are right. Maybe every star will at some point cross the center of the universe within the distance of a light year.

At some point though it was realized that, since the universe was bounded in all four dimensions, there was not only a center in space, but also a center in time, a midpoint between the beginning of the universe and its future end.

All human history is encompassed in the last hundred thousand years. From the mitochondrial Eve and the Y-Chromosomal Adam who lived in Africa, the mother of our mother of our mother, and so on, that we all share, and the father of our father of our father, and so on, that we all share, their descendants, our ancestors, who crossed the then fertile jungle of the Sahara and who afterwards settled the whole planet, painted on the walls of caves and filled the air with music by blowing over grass blades and into hollow bones, wandered over the land bridge connecting Asia with the Americas and traveled over the vast Pacific to discover tiny islands, until the recent invention of the alphabet, all of this happened in the last hundred thousand years. The universe has an age of hundred thousand times a hundred thousand years, roughly. And the fabled midpoint turned out to be within the last few thousand years.

The hopes that our earth was just accidentally next to the center of the universe was shattered. As the precision of the calculations increased, it became clearer and clearer that earth was not merely close to the center of the universe, but back at the midpoint of history, earth was right there in the center. In every single of the seven possible results, Earth was right at the center of the universe. [1]

As the calculations continued over the years, a new class of mystic mathematicians emerged, and many walls between religion and science were shattered. On both sides the unshakeable ones remained: the scientists who would not admit that these results mean anything, that it all is merely a mathematical abstraction; and the priests who say that these results mean nothing, that they don’t tell us about how to live a good life. That these parallels intersect, is the only trace of infinity left.


[1] As the results refined, it seemed that the seven mathematical solutions for the center of time and space turned out to be some very well known dates. So far the precisions calculated was ten years here or there. The well known dates were: 3760 BC, 541 BC, 30 AD, and 610 AD. The other dates turned out to be quite less well known: 10909 BC, 3114 BC, and 1989 AD. The interpretation of the dates led to a well-known series of events all over the world, which we will not discuss here.


(This story was first published on Medium on February 2, 2014 under CC-BY 4.0).

CodeNet problem descriptions on the Web

Project CodeNet is a large corpus of code published by IBM. It has close to one and a half million programs around a bit more than 4,000 problems.

I took the problem descriptions, created a simple index file to those, and uploaded them to the Web to make them easily browseable.

Wikidata or scraping Wikipedia

Yesterday I was pointed to a blog post describing how to answer an interesting project: how many generations from Alfred the Great to Elizabeth II? Alfred the Great was a king in England at the end of the 9th century, and Elizabeth II is the current Queen of England (and a bit more).

The author of the blog post, Bill P. Godfrey, describes in detail how he wrote a crawler that started downloading the English Wikipedia article of Queen Elizabeth II, and then followed the links in the infobox to download all her ancestors, one after the other. He used a scraper to get the information from the Wikipedia infoboxes from the HTML page. He invested quite a bit of work in cleaning the data, particularly doing entity reconciliation. This was then turned into a graph and the data analyzed, resulting in a number of paths from Elizabeth II to Alfred, the shortest being 31 generations.

I honestly love these kinds of projects, and I found Bill’s write-up interesting and read it with pleasure. It is totally something I would love to do myself. Congrats to Bill for doing it. Bill provided the dataset for further analysis on his Website. Thanks for that!

Everything I say in this post is not meant, in any way, as a criticism of Bill. As said, I think he did a fun project with interesting results, and he wrote a good write-up and published his data. All of this is great. I left a comment on the blog post sketching out how Wikidata could be used for similar results.

He submitted his blog post to Hacker News, where a, to me, extremely surprising discussion ensued. He was pointed rather naturally and swiftly to Wikidata and DBpedia. DBpedia is a project that started and invested heavily in scraping the infoboxes from Wikipedia. Wikidata is a sibling project of Wikipedia where data can be directly maintained by contributors and accessed in a number of machine-readable ways. Asked why he didn’t use Wikidata, he said he didn’t know about it. All fair and good.

But some of the discussions and comments on Hacker News surprised me entirely.

Expressing my consternation, I started discussions on Twitter and on Facebook. And there were some very interesting stories about the pain of using Wikidata, and I very much expect us to learn from them and hopefully make things easier. The number of API queries one has to make in order to get data (although, these numbers would be much smaller than with the scraping approach), the learning curve about SPARQL and RDF (although, you can ignore both, unless you want to use them explicitly - you can just use JSON and the Wikidata API), the opaqueness of the identifiers (wdt:P25 wd:Q9682 instead of “mother” and “Queen Elizabeth II”) were just a few. The documentation seems hard to find, there seem to be a lack of libraries and APIs that are easy to use. And yet, comments like "if you've actually tried getting data from wikidata/wikipedia you very quickly learn the HTML is much easier to parse than the results wikidata gives you" surprised me a lot.

Others asked about the data quality of Wikidata, and complained about the huge amount of bad data, duplicates, and the bad ontology in Wikidata (as if Wikipedia wouldn’t have these problems. I mean how do you figure out what a Wikipedia article is about? How do you get a list of all bridges or events from Wikipedia?)

I am not here to fight. I am here to listen and to learn, in order to help figuring out what needs to be made better. I did dive into the question of data quality. Thankfully, Bill provides his dataset on the Website, and downloading the query result for the following query - select * { wd:Q9682 (wdt:P25|wdt:P22)* ?p . ?p wdt:P25|wdt:P22 ?q } - is just one click away. The result of this query is equivalent to what Bill was trying to achieve - a list of all ancestors of Elizabeth II. (The actual query is a little bit more complex, because we also fetch the names of the ancestors, and their Wikipedia articles, in order to help match the data to Bill’s data).

I would claim that I invested far less work than Bill in creating my graph data. No data cleansing, no scraping, no crawling, no entity reconciliation, no manual checking. How about the quality of the two datasets?

Update: Note, this post is not a tutorial to SPARQL or Wikidata. You can find an explanation of the query in the discussion on Hacker News about this post. I really wanted to see how the quality of the data using the two approaches compares. Yes, it is an unfamiliar language for many, but I used to teach SPARQL and the basics of the languages seem not that hard to learn. Try out this tutorial for example. Update over

So, let’s look at the datasets. I will refer to the two datasets as the scrape (that’s Bill’s dataset) and Wikidata (that’s the query result from Wikidata, as of the morning of August 20 - in particular, none of the errors in Wikidata mentioned below have been fixed).

In the scrape, we find 2,584 ancestors of Elizabeth II (including herself). They are connected with 3,528 parenthood relationships.

In Wikidata, we find 20,068 ancestors of Elizabeth II (including herself). They are connected with 25,414 parenthood relationships.

So the scrape only found a bit less than 13% of the people that Wikidata knows about, and close to 14% of the relationships. If you ask me, that’s quite a bad recall - almost seven out of eight ancestors are missing.

Did the scrape find things that are missing in Wikidata? Yes. 43 ancestors are in the scrape which are missing in Wikidata, and 61 parenthood relationships are in the scrape which are missing from Wikidata. That’s about 1.8% of the data in the scrape, or 0.24% compared to the overall parent relationship data of Elizabeth II in Wikidata.

I evaluated the complete list of those relationships from the scrape missing from Wikidata. They fall into five categories:

  • Category 1: Errors that come from the scraper. 40 of the 61 relationships are errors introduced by the scrapers. We have cities or countries being parents - which isn’t too terrible, as Bill says in the blog post because they won’t have parents themselves and won’t participate in the original question of findinging the lineage from Alfred to Elizabeth, so no problem. More problematic is when grandparents or great-grandparents are identified as the parent, because this directly messes up the counting of generations: Ügyek is thought to be a son, not a grandson of Prince Csaba, Anna Dalassene is skipping two generations to Theophylact Dalassenos, etc. This means we have an error rate of at least 1.1% in the scraper dataset, besides having the low recall rate mentioned above.
  • Category 2: Wikipedia has an error. Those are rare, it happened twice. Adelaide of Metz had the wrong father and Sophie of Mecklenburg linked to the wrong mother in the infobox (although the text was linking to the right one). The first one has been fixed since Bill ran his scraper (unlucky timing!), and I fixed the second one. Note I am linking to the historic version of the article with the error.
  • Category 3: Wikidata was missing data. Jeanne de Fougères, Countess of La Marche and of Angoulême and Albert Azzo II, Margrave of Milan were missing one or both of their parents, and Bill’s scraping found them. So of the more than 3,500 scraped relationships, only 2 were missing! I added both.
  • In addition, correct data was marked deprecated once. I fixed that, too.
  • Category 4: Wikidata has duplicates, and that breaks the chain. That happened five times, I think the following pairs are duplicates: Q28739301/Q106688884, Q105274433/Q40115489, Q56285134/Q354855, Q61578108/Q546165 and Q15730031/Q59578032. Duplicates were mentioned explicitly in one of the comments as a problem, and here we can see that they happen with quite a bit of frequency, particularly for non-central items. I merged all of these.
  • Category 5: the situation is complicated, and different Wikipedia versions disagree, because the sources seem to disagree. Sometimes Wikidata models that disagreement quite well - but often not. After all, we are talking about people who sometimes lived more than a millennium ago. Here are these cases: Albert II, Margrave of Brandenburg to Ada of Holland; Prince Álmos to Sophia to Emmo of Loon (complicated by a duplicate as well); Oldřich, Duke of Bohemia to Adiva; William III to Raymond III, both Counts of Toulouse; Thored to Oslac of York; Bermudo II of León to Ordoño III of León (Galician says IV); and Robert Fitzhamon to Hamo Dapifer. In total, eight cases. I didn't edit those as these require quite a bit of thought.

Note that there was not a single case of “Wikidata got it wrong”, which surprised me a lot - I totally expected errors to happen. Unless you count the cases in Category 5. I mean, even English Wikipedia had errors! This was a pleasant surprise. Also, the genuine complicated cases are roughly as frequent as missing data, duplicates, and errors together. To be honest, that sounds like a pretty good result to me.

Also, the scraped data? Recall might be low, but the precision is pretty good: more than 98% of it is corroborated by Wikidata. Not all scraping jobs have such a high correctness.

In general, these results are comparable to a comparison of Wikidata with DBpedia and Freebase I did two years ago.

Oh, and what about Bill’s original question?

Turns out that Wikidata knows of a path between Alfred and Elizabeth II that is even shorter than the shortest 31 generations Bill found, as it takes only 30 generations.

This is Bill’s path:

  • Alfred the Great
  • Ælfthryth, Countess of Flanders
  • Arnulf I, Count of Flanders
  • Baldwin III, Count of Flanders
  • Arnulf II, Count of Flanders
  • Baldwin IV, Count of Flanders
  • Judith of Flanders
  • Henry IX, Duke of Bavaria
  • Henry X, Duke of Bavaria
  • Henry the Lion
  • Henry V, Count Palatine of the Rhine
  • Agnes of the Palatinate
  • Louis II, Duke of Bavaria
  • Louis IV, Holy Roman Emperor
  • Albert I, Duke of Bavaria
  • Joanna Sophia of Bavaria
  • Albert II o _Germany
  • Elizabeth of Austria
  • Barbara Jagiellon
  • Christine of Saxony
  • Christine of Hesse
  • Sophia of Holstein-Gottorp
  • Adolphus Frederick I, Duke of Mecklenburg-Schwerin
  • Adolphus Frederick II, Duke of Mecklenburg-Strelitz
  • Duke Charles Louis Frederick of Mecklenburg
  • Charlotte of Mecklenburg-Strelitz
  • Prince Adolphus, Duke of Cambridge
  • Princess Mary Adelaide of Cambridge
  • Mary of Teck
  • George VI
  • Elizabeth II

And this is the path that I found using the Wikidata data:

  • Alfred the Great
  • Edward the Elder (surprisingly, it deviates right at the beginning)
  • Eadgifu of Wessex
  • Louis IV of France
  • Matilda of France
  • Gerberga of Burgundy
  • Matilda of Swabia (this is a weak link in the chain, though, as there might possibly be two Matildas having been merged together. Ask your resident historian)
  • Adalbert II, Count of Ballenstedt
  • Otto, Count of Ballenstedt
  • Albert the Bear
  • Bernhard, Count of Anhalt
  • Albert I, Duke of Saxony
  • Albert II, Duke of Saxony
  • Rudolf I, Duke of Saxe-Wittenberg
  • Wenceslaus I, Duke of Saxe-Wittenberg
  • Rudolf III, Duke of Saxe-Wittenberg
  • Barbara of Saxe-Wittenberg (Barbara has no article in the English Wikipedia, but in German, Bulgarian, and Italian. Since the scraper only looks at English, they would have never found this path)
  • Dorothea of Brandenburg
  • Frederick I of Denmark
  • Adolf, Duke of Holstein-Gottorp (husband to Christine of Hesse in Bill’s path)
  • Sophia of Holstein-Gottorp (and here the two lineages merge again)
  • Adolphus Frederick I, Duke of Mecklenburg-Schwerin
  • Adolphus Frederick II, Duke of Mecklenburg-Strelitz
  • Duke Charles Louis Frederick of Mecklenburg
  • Charlotte of Mecklenburg-Strelitz
  • Prince Adolphus, Duke of Cambridge
  • Princess Mary Adelaide of Cambridge
  • Mary of Teck
  • George VI
  • Elizabeth II

I hope that this is an interesting result for Bill coming out of this exercise.

I am super thankful to Bill for doing this work and describing it. It led to very interesting discussions and triggered insights into some shortcomings of Wikidata. I hope the above write-up is also helpful, particularly in providing some data regarding the quality of Wikidata, and I hope that it will lead to work in making Wikidata more and easier accessible to explorers like Bill.

Update: there has been a discussion of this post on Hacker News.

Double copy in gravity

15 May 2021

When I was younger, I understood these theories much better. Today I read them like a fascinated, but a bit distant bystander.

But it is terribly interesting. What does turning physics into math mean? When we find a mathematical shortcut that works but we don't understand - is this real? What is the relation between mathematical formulas and reality? And will we finally understand gravity some day?

It was an interesting article, but I am not sure I understood it all. I guess, I'm getting old. Or just too specialized.

Zen and the Art of Motorcycle Maintenance

13 May 2021

During my PhD, on the topic of ontology evaluation - figuring out what a good ontology is and what is not - I was running circles up and down trying to define what "good" means for an ontology (Benjamin Good, another researcher on that topic, had it easier, as he could call his metric "Good metric" and be done with it).

So while I was struggling with the definition in one of my academic essays, a kind anonymous reviewer (I think it was Aldo Gangemi) suggested I should read "Zen and the Art of Motorcycle Maintenance".

When I read the title of the suggested book, I first thought the reviewer was being mean or silly and suggesting a made-up book because I was so incoherent. It took me two days to actually check whether that book existed, as I wouldn't believe it.

It existed. And it really helped me, by allowing me to set boundaries of how far I can go in my own work, and that it is OK to have limitations, and that trying to solve EVERYTHING leads to madness.

(Thanks to Brandon Harris for triggering this memory)

Keynote at Web Conference 2021

Today, I have the honor to give a keynote at the WWW Confe... sorry, the Web Conference 2021 in Ljubljana (and in the whole world). It's the 30th Web Conference!

Join Jure Leskovec, Evelyne Viegas, Marko Grobelnik, Stan Matwin and myself!

I am going to talk about how Abstract Wikipedia and Wikifunctions aims to contribute to Knowledge Equity. Register here for free:

Update: the talk can now be watched on VideoLectures:

Building a Multilingual Wikipedia

Communications of the ACM published my paper on "Building a Multilingual Wikipedia", a short description of the Wikifunctions and Abstract Wikipedia project that we are currently working on at the Wikimedia Foundation.


Jochen Witte

Jochen Witte war ein Freund meiner Schulzeit. Ich habe viel von ihm gelernt, er konnte all diese praktischen Sachen zu denen ich nie einen Zugang hatte und von denen ich oft wünschte, ich könnte sie. Von ihm lernte ich, was eine gute Soundanlage braucht und warum Subwoofer groß sein müssen und was Subwoofer überhaupt sind. Zusammen schleppten wir schwere Boxen, um Unterstufendiscos und Abischerze und Vorträge zu ermöglichen. Von ihm lernte ich die Vorzüge des Gaffertapes kennen, und dass es nicht nur silbernes Klebeband ist. Er war der erste, der mir Mangas und Anime ein wenig näherbrachte, insbesondere hatte er eine Leidenschaft für Akira. Er ließ mich das erste Mal die elektronische Musik von Chris Hülsbeck und Jean-Michel Jarre hören. Er las ASM, ich las Power Play. Wir spielten eine zeitlang DSA miteinander. Er war der erste den ich kannte mit einem Pager. Er wirkte stets so als konnte er alles reparieren, und es war gut so jemanden zu kennen.

Gleichzeitig waren einige meiner Freunde und ich ihm gegenüber nicht immer freundlich, oh nein, im Gegenteil, manchmal war ich geradewegs grausam. Ich mache mich über seine Brille lustig oder sein Gewicht, und konnte Punkte damit sammeln, über ihn Witze zu machen. Ich wusste es war falsch. Wir waren ja schon die Außenseiter in der Klasse, und ich versuchte ihn zum Außenseiter der Außenseiter zu machen. Meine einzige Entschuldigung ist, dass wir Kinder waren, und ich noch nicht die Stärke hatte, besser zu sein. Ich lernte viel daraus, und wollte nie wieder so sein. Mit der Zeit verstand ich mich besser. Wo diese Grausamkeit herkam. Und das es nicht an Jochen lag, sondern in mir. Ich schäme mich für vieles was ich tat. Ich weiß nicht, ob ich mich jemals bei ihm entschuldigt habe.

Und dennoch glaube ich waren wir Freunde.

Nach der Schulzeit verloren wir uns aus den Augen. Er studierte Chemie in Esslingen, wir trafen uns hin und wieder im Movie Dick zur Sneak Preview. Er zog nach Staig im Alb-Donau-Kreis und fand sich als Goth wieder. Aber über die Jahre hinweg, gerieten wir hin und wieder in Kontakt.

Eine unserer gemeinsamen Erinnerungen war, wie wir zusammen zu einem Vortrag von Erich von Däniken fuhren. Es war mein Auto. Wir hatten einen Platten, und während er es zum Laufen brachte - wie gesagt, er konnte alles reparieren - fragte er mich, wann ich denn das letzte Mal nach dem Öl geschaut habe. Ich muss so belämmert reingeschaut haben, dass er nur noch lachen konnte. Die Antwort war "Nie", und er sah es in meinem Gesicht. Jedesmal wenn wir uns trafen, sprach er mich auf diesen Abend an.

Jochen half mir beim Umzug nach Karlsruhe. Das Gästebett passte nicht richtig zusammen. Er sagte er könnte es festziehen, aber ich würde es nie wieder auseinander bekommen. Es wird schwierig, damit umzuziehen. Ich sagte, das ist OK, ist ja nur ein billiges IKEA Gästebett Couch Dings. Ich habe nicht vor, damit umzuziehen, versicherte ich ihm.

Ich zog damit von Karlsruhe nach Berlin. Von Berlin nach Alameda. Innerhalb von Alameda. Von Alameda nach Berkeley. Es hat den Umzugshelfern jedesmal Kopfzerbrechen bereitet, genau wie Jochen versprochen hatte. Letzte Woche brach ein Stück ab. Ich sitze jetzt darauf und schreibe das hier. Nach fast einem Jahrzehnt sollte ich es wohl endlich austauschen.

Das letzte mal trafen wir uns ganz zufällig 2017 am Stuttgarter Bahnhof. Ich war überhaupt nur ein Mal im letzen halben Jahrzehnt wieder in Deutschland. Und da, am Bahnhof, traf ich ihn. Es war schön, Jochen wiederzusehen, und wir redeten als ob wir uns immer noch täglich sehen würden, wie zwanzig Jahre zuvor. Als ob das Abitur erst gestern war.

Diese Woche erfuhr ich von Michael, dass Jochen verstorben ist. Er starb nur wenige Monate nach unserem zufälligen Treffen, im April 2018. Er wurde nur vierzig Jahre alt.

Es tut mir leid.

Und noch viel mehr: Danke.

Ruhe in Frieden, Jochen Witte.

Der Name Zdenko

Heute sah ich dass der Artikel Zdenko - mein eigentlicher Name - auf der Englischen Wikipedia verändert wurde. Jemand hatte die Bedeutung des Namens von dem, was ich für richtig hielt (slawische Form von Sidonius) zu etwas was ich nie zuvor gehört habe (Koseform von Zdeslav) verändert, aber nicht die Quelle angepasst. Ich dachte, das wird eine schnelle Korrektur, habe aber dennoch in die Quelle geschaut - und, schau an, die Quelle sagte weder das eine noch das andere, sondern behauptete der Name stammt von dem slawischen Wort zidati, bauen, errichten.

Das führte mich zu einer zweitstündigen Odyssee durch verschiedene Quellen des 19. und 20. Jahrhunderts, wo ich Belege für alle drei Bedeutungen finden konnte - außerdem Quellen, die behaupteten, dass der Name von dem Slawischen Wort zdenac, Brunnen, abgeleitet ist, dass auch der Name Sidney von Sidonius stamme, und eine Hessische Quelle die vehement darüber schimpfte, dass doch Zdenko und Sidonius nichts miteinander zu tun haben (auch die Slowenische Wikipedia sagt, dass die Namen Zdenko und Sidonius zwar einen gemeinsamen Namenstag haben, aber nicht der gleiche Name sind). Dafür aber führt die gleiche Quelle aus, dass der im Osthessischen gebrauchte Name Denje wohl von Zdenka kommt (so nah an Denny!)

Denje gefällt mir als Name.

Kurzgesagt: wenn Du denkst, Etymologie sei kompliziert, sei gewarnt: Anthroponomastik ist deutlich schlimmer!

The name Zdenko

Today I saw that the Wikipedia article on Zdenko - my actual name - was edited, and the meaning of the name was changed from something I considered correct (slavic form of Sidonius) to something that I never heard of before (diminutive of Zdeslav), but the reference stayed intact, so I thought that'll be an easy revert. Just to do due process, I checked the given source - and funnily enough, it didn't say neither one nor the other, but gave an etymology from the slavic word zidati, to build, to create.

That lead me down a two hour rabbit hole through different sources crossing the 19th to 20th century, finding sources that claim the name is derived from the Slavic word zdenac, a well, or that Zdenko is cognate to Sidney, a Hessian source explaining that it is considered the root for the name Denje (so close to Denny!) (and saying it has nothing to do with Sidonius), and much more.

In short, if you think that etymology is messy, I tell you, anthroponymy is far worse!

Time on Mars

This is a fascinating and fun listen about the mars mission. Because a day on Mars takes 40 minutes longer than on Earth, the people working on that mission had to live on Mars time, as the Mars rovers work with solar panels. So they have watches showing Mars time. They invent new words in their language, speaking about sol instead of day, of yestersol, and they start themselves calling Martians. 11 minutes.

Katherine Maher to step down from Wikimedia Foundation

Today Katherine Maher announced that she is stepping down as the CEO of the Wikimedia Foundation in April.

Thank you for everything!

Boole and Voynich and Everest

Did you know?

George Boole - after whom the Boolean data type and Boolean logic was named - was the father of Ethel Lilian Voynich - who wrote The Gadfly.

Her husband was Wilfrid Voynich - after whom the Voynich manuscript was named.

Ethel's mother and George Boole's wife was Mary Everest Boole - a self-thought mathematician who wrote educational books about mathematics. Her life is of interest to feminists as an example of how women made careers in an academic system that did not welcome them.

Mary Everest Boole's uncle was Sir George Everest - after whom Mount Everest is named.

And her daughter Lucy Everest was the first he first woman Fellow of the Royal Institute of Chemistry.

Geoffrey Hinton, great-great-grandson of George and Mary Everest Boole, received the Turing Award for his work on deep learning.

Abraham Taherivand to step down from Wikimedia Deutschland

Today Abraham Taherivand announced that he is stepping down as the CEO of Wikimedia Deutschland at the end of the year.

Thank you for everything!

Twenty years

On this day, twenty years ago, on January 15, 2001, I started my third Website, Nodix, and I kept it up since then (unlike my previous two Websites, which are lost to history as Internet Archive didn't capture them yet, it seems). A few years later I renamed it to Simia.

Here is the first entry: Willkommen auf der Webseite von Denny Vrandecic!

My Website never became particularly popular, although I was meticulously keeping track of how many hits I got and all of this. It was always a fun side project for which I had sometimes more and sometimes less time.

The funniest thing is that it was - and that was completely incidental - exactly the same day that another Website was started, which I, over the years, spent much more time on: Wikipedia.

Wikipedia changed my life, not only once, but many times.

It is how I met Kamara.

It is how I met a lot of other very smart people, too. It became part of my research work and my PhD thesis. It became the motivation for many of the projects I have started, be it Semantic MediaWiki, Wikidata, or Abstract Wikipedia. It is the reason for my career trajectory over the last fifteen years. It is hard to overstate how influential Wikipedia has been on my life.

It is hard to overstate how important Wikipedia has become for modern AI and for the Web of today. For smaller language communities. For many, many people looking for knowledge. And for the many people who realised that they can contribute to it too.

Thanks to the Wikipedia community, thanks to this marvellous project, and happy anniversary and many returns to Wikipedia!

Happy New Year 2021!

2020 was a challenging year, particularly due to the pandemic. Some things were very different, some things were dangerous, and the pandemic exposed the fault lines in many societies in a most tragic way around the world.

Let's hope that 2021 will be better in that respect, that we will have learned from how the events unfolded.

But I'm also amazed by how fast the vaccine was developed and made available to tens of millions.

I think there's some chance that the summer of '21 will become one to sing about for a generation.

Happy New Year 2021!

Keynote at SMWCon Fall 2020


I have the honor of being the invited keynote for the SMWCon Fall 2020. I am going to talk "From Semantic MediaWiki to Abstract Wikipedia", discussing fifteen years of Semantic MediaWiki, how it all started, where we are now - crossing Freebase, DBpedia, Wikidata - and now leading to Wikifunctions and Abstract Wikipedia. But, more importantly, how Semantic MediaWiki, over all these years, still holds up and what its unique value is.

Page about the talk on the official conference site: https://www.semantic-mediawiki.org/wiki/SMWCon_Fall_2020/Keynote:_From_Semantic_Wikipedia_to_Abstract_Wikipedia

Site went down

The site went down, again. First time was in July, when Apache had issues, this time it's due to MySQL acting up and frying the database. I found a snapshot from July 2019, and am trying to recreate the entries from in between (thanks, Wayback Machine!)

Until then, at least the site is back up, even though they might be some losses in the content.

P.S.: it should all be back up. If something is missing, please email me.

Wikidata crossed Q100000000

Wikidata crossed Q100000000 (and, in fact, skipped it and got Q100000001 instead).

Here's a small post by Lydia Pintscher and me: https://diff.wikimedia.org/2020/10/06/wikidata-reaches-q100000000/

Mulan

I was surprised when Disney made the decision to sell Mulan on Disney+. So if you wanted to watch Mulan, you not only have to buy it, so far so good, but you have to join their subscription service first. The price for Mulan is $30 in the US, additionally to the monthly fee of streaming, $7. So the $30 don't buy you Mulan, but allow you to watch it if you keep up your subscription.

Additionally, on December 4 the movie becomes free for everyone with a Disney+ subscription.

I thought, that's a weird pricing model. Who'd pay that much money for streaming the movie a few weeks earlier? I know, it will be very long weeks due to the world being so 2020, but still. Money is tight for many people. Also, the movie had very mixed reviews and a number of controversies attached to it.

According to the linked report, Disney really knows what they're doing. 30% of subscribers bought the early streaming privilege! Disney made hundreds of millions in extra profit within three first few days (money they really will be thankful for right now given their business with the cruise ships and theme parks and movies this year).

The most interesting part is how this will affect the movie industry. Compare to Tenet - which was reviewed much better and which was the hope to revive the moribund US cinema industry, but made less than $30M - which also needs to be shared with the theaters and had much more distribution costs. Disney keeps a much larger share of the $30 for Mulan than Tenet makes for its production company.

The lesson from Mulan and Trolls 2, which also did much better than I would ever have predicted, for the production companies experimenting with novel pricing models, could be disastrous for theaters.

I think we're going to see even more experimentation with pricing models. If the new Bond movie and/or the new Marvel movie should be pulled from cinemas, this might also be the end of cinemas as we know them.

I don't know how the industry will change, but the swing is from AMC to Netflix, with the producers being caught in between. The pandemic massively accelerated this transition, as it did so many others.

https://finance.yahoo.com/amphtml/news/nearly-onethird-of-us-households-purchased-mulan-on-disney-for-30-fee-data-221410961.html

Gödel's naturalization interview

When Gödel went to his naturalization interview, his good friend Einstein accompanied him as a witness. On the way, Gödel told Einstein about a gap in the US constitution that would allow the country to be turned into a dictatorship. Einstein told him to not mention it during the interview.

The judge they came to was the same judge who already naturalized Einstein. The interview went well until the judge asked whether Gödel thinks that the US could face the same fate and slip into a dictatorship, as Germany and Austria did. Einstein became alarmed, but Gödel started discussing the issue. The judge noticed, changed the topic quickly, and the process came to the desired outcome.

I wonder what that was, that Gödel found, but that's lost to history.

Gödel and Leibniz

Gödel in his later age became obsessed with the idea that Leibniz had written a much more detailed version of the Characteristica Universalis, and that this version was intentionally censored and hidden by a conspiracy. Leibniz had discovered what he had hunted for his whole life, a way to calculate truth and end all disagreements.

I'm surprised that it was Gödel in particular to obsess with this idea, because I'd think that someone with Leibniz' smarts would have benefitted tremendously from Gödel's proofs, and it might have been a helpful antidote to his own obsession with making truth a question of mathematics.

And wouldn't it seem likely to Gödel that even if there were such a Characteristics Universalis by Leibniz, that, if no one else before him, he, Gödel himself would have been the one to find the fatal bug in it?

Starting Abstract Wikipedia

I am very happy about the Board of the Wikimedia Foundation having approved the proposal for the multilingual Wikipedia aka Abstract Wikipedia aka Wikilambda aka we'll need to find a name for it.

In order to make that project a reality, I will as of next week join the Foundation. We will be starting with a small, exploratory team, which will allow us to have plenty of time to continue to socialize and discuss and refine the idea. Being able to work on this full time and with a team should allow us to make significant progress. I am very excited about that.

I am sad to leave Google. It was a great time, and I learned a lot about running *large* projects, and I met so many brilliant people, and I ... seriously, it was a great six and a half years, and I will very much miss it.

There is so much more I want to write but right now I am just super happy and super excited. Thanks everyone!

Lexical masks in JSON

We have released lexical masks as ShEx files before, schemata for lexicographic forms that can be used to validate whether the data is complete.

We saw that it was quite challenging to turn these ShEx files into forms for entering the data, such as Lucas Werkmeister’s Lexeme Forms. So we adapted our approach slightly to publish JSON files that keep the structures in an easier to parse and understand format, and to also provide a script that translates these JSON files into ShEx Entity Schemas.

Furthermore, we published more masks for more languages and parts of speech than before.

Full documentation can be found on wiki: https://www.wikidata.org/wiki/Wikidata:Lexical_Masks#Paper

Background can be found in the paper: https://www.aclweb.org/anthology/2020.lrec-1.372/

Thanks Bruno, Saran, and Daniel for your great work!

Major bill for US National Parks passed

Good news: the US Senate has passed a bipartisan large Public Lands Bill, which will provide billions right now and continued sustained funding for National Parks.

There a number of interesting and good parts about this, besides the obvious that National Parks are being funded better and predictably:

  1. the main reason why this passed and was made was that the Evangelical movement in the US is increasingly reckoning that Pro-Life also means Pro-Environment, and this really helped with making this bill a reality. This is major as it could set the US on a path to become a more sane nation regarding environmental policies. If this could also extend to global warming, that would be wonderful, but let's for now be thankful for any momentum in this direction.
  2. the sustained funding comes from oil and gas operations, which has a certain satisfying irony to it. I expect this part to backfire a bit somehow, but I don't know how yet.
  3. Even though this is a political move by Republicans in order to safe two of their Senators this fall, many Democrats supported it because the substance of the bill is good. Let's build on this momentum of bipartisanship.
  4. This has nothing to do with the pandemic, for once, but was in work for a long time. So all of the reasons above are true even without the pandemic.

Black lives matter

Fun in coding

16 May 2020

This article really was grinding my gears today. Coding is not fun, it claims, and everyone who says otherwise is lying for evil reasons, like luring more people into programming.

Programming requires almost superhuman capabilities, it says. And other jobs who do that, such as brain surgery, would never be described as fun, so it is wrong to talk like this about coding.

That is all nonsense. The article not only misses the point, but it denies many people their experience. What's the goal? Tell those "pretty uncommon" people that they are not only different than other people, but that their experience is plain wrong, that when they say they are having fun doing this, they are lying to others, to the normal people, for nefarious reasons? To "lure people to the field" to "keep wages under control"?

I feel offended by this article.

There are many highly complex jobs that some people have fun doing some of the time. Think of writing a novel. Painting. Playing music. Cooking. Raising a child. Teaching. And many more.

To put it straight: coding can be fun. I have enjoyed hours and days of coding since I was a kid. I will not allow anyone to deny me that experience I had, and I was not a kid with nefarious plans like getting others into coding to make tech billionaires even richer. And many people I know have expressed fun with coding.

Also: coding does not *have* to be fun. Coding can be terribly boring, or difficult, or frustrating, or tedious, or bordering on painful. And there are people who never have fun coding, and yet are excellent coders. Or good enough to get paid and have an income. There are coders who code to pay for their rent and bills. There is nothing wrong with that either. It is a decent job. And many people I know have expressed not having fun with coding.

Having fun coding doesn't mean you are a good coder. Not having fun coding doesn't mean you are not a good coder. Being a good coder doesn't mean you have to have fun doing it. Being a bad coder doesn't mean you won't have fun doing it. It's the same for singing, dancing, writing, playing the trombone.

Also, professional coding today is rarely the kind of activity portrayed in this article, a solitary activity where you type code in green letters into a monotype font on black background, without having to answer to anyone, your code not being reviewed and scrutinized before it goes into production. For decades, coding has been a highly social activity, that requires negotiation and discussion and social skills. I don't know if I know many senior coders who spend the majority of their work time actually coding. And it's in that level of activity where ethical decisions are made. Ethical decisions are rarely happening at the moment the coder writes an if statement, or declares a variable. These decisions are made long in advance, documented in design docs and task descriptions, reviewed by a group of people.

So this article, although it has its heart in the right position, trying to point out that coding, like any engineering, also has many relevant ethical questions, goes about it entirely wrongly, and manages to offend me, and probably a lot of other people.

Sorry for my Saturday morning rant.

OK

11 May 2020

I often hear "don't go for the mediocre, go for the best!", or "I am the best, * the rest" and similar slogans. But striving for the best, for perfection, for excellence, is tiring in the best of times, never mind, forgive the cliché, in these unprecedented times.

Our brains are not wired for the best, we are not optimisers. We are naturally 'satisficers', we have evolved for the good-enough. For this insight, Herbert Simon received a Nobel prize, the only Turing Award winner to ever get one.

And yes, there are exceptional situations where only the best is good enough. But if good enough was good enough for a Turing-Award winning Nobel laureate, it is probably for most of us too.

It is OK to strive for OK. OK can sometimes be hard enough, to be honest.

May is mental health awareness month. Be kind to each other. And, I know it is even harder, be kind to yourself.

Here is OK in different ways. I hope it is OK.

Oké ఓకే ਓਕੇ オーケー ओके 👌 ওকে או. קיי. Окей أوكي Օքեյ O.K.


Tim Bray leaving Amazon in protest

Tim Bray, co-author of XML, stepped down as Amazon VP over their handling of whistleblowers on May 1st. His post on this decision is worth reading.

If life was one day

If the evolution of animals was one day... (600 million years)

  • From 1am to 4am, most of the modern types of animals have evolved (Cambrian explosion)
  • Animals get on land a bit at 3am. Early risers! It takes them until 7am to actually breath air.
  • Around noon, first octopuses show up.
  • Dinosaurs arrive at 3pm, and stick around until quarter to ten.
  • Humans and chimpanzees split off about fifteen minutes ago, modern humans and Neanderthals lived in the last minute, and the pyramids were built around 23:59:59.2.

In that world, if that was a Sunday:

  • Saturday would have started with the introduction of sexual reproduction
  • Friday would have started by introducing the nucleus to the cell
  • Thursday recovering from Wednesday's catastrophe
  • Wednesday photosynthesis started, and lead to a lot of oxygen which killed a lot of beings just before midnight
  • Tuesday bacteria show up
  • Monday first forms of life show up
  • Sunday morning, planet Earth forms, pretty much at the same time as the Sun.
  • Our galaxy, the Milky Way, is about a week older
  • The Universe is about another week older - about 22 days.

There are several things that surprised me here.

  • That dinosaurs were around for such an incredibly long time. Dinosaurs were around for seven hours, and humans for a minute.
  • That life started so quickly after Earth was formed, but then took so long to get to animals.
  • That the Earth and the Sun started basically at the same time.

Addendum April 27: Álvaro Ortiz, a graphic designer from Madrid, turned this text into an infographic.

Architecture for a multilingual Wikipedia

I published a paper today:

"Architecture for a multilingual Wikipedia"

I have been working on this for more than half a decade, and I am very happy to have it finally published. The paper is a working paper and comments are very welcome.

Abstract:

Wikipedia’s vision is a world in which everyone can share in the sum of all knowledge. In its first two decades, this vision has been very unevenly achieved. One of the largest hindrances is the sheer number of languages Wikipedia needs to cover in order to achieve that goal. We argue that we need anew approach to tackle this problem more effectively, a multilingual Wikipedia where content can be shared between language editions. This paper proposes an architecture for a system that fulfills this goal. It separates the goal in two parts: creating and maintaining content in an abstract notation within a project called Abstract Wikipedia, and creating an infrastructure called Wikilambda that can translate this notation to natural language. Both parts are fully owned and maintained by the community, as is the integration of the results in the existing Wikipedia editions. This architecture will make more encyclopedic content available to more people in their own language, and at the same time allow more people to contribute knowledge and reach more people with their contributions, no matter what their respective language backgrounds. Additionally, Wikilambda will unlock a new type of knowledge asset people can share in through the Wikimedia projects, functions, which will vastly expand what people can do with knowledge from Wikimedia, and provide a new venue to collaborate and to engage the creativity of contributors from all around the world. These two projects will considerably expand the capabilities of the Wikimedia platform to enable every single human being to freely share in the sum of all knowledge.

Stanford seminar on Knowledge Graphs

My friend Vinay Chaudhri is organising a seminar on Knowledge Graphs with Naren Chittar and Michael Genesereth this semester at Stanford.

I have the honour to present in it as the opening guest lecturer, introducing what Knowledge Graphs are and what are good for.

Due to the current COVID situation, the seminar was turned virtual, and opened to everyone to attend to.

Other speakers during the semester include Juan Sequeda, Marie-Laure Mugnier, Héctor Pérez Urbina, Michael Uschold, Jure Leskovec, Luna Dong, Mark Musen, and many others.

Change is in the air

I'll be prophetic: the current pandemic will shine a bright light on the different social and political systems in the different countries. I expect to see noticeable differences in how disruptive the handling of the situation by the government is, how many issues will be caused by panic, and what effect freely available health care has. The US has always been on the very end of admiring the self sustained individual, and China has been on the other end of admiring the community and its power, and Europe is somewhere in the middle (I am grossly oversimplifying).

This pandemic will blow over in a year or two, it will sweep right through the US election, and the news about it might shape what we deem viable and possible in ways beyond the immediately obvious. The possible scenarios range all the way from high tech surveillance states to a much wider access to social goods such as health and education, and whatever it is, the pandemic might be a catalyst towards that.

Wired: "Wikipedia is the last best place on the Internet"

WIRED published a beautiful ode to Wikipedia, painting the history of the movement with broad strokes, aiming to capture its impact and ambition with beautiful prose. It is a long piece, but I found the writing exciting.

Here's my favorite paragraph:

"Pedantry this powerful is itself a kind of engine, and it is fueled by an enthusiasm that verges on love. Many early critiques of computer-assisted reference works feared a vital human quality would be stripped out in favor of bland fact-speak. That 1974 article in The Atlantic presaged this concern well: “Accuracy, of course, can better be won by a committee armed with computers than by a single intelligence. But while accuracy binds the trust between reader and contributor, eccentricity and elegance and surprise are the singular qualities that make learning an inviting transaction. And they are not qualities we associate with committees.” Yet Wikipedia has eccentricity, elegance, and surprise in abundance, especially in those moments when enthusiasm becomes excess and detail is rendered so finely (and pointlessly) that it becomes beautiful."

They also interviewed me and others for the piece, but the focus of the article is really on what the Wikipedia communities have achieved in our first two decades.

Two corrections: - I cannot be blamed for Wikidata alone, I blame Markus Krötzsch as well - the article says that half of the 40 million entries in Wikidata have been created by humans. I don't know if that is correct - what I said is that half of the edits are made by human contributors

Normbrunnenflasche

It's a pity there's no English Wikipedia article about this marvellous thing that exemplifies Germany so beautifully and quintessentially: the Normbrunnenflasche.

I was wondering the other day why in Germany sparkling water is being sold in 0.7l bottles and not in 1l or 2l or whatever, like in the US (when it's sold here at all, but that's another story).

Germany had a lot of small local producers and companies. To counter the advantages of the Coca Cola Company pressing in the German market, in 1969 a conference of representatives of the local companies decided to introduce a bottle design they all would use. This decision followed a half year competition and discussion on what this bottle should look like.

Every company would use the same bottle for sparkling water and other carbonated drinks, and so no matter which one you bought, the empty bottle would afterwards be routed to the closest participating company, not back home, therefore reducing transport costs and increasing competitiveness against Coca Cola.

The bottle is full of smart features. The 0.7l were chosen to ensure that the drink remained carbonated until the last sip, because larger bottles would last longer and thus gradually loose carbonization.

The form and the little pearls outside were chosen for improved grip, but also to symbolize the sparkles of the carbonization.

The metal screw cap was the real innovation there, useful for drinks that could increase pressure due to the carbonization.

And finally two slightly thicker bands along the lower half of the bottle that would, while being rerouted for another usage, slowly get more opaque due to mechanical pressure, thus indicating how well used the individual bottle was, so they could be taken out of service in time before breaking at the customer.

The bottles were reused an average of fifty times, their boxes an average of hundred times. More than five billion of them have been brought into circulation in the fifty years since their adoption, for an estimated quarter of a trillion fillings.

A new decade?

The job of an ontologist is to define concepts. And since I see some posts commenting on whether a decade is closing and a new decade is starting tonight, here's my private, but entirely official position.

A decade is a consecutive timespan of ten years, and therefore at every given point a new decade starts and one ends. But that's a trivial answer to the question and not very useful.

There are two ways to count calendar decades, and both are arbitrary and rely on retconning, I mean, they really on redefining the past. Therefore there is no right or wrong.

Method one is by using the proleptic Gregorian calendar, and starting with the year 1 and ending with the year 10, and calling that the first decade. If you keep counting, then the twohundredandthird decade will start on January 1st 2021, and we are currently firmly in the twohundredandsecond decade, and will stay there for another year.

Method two is based on the fact that for a millennium now and for many years to come there's a time period that conveniently lasts a decade where the years start with the same three digits. That is, the years starting with 202, which are called the 2020s, the ones with 199 which are called the 1990s (or sometimes just the 90s), etc. For centuries now we can find support for these kind of decades being widely used. According to this method, tonight marks a new decade.

So whether you are celebrating a new year tonight or not (because there are many other calendars out there too), or a new decade or not, I wish you wonderful 2020s!

SWAT4HCLS trip report

This week saw the 12th SWAT4HCLS event in Edinburgh, Scotland. It started with a day of tutorials and workshops on Monday, December 10th, on topics such as SPARQL, querying, ontology matching, and using Wikibase and Wikidata.

Conference presentations went on for two days, Tuesday and Wednesday. This included four keynotes, including mine on Wikidata, and how to move beyond Wikidata (presenting the ideas from my Abstract Wikipedia papers). The other three keynotes (as well as a number of the paper presentation) were all centered on the FAIR concept which I already saw being so prominent at the eScience conference earlier this year. FAIR as in Findable, Accessible, Interoperable, and Reusable publication of data. I am very happy to see these ideas spread out so prominently!

Birgitta König-Ries talked about how to use semantic technologies to manage FAIR data. Dov Greenbaum talked about how licenses interplay with data and what it means for FAIR data - personally, my personal favorite of the keynotes, because of my morbid fascination regarding licenses and intellectual property rights pertaining to data and knowledge. He actually confirmed my understanding of the area - that you can’t really use copyright for data, and thus the application of CC-BY or similar licenses to data would stand on shaky grounds in a court. The last keynote was by Helen Parkinson, who gave a great talk on the issues that come up when building vocabularies, including issues around over-ontologizing (and the siren call of just keeping on modeling) and others. She put the issues in parallel to the travels of Odysseus, which was delightful.

The conference talks and posters were really on spot on the topic of the conference: using semantic web technologies in the life sciences, health care, and related fields. It was a very satisfying experience to see so many applications of the technologies that Semantic Web researchers and developers have been creating over the years. My personal favorite was MetaStanza, web components that visualize SPARQL results in many different ways (a much needed update to SPARK, that Andreas Harth and I had developed almost a decade ago).

On Thursday, the conference closed with a Hackathon day, which I couldn’t attend unfortunately.

Thanks to the organizers for the event, and thanks again for the invitation to beautiful Edinburgh!

Other trip reports (send me more if you have them):

Frozen II in Korea

This is a fascinating story, that just keeps getting better (and Hollywood Reporter is only scratching the surface here, unfortunately): an NGO in South Korea is suing Disney for "monopolizing" the movie screens of the country, because Frozen II is shown on 88% of all screens.

Now, South Korea has a rich and diverse number of movie theatres - they have the large cineplexes in big cities, but in the less populated areas they have many small theatres, often with a small number of screens (I reckon it is similar to the villages in Croatia, where there was only a single screen in the theater, and most movies were shown only once, and there were only one or two screenings per day, and not on every day). The theatres are often independent, so there is no central planning about which movies are being shown (and today, it rarely matters today how many copies of a movie are being made, as many projectors are digital and thus unlimited copies can be created on the fly - instead of waiting for the one copy to travel from one town to the next, which was the case in my childhood).

So how would you ensure that these independent movies don't show a movie too often? By having a centralized way that ensures that not too many screens show the same movie? (Preferably on the Blockchain, using an auction system?) Good luck with that, and allowing the local theatres to adapt their screenings to their audiences.

But as said, it gets better: the 88% number is being arrived at by counting how many of the screens in the country showed Frozen II on a given day. It doesn't mean that that screen was used solely for Frozen II! If the screen was used at noon for a showing of Frozen II, and at 10pm for a Korean horror movie, that screen counts for both. Which makes the percentage a pretty useless number if you want to show monopolistic dominance (also, because the numbers add up to far more than 100%). Again, remember that in small towns there is often a small number of screens, and they have to show several different movies on the same screen. If the ideas of the lawsuit would be enacted, you would need to keep off Frozen II from a certain number of screens! Which basically makes it impossible to allow kids and teens in less populated areas to participate in event movie-going such as Frozen II and trying to avoid spoilers in Social Media afterwards.

Now, if you look how many screenings, instead of screens, were occupied by Frozen II, the number drops down to 46% - which is still impressive, but far less dominant and monopolistic than the 88% cited above (and in fact below the 50% the Korean law requires to establish dominance).

And even more impressive: in the end it is up to the audience. And even though 'only' 46% of the screenings were on Frozen II, every single day since its release between 60% and 85% of all revenue was going to Frozen II. So one could argue that the theatres were actually underserving the audience (but then again, that's not how it really works, because screenings are usually in rooms with hundred or more seats, and they can be very differently filled - and showing a blockbuster three times with almost full capacity, and showing a less popular movie once with only a dozen or so tickets sold might still have served the local community better than only running the block buster).

I bet the NGO's goal is just to raise awareness about the dominance of the American entertainment industry, and for that, hey, it's certainly worth a shot! But would they really want to go back to a system where small local cinemas would not be able to show blockbusters for a long time, involving a complicated centralized planning component?

(Also, I wish there was a way to sign up for updates on a story, like this lawsuit. Let me know if anyone knows of such a system!)


Machine Learning and Metrology

There are many, many papers in machine learning these days. And this paper, taking a step back, and thinking about how researchers measure their results and how good a specific type of benchmarks even can be - crowdsourced golden sets. It brings a convincing example based on word similarity, using terminology and concepts from metrology, to show how many results that have been reported are actually not supported by the golden set, because the resolution of the golden set is actually insufficient. So there might be no improvement at all, and that new architecture might just be noise.

I think this paper is really worth the time of people in the research field. Written by Chris Welty, Lora Aroyo, and Praveen Paritosh.

The story of the Swedish calendar

Most of us are mostly aware how the calendar works. There’s twelve months in a year, each month has 30 or 31 days, and then there’s February, which usually has 28 days and sometimes, in what is called a leap year, 29. In general, years divisible by four are leap years.

This calendar was introduced by no one else then Julius Caesar, before he became busy conquering the known world and becoming the Emperor of Rome. Before that he used to have the job title “supreme bridge builder” - the bridge connecting the human world with the world of the gods. One of the responsibilities of this role was to decide how many days to add to the end of the calendar year, because the Romans noticed that their calendar was getting misaligned with the seasons, because it was simply a bit too short. So, for every year, the supreme bridge builder had to decide how many days to add to the calendar.

Since we are talking about the Roman Republic, this was unsurprisingly misused for political gain. If the supreme bridge builder liked the people in power, he might have granted a few extra weeks. If not, no extra days. Instead of ensuring that the calendar and the seasons aligned, the calendar got even more out of whack.

Julius Caesar spearheaded a reform of the calendar, and instead of letting the supreme bridge builder decide how many days to add, the reform devised rules founded in observation and mathematical rules - leading to the calendar we still have today: twelve months each year, each with 30 or 31 days, besides February, which had 28, but every four years would have 29. This is what we today call the Julian calendar. This calendar was not perfect, but pretty good.

Over the following centuries, the role of the supreme bridge builder - or, in latin, Pontifex Maximus - transferred from the Emperor of Rome to the Bishop of Rome, the Pope. And with continuing observations over centuries it was noticed that the calendar was again getting out of sync with the seasons. So it was the Pope - Gregory XIII, later called The Great - who, in his role as Pontifex Maximus, decided that the calendar should be fixed once again. The committee he set up to work on that came up with fabulous improvements, which would guarantee to keep the calendar in sync for a much longer time frame. In addition to the rules established by the Julian calendar, every hundred years we would drop a leap year. But every four hundred years, we would skip dropping the leap year (as we did in 2000, which not many people noticed). And in 1582, this calendar - called the Gregorian calendar - was introduced.

Imagine leading a committee that comes up with rules on what the whole world would need to do once every four hundred years - and mostly having these rules implemented. How would you lead and design such a committee? I find this idea mind-blowing.

Since the time of Caesar until 1582, about fifteen centuries have passed. And in this time, the calendar was getting slightly out of sync - by one day every century, skipping every fourth. In order to deal with that shift, they decided that ten calendar days need to be skipped. Following the 4th of October 1582 was the 15th of October 1582. In 1582, there was no 5th or 14th of October, nor any of the days in between, in the countries that had the Gregorian calendar adopted.

This lead to plenty of legal discussions, mostly about monthly rents and wages: is this still a full month, or should the rent or wage be paid prorated to the number of days? Should annual rents, interests, and taxes be prorated by these ten days, or not? What day of the week should the 15th of October be?


The Gregorian calendar was a marked improvement over the Julian calendar with regards to keeping the seasons in sync with the calendar. So one might think its adoption should be a no-brainer. But there was a slight complication: politics.

Now imagine that today the Pope gets out on his balcony, and declares that, starting in five years, January to November all have 30 days, and December has 35 or 36 days. How would the world react? Would they ponder the merits of the proposal, would they laugh, would they simply adopt it? Would a country such as Italy have a different public discourse about this topic than a country such as China?

In 1582, the situation was similarly difficult. Instead of pondering the benefits of the proposal, the source of the proposal and the relation to that source became the main deciding factor. Instead of adopting the idea because it is a good idea, the idea was adopted - or not - because the Pope of the Catholic Church declared it. The Papal state, the Spanish and French Kingdoms, were first to adopt it.

Queen Elizabeth wanted to adopt it in England, but the Anglican bishops were fiercely opposed to it because it was suggested by the Pope. Other Protestant and the Orthodox countries simply ignored it for centuries. And thus there was a 5th of October 1582 in England, but not in France, and that lead to a number of confusions over the following centuries.

Ever wondered why the October Revolution started November 7? There we go. There is even a story that Napoleon won an important battle (either the Battle of Austerlitz or the Battle of Ulm) because the Russian and Austrian forces coordinated badly as the Austrians were using the Gregorian and the Russians the Julian calendar. The story is false, but it makes for a great story.

Today, the International Day of the Book is on April 23 - the death date of both Miguel de Cervantes and William Shakespeare in 1616, the two giants of literature in their respective languages - with the amusing side-effect that they actually died about two weeks apart, even though they died on the same calendar day, but in different calendars.

It wasn’t until 1923 that for most purposes all countries had deprecated the Julian calendar, and for religious purposes some still follow it - which is why the Orthodox and the Amish celebrate Christmas on January 6. Starting 2101, that should shift by another day - and I would be very curious to see whether it will, or whether by then January 6th has solidified as the Christmas date.


Possibly the most confusing story about adopting the Gregorian calendar comes from Sweden. Like most protestant countries, Sweden did not initially adopt the Gregorian calendar, and was sticking with the Julian calendar, until in 1699 they decided to switch.

Now, the idea of skipping eleven or twelve days in one go did not sound appealing - remember all the chaos that occurred in the other countries for dropping these days. So in Sweden they decided that instead of dropping the days all at once, they would drop them one by one, by skipping the leap years from 1700 until 1740, when the two calendars would finally catch up.

In 1700, February 29 was skipped in Sweden. Which didn’t bring them any closer to Gregorian countries such as Spain, because they skipped the leap year in 1700 anyway. But it brought them out of alignment with Russia - by one day.

A war with Russia started (not about the calendar, but just a week before the calendars went out of sync, incidentally), and due to the war Sweden forgot to skip the leap days in 1704 and 1708 (they had other things on their mind). And as this was embarrassing, in 1711, King Charles XII of Sweden declared to abandon the plan, and added one extra day the following year to realign it back to Russia. And because 1712 was a leap year anyway, in Sweden there was not only a February 29, but also a February 30, 1712. The only legal February 30 in history so far.

It needed not only for Charles XII to die, but also for his sister (who succeeded him) and her husband (who succeeded her) in 1751, before Sweden could move beyond that embarrassing episode, and in 1752 Sweden switched from the Julian to the Gregorian calendar, by cutting February short and ending it after February 17, following that by March 1.


Somewhere on my To-Do list, I have the wish to write a book on Wikidata. How it came to be, how it works, what it means, the complications we encountered, and the ones we missed, etc. One section in this book is planned to be about calendar models. This is an early, self-contained draft of part of that section. Feedback and corrections are very welcome.


Erdös number, update

I just made an update to a post from 2006, because I learned that my Erdös number has went down from 4 to 3. I guess that's pretty much it - it is not likely I'll ever become a 2.

The Fourth Scream

Janie loved her research. It was at the intersection of so many interesting areas - genetics, linguistics, neuroscience. And the best thing about it - she could work the whole day with these adorable vervet monkeys.

One more time, she showed the video of the flying eagle to Kassandra. The MRI helmet on Kassandra’s little head measured the neuron activation, highlighting the same region on her computer screen as the other times, the same region as with the other monkeys. Kassandra let out the scream that Janie was able to understand herself by now, the scream meaning “Eagle!”, and the other monkeys behind the bars in the far end of the room, in a cage large as half the room, ran to cover in the bushes and small caves, if they were close enough. As they did every time.

That MRI helmet was a masterpiece. She could measure the activation of the neurons in unprecedented high resolution. And not only that, she could even send inferencing waves back, stimulating very fine grained regions in the monkey’s brain. The stimulation wasn’t very fast, but it was a modern miracle.

She slipped a raspberry to Kassandra, and Kassandra quickly snatched it and stuffed it in her mouth. The monkeys came from different populations from all over Southern and Eastern Africa, and yet they all understood the same three screams. Even when the baby monkeys were raised by mute parents, the baby monkeys understood the same three screams. One scream was to warn them from leopards, one scream was to warn them from snakes, and the third scream was to warn them from eagles. The screams were universally understood by everyone across the globe - by every vervet monkey, that is. A language encoded in the DNA of the species.

She called up the aggregated areas from the scream from her last few experiments. In the last five years, she was able to trace back the proteins that were responsible for the growth of these four areas, and thus the DNA encoding these calls. She could prove that these three different screams, the three different words of Vervetian, were all encoded in DNA. That was very different from human language, where every word is learned, arbitrary, and none of the words were encoded in our DNA. Some researchers believed that other parts of our language were encoded in our DNA: deep grammatical patterns, the ability to merge chunks into hierarchies of meaning when parsing sentences, or the categorical difference between hearing the syllable ba and the syllable ga. But she was the first one to provably connect three different concrete genes with three different words that an animal produces and understands.

She told the software to create an overlapping picture of the three different brain areas activated by the three screams. It was a three dimensional picture that she could turn, zoom, and slice freely, in real time. The strands of DNA were highlighted at the bottom of the screen, in the same colors as the three different areas in the brain. One gene, then a break, then the other two genes she had identified. Leopard, snake, eagle.

She started to turn the visualization of the brain areas, as Kassandra started squealing in pain. Her hand was stuck between the cage bars and the plate with raspberries. The little thief was trying to sneak out a raspberry or two! Janie laughed, and helped the monkey get the hand unstuck. Kassandra yanked it back into the cage, looked at Janie accusingly, knowing that the pain was Janie’s fault for not giving her enough raspberries. Janie snickered, took out another raspberry and gave it to the monkey. She snatched it out of Janie’s hand, without stopping the accusing stare, and Janie then put the plate to the other side of the table, in safe distance and out of sight of Kassandra.

She looked back at the screen. When Kassandra cried out, her hand had twitched, and turned the visualization to a weird angle. She just wanted to turn it back to a more common view, when she suddenly stopped.

From this angle, she could see the three different areas, connecting together with the audiovisual cortex at a common point, like the leaves of a clover. But that was just it. It really looked like three leaves of a four-leaf clover. The area where the fourth leaf would be - it looked a lot like the areas where the other three leaves were.

She zoomed into the audiovisual cortex. She marked the neurons that triggered each of the three leaves. And then she looked at the fourth leaf. The connection to the cortex was similar. A bit different, but similar enough. She was able to identify what probably are the trigger-neurons, just like she was able to find them for the other three areas.

She targeted the MRI helmet on the neurons connected to the eagle trigger neurons, and with a click she sent a stimulus. Kassandra looked up, a bit confused. Janie looked at the neurons, how they triggered, unrolled the activation patterns, and saw how the signal was suppressed. She reprogrammed the MRI helmet, refined the neurons to be stimulated, and sent off another stimulus.

Kassandra yanked her head up, looking around, surprised. She looked at her screen, but it showed nothing as well. She walked nervously around inside the little cage, looking worriedly to the ceiling of the lab, confused. Janie again analyzed the activation patterns, and saw how it almost went through. There seemed to be a single last gatekeeper to pass. She reprogrammed the stimulator again. Third time's the charm, they say. She just remembered a former boyfriend, who was going on and on about this proverb. How no one knew how old it was, where it began, and how many different cultures all over the world associate trying something three times with eventual success, or an eventual curse. How some people believed you need to call the devil's name three times to —

Kassandra screamed out the same scream as before, the scream saying “Eagle!”. The MRI helmet had sent the stimulus, and it worked. The other monkeys jumped for cover. Kassandra raised her own arms above her head, peeking through her fingers to find the eagle she had just sensed.

Janie was more than excited! This alone will make a great paper. She could get the monkeys to scream out one of the three words of their language by a simple stimulation of particular neurons! Sure, she expected this to work - why wouldn’t it? But the actual scream, the confirmation, was exhilarating. As expected, the neurons now had a heightened potential, were easier to activate, waiting for more input. They slowly cooled down as Kassandra didn’t see any eagles.

She looked at the neurons connected to the fourth leaf. The gap. Was there a secret, fourth word hidden? One that all the zoologists studying vervet monkeys have missed so far? What would that word be? She reprogrammed the MRI helmet, aiming at the neurons that would trigger the fourth leaf. If her theory was right. With another click she sent a stimulus to the —

Janie was crouching in the corner of the room, breathing heavily, cold sweat was covering her arms, her face, her whole body. Her clothes were clamp. Her arms were slung above her head. She didn’t remember how she got here. The office chair she was just sitting in a moment ago, laid on the floor. The monkeys were quiet. Eerily quiet. She couldn’t see them from where she was, she couldn’t even see Kassandra from here, who was in the cage next to her computer. One of the halogen lamps in the ceiling was flickering. It wasn’t doing that before, was it?

She slowly stood up. Her body was shivering. She felt dizzy. She almost stumbled, just standing up. She slowly lowered her arms, but her arms were shaking. She looked for Kassandra. Kassandra was completely quiet, rolled up in the very corner of her cage, her arms slung around herself, her eyes staring catatonically forward, into nothing.

Janie took a step towards the middle of the room. She could see a bit more of the cage. The monkeys were partly huddled together, shaking in fear. One of them laid in the middle of the cage, his face in a grimace of terror. He was dead. She thought it was Rambo, but she wasn’t sure. She stumbled to the computer, pulled the chair from the floor, slumped into it.

The MRI helmet had recorded the activation pattern. She stepped through it. It did behave partially the same: the neurons triggered the unknown leaf, as expected, and that lead to activate the muscles around the lungs, the throat, the tongue, the mouth - in short, that activated the scream. But, unlike with the eagle scream, the activation potential did not increase, it was now suppressed. Like if it was trying to avoid a second triggering. She checked the pattern: yes, the neuron triggered that suppression itself. That was different. How did this secret scream sound?

Oh no! No, no, no, no, NOO!! She had not recorded the experiment. How stupid!

She was excited. She was scared, too, but she tried to push that away. She needed to record that scream. She needed to record the fourth word, the secret word of vervet monkeys. She switched on all three cameras in the lab, one pointed at the large cage with the monkeys, the other two pointing at Kassandra - and then she changed her mind, and turned one onto herself. What has happened to herself? Why couldn’t she remember hearing the scream? Why was she been crouching on the floor like one of the monkeys?

She checked her computer. The MRI helmet was calibrated as before, pointing at the group of triggering neurons. The suppression was ebbing down, but not as fast as she wanted. She increased the stimulation power. She shouldn’t. She should follow protocol. But this all was crazy. This was a cover story for Nature. With her as first author. She checked the recording devices. All three were on. The streams were feeding back into her computer. She clicked to send the sti—

She felt the floor beneath her. It was dirty and cold. She was laying on the floor, face down. Her ears were ringing. She turned her head, opened her eyes. Her vision was blurred. Over the ringing in her ears she didn’t hear a single sound from the monkeys. She tried to move, and she felt her pants were wet. She tried to stand up, to push herself up.

She couldn’t.

She panicked. Shivered. And when she felt the tears running over her face, she clenched her teeth together. She tried to breath, consciously, to collect herself, to gain control. Again she tried to stand up, and this time her arms and legs moved. Slower than she wanted. Weaker than she hoped. She was shaking. But she moved. She grabbed the chair. Pulled herself up a bit. The computer screen was as before, as if nothing has happened. She looked to Kassandra.

Kassandra was dead. Her eyes were bloodshot. Her face was a mask of pure terror, staring at nothing in the middle of the room. Janie tried to look at the cage with the other monkeys, but she couldn’t focus her gaze. She tried to yank herself into the chair.

The chair rolled away, and she crashed to the floor.

She had went too far. She had made a mistake. She should have had followed protocol. She was too ambitious, her curiosity and her impatience took the best of her. She had to focus. She had to fix things. But first she needed to call for help. She crawled to the chair. She pulled herself up, tried to sit in the chair, and she did it. She was sitting. Success.

Slowly, she rolled back to the computer. Her office didn’t have a phone. She double-clicked on the security app on her desktop. She had no idea how it worked, she never had to call security before. She hoped it would just work. A screen opened, asking her for some input. She couldn’t read it. She tried to focus. She didn’t know what to do. After a few moments the app changed, and it said in big letters: HELP IS ON THE WAY. STAY CALM. She closed her eyes. Breathed. Good.

After a few moments she felt better. She opened her eyes. HELP IS ON THE WAY. STAY CALM. She read it, once, twice. She nodded, her gaze jumping over the rest of the screen.

The recording was still on.

She moved the mouse cursor to the recording app. She wanted to see what has happened. There was nothing to do anyway, until security came. She clicked on the play button.

The recording filled three windows, one for each of the cameras. One pointed at the large cage with the vervet monkeys, two at Kassandra. Then, one of the cameras pointing at Kassandra was moved, pointing at Janie, just moments ago - it was moments, was it? - sitting at the desk. She saw herself getting ready to send the second stimulus to Kassandra, to make her call the secret scream a second time.

And then, from the recording, Kassandra called for a third time.

The end

History of knowledge graphs

An overview on the history of ideas leading to knowledge graphs, with plenty of references. Useful for anyone who wants to understand the background of the field, and probably the best current such overview.

On the competence of conspiracists

“Look, I’ll be honest, if living in the US for the last five years has taught me anything is that any government assemblage large enough to try to control a big chunk of the human population would in no way be consistently competent enough to actually cover it up. Like, we would have found out in three months and it wouldn’t even have been because of some investigative reporter, it would have been because one of the lizards forgot to put on their human suit on day and accidentally went out to shop for a pint of milk and like, got caught in a tik-tok video.” -- Os Keyes, WikidataCon, Keynote "Questioning Wikidata"

Power in California

It is wonderful to live in the Bay Area, where the future is being invented.

Sure, we might not have a reliable power supply, but hey, we have an app that connects people with dogs who don't want to pick up their poop with people who are desperate enough to do this shit.

Another example how the capitalism that we currently live failed massively: last year, PG&E was found responsible for killing people and destroying a whole city. Now they really want to play it safe, and switch off the power for millions of people. And they say this will go on for a decade. So in 2029 when we're supposed to have AIs, self-driving cars, and self-tieing Nikes, there will be cities in California that will get their power shut off for days when there is a hot wind for an afternoon.

Why? Because the money that should have gone into, that was already earmarked for, making the power infrastructure more resilient and safe went into bonus payments for executives (that sounds so cliché!). They tried to externalize the cost of an aging power infrastructure - the cost being literally the life and homes of people. And when told not to, they put millions of people in the dark.

This is so awfully on the nose that there is no need for metaphors.

San Francisco offered to buy the local power grid, to put it into public hands. But PG&E refused that offer of several billion dollars.

So if you live in an area that has a well working power infrastructure, appreciate it.

Academic lineage

Sorry for showing off, but it is just too cool not to: here is a visualization of my academic lineage according to Wikidata.

Query: w.wiki/AE8

Bring me to your leader!

"Bring me to your leader!", the explorer demanded.

"What's a leader?", the natives asked.

"The guy who tells everyone what to do.", he explained with some consternation.

"Oh yeah, we have one like that, but why would you want to talk to him? He's unbearable."

AKTS 2019

September 24 was the AKTS workshop - Advanced Knowledge Technologies for Science in a FAIR world - co-located with the eScience and Gateways conferences in San Diego. As usual with my trip reports, I won't write about every single talk, but offer only my own personal selection and view. This is not an official report on the workshop.

I had the honor of kicking off the day. I made the proposal of using Wikidata for describing datasets so that dataset catalogs can add these descriptions to their indexes. The standard way to do so is to use Schema.org annotations describing the datasets, but our idea here was to provide a fallback solution in case Schema.org cannot be applied for one reason or the other. Since the following talks would also be talking about Wikidata I used the talk to introduce Wikidata in a bit more depth. In parallel, I kicked the same conversation off on Wikidata as well. The idea was well received, but one good question was raised by Andrew Su: why not add Schema.org annotations to Wikidata instead?

After that, Daniel Garijo of USC's ISI presented WDPlus, Wikidata Plus, which presented a prototype for how to extend Wikidata with more data (particularly tabular data) from external data sources, such as censuses and statistical publications. The idea is to surround Wikidata with a layer of so-called satellites, which materialize statistical and other external data into Wikidata's schema. They implemented a mapping languages, T2WDML, that allows to grab CSV numbers and turn them into triples that are compatible with Wikidata's schema, and thus can be queried together. There seems to be huge potential in this idea, particularly if one can connect the idea of federated SPARQL querying with on-the-fly mappings, extending Wikidata to a virtual knowledge base that would be easily several times its current size.

Andrew Su from Scripps Research talked about using Wikidata as a knowledge graph in a FAIR world. He presented their brilliant Gene Wiki project, about adding knowledge about genes and proteins to Wikidata. He presented the idea of using Wikidata as a generalized back-end for customized frontend-applications - which is perfect. Wikidata's frontend is solid and functional, but in many domains there is a large potential to improve the UX for users in specific domains (and we are seeing some if flowering more around Lexemes, with Lucas Werkmeister's work on lexical forms). Su and his lab developed ChlamBase which allows the Chlamydia research community to look at the data they are interested in, and to easily add missing data. Another huge advantage of using Wikidata? Your data is going to live beyond the life of the grant. A great overview of the relevant data in Wikidata can be seen in this rich and huge and complex diagram.

The talks switched more to FAIR principles, first by Jeffrey Grethe of UCSD and then Mark Musen of Stanford. Mark was pointing out how quickly FAIR turned from a new idea to a meme that was pervasive everywhere, and the funding agencies now starting to require it. But data often has issues. One example: BioSample is the best metadata NIH has to offer. But 73% of the Boolean metadata values are not 'true' or 'false' but have values like "nonsmoker" or "recently quitted". 26% of the integers were not parseable. 68% of the entries from a controlled vocabulary were not. Having UX that helped with entering this data would be improving the quality considerably, such as CEDAR.

Carole Goble then talked about moving towards using Schema.org for FAIRer Life Sciences resources and defining a Schema.org profile that make datasets easier to use. The challenges in the field have been mostly social - there was a lot of confidence that we know how to solve the technical issues, but the social ones provide to be challenging. Carol named four of those explicitly:

  1. ontology-itis
  2. building consensus (it's harder than you think)
  3. the Schema.org Catch-22 (Schema.org won't take it if there is no usage, but people won't use it until it is in Schema.org)
  4. dedicated resources (people think you can do the social stuff in your spare time, but you can't)

Natasha Noy gave the keynote, talking about Google Dataset Search. The lessons learned from building it:

  1. Build an ecosystem first, be technically light-weight (a great lesson which was also true for Wikipedia and Wikidata)
  2. Use open, non-proprietary, standard solutions, don't ask people to build it just for Google (so in this case, use Schema.org for describing datasets)
  3. bootstrapping requires influencers (i.e. important players in the field, that need explicit outreach) and incentives (to increase numbers)
  4. semantics and the KG are critical ingredients (for quality assurance, to get the data in quickly, etc.)

At the same time, Natasha also reiterated one of Mark's points: no matter how simple the system is, people will get it wrong. The number of ways a date field can be written wrong is astounding. And often it is easier to make the ingester more accepting than try to get people to correct their metadata.

Chris Gorgolewski followed with a session on increasing findability for datasets, basically a session on SEO for dataset search: add generic descriptions, because people who need to find your dataset probably don't know your dataset and the exact terms (or they would already use it). Ensure people coming to your landing site have a pleasant experience. And the description is markup, so you can even use images.

I particularly enjoyed a trio of paper presentations by Daniel Garijo, Maria Stoica, Basel Shbita and Binh Vu. Daniel spoke about OntoSoft, an ontology to describe software workflows in sufficient detail to allow executing them, and also to create input and output definitions, describe the execution environment, etc. Close to those in- and output definition we find Maria's work on an ontology of variables. Maria presented a lot of work to identify the meaning of variables, based on linguistic, semantic, and ontological reasoning. Basel and Binh talked about understanding data catalogs deepers, being able to go deeper into the tables and understand the actual content in them. If one would connect the results of these three papers, one could potentially see how data from published tables and datasets could become alive and answer questions almost out of the box: extracting knowledge from tables, understanding their roles with regards to the input variables, and how to execute the scientific workflows.

Sure, science fiction, and the question is how well would each of the methods work, and how well would they work in concert, but hey, it's a workshop. It's meant for crazy ideas.

Ibrahim Burak Ozyurt presented an approach towards question answering in the bio-domain using Deep Learning, including Glove and BERT and all the other state of the art work. And it's all on Github! Go try it out.

The day closed with a panel with Mark Musen, Natasha Noy, and me, moderated by Yolanda Gil, discussing what we learned today. It quickly centered on the question how to ensure that people publishing datasets get appropriate credit. For most researchers, and particularly for universities, paper publications and impact factors are the main metric to evaluate researchers. So how do we ensure that people creating datasets (and I might add, tools, workflows, and social consensus) receive the fair share of credit?

Thanks to Yolanda Gil and Andrew Su for organizing the workshop! It was an exhausting, but lovely experience, and it is great to see the interest in this field.

Illuminati and Wikibase

When I was a teenager I was far too much fascinated by the Illuminati. Much less about the actual historical order, and more about the memetic complex, the trilogy by Shea and Wilson, the card game by Steve Jackson, the secret society and esoteric knowledge, the Templar Story, Holy Blood of Jesus, the rule of 5, the secret of 23, all the literature and offsprings, etc etc...

Eventually I went to actual order meetings of the Rosicrucians, and learned about some of their "secret" teachings, and also read Eco's Foucault's Pendulum. That, and access to the Web and eventually Wikipedia, helped to "cure" me from this stuff: Wikipedia allowed me to put a lot of the bits and pieces into context, and the (fascinating) stories that people like Shea & Wilson or von Däniken or Baigent, Leigh & Lincoln tell, start falling apart. Eco's novel, by deconstructing the idea, helps to overcome it.

He probably doesn't remember it anymore, but it was Thomas Römer who, many years ago, told me that the trick of these authors is to tell ten implausible, but verifiable facts, and tie them together with one highly plausible, but made-up fact. The appeal of their stories is that all of it seems to check out (because back then it was hard to fact check stuff, so you would use your time to check the most implausible stuff).

I still understand the allure of these stories, and love to indulge in them from time to time. But it was the Web, and it was learning about knowledge representation, that clarified the view on the underlying facts, and when I tried to apply the methods I was learning to it, it fell apart quickly.

So it is rather fascinating to see that one of the largest and earliest applications of Wikibase, the software we developed for Wikidata, turned out to be actual bona fide historians (not the conspiracy theorists) using it to work on the Illuminati, to catalog the letters they sent to reach other, to visualize the flow of information through the order, etc. Thanks to Olaf Simons for heading this project, and for this write up of their current state.

It's amusing to see things go round and round and realize that, indeed, everything is connected.

Wikidatan in residence at Google

Over the last few years, more and more research teams all around the world have started to use Wikidata. Wikidata is becoming a fundamental resource. That is also true for research at Google. One advantage of using Wikidata as a research resource is that it is available to everyone. Results can be reproduced and validated externally. Yay!

I had used my 20% time to support such teams. The requests became more frequent, and now I am moving to a new role in Google Research, akin to a Wikimedian in Residence: my role is to promote understanding of the Wikimedia projects within Google, work with Googlers to share more resources with the Wikimedia communities, and to facilitate the improvement of Wikimedia content by the Wikimedia communities, all with a strong focus on Wikidata.

One deeply satisfying thing for me is that the goals of my new role and the goals of the communities are so well aligned: it is really about improving the coverage and quality of the content, and about pushing the projects closer towards letting everyone share in the sum of all knowledge.

Expect to see more from me again - there are already a number of fun ideas in the pipeline, and I am looking forward to see them get out of the gates! I am looking forward to hearing your ideas and suggestions, and to continue contributing to the Wikimedia goals.

Deep kick


Mark Stoneward accepted the invitation immediately. Then it took two weeks for his lawyers at the Football Association to check the contracts and non-disclosure agreements prepared by the AI research company. Stoneward arrived at the glass and steel building in London downtown. He signed in at a fully automated kiosk, and was then accompanied by a friendly security guard to the office of the CEO.

Denise Mirza and Stoneward had met at social events, but never had time to talk for a longer time. “Congratulations on the results of the World Cup!” Stoneward nodded, “Thank you.”

“You have performed better than most of our models have predicted. This was particularly due to your willingness to make strategic choices, where other associations would simply have told their players to do their best. I am very impressed.” She looked at Stoneward, trying to read his face.

Stoneward’s face didn’t move. He didn’t want to give away how much was planned, how much was luck. He knew these things travel fast, and every little bit he could keep secret gave his team an edge. Mirza smiled. She recognised that poker face. “We know how to develop a computer system that could help you with even better strategic decisions.”

Stoneward tried to keep his face unmoved, but his body turned to Mirza and his arms opened a bit wider. Mirza knew that he was interested.

“If our models are correct, we can develop an Artificial Intelligence that could help you discuss your plans, help you with making the right strategic decisions, and play through different scenarios. Such AIs are already used in board rooms, in medicine, to create new recipes for top restaurants, or training chess players.”

“What about the other teams?”

“Well, we were hoping to keep this exclusive for two or four years, to test and refine the methodology. We are not in a hurry. Our models give us an overwhelming probability to win both the European Championship and the World Cup in case you follow our advice.”

“Overwhelming probability?”

“About 96%.”

“For the European Championship?”

“No. To win both.”

Stoneward gasped. “That is… hard to believe.”

The CEO laughed. “It is good that you are sceptical. I also doubted these probabilities, but I had two teams double-check.”

“What is that advice?”

She shrugged. “I don’t know yet. We need to develop the AI first. But I wanted to be sure you are actually interested before we invest in it.”

“You already know how effective the system will be without even having developed it yet?”

She smiled. “Our own decision process is being guided by a similar AI. There are so many things we could be doing. So many possible things to work on and revolutionise. We have to decide how to spend our resources and our time wisely.”

“And you’d rather spend your time on football than on… I don’t know, healing cancer or making a product that makes tons of money?”

“Healing cancer is difficult and will take a long time. Regarding money… the biggest impediment to speeding up the impact of our work is currently not a lack of resources, but a lack of public and political goodwill. People are worried about what our technology can do, and parliament and the European Union are eager to throw more and more regulations at us. What we need is something that will make every voter in England fall in love with us. That will open up the room for us to move more freely.”

Stoneward smiled. “Winning the World Cup.”

She smiled. “Winning the World Cup.”


Three months later…

“So, how will this work? Do I, uhm, type something in a computer, or do we have to run some program and I enter possible players we are considering to select?”

Mirza laughed. “No, nothing that primitive. The AI already knows all of your players. In fact, it knows all professional players in the world. It has watched and analyzed every second of TV screening of any game around the world, every relevant online video, and everything written in local newspapers.”

Stoneward nodded. That sounded promising.

“Here comes a little complication, though. We have a protocol for using our AIs. The protocols are overcautious. Our AIs are still far away from human intelligence, but our Ethics and Safety boards insisted on implementing these protocols whenever we use some of the near-human intelligence systems. It is completely overblown, but we are basically preparing ourselves for the time we have actually intelligent systems, maybe even superhuman intelligent systems.”

“I am afraid I don’t understand.”

“Basically, instead of talking to the AI directly, we talk with them through an operator, or medium.”

“Talk to them? You simply talk with the AI? Like with Siri?”

Mirza scoffed. “Siri is just a set of hard coded scripts and triggers.”

Stoneward didn’t seem impressed by the rant.

“The medium talks with the AI, tries its best to understand it, and then relays the AI’s advice to us. The protocol is strict about not letting the AI interact with decision makers directly.”

“Why?”

“Ah, as said, it is just being overly cautious. The protocol is in place in case we ever develop a superhuman intelligence, in which case we want to ensure that the AI doesn’t have too much influence on actual decision makers. The fear is that a superhuman AI could possibly unduly influence the decision maker. But with the medium in between, we have a filter, a normal intelligence, so it won’t be able to invert the relationship between adviser and decision maker.”

Stoneward blinked. “Pardon me, but I didn’t entirely follow what you — ”

“It’s just a Science Fiction scenario, but in case the AI tries to gain control, the fear is that a superhuman intelligence could basically turn you into a mindless muppet. By putting a medium in between, well, even if the medium becomes enslaved, the medium can only use their own intelligence against you. And that will fail.”

The director took a sip of water, and was pondering what he just heard for a few moments. Denise Mirza was burning with frustration. Sometimes she forgets how it is to deal with people this slow. And this guy had more balls banged against his skull than is healthy, which isn’t expected to speed his brain up. After what felt like half an eternity, he nodded.

“Are you ready for me to call the medium in?”

“Yes.”

She tapped her phone.

“Wait, does this mean that these mediums are slaves to your AI?”

She rolled her eyes. “Let us not discuss this in front of the medium, but I can assure you that our systems have not yet reached the level to convince a four year old to give up a lollipop, never mind a grown up person to do anything. We can discuss this more afterwards. Oh, there he is!”

Stoneward looked up surprised.

It was an old acquaintance, Nigel Ramsay. Ramsay used to manage some smaller teams in Lancashire, where Stoneward grew up. Ramsay was more known for his passion than for his talents.

“I am surprised to see you here”

The medium smiled. “It was a great offer, and when I learned what we are aiming for, I was positively thrilled. If this works we are going to make history!”

They sat down. “So, what does the system recommend?”

“Well, it recommends to increase the pressure on the government for a second referendum on Brexit.”

Stoneward stared at Ramsay, stunned. “Pardon me?”

“It is quite clear that the Prime Minister is intentionally sabotaging any reasonable solution for Brexit, but is too afraid to yet call a second referendum. She has been a double agent for the remainers the whole time. Once it is clear how much of a disaster leaving the European Union would be, we should call for a second referendum, reversing the result of the first.”

“I… I am not sure I follow… I thought we are talking football?”

“Oh, but yes! We most certainly are. Being part of an invigorated European Union after Brexit gets cancelled, we should strongly support a stronger Union, even the founding of a proper state.”

Stoneward looked at Ramsay with exasperation. Mirza motioned with her hands, asking for patience.

“Then, when the national football associations merge, this will pave the way for a single, unified European team.”

“The associations… merge?”

“Yes, an EU-wide all stars team. Just imagine that. Also, most of the serious competition would already be wiped out. No German team, no French team, just one European team and — “

“This is ridiculous! Reversing Brexit? Just to get a single European team? Even if we did, a unified European team might kill any interest in international football.”

“Yeah, that is likely true, but our winning chances would go through the roof!”

“But even then, 96% winning chances?”

“Oh, yeah, I asked the same. So, that’s not all. We also need to cause a war between Argentina and Brazil, in order to get them disqualified. There are a number of ways to get to this — ”

“Stop! Stop right there.” Stoneward looked shocked, his hands raised like a goalie waiting for the penalty kick. “Look, this is ridiculous. We will not stop Brexit or cause a war between two countries just to win a game.”

The medium looked at Stoneward in surprise. “To ‘just’ win a game?” His eyes wandered to Mirza in support. “I thought this was the sole reason for our existence. What does he mean, ‘just’ win a game? He is a bloody director of the FA, and he doesn’t care to win?”

“Maybe we should listen to some of the other suggestions?”, the CEO asked, trying to soothe the tension in the room.

Stoneward was visibly agitated, but after a few moments, he nodded. “Please continue.”

“So even if we don’t merge the European associations due to Brexit, we should at least merge the English, Scottish, Welsh, and Northern Irish associations in — ”

“No, no, NO! Enough of this association merging nonsense. What else do you have?”

“Well, without mergers, and wars, we’re down to 44% probability to win both the European and World Cup within the next twenty years.” The medium sounded defeated.

“That’s OK, I’ll take that. Tell me more.” Stoneward has known that the probabilities given before were too good to be true. It was still a disappointment.

“England has some of the best schools in the world. We should use this asset to lure young talent to England, offer them scholarships in Oxford, in Cambridge.”

“But they wouldn’t be English? They can’t play for England.”

“We would need to make the path to citizenship easier for them, immigration laws should be more integrative for top talent. We need to give them the opportunity to become subjects of the Queen before they play their first international. And then offer them to play for England. There is so much talent out there, and if we can get them while they’re young, we could prep up our squad in just a few years.”

“Scholarships for Oxford? How much would that even cost?”

“20, 25 thousand per year and student? We can pay a hundred scholarships and it wouldn’t even show up in our budget.”

“We are cutting budgets left and right!”

“Since we’re not stopping Brexit, why not dip into those 350 million pounds per week that we will save.”

“That was a lie!”

“I was joking.”

“Well, the scholarship thing wasn’t bad. What else is on the table?”

“One idea was to hack the video stream and bribe the referee, and then we can safely gaslight everyone.”

“Next idea.”

“We could poison the other teams.”

“Just stop it.”

“Or give them substances that would mess up their drug tests.”

“Why not getting FIFA to change the rules so we always win?”

“Oh, we considered it, but given the existing corruption inside FIFA it seems that would be difficult to outbid.”

Stonward sighed. “Now I was joking.”

“One suggestion is to create a permanent national team, and have them play in the national league. So they would be constantly competing, playing with each other, be better used to each other. A proper team.”

“How would we even pay for the players?”

“It would be an honor to play for the national team. Also, it could be a new rule to require the best players to play in the national team.”

“I think we are done here. These suggestions were… rather interesting. But I think they were mostly unactionable.” He started standing up.

Mirza looked desperately from one to the other. This meeting did not go as she had intended. “I think we can acknowledge the breadth of the creative proposals that have been on the table today, and enjoy a tea before you leave?”, she said, forcing a smile.

Stoneward nodded politely. “We sure can appreciate the creativity.”

“Now imagine this creativity turned into strategies in the pitch. Tactical moves. Variations to set pieces.”, the medium started, his voice slightly shifting.

“Yes, well, that would certainly be more interesting than most of the suggestions so far.”

“Wouldn’t it? And not only that, but if we could talk to the players. If we could expand their own creativity. Their own willpower. Their focus. Their energy to power through, not to give up.”

“If you’re suggesting to give them drugs, I am out.”

Ramsay laughed. “No, not drugs. But a helmet that emits electromagnetic waves and allows the brain muscles to work in more interesting ways.”

Stoneward looked over to the CEO. “Is that a possibility?”

Mirza looked uncomfortable, but tried to hide it. “Yes, yes, it is. We had tested it a few times, and the results were quite astonishing. It is just not what I would have expected as a proposal.”

“Why? Anything wrong with that?”

“Well, we use it for our top engineers, to help them focus when developing and designing solutions. The results are nothing short of marvelous. It is just, I didn’t think football would benefit that much from improved focus.”

Stoneward chuckled, as he sat down again. “Yes, many people underestimate the role of a creative mind in the game. I think I would now like a tea.” He looked to Ramsay. “Tell me more.”

The medium smiled. The system will be satisfied with the outcome.

(Originally published July 28, 2018 on Medium)

Saturn the alligator

Today at work I learned about Saturn the alligator. Born to humble origins in 1936 in Mississippi, he moved to Berlin where he became acquainted with Hitler. After the bombing of the Berlin Zoo he wandered through the streets. British troops found him, gave him to the Soviets, where against all odds he survived a number of near death situations - among others he refused to eat for a year - and still lives today, in an enclosure sponsored by Lacoste.

I also went to Wikidata to improve the entry on Saturn. For that I needed to find the right property to express the connection between Saturn, and the Moscow Zoo, where he is held.

The following SPARQL query was helpful: https://w.wiki/7ga

It tells you which properties connect animals with zoos how often - and in the Query Helper UI it should be easy to change either types to figure out good candidates for the property you are looking for.

Wikidata reached a billion edits

As of today, Wikidata has reached a billion edits - 1,000,000,000.

This makes it the first Wikimedia project that has reached that number, and possibly the first wiki ever to have reached so many edits. Given that Wikidata was launched less than seven years ago, this means an average edit rate of 4-5 edits per second.

The billionth edit is the creation of an item for a 2006 physics article written in Chinese.

Congratulations to the community! This is a tremendous success.

In the beginning

"Let there be a planet with a hothouse effect, so that they can see what happens, as a warning."

"That is rather subtle, God", said the Archangel.

"Well, let it be the planet closest to them. That should do it. They're intelligent after all."

"If you say so."

Lion King 2019

Wow. The new version of the Lion King is technically brilliant, and story-wise mostly unnecessary (but see below for an exception). It is a mostly beat-for-beat retelling of the 1994 animated version. The graphics are breathtaking, and they show how far computer-generated imagery has come. For a measly million dollar per minute of film you can get a photorealistic animal movies. Because of the photorealism, it also loses some of the charm and the emotions that the animated version carried - in the original the animals were much more anthropomorphic, and the dancing was much more exaggerated, which the new version gave up. This is most noticeable in the song scene for "I can't wait to be king", which used to be a psychedelic, color shifted sequence with elephants and tapirs and giraffes stacked upon each other, replaced by a much more realistic sequence full of animals and fast cuts that simply looks amazing (I never was a big fan of the psychedelic music scenes that were so frequent in many animated movies, so I consider this a clear win).

I want to focus on the main change, and it is about Scar. I know the 1994 movie by heart, and Scar is its iconic villain, one of the villains that formed my understanding of a great villain. So why would the largest change be about Scar, changing him profoundly for this movie? How risky a choice in a movie that partly recreates whole sequences shot by shot?

There was one major criticism about Scar, and that is that he played with stereotypical tropes of gay grumpy men, frustrated, denied, uninterested in what the world is offering him, unable to take what he wants, effeminate, full of cliches.

That Scar is gone, replaced by a much more physically threatening scar, one that whose philosophy in life is that the strongest should take what they want. Chiwetel Ejiofor's voice for Scar is scary, threatening, strong, dominant, menacing. I am sure that some people won't like him, as the original Scar was also a brilliant villain, but this leads immediately to my big criticism of the original movie: if Scar was only half as effing intelligent as shown, why did he do such a miserable job in leading the Pride Lands? If he was so much smarter than Mufasa, why did the thriving Pride Lands turn into a wasteland, threatening the subsistence of Scar and his allies?

The answer in the original movie is clear: it's the absolutist identification of country and ruler. Mufasa was good, therefore the Pride Lands were doing well. When Scar takes over, they become a wasteland. When Simba takes over, in the next few shots, they start blooming again. Good people, good intentions, good outcomes. As simple as that.

The new movie changes that profoundly - and in a very smart way. The storytellers at Disney really know what they're doing! Instead of following the simple equation given above, they make it an explicit philosophical choice in leadership. This time around, the whole Circle of Life thing, is not just an Act One lesson, but is the major difference between Mufasa and Scar. Mufasa describes a great king as searching for what they can give. Scar is about might is right, and about the strongest taking whatever they want. This is why he overhunts and allows overhunting. This is why the Pride Lands become a wasteland. Now the decline of the Pride Lands make sense, and also why the return of Simba and his different style as a king would make a difference. The Circle of Life now became important for the whole movie, at the same time tying with the reinterpretation of Scar, and also explaining the difference in outcome.

You can probably tell, but I am quite amazed at this feat in storytelling. They took a beloved story and managed to improve it.

Unfortunately, the new Scar also means that the song Be Prepared doesn't really work as it used to, and thus the song also got shortened and very much changed in a movie that became much longer otherwise. I am not surprised, they even wanted to remove it, and now I understand why (even though back then I grumbled about it). They also removed the Leni Riefenstahl imaginary from the new version which was there in the original one, which I find regrettable, but obviously necessary given the rest of the movie.

A few minor notes.

The voice acting was a mixed bag. Beyonce was surprisingly bland (speaking, her singing was beautiful), and so was John Oliver (singing, his speaking was perfect). I just listened again to I can't wait to be king, and John Oliver just sounds so much less emotional than Rowan Atkinson. Pity.

Another beautiful scene was the scene were Rafiki receives the massage that Simba is still alive. In the original, this was a short transition of Simba ruffling up some flowers, and the wind takes them to Rafiki, he smells them, and realizes it is Simba. Now the scene is much more elaborate, funnier, and is reminiscent of Walt Disney's animal movies, which is a beautiful nod to the company founder. Simba's hair travels with the wind, birds, a Giraffe, an ant, and more, until it finally reaches the Shaman's home.

One of my best laughs was also due to another smart change: in Hakuna Matata, when they retell Pumbaa's story (with an incredibly cute little baby Pumbaa), Pumbaa laments that all his friends leaving him got him "unhearted, every time that he farted", and immediately complaining to Timon as to why he didn't stop him singing it - a play on the original's joke, where Timon interjects Pumbaa before he finishes the line with "Pumbaa! Not in front of the kids.", looking right at the camera and breaking the fourth wall.

Another great change was to give the Hyenas a bit more character - the interactions between the Hyena who wasn't much into personal space and the other who rather was, were really amusing. Unlike with the original version the differences in the looks of the Hyenas are harder to make out, and so giving them more personality is a great choice.

All in all, I really loved this version. Seeing it on the big screen pays off for the amazing imagery that really shines on a large canvas. I also love the original, and the original will always have a special place in my heart, but this is a wonderful tribute to a brilliant movie with an exceptional story.

210,000 year old human skull found in Europe

A Homo Sapiens skull that is 210,000 years old had been found in Greece, together with a Neanderthal skull from 175,000 years ago.

The oldest European Homo Sapiens remains known so far only date to 40,000 years ago.


Draft: Collaborating on the sum of all knowledge across languages

For the upcoming Wikipedia@20 book, I published my chapter draft. Comments are welcome on the pubpub Website until July 19.

Every language edition of Wikipedia is written independently of every other language edition. A contributor may consult an existing article in another language edition when writing a new article, or they might even use the Content Translation tool to help with translating one article to another language, but there is nothing that ensures that articles in different language editions are aligned or kept consistent with each other. This is often regarded as a contribution to knowledge diversity, since it allows every language edition to grow independently of all other language editions. So would creating a system that aligns the contents more closely with each other sacrifice that diversity?

Differences between Wikipedia language editions

Wikipedia is often described as a wonder of the modern age. There are more than 50 million articles in almost 300 languages. The goal of allowing everyone to share in the sum of all knowledge is achieved, right?

Not yet.

The knowledge in Wikipedia is unevenly distributed. Let’s take a look at where the first twenty years of editing Wikipedia have taken us.

The number of articles varies between the different language editions of Wikipedia: English, the largest edition, has more than 5.8 million articles, Cebuano — a language spoken in the Philippines — has 5.3 million articles, Swedish has 3.7 million articles, and German has 2.3 million articles. (Cebuano and Swedish have a large number of machine generated articles.) In fact, the top nine languages alone hold more than half of all articles across the Wikipedia language editions — and if you take the bottom half of all Wikipedias ranked by size, they together wouldn’t have 10% of the number of articles in the English Wikipedia.

It is not just the sheer number of articles that differ between editions, but their comprehensiveness does as well: the English Wikipedia article on Frankfurt has a length of 184,686 characters, a table of contents spanning 87 sections and subsections, 95 images, tables and graphs, and 92 references — whereas the Hausa Wikipedia article states that it is a city in the German state of Hesse, and lists its population and mayor. Hausa is a language spoken natively by 40 million people and as a second language by another 20 million.

It is not always the case that the large Wikipedia language editions have more content on a topic. Although readers often consider large Wikipedias to be more comprehensive, local Wikipedias may frequently have more content on topics of local interest: the English Wikipedia knows about the Port of Călărași that it is one of the largest Romanian river ports, located at the Danube near the town of Călărași — and that’s it. The Romanian Wikipedia on the other hand offers several paragraphs of content about the port.

The topics covered by the different Wikipedias also overlap less than one would initially assume. English Wikipedias has 5.8 million articles, German has 2.2 million articles — but only 1.1 million topics are covered by both Wikipedias. A full 1.1 million topics have an article in German — but not in English. The top ten Wikipedias by activity — each of them with more than a million articles — have articles on only hundred thousand topics in common. 18 million topics are covered by articles in the different language Wikipedias — and English only covers 31% of these.

Besides coverage, there is also the question of how up to date the different language editions are: in June 2018, San Francisco elected London Breed as its new mayor. Nine months later, in March 2019, I conducted an analysis of who the mayor of San Francisco was, according to the different language versions of Wikipedia. Of the 292 language editions, a full 165 had a Wikipedia article on San Francisco. Of these, 86 named the mayor. The good news is that not a single Wikipedia lists a wrong mayor — but the vast majority are out of date. English switched the minute London Breed was sworn in. But 62 Wikipedia language editions list an out-of-date mayor — and not just the previous mayor Ed Lee, who became mayor in 2011, but also often Gavin Newsom (2004-2011), and his predecessor, Willie Brown (1996-2004). The most out-of-date entry is to be found in the Cebuano Wikipedia, who names Dianne Feinstein as the mayor of San Francisco. She had that role after the assassination of Harvey Milk and George Moscone in 1978, and remained in that position for a decade in 1988 — Cebuano was more than thirty years out of date. Only 24 language editions had listed the current mayor, London Breed, out of the 86 who listed the name at all.

An even more important metric for the success of a Wikipedia are the number of contributors: English has more than 31,000 active contributors — three out of seven active Wikimedians are active on the English Wikipedia. German, the second most active Wikipedia community, already only has 5,500 active contributors. Only eleven language editions have more than a thousand active contributors — and more than half of all Wikipedias have fewer than ten active contributors. To assume that fewer than ten active contributors can write and maintain a comprehensive encyclopedia in their spare time is optimistic at best. These numbers basically doom the mission of the Wikimedia movement to realize a world where everyone can contribute to the sum of all knowledge.

Enter Wikidata

Wikidata was launched in 2012 and offers a free, collaborative, multilingual, secondary database, collecting structured data to provide support for Wikipedia, Wikimedia Commons, the other wikis of the Wikimedia movement, and to anyone in the world. Wikidata contains structured information in the form of simple claims, such as “San Francisco — Mayor — London Breed”, qualifiers, such as “since — July 11, 2018”, and references for these claims, e.g. a link to the official election results as published by the city.

One of these structured claims would be on the Wikidata page about San Francisco and state the mayor, as discussed earlier. The individual Wikipedias can then query Wikidata for the current mayor. Of the 24 Wikipedias that named the current mayor, eight were current because they were querying Wikidata. I hope to see that number go up. Using Wikidata more extensively can, in the long run, allow for more comprehensive, current, and accessible content while decreasing the maintenance load for contributors.

Wikidata was developed in the spirit of the Wikipedia’s increasing drive to add structure to Wikipedia’s articles. Examples of this include the introduction of infoboxes as early as 2002, a quick tabular overview of facts about the topic of the article, and categories in 2004. Over the year, the structured features became increasingly intricate: infoboxes moved to templates, templates started using more sophisticated MediaWiki functions, and then later demanded the development of even more powerful MediaWiki features. In order to maintain the structured data, bots were created, software agents that could read content from Wikipedia or other sources and then perform automatic updates to other parts of Wikipedia. Before the introduction of Wikidata, bots keeping the language links between the different Wikipedias in sync, easily contributed 50% and more of all edits.

Wikidata allowed for an outlet to many of these activities, and relieved the Wikipedias of having to run bots to keep language links in sync or of massive infobox maintenance tasks. But one lesson I learned from these activities is that I can trust the communities with mastering complex workflows spread out between community members with different capabilities: in fact, a small number of contributors working on intricate template code and developing bots can provide invaluable support to contributors who more focus on maintaining articles and contributors who write large swaths of prose. The community is very heterogeneous, and the different capabilities and backgrounds complement each other in order to create Wikipedia.

However, Wikidata’s structured claims are of a limited expressivity: their subject always must be the topic of the page, every object of a statement must exist as its own item and thus page in Wikidata. If it doesn’t fit in the rigid data model of Wikidata, it simply cannot be captured in Wikidata — and if it cannot be captured in Wikidata, it cannot be made accessible to the Wikipedias.

For example, let’s take a look at the following two sentences from the English Wikipedia article on Ontario, California:

“To impress visitors and potential settlers with the abundance of water in Ontario, a fountain was placed at the Southern Pacific railway station. It was turned on when passenger trains were approaching and frugally turned off again after their departure.”

There is no feasible way to express the content of these two sentences in Wikidata - the simple claim and qualifier structure that Wikidata supports can not capture the subtle situation that is described here.

An Abstract Wikipedia

I suggest that the Wikimedia movement develop an Abstract Wikipedia, a Wikipedia in which the actual textual content is being represented in a language-independent manner. This is an ambitious goal — it requires us to push the current limits of knowledge representation, natural language generation, and collaborative knowledge construction by a significant amount: an Abstract Wikipedia must allow for:

  1. relations that connect more than just two participants with heterogeneous roles.
  2. composition of items on the fly from values and other items.
  3. expressing knowledge about arbitrary subjects, not just the topic of the page.
  4. ordering content, to be able to represent a narrative structure.
  5. expressing redundant information.

Let us explore one of these requirements, the last one: unlike the sentences of a declarative formal knowledge base, human language is usually highly redundant. Formal knowledge bases usually try to avoid redundancy, for good reasons. But in a natural language text, redundancy happens frequently. One example is the following sentence:

“Marie Curie is the only person who received two Nobel Prizes in two different sciences.”

The sentence is redundant given a list of Nobel Prize award winners and their respective disciplines they have been awarded to — a list that basically every large Wikipedia will contain. But the content of the given sentence nevertheless appears in many of the different language articles on Marie Curie, and usually right in the first paragraph. So there is obviously something very interesting in this sentence, even though the knowledge expressed in this sentence is already fully contained in most of the Wikipedias it appears in. This form of redundancy is common place in natural language — but is usually avoided in formal knowledge bases.

The technical details of the Abstract Wikipedia proposal are presented in (Vrandečić, 2018). But the technical architecture is only half of the story. Much more important is the question whether the communities can meet the challenges of this project?

Wikipedia and Wikidata have shown that the communities are capable to meet difficult challenges: be it templates in Wikipedia, or constraints in Wikidata, the communities have shown that they can drive comprehensive policy and workflow changes as well as the necessary technological feature development. Not everyone needs to understand the whole stack in order to make a feature such as templates a crucial part of Wikipedia.

The Abstract Wikipedia is an ambitious future project. I believe that this is the only way for the Wikimedia movement to achieve its goal, short of developing an AI that will make the writing of a comprehensive encyclopedia obsolete anyway.

A plea for knowledge diversity?

When presenting the idea of the Abstract Wikipedia, the first question is usually: will this not massively reduce the knowledge diversity of Wikipedia? By unifying the content between the different language editions, does this not force a single point of view on all languages? Is the Abstract Wikipedia taking away the ability of minority language speakers to maintain their own encyclopedias, to have a space where, for example, indigenous speakers can foster and grow their own point of view, without being forced to unify under the western US-dominated perspective?

I am sympathetic with the intent of this question. The goal of this question is to ensure that a rich diversity in knowledge is retained, and to make sure that minority groups have spaces in which they can express themselves and keep their knowledge alive. These are, in my opinion, valuable goals.

The assumption that an Abstract Wikipedia, from which any of the individual language Wikipedias can draw content from, will necessarily reduce this diversity, is false. In fact, I believe that access to more knowledge and to more perspectives is crucial to achieve an effective knowledge diversity, and that the currently perceived knowledge diversity in different language projects is ineffective at best, and harmful at worst. In the rest of this essay I will argue why this is the case.

Language does not align with culture

First, it is wrong to use language as the dimension along which to draw the demarcation line between different content if the Wikimedia movement truly believes that different groups should be able to grow and maintain their own encyclopedias.

In case the Wikimedia movement truly believes that different groups or cultures should have their own Wikipedias, why is there only a single Wikipedia language edition for the English speakers from India, England, Scotland, Australia, the United States, and South Africa? Why is there only one Wikipedia for Brazil and Portugal, leading to much strife? Why are there no two Wikipedias for US Democrats and Republicans?

The conclusion is that the Wikimedia movement does not believe that language is the right dimension to split knowledge — it is a historical decision, driven by convenience. The core Wikipedia policies, vision, and mission are all geared towards enabling access to the sum of all knowledge to every single reader, no matter what their language, and not toward capturing all knowledge and then subdividing it for consumption based on the languages the reader is comfortable in.

The split along languages leads to the problem that it is much easier for a small language community to go “off the rails” — to either, as a whole, become heavily biased, or to adopt rules and processes which are problematic. The fact that the larger communities have different rules, processes, and outcomes can be beneficial for Wikipedia as a whole, since they can experiment with different rules and approaches. But this does not seem to hold true when the communities drop under a certain size and activity level, when there are not enough eyeballs to avoid the development of bad outcomes and traditions. For one example, the article about skirts in the Bavarian Wikipedia features three upskirt pictures, one porn actress, an anime screenshot, and a video showing a drawing of a woman with a skirt getting continuously shorter. The article became like this within a day or two of its creation, and, even though it has been edited by a dozen different accounts, has remained like this over the last seven years. (This describes the state of the article in April 2019 — I hope that with the publication of this essay, the article will finally be cleaned up).

A look on some south Slavic language Wikipedias

Second, a natural experiment is going on, where contributors that are more separated by politics than language differences have separate Wikipedias: there exist individual Wikipedia language editions for Croatian, Serbian, Bosnian, and Serbocroatian. Linguistically, the differences between the dialects of Croatian are often larger than the differences between standard Croatian and standard Serbian. Particularly the existence of the Serbocroatian Wikipedia poses interesting questions about these delineations.

Particularly the Croatian Wikipedia has turned to a point of view that has been described as problematic. Certain events and Croat actors during the 1990s independence wars or the 1940s fascist puppet state might be represented more favorably than in most other Wikipedias.

Here are two observations based on my work on south Slavic language Wikipedias:

First, claiming that a more fascist-friendly point of view within a Wikipedia increases the knowledge diversity across all Wikipedias might be technically true, but is practically insufficient. Being able to benefit from this diversity requires the reader to not only be comfortable reading several different languages, but also to engage deeply enough and spend the time and interest to actually read the article in different languages, which is mostly a profoundly boring exercise, since a lot of the content will be overlapping. Finding the juicy differences is anything but easy, especially considering that most readers are reading Wikipedia from mobile devices, and are just looking to satisfy a quick information need from a source whose curation they trust.

Most readers will only read a single language version of an article, and thus any diversity that exists across different language editions is practically lost. The sheer existence of this diversity might even be counterproductive, as one may argue that the communities should not spend resources on reflecting the true diversity of a topic within each individual language. This would cement the practical uselessness of the knowledge diversity across languages.

Second, many of the same contributors that write the articles with a certain point of view in the Croatian Wikipedia, also contribute on the English Wikipedia on the articles about the same topics — but there they suddenly are forced and able to compromise and incorporate a much wider variety of points of view. One might hope the contributors would take the more diverse points of view and migrate them back to their home Wikipedias — but that is often not the case. If contributors harbor a certain point of view (and who doesn’t?) it often leads to a situation where they push that point of view as much as they can get away with in each of the projects.

It has to be noted that the most blatant digressions from a neutral point of view in Wikipedias like the Croatian Wikipedia will not be found in the most central articles, but in the large periphery of articles surrounding these central articles which are much harder to keep an eye on.

Abstract Wikipedia and Knowledge diversity

The Abstract Wikipedia proposal does not require any of the individual language editions to use it. Each language community can decide for each article whether to fall back on the Abstract Wikipedia or whether to create their own article in their language. And even that decision can be more fine grained: a contributor can decide for an individual article to incorporate sections or paragraphs from the Abstract Wikipedia.

This allows the individual Wikipedia communities the luxury to entirely concentrate on the differences that are relevant to them. I distinctly remember that when I started the Croatian Wikipedia: it felt like I had the burden to first write an article about every country in the world before I could write the articles I cared about, such as my mother’s home village — because how could anyone defend a general purpose encyclopedia that might not even have an article on Nigeria, a country with a population of a hundred million, but one on Donji Humac, a village with a population of 157? Wouldn’t you first need an article on all of the chemical elements that make up the world before you can write about a local food?

The Abstract Wikipedia frees a language edition from this burden, and allows each community to entirely focus on the parts they care about most — and to simply import the articles from the common source for the topics that are less in their focus. It allows the community to make these decisions. As the communities grow and shift, they can revisit these decisions at any time and adapt them.

At the same time, the Abstract Wikipedia makes these differences more visible since they become explicit. Right now there is no easy way to say whether the fact that Dianne Feinstein is listed as the Mayor of San Francisco in the Cebuano Wikipedia is due to cultural particularities of the Cebuano language communities or not. Are the different population numbers of Frankfurt in the different language editions intentional expressions of knowledge diversity? With an Abstract Wikipedia, the individual communities could explicitly choose which articles to create and maintain on their own, and at the same time remove a lot of unintentional differences.

By making these decisions more explicit, it becomes possible to imagine an effective workflow that observes these intentional differences, and sets up a path to integrate them into the common article in the Abstract Wikipedia. Right now, there are 166 different language versions of the article on the chemical element Helium — it is basically impossible for a single person to go through all of them and find the content that is intentionally different between them. With an Abstract Wikipedia, which contains the common shared knowledge, contributors, researchers, and readers can actually take a look at those articles that intentionally have content that replaces or adds to the commonly shared one, assess these differences, and see if contributors should integrate the differences in the shared article.

The differences in content may be reflecting difference in policies, particularly in policies of notability and reliability. Whereas on first glance it might seem that the Abstract Wikipedia might require unified notability and reliability requirements across all Wikipedias, this is not the case: due to the fact that local Wikipedias can overlay and suppress content from the Abstract Wikipedias, they can adjust their Wikipedias based on their own rules. And the increased visibility of such decisions will lead to easier identify biases, and hopefully also to updated rules to reduce said bias.

A new incentive infrastructure

The Abstract Wikipedia will evolve the incentive infrastructure of Wikipedia.

Presently, many underrepresented languages are spoken in areas that are multilingual. Often another language spoken in this area is regarded as a high-prestige language, and is thus the language of education and literature, whereas the underrepresented language is a low-prestige language. So even though the low-prestige language might have more speakers, the most likely recruits for the Wikipedia communities, people with education who can afford internet access and have enough free time, will be able to contribute in both languages.

In which language should I contribute? If I write the article about my mother’s home town in Croatian, I make it accessible to a few million people. If I write the article about my mother’s home town in English, it becomes accessible to more than a hundred times as many people! The work might be the same, but the perceived benefit is orders of magnitude higher: the question becomes, do I teach the world about a local tradition, or do I tell my own people about their tradition? The world is bigger, and thus more likely to react, creating a positive feedback loop.

This cannibalizes the communities for local languages by diverting them to the English Wikipedia, which is perceived as the global knowledge community (or to other high-prestige languages, such as Russian or French). This is also reflected in a lot of articles in the press and in academic works about Wikipedia, where the English Wikipedia is being understood as the Wikipedia. Whereas it is known that Wikipedia exists in many other languages, journalists and researchers are, often unintentionally, regarding the English Wikipedia as the One True Wikipedia.

Another strong impediment to recruiting contributors to smaller Wikipedia communities is rarely explicitly called out: it is pretty clear that, given the current architecture, these Wikipedias are doomed in achieving their mission. As discussed above, more than half of all Wikipedia language editions have fewer than ten active contributors — and writing a comprehensive, up-to-date Wikipedia is not an achievable goal with so few people writing in their free time. The translation tools offered by the Wikimedia Foundation can considerably help within certain circumstances — but for most of the Wikipedia languages, automatic translation models don’t exist and thus cannot help the languages which would need it the most.

With the Abstract Wikipedia though, the goal of providing a comprehensive and current encyclopedia in almost any language becomes much more tangible: instead of taking on the task of creating and maintaining the entire content, only the grammatical and lexical knowledge of a given language needs to be created. This is a far smaller task. Furthermore, this grammatical and lexical knowledge is comparably static — it does not change as much as the encyclopedic content of Wikipedia, thus turning a task that is huge and ongoing into one where the content will grow and be maintained without the need of too much maintenance by the individual language communities.

Yes, the Abstract Wikipedia will require more and different capabilities from a community that has yet to be found, and the challenges will be both novel and big. But the communities of the many Wikimedia projects have repeatedly shown that they can meet complex challenges with ingenious combinations of processes and technological advancements. Wikipedia and Wikidata have both demonstrated the ability to draw on technologically rather simple canvasses, and create extraordinary rich and complex masterpieces, which stand the test of time. The Abstract Wikipedia aims to challenge the communities once again, and the promise this time is nothing else but to finally be able to reap the ultimate goal: to allow every one, no matter what their native language is, to share in the sum of all knowledge.

Acknowledgements

Thanks to the valuable suggestions on improving the article to Jamie Taylor, Daniel Russell, Joseph Reagle, Stephen LaPorte, and Jake Orlowitz.

Bibliography

  • Bao, Patti, Brent J. Hecht, Samuel Carton, Mahmood Quaderi, Michael S. Horn and Darren Gergle. “Omnipedia: bridging the wikipedia language gap.” in Proceedings of the Conference on Human Factors in Computing Systems (CHI 2012), edited by Joseph A. Konstan, Ed H. Chi, and Kristina Höök. Austin: Association for Computing Machinery, 2012: 1075-1084.
  • Eco, Umberto. The Search for the Perfect Language (the Making of Europe). La ricerca della lingua perfetta nella cultura europea. Translated by James Fentress. Oxford: Blackwell, 1995 (1993).
  • Graham, Mark. “The Problem With Wikidata.” The Atlantic, April 6, 2012. https://www.theatlantic.com/technology/archive/2012/04/the-problem-with-wikidata/255564/
  • Hoffmann, Thomas and Graeme Trousdale, “Construction Grammar: Introduction”. In The Oxford Handbook of Construction Grammar, edited by Thomas Hoffmann and Graeme Trousdale, 1-14. Oxford: Oxford University Press, 2013.
  • Kaffee, Lucie-Aimée, Hady ElSahar, Pavlos Vougiouklis, Christophe Gravier, Frédérique Laforest, Jonathon S. Hare and Elena Simperl. “Mind the (Language) Gap: Generation of Multilingual Wikipedia Summaries from Wikidata for Article Placeholders.” in Proceedings of the 15th European Semantic Web Conference (ESWC 2018), edited by Aldo Gangemi, Roberto Navigli, Marie-Esther Vidal, Pascal Hitzler, Raphaël Troncy, Laura Hollink, Anna Tordai, and Mehwish Alam. Heraklion: Springer, 2018: 319-334.
  • Kaffee, Lucie-Aimée, Hady ElSahar, Pavlos Vougiouklis, Christophe Gravier, Frédérique Laforest, Jonathon S. Hare and Elena Simperl. “Learning to Generate Wikipedia Summaries for Underserved Languages from Wikidata.” in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2, edited by Marilyn Walker, Heng Ji, and Amanda Stent. New Orleans: ACL Anthology, 2018: 640-645.
  • Schindler, Mathias and Denny Vrandečić. “Introducing new features to Wikipedia: Case studies for Web Science.” IEEE Intelligent Systems 26, no. 1 (January-February 2011): 56-61.
  • Vrandečić, Denny. “Restricting the World.” Wikimedia Deutschland Blog. February 22, 2013. https://blog.wikimedia.de/2013/02/22/restricting-the-world/
  • Vrandečić, Denny and Markus Krötzsch. “Wikidata: A Free Collaborative Knowledgebase.” Communications of the ACM 57, no. 10 (October 2014): 78-85. DOI 10.1145/2629489.
  • Kaljurand, Kaarel and Tobias Kuhn. “A Multilingual Semantic Wiki Based on Attempto Controlled English and Grammatical Framework.” in Proceedings of the 10th European Semantic Web Conference (ESWC 2013), edited by Philipp Cimiano, Oscar Corcho, Valentina Presutti, Laura Hollink, and Sebastian Rudolph. Montpellier: Springer, 2013: 427-441.
  • Milekić, Sven. “Croatian-language Wikipedia: when the extreme right rewrites history.” Osservatorio Balcani e Caucaso, September 27, 2018. https://www.balcanicaucaso.org/eng/Areas/Croatia/Croatian-language-Wikipedia-when-the-extreme-right-rewrites-history-190081
  • Ranta, Aarne. Grammatical Framework: Programming with Multilingual Grammars. Stanford: CSLI Publications, 2011.
  • Vrandečić, Denny. “Towards a multilingual Wikipedia,” in Proceedings of the 31st International Workshop on Description Logics (DL 2018), edited by Magdalena Ortiz and Thomas Schneider. Phoenix: Ceur-WS, 2018.
  • Wierzbicka, Anna. Semantics: Primes and Universals. Oxford: Oxford University Press, 1996.
  • Wikidata Community: “Lexicographical data.” Accessed June 1, 2019. https://www.wikidata.org/wiki/Wikidata:Lexicographical_data
  • Wulczyn, Ellery, Robert West, Leila Zia and Jure Leskovec. “Growing Wikipedia Across Languages via Recommendation.” in Proceedings of the 25th International World-Wide Web Conference (WWW 2016), edited by Jaqueline Bourdeau, Jim Hendler, Roger Nkambou, Ian Horrocks, and Ben Y. Zhao. Montréal: IW3C2, 2016: 975-985.

Toy Story 4

Toy Story 4 was great fun!

Toy Story 3 had a great closure (and a lot of tears), so would, what could they do to justify a fourth part? They developed the characters further than ever before. Woody is faced with a lot of decisions, and he has to grow in order to say an even bigger good-bye than last time.

Interesting fact: PETA protested the movie because Bo Peep uses a shepherd's crook, and those are considered a "symbol of domination over animals."

Bo Peep was a pretty cool character in the movie. And she used her crook well.

The cast was amazing: besides the many who kept their roles (Tom Hanks, Tim Allen, Annie Potts, Joan Cusack, Timothy Dalton, even keeping Don Rickles from archive footage after his death, and everyone else) many new voices (Betty White, Mel Brooks, Christina Hendricks, Keanu Reeves, Bill Hader, Tony Hale, Key and Peele, and Flea from the Red Hot Chili Peppers).

The end of civilization?

This might be controversial with some of my friends, but no, there is no high likelihood of human civilization ending within the next 30 years.

Yes, climate change is happening, and we're obviously not reacting fast and effective enough. But that won't kill humanity, and it will not end civilization.

Some highly populated areas might become uninhabitable. No question about this. Whole countries in southern Asia, central and South America, in Africa, might become too hot and too humid or too dry for human living. This would lead to hundreds of millions, maybe billions of people, who will want to move, to save their lives and the lives of their loved ones. Many, many people would die in these migrations.

The migration pressures on the countries that are climatically better off may become enormous, and it will either lead to massive bloodshed or to enormous demographic changes, or, most likely, both.

But look at the map. There are large areas in northern Asia and North America that would dramatically improve their habitability for humans if they would warm a bit. Large areas could become viable for growing wheat, fruits, corn.

As it is already today, and as it was for most of human history, we produce enough food and clean water and shelter and energy for everyone. The problem is not production, it is and will always be distribution. Facing huge upheaval and massive migration the distribution channels will likely break down and become even more ineffective. The disruption of the distribution network will likely also endanger seemingly stable states, and places that thought to pass the events unscathed will be hurt by that breakdown. The fact that there would be enough food will make the humanitarian catastrophes even more maddening.

Money will make it possible to shelter away from the most severe effects, no matter where you start now. It's the poor that will bear the brunt of the negative effects. I don't think that's surprising to anyone.

But even if almost none of today's countries might survive as they are, and if a few billion people die, the chances of humanity to end, of civilization to end, are negligible. Billions will survive into the 21st century, and will carry on history.

So, yes, the changes might be massive and in some areas catastrophic. But humanity and civilization will preserve.

Why this post? I don't think it is responsible to exaggerate the bad predictions too much. It makes the predictions less believable. Also, to have a sober look at the possible changes may make it easier to understand why some countries react as they do. Does this mean we don't need to react and try to reduce climate change? If that's your conclusion, you haven't read carefully along. I said something about possibly billions becoming displaced.

IFLScience: New Report Warns "High Likelihood Of Human Civilization Coming To An End" Within 30 Years

Web Conference 2019

25 May 2019

Last week saw the latest incarnation of the Web Conference (previously known as WWW or dubdubdub), going from May 15 to 17 (with satellite events the two days before). When I was still in academia, WWW was one of the most prestigious conference series for my research area, so when it came to be held literally across the street from my office, I couldn’t resist going to it.

The conference featured two keynotes (the third, by Lawrence Lessig, was cancelled on short notice due to a family emergency):

Watch the talks on YouTube on the links given above. Thanks to Marco Neumann for pointing to the links!

The conference was attended by more than 1,400 people (closer to 1,600?), making it the second largest since its inception (trailing only Lyon from last year), and about double the size than it used to be only four or five years ago. The conference dinner in the Exploratorium was relaxed and enjoyable. Acceptance rate was at 18%, which made for 225 accepted full papers.

The proceedings are available for free (yay!), so browse them for papers you find interesting. Personally, I really enjoyed the papers that looked into the use of WhatsApp to spread misinformation before the Brazil election, Dataset Search, and pre-empting SPARQL queries from blocking the endpoint. The proceedings span 5,047 pages, and are available online.

I had the feeling that Machine Learning was taking much more space in the program than it used to when I used to attend the conference regularly - which is fine, but many of the ML papers were only tenuously connected to the Web (which was the same criticism that we raised against many of the Semantic Web / Description Logic papers back then).

Thanks to the general chairs for organizing the conference, Leila Zia and Ricardo Baeza-Yates, and thanks to the sponsors, particularly Microsoft, Bloomberg, Amazon, and Google.

The two workshops I attended before the Web Conference were the Knowledge Graph Technology and Applications 2019 workshop on Monday, and the Wiki workshop 2019 on Tuesday. They have their own trip reports.

If you have trip reports, let me know and I will link to them.

Wiki workshop 2019

24 May 2019

Last week, May 14, saw the fifth incarnation of the Wiki workshop, co-located with the Web Conference (formerly known as dubdubdub), in San Francisco. The room was tight and very full - I am bad at estimating, but I guess 80-110 people were there.

I was honored to be invited to give the opening talk, and since I had a bit more time than in the last few talks, I really indulged in sketching out the proposal for the Abstract Wikipedia, providing plenty of figures and use cases. The response was phenomenal, and there were plenty of questions not only after the talk but also throughout the day and in the next few days. In fact, the Open Discussion slot was very much dominated by more questions about the proposal. I found that extremely encouraging. Some of the comments were immediately incorporated into a paper I am writing right now and that will be available for public reviews soon.

The other presentations - both the invited and the accepted ones - were super interesting.

Thanks to Dario Taraborelli, Bob West, and Miriam Redi for organizing the workshop.

A little extra was that I smuggled my brother and his wife into the workshop for my talk (they are visiting, and they have never been to one of my talks before). It was certainly interesting to hear their reactions afterwards - if you have non-academic relatives, you might underestimate how much they may enjoy such an event as mere spectators. I certainly did.

See also the #wikiworkshop2019 tag on Twitter.

Knowledge Graph Technology and Applications 2019

23 May 2019

Last week, on May 13, the Knowledge Graph Technology and Applications workshop happened, co-located with the Web Conference 2019 (formerly known as WWW), in San Francisco. I was invited to give the opening talk, and talked about the limits of Knowledge Graph technologies when trying to express knowledge. The talk resonated well.

Just like in last week's KGC, the breadth of KG users is impressive: NASA uses KGs to support air traffic management, Uber talks about the potential for their massive virtual KG over 200,000 schemas, LinkedIn, Alibaba, IBM, Genentech, etc. I found particularly interesting that Microsoft has not one, but at least four large Knowledge Graphs: the generic Knowledge Graph Satori; an Academic Graph for science, papers, citations; the Enterprise Graph (mostly LinkedIn), with companies, positions, schools, employees and executives; and the Work graph about documents, conference rooms, meetings, etc. All in all, they boasted more than a trillion triples (why is it not a single graph? No idea).

Unlike last week, the focus was less on sharing experiences when working with Knowledge Graphs, but more on academic work, such as query answering, mixing embeddings with KGs, scaling, mapping ontologies, etc. Given that it is co-located with the Web Conference, this seems unsurprising.

One interesting point that was raised was the question of common sense: can we, and how can we use a knowledge graph to represent common sense? How can we say that a box of chocolate may fit in the trunk of a car, but a piano would not? Are KGs the right representation for that? The question remained unanswered, but lingered through the panel and some QnA sessions.

The workshop was very well visited - it got the second largest room of the day, and the room didn’t feel empty, but I have a hard time estimating how many people where there (about 100-150?). The audience was engaged.

The connection with the Web was often rather tenuous, unless one thinks of KGs as inherently associated with the Web (maybe because they often could use Semantic Web standards? But also often they don’t). On the other side it is a good outlet within the Web Conference for the Semantic Web crowd and to make them mingle more with the KG crowd, I did see a few people brought together into a room that often have been separated, and I was able to point a few academic researchers to enterprise employees that would benefit from each other.

Thanks to Ying Ding from the Indiana University and the other organizers for organizing the workshop, and for all the discussion and insights it generated!

Update: corrected that Uber talked about the potential of their knowledge graph, not about their realized knowledge graph. Thanks to Joshua Shivanier for the correction! Also added a paragraph on common sense.

Knowledge Graph Conference 2019, Day 1

On Tuesday, May 7, began the first Knowledge Graph Conference. Organized by François Scharffe and his colleagues at Columbia University, it was located in New York City. The conference goes for two days, and aims at a much more industry-oriented crowd than conferences such as ISWC. And it reflected very prominently in the speaker line-up: especially finance was very well represented (no surprise, with Wall Street being just downtown).

Speakers and participants from Goldman Sachs, Capital One, Wells Fargo, Mastercard, Bank of America, and others were in the room, but also from companies in other industries, such as Astra Zeneca, Amazon, Uber, or AirBnB. The speakers and participants were rather open about their work, often listing numbers of triples and entities (which really is a weird metric to cite, but since it is readily available it is often expected to be stated), and these were usually in the billions. More interesting than the sheer size of their respective KGs were their use cases, and particularly in finance it was often ensuring compliance to insider trading rules and similar regulations.

I presented Wikidata and the idea of an Abstract Wikipedia as going beyond what a Knowledge Graph can easily express. I had the feeling the presentation was well received - it was obvious that many people in the audience were already fully aware of Wikidata and are actively using it or planning to use it. For others, particularly the SPARQL endpoint with its powerful visualization capabilities and the federated queries, and the external identifiers in Wikidata, and the approach to references for the claims in Wikidata were perceived as highlights. The proposal of an Abstract Wikipedia was very warmly received, and it was the first time no one called it out as a crazy idea. I guess the audience was very friendly, despite New York's reputation.

A second set of speakers were offering technologies and services - and I guess I belong to this second set by speaking about Wikidata - and among them were people like Juan Sequeda of Capsenta, who gave an extremely engaging and well-substantiated talk on how to bridge the chasm towards more KG adoption; Pierre Haren of Causality Link, who offered an interesting personal history through KR land from LISP to Causal Graphs; Dieter Fensel of OnLim, who had a a number of really good points on the relation between intelligent assistants and their dialogue systems and KGs; Neo4J, Eccenca, Diffbot.

A highlight for me was the astute and frequent observation by a number of the speakers from the first set that the most challenging problems with Knowledge Graphs were rarely technical. I guess graph serving systems and cloud infrastructure have improved so much that we don't have to worry about these parts anymore unless you are doing crazy big graphs. The most frequently mentioned problems were social and organizational. Since Knowledge Graphs often pulled data sources from many different parts of an organization together, with a common semantics, they trigger feelings of territoriality. Who gets to define the common ontology? What if the data a team provides has problems or is used carelessly, who's at fault? What if others benefit from our data more than we did even though we put all the effort in to clean it up? How do we get recognized for our work? Organizational questions were often about a lack of understanding, especially among engineers, for fundamental Knowledge Graph principles, and a lack of enthusiasm in the management chain - especially when the costs are being estimated and the social problems mentioned before become apparent. One particularly visible moment was when Bethany Sehon from Capital One was asked about the major challenges to standardizing vocabularies - and her first answer was basically "egos".

All speakers talked about the huge benefits they reaped from using Knowledge Graphs (such as detecting likely cliques of potential insider trading that later indeed got convicted) - but then again, this is to be expected since conference participation is self-selecting, and we wouldn't hear of failures in such a setting.

I had a great day at the inaugural Knowledge Graph Conference, and am sad that I have to miss the second day. Thanks to François Scharffe for organizing the conference, and thanks to the sponsors, OntoText, Collibra, and TigerGraph.

For more, see:

Golden

I'd say that Golden might be the most interesting competitor to Wikipedia I've seen in a while (which really doesn't mean that much, it's just the others have been really terrible).

This one also has a few red flags:

  • closed source, as far as I can tell
  • aiming for ten billion topics in their first announcement, but lacking an article on Germany
  • obviously not understanding what the point of notability policies are, and no, it is not about server space

They also have a features that, if they work, should be looked at and copied by Wikipedia - such as the editing assistants and some of the social features that are built-in into the platform.

Predictions:

  1. they will make a splash or two, and have corresponding news cycles to it
  2. they will, at some point, make an effort to import or transclude Wikipedia content
  3. they will never make a dent in Wikipedia readership, and will say that they wouldn't want to anyway because they love Wikipedia (which I believe)
  4. they will make a press release of donating all their content to Wikipedia (even though that's already possible thanks to their license)
  5. and then, being a for-profit company, they will pivot to something else within a year or two.

May 2019 talks

I am honored to give the following three invited talks in the next few weeks:

The topics will all be on Wikidata, how the Wikipedias use it, and the Abstract Wikipedia idea.

AI and role playing

An article about AI and role playing games, and thus in the perfect intersection of my interest.

But the article is entirely devoid of any interesting content, and basically boils down to asking the question "could RPGs be a Turing test for AI?"

I mean, the answer is so painfully obviously "yes" that no one ever bothered to write it down. I mean, Turing wrote the test as a role playing game basically!

Papaphobia

In a little knowledge engineering exercise, I was trying to add the causes of a phobia to the respective Wikidata items. There are currently about 160 phobias in Wikidata, and only a few listed in a structured way what they are afraid of. So I was going through them, trying to capture it in s a structured way. Here's a list of the current state:

Now, one of those phobias was the Papaphobia - the fear of the pope. Now, is that really a thing? I don't know. CDC does not seem to have an entry on it. On the Web, in the meantime, some pages have obviously taken to mining lists of phobias and creating advertising pages that "help" you with Papaphobia - such as this one:

This page is likely entirely auto-generated. I doubt it that they have "clients for papaphobia in 70+ countries", whom they helped "in complete discretion" within a single day! "People with severe fears and phobias like papaphobia (which is in fact the formal diagnostic term for papaphobia) are held prisoners by their phobias."

This site offers more, uhm, useful information.

"Group psychotherapy can also help where individuals share their experience and, in the process, understand and recover from their phobia." Really? There are enough cases that we can even set up a group therapy?

Now, maybe I am entirely off here - maybe, papaphobia is really a thing. With search in Scholar I couldn't find any medical sources (the term is mentioned in a number of sociological and historical works, to express general sentiments in a population or government against the authority of the pope, but I could not find any mentions of it in actual medical literature).

Now could those pages up there be benign cases of jokes? Or are they trying to scam people with promises to heal their actual fears, and they just didn't curate the list of fears sufficiently, because, really, you wouldn't find this page unless you actually search for this term?

And now what? Now what if we know these pages are made by scammers? Do we report them to the police? Do we send a tip to journalists? Or should we just do nothing, allowing them to scam people with actual fears? Well, by publishing this text, maybe I'll get a few people warned, but it won't reach the people it has to reach at the right time, unfortunately.

Also, was it always so hard to figure out what is real and what is not? Does papaphobia exist? Such a simple question. How should we deal with it on Wikidata? How many cases are there, if it exists? Did it get worse for people with papaphobia now that we have two people living who have been made pope?

My assumption now is that someone was basically working on a corpus, looking for words ending in -phobia, in order to generate a list of phobias. And then the term papaphobia from sociological and historical literature popped up, and it landed in some list, and was repeated in other places, etc., also because it is kind of a funny idea, and so a mixture of bad research and joking bubbled through, and rolled around on the Web for so long that it looks like it is actually a thing, to the point that there are now organizations who will gladly take your money (CTRN is not the only one) to treat you for papaphobia.

The world is weird.

An indigenous library

Great story about an indigenous library using their own categorization system instead of the Dewey Decimal System (which really doesn't work for indigenous topics - I mean it doesn't really work for the modern world as well, but that's another story).

What I am wondering though if if they're not going far enough. Dewey's system is eventually rooted in Aristotelian logic and categorization - with a good dash of practical concerns of running a physical library.

Today, these practical concerns can be overcome, and it is unlikely that indigenous approaches to knowledge representation would be rooted in Aristotelian logic. Yes, having your own categorization system is a great first step - but that's like writing your own anthem following the logic of European hymns or creating your own flag following the weird rules of European medieval heraldry. How would it look like if you were really going back to the principles and roots of the people represented in these libraries? Which novel alternatives to representing and categorizing knowledge could we uncover?

Via Jens Ohlig.

How much information is in a language?

About the paper "Humans store about 1.5 megabytes of information during language acquisition“, by Francis Mollica and Steven T. Piantadosi.

This is one of those papers that I both love - I find the idea is really worthy of investigation, having an answer to this question would be useful, and the paper is very readable - and can't stand, because the assumptions in the papers are so unconvincing.

The claim is that a natural language can be encoded in ~1.5MB - a little bit more than a floppy disk. And the largest part of this is the lexical semantics (in fact, without the lexical semantics, the rest is less than 62kb, far less than a short novel or book).

They introduce two methods about estimating how many bytes we need to encode the lexical semantics:

Method 1: let's assume 40,000 words in a language (languages have more words, but the assumptions in the paper is about how many words one learns before turning 18, and for that 40,000 is probably an Ok estimation although likely on the lower end). If there are 40,000 words, there must be 40,000 meanings in our heads, and lexical semantics is the mapping of words to meanings, and there are only so many possible mappings, and choosing one of those mappings requires 553,809 bits. That's their lower estimate.

Wow. I don't even know where to begin in commenting on this. The assumption that all the meanings of words just float in our head until they are anchored by actual word forms is so naiv, it's almost cute. Yes, that is likely true for some words. Mother, Father, in the naive sense of a child. Red. Blue. Water. Hot. Sweet. But for a large number of word meanings I think it is safe to assume that without a language those word meanings wouldn't exist. We need language to construct these meanings in the first place, and then to fill them with life. You can't simply attach a word form to that meaning, as the meaning doesn't exist yet, breaking down the assumptions of this first method.

Method 2: let's assume all possible meanings occupy a vector space. Now the question becomes: how big is that vector space, how do we address a single point in that vector space? And then the number of addresses multiplied with how many bits you need for a single address results in how many bits you need to understand the semantics of a whole language. There lower bound is that there are 300 dimensions, the upper bound is 500 dimensions. Their lower bound is that you either have a dimension or not, i.e. that only a single bit per dimension is needed, their upper bound is that you need 2 bits per dimension, so you can grade each dimension a little. I have read quite a few papers with this approach to lexical semantics. For example it defines "girl" as +female, -adult, "boy" as -female,-adult, "bachelor" as +adult,-married, etc.

So they get to 40,000 words x 300 dimensions x 1 bit = 12,000,000 bits, or 1.5MB, as the lower bound of Method 2 (which they then take as the best estimate because it is between the estimate of Method 1 and the upper bound of Method 2), or 40,0000 words x 500 dimensions x 2 bits = 40,000,000 bits, or 8MB.

Again, wow. Never mind that there is no place to store the dimensions - what are they, what do they mean? - probably the assumption is that they are, like the meanings in Method 1, stored prelinguistically in our brains and just need to be linked in as dimensions. But also the idea that all meanings expressible in language can fit in this simple vector space. I find that theory surprising.

Again, this reads like a rant, but really, I thoroughly enjoyed this paper, even if I entirely disagree with it. I hope it will inspire other papers with alternative approaches towards estimating these numbers, and I'm very much looking forward to reading them.

Milk consumption in China

Quiet disappointed by The Guardian. Here's a (rather) interesting article on the history of milk consumption in China. But the whole article is trying to paint how catastrophic this development might be: the Chinese are trying to triple their intake in milk! That means more cows! That's bad because cows fart us into a hot house!

The argumentation is solid - more cows are indeed problematic. But blaming it on milk consumption in China? Let's take a look at a few numbers omitted from the article, or stuffed into the very last paragraph.

  • On average, a European consumes six times as much milk as a Chinese. So, even if China achieves its goal and triples average milk consumption, they will drink only half as much as a European.
  • Europe has double the number of dairy cows than China has.
  • China is planning to increase their milk output by 300% but only increase resources for that by 30% according to the article. I have no idea how that works, but sounds like a great deal to me.
  • And why are we even talking about dairy cows? The number of beef cows in the US or in Europe each outnumber the dairy cows by a fair amount (unsurprisingly - a cow produces quite a lot of milk over a longer time, whereas its meat production is limited to a single event)
  • There are about 13 million dairy cows in China. The US have more than 94 million cattle, Brazil has more than 211 million, world wide it's more than 1.4 billion - but hey, it's the Chinese milk cows that are the problem.

Maybe the problem can be located more firmly in the consumption habits of people in the US and in Europe than the "unquenchable thirst of China".

The article is still interesting for a number of other reasons.

Shazam!

Shazam! was fun. And had more heart than many other superhero stories. I liked that, for the first time, a DC universe movie felt like it's organically part of that universe - with all the backpacks with Batman and Superman logos and stuff. That was really neat.

Since I saw him in the first trailer I was looking forward to see Steve Carell playing the villain. Turns out it was Mark Strong, not Steve Carell. Ah well.

I am not sure the film knew exactly at whom it was marketed. The theater was full with kids, and given the trailers it was clear that the intention was to get as many families into it as possible. But the horror sequences, the graphic violence, the expletives, and the strip club scenes were not exactly for that audience. PG-13 is an appropriate rating.

It was a joy to watch the protagonist and his buddy explore and discover his powers. Colorful, lively, fun. Easily the best scenes of the movie.

The foster family drama gave the movie it's heart, but the movie seemed a bit overwhelmed by it. I wish that part was executed a bit better. But then again, it's a superhero movie, and given that it was far better than many of the other movies of its genre. But as far as High School and family drama superheroes go, it doesn't get anywhere near Spiderman: Homecoming.

Mid credit scenes. A tradition that Marvel started and that DC keeps copying - but unlike Marvel DC hasn't really paid up to the teasers in their scenes. And regarding cameos - also something where DC could learn so much from Marvel. Also, what's up with being afraid of naming their heroes? Be it in Man of Steel with Superman or here with Billy, the hero doesn't figure out his name (until the next movie comes along and everybody refers to him as Superman as if it was obvious all the time).

All in all, an enjoyable movie while waiting for Avengers: Endgame, and hopefully a sign that DC is finally getting on the right path.

EMWCon 2019, Day 2

Today was the second day of the Enterprise MediaWiki Conference, EMWCon, in Daly City at the Genesys headquarters.

The day started with my keynote on Wikidata and the Abstract Wikipedia idea. The idea was received very friendly.

Today, the day was filled with stories from people building systems on top of MediaWiki, and in particularly Semantic MediaWiki, Cargo, and some Wikibase. This included SFMoma presenting their system to collaboratively document art, using Cargo and Lua on the League of Legends wiki, running a whole wiki farm for Finnish memory and language institutions, the Lost Plays database, and - what I found particularly impressive - an engineer at NASA who implemented a workflow for document approval including authorization, audibality, and a full Web interface within a mere week, and still thinking that it could have been done much faster.

A common theme was "how incredibly easy it was". Yes, almost everyone mentioned something they got stumped on, and this really points to the community needing maybe more usage on StackOverflow or IRC or something, but in so many use cases, people who were not developers were able to create pretty complex workflows and apps right there in their browsers. This also ties in with the second common theme, that a lot of the deployments of such wikis are often starting "under the radar".

There were also genuinely complex solutions that were using Semantic MediaWiki as a mere component: Matteo Busanelli was presenting a solution that included lifting external data sources, deploying ontologies, reasoning, and all the whistles and bells - a very impressive and powerful architecture.

The US government uses Semantic MediaWiki in many places, most notably Intellipedia used by more than 16 intelligence agencies, Diplopedia by the Department of State, and Powerpedia for the Department of Energy. EPA's Statipedia is no more, but new wikis are popping up in other agency, such as WikITA for the International Trade Administration, and for the Nuclear Regulatory Commission. Canada's GCpedia was mentioned with a lot of respect, and the wish that the US would have something similar.

NASA has a whole wiki farm: within mission control alone they had 12 different wikis after a short while, many grown bottom up. They noticed that it would make sense to merge them together - which wasn't easy, neither technically nor legally nor managerially. They found that a lot of their knowledge was misclassified - for example, they classified handbooks which can be bought by anyone on Amazon. One of the biggest changes the wiki caused at NASA was that the merged ISS wiki lead to opening more knowledge to more people, and drawing the circles larger. 20% of the people who have access to the wikis actively contribute to the wikis! This is truly impressive.

So far, no edit has been made from space - due to technical issues. But they are working on it.

The day ended with a panel, asking the question where MediaWiki is in the marketplace, and how to grow.

Again, thanks to Yaron Koren and Cindy Cicalese for organizing the conference, and Genesys for hosting us. All presentations are available on YouTube.

EMWCon 2019, Day 1

Today was the first day of the Enterprise MediaWiki Conference, EMWCon, in Daly City. Among the attendees were people from NASA (6 or more people), UIC (International Union of Railways), the UK Ministry of Defence, the US radioactivity safety agencies, cancer research institutes, the Bureaus of Labour Statistics, PG&E, General Electric, and a number of companies providing services around MediaWiki, such as WikiTeq, Wikiworks, dokit, etc., with or without semantic extensions. The conference was located at the Headquarter of Genesys.

I'm not going to comment on all talks, and also I will not faithfully report on the talks - you can just go to YouTube to watch the talks themselves. The following is a personal, biased view of the first day.

NASA made an interesting comment early on: the discussion was about MediaWiki and its lack of fine-grained access control. You can set up a MediaWiki easily for a controlled group (so that not everyone in the world can access it), but it is not so easy to say "oh, this set of pages is available for people in this group, and managers in that org can access the pages with this markers", etc. So NASA, at first, set up a lot of wiki installations, each one for such specific groups - but eventually turned it all around and instead had a small number of well-defined groups and merged the wikis into them, tearing down barriers within the org and making knowledge wider available.

Evita Hollis from General Electric had an interesting point in her presentation on how GE does knowledge sharing: they use SharePoint and Yammer to connect people to people, and MediaWiki to connect people to Knowledge. MediaWiki has been not-exactly-great at allowing people to work together in real-time - it is a different flow, where you capture and massage knowledge slowly into it. There is a reason why Ops at Wikimedia do not use a wiki during an incident that much, but rather IRC. I think there is a lot of insight in her argument - and if we take that serious, we could actually really lift MediaWiki to a new level, and take Wikipedia there too.

Another interesting point is that SharePoint at General Electric had three developers, and MediaWiki had one. The question from the audience was, whether that reflect how difficult it is to work with SharePoint, or whether that reflected some bias of the company towards SharePoint. Hollis was adamant about how much she likes Sharepoint, but the reason for the imbalance was that MediaWiki, particularly Semantic MediaWiki, allows actually much more flexibility and power than SharePoint without having to touch a single line of wiki source code. It is a platform that allows for rapid experimentation by the end user (I am adding the Spiderman adage about great power coming with great responsibility).

Daren Welsh from NASA talked about many different forms of biases and how they can bubble up on your wiki. Very interesting was one effect: if knowledge from the wiki is becoming too readily availble, people may start to become dependent on it. They had tests where they took away the wiki randomly from flight controllers in training, in order to ensure they are resourceful enough to still figure out what to do - and some failed miserably.

Ike Hecht had a brilliant presentation on the kind of quick application development Semantic MediaWiki lends itself to. He presented a task manager, a news feed, and a file management system, calling them "Semantic Structures That Do Stuff" - which is basically a few pages for your wiki, instead of creating extensions for all of these. This also resonated with GE's statement about needling less developers. I think that this is wildly underutilized and there is a lot of value in this idea.

Thanks to Yaron Koren - who also gave an intro to the topic - and Cindy Cicalese for organizing the conference, and Genesys for hosting us. All presentations are available on YouTube.

EMWCon Spring 2019

I'm honored to be invited to keynote the Enterprise MediaWiki conference in Daly City. The keynote is on Thursday, I will talk about Wikidata and beyond - towards an abstract Wikipedia.

The talk is planned to be recorded, so it should be available afterwards for everyone interested.

Turing Award to Bengio, LeCun, and Hinton

Congratulations to Yoshua Bengio, Yann LeCun, and Geoffrey Hinton on being awarded the Turing Award, the most prestigious award in Computer Science.

Their work had revolutionized huge parts of computer science as it is used in research and industry, and has lead to the current impressive results in AI and ML. They were continuing to work on an area that was deemed unpromising, and has suddenly swept through whole industries and reshaped them.

Something Positive in Deutsch wieder online

2005 und 2006 übersetzten Ralf Baumgartner und ich die ersten paar Something Positive comics von R. K. Milholland ins Deutsche. Die 80 Comics, die wir damals übersetzt haben, sind hiermit wieder online. Wir haben noch vier weitere Comics übersetzt, die in den nächsten Tagen auch nach und nach online kommen werden.

Viel Spass! Oh, und die Comics sind für Erwachsene.

DSA Erfolgswahrscheinlichkeiten

Ich fand es immer spannend, auszurechnen, wie hoch die Wahrscheinlichkeit ist, dass eine Talentprobe in DSA gelingt oder nicht. Ich konnte über die Jahre hinweg keine vernünftige, geschlossene Formel finden, und so blieb ich immer bei Überschlagsrechnungen. Dabei visualisierte ich mir im Kopf die drei Würfelwürfe als die drei Dimensionen eines Raumes, in dem ein Teil des Raumes gelungene Proben und der Rest des Raumes misslungene Proben darstellt.

Ich dachte lange darüber nach, dass es interessant ware, diesen Raum tatsächlich zu visualisieren. 2010 musste ich während eines Forschungsaufenthalts in Los Angeles ein paar Webtechniken erlernen - HTML Canvas, jQuery, Blueprint, etc. - und am besten lerne ich, indem ich ein kleines Projekt mache. Also nutzte ich diese Gelegenheit. Damals war DSA4 aktuell, und entsprechend machte ich das Projekt für die Regeln von DSA4.

2017 überarbeitete Hanno Müller-Kalthoff die Visualisierung und passte sie an die neuen Regeln von DSA5 an. Hier sind Links für beide Seiten und eine DSA5 App:

A bitter, better lesson

Rich Sutton is expressing some frustration in his short essay on computation and simple methods beating smart methods again and again.

Rodney Brooks answers with great arguments on why this is not really the case, and how we're just hiding human ingenuity and smartness better.

They're both mostly right, and it was interesting to read the arguments on both sides. And yet, not really new - it's mostly rehashing the arguments from The unreasonable effectiveness of data by Alon Halevy, Peter Norvig, and Fernando Pereira ten years ago. But nicely updated and much shorter. So worth a read!

Wikipedia demonstriert

Eine Reihe von Wikipedien (Deutsch, Dänisch, Estnisch, Tschechisch) tragen heute schwarz um schlecht gemachte Gesetzesänderungen zu verhindern. Ich bin stolz auf die Freiwilligen der Wikipedien, die das organisiert bekommen haben.

Spring cleaning

Going through my old online presence and cleaning it up is really a trip down memory lane. I am also happy that most - although not all - of the old demos still work. This is going to be fun to release it all again.

Today I discovered that we had four more German translations of Something Positive that we never published. So that's another thing that I am going to publish soon, yay!

Prediction coming true

I saw my post from seven years ago, where I said that I really like Facebook and Google+, but I want a space where I have more control about my content so it doesn't just disappear. "Facebook and Google+ -- maybe they won't disappear in a year. But what about ten?"

And there we go, Google+ is closing in a few days.

I downloaded my data from there (as well as my data from Facebook and Twitter), to see if there is anything to be salvaged from that, but I doubt it.

Restarting, 2019 edition

I had neglected Simia for too long - there were five entries in the last decade. A combination of events lead me to pour some effort back into it - and so I want to use this chance to restart it, once again.

Until this weekend, Simia was still running on a 2007 version of Semantic MediaWiki and MediaWiki - which probably helped with Simia getting hacked a year or two ago. Now it is up to date with a current version, and I am trying to consolidate what is already there with some content I had created in other places.

Also, life has been happening. If you have been following me on Facebook (that link only works if you're logged in), you have probably seen some of that. I married, got a child, changed jobs, and moved. I will certainly catch up on this too, but there is no point in doing that all in one huge mega-post. Given that I am thinking about starting a new project just these days, this might be the perfect outlet to accompany that.

I make no promises with regards to the quality of Simia, or the frequency of entries. What I would really love recreate would be a space that is as interesting and fun for as my Facebook wall was, before I stopped engaging there earlier this year - but since you cannot even create comments here, I have to figure out how to make this even remotely possible. For now, suggestions on Twitter or Facebook are welcome. And no, moving to WordPress or another platform is hardly an option, as I really want to stay with Semantic MediaWiki - but pointers to interesting MediaWiki extensions are definitely welcome!

Stars in our eyes

I grew up in a suburban, almost rural area in southern Germany, and I remember the hundreds, if not thousands of stars I could see at night. In the summers, that I spent on an island in Croatia, it was even more marvelous, and the dark night sky was breathtaking.

As I grew up, the stars dimmed, and I saw fewer and fewer of those, until only the brightest stars were visible. It was blindingly obvious that air and light pollution have swallowed that every-night miracle and confined it to my memory only.

Until in my late twenties I finally accepted and got glasses. Now the stars are back, as beautiful and brilliant as they have ever been.

Croatian Elections 2016

Croatian elections are upcoming.

The number of Croatians living abroad - in the so called Croatian diaspora - is estimated to be almost 4 Million according to the Croatian state office for Croatians abroad - only little less than the 4.3 Million who live in Croatia. The estimates vary wildly, and most of them actually do not have Croatian citizenship. But it is estimated that between 9-10% of holders of the Croatian citizenship live abroad.

These 9-10% are represented in the Croatian parliament: out of the 151 Members of Parliament, there are 3 (three) voted by the diaspora. That's 2% of the parliament representing 10% of the population.

In order for a member of the diaspora to vote, they have to register well before the election with their nearest diplomatic mission or consulate. The registration deadline is today, at least for my consulate. But for the election itself, you have to personally appear and vote at the consulate. For me, that would mean to drive or fly all the way to Los Angeles from San Francisco. And I am rather close to one of the 9 consulates in the US. There are countries that do not have Croatian embassies at all. Want to vote? Better apply for a travel visa to the country with the next embassy. Live in Nigeria? Have a trip to Libya or South Africa. There is no way to vote per mail or - ohwow21stcentury? - electronically. For one of the three Members of Parliament that represent us.

I don't really feel like the parliament wants us to vote. Making the vote mean so little and making it so hard to vote.

Gödel and physics

"A logical paradox at the heart of mathematics and computer science turns out to have implications for the real world, making a basic question about matter fundamentally unanswerable."

I just love this sentence, published in "Nature". It raises (and somehow exposes the author's intuition about) one of the deepest questions in science: how are mathematics, logic, computer science, i.e. the formal sciences, on the one side, and the "real world" on the other side, related? What is the connection between math and reality? The author seems genuinely surprised that logic has "implications for the real world" (never mind that "implication" is a logical term), and seems to struggle with the idea that a counter-intuitive theorem by Gödel, which has been studied and scrutinized for 85 years, would also apply to equations in physics.

Unfortunately the fundamental question does not really get tackled: the work described here, as fascinating as it is, was an intentional, many year effort to find a place in the mathematical models used in physics where Gödel can be applied. They are not really discussing the relation between maths and reality, but between pure mathematics and mathematics applied in physics. The original deep question remains unsolved and will befuddle students of math and the natural sciences for the next coming years, and probably decades (besides Stephen Wolfram, who beieves to have it all solved in NKS, but that's another story).

Nature: Paradox at the heart of mathematics makes physics problem unanswerable

Phys.org: Quantum physics problem proved unsolvable: Godel and Turing enter quantum physics

AI is coming, and it will be boring

I was asked about my opinion on this topic, and I thought I would have some profound thoughts on this. But I ended up rambling, and this post doesn’t really make any single strong point. tl;dr: Don’t worry about AI killing all humans. It’s not likely to happen.

In an interview with the BBC, Stephen Hawking stated that “the development of full artificial intelligence could spell the end of the human race”. Whereas this is hard to deny, it is rather trivial: any sufficiently powerful tool could potentially spell the end of the human race given a person who knows how to use that tool in order to achieve such a goal. There are far more dangerous developments - for example, global climate change, the arsenal of nuclear weapons, or an economic system that continues to sharpen inequality and social tension?

AI will be a very powerful tool. Like every powerful tool, it will be highly disruptive. Jobs and whole industries will be destroyed, and a few others will be created. Just as electricity, the car, penicillin, or the internet, AI will profoundly change your everyday life, the global economy, and everything in between. If you want to discuss consequences of AI, here are a few that are more realistic than human extermination: what will happen if AI makes many jobs obsolete? How do we ensure that AIs make choices compliant with our ethical understanding? How to define the idea of privacy in a world where your car is observing you? What does it mean to be human if your toaster is more intelligent than you?

The development of AI will be gradual, and so will the changes in our lifes. And as AI keeps developing, things once considered magical will become boring. A watch you could talk to was powered by magic in Disney’s 1991 classic “The Beauty and the Beast”, and 23 years later you can buy one for less than a hundred dollars. A self-driving car was the protagonist of the 80s TV show “Knight Rider”, and thirty years later they are driving on the streets of California. A system that checks if a bird is in a picture was considered a five-year research task in September 2014, and less than two months later Google announces a system that can provide captions for pictures - including birds. And these things will become boring in a few years, if not months. We will have to remind ourselves how awesome it is to have a computer in our pocket that is more powerful than the one that got Apollo to the moon and back. That we can make a video of our children playing and send it instantaneously to our parents on another continent. That we can search for any text in almost any book ever written. Technology is like that. What’s exciting today, will become boring tomorrow. So will AI.

In the next few years, you will have access to systems that will gradually become capable to answer more and more of your questions. That will offer advice and guidance towards helping you navigate your life towards the goal you tell it. That will be able to sift through text and data and start to draw novel conclusions. They will become increasingly intelligent. And there are two major scenarios that people are afraid of at this point:

  1. That the system will become conscious and develop their own intentions and their own will, and they will want to destroy humanity: the Skynet scenario from the Terminator movies.
  2. That the system might get a task, and figure out a novel solution for the task which unfortunately wipes out humanity. This is the paperclip scenario— an AI gets the task to create paperclips, and kills all humans by doing so — , which has not yet been turned into a blockbuster.

The Skynet scenario is just mythos. There is no indication that raw intelligence is sufficient to create intrinsic intention or will.

The paperclip scenario is more realistic. And once we get closer to systems with such power, we will need to put the right safeguards in place. The good news is that we will have plenty of AIs at our disposal to help us with that. The bad news is that discussing such scenarios now is premature: we simply don’t know how these systems will look like. That’s like starting a committee a hundred years ago to discuss the danger coming from novel weaponry: no one in 1914 could have predicted nuclear weapons and their risks. It is unlikely that the results of such a committee would have provided much relevant ethical guidance for the Manhattan project three decades later. Why should that be any different today?

In summary: there are plenty of consequences of the development of AI that warrant intensive discussion (economical consequences, ethical decisions made by AIs, etc.), but it is unlikely that they will bring the end of humanity.

Further reading

Published originally on Medium on December 14, 2014

Start the website again

This is no blog anymore. I haven't had entries for years, and even before then sporadically. This is a wiki, but somehow it is not that either. Currently you cannot make comments. Updating the software is a pain in the ass. But I like to have a site where I can publish again. Switch to another CMS? Maybe one day. But I like Semantic MediaWiki. So what will I do? I do not know. But I know I will slowly refresh this page again. Somehow.

A new part of my life is starting soon. And I want to have a platform to talk about it. And as much as I like Facebook or Google+, I like to have some form of control over this platform. Facebook and Google+ -- maybe they won't disappear in a year. But what about ten? Twenty? Fifty years? I'll still be around (I hope), but they might not...

Let's see what will happen here. For now, I republished the retelling of a day as a story I first published on Google+ (My day in Jerusalem) and a poem that feels eerily relevant whenever I think about it (Wenn ich wollte)

Popculture in logics

  1. You ⊑ ∀need.Love (Lennon, 1967)
  2. ⊥ ≣ compare.You (Nelson, 1985)
  3. Cowboy ⊑ ∃sing.SadSadSong (Michaels, 1988)
  4. ∀t : I ⊑ love.You (Parton, 1973)
  5. ∄better.Time ⊓ ∄better­­­­­­­⁻¹.Time (Dickens, 1859)
  6. {god} ⊑ Human ⊨ ? (Bazilian, 1995)
  7. Bad(X)? (Jackson, 1987)
  8. ⃟(You ⊑ save.I) (Gallagher, 1995)
  9. Dreamer(i). ∃x : Dreamer(x) ∧ (x ≠ i). ⃟ ∃t: Dreamer(you). (Lennon, 1971)
  10. Spoon ⊑ ⊥ (Wachowski, 1999)
  11. ¬Cry ← ¬Woman (Marley, 1974)
  12. ∄t (Poe, 1845)

Solutions: Turn around your monitor to read them.

sǝlʇɐǝq ǝɥʇ 'ǝʌol sı pǝǝu noʎ llɐ ˙ǝuo
ǝɔuıɹd ʎq ʎllɐuıƃıɹo sɐʍ ʇı 'ƃuos ǝɥʇ pǝɹǝʌoɔ ʇsnɾ pɐǝuıs ˙noʎ oʇ sǝɹɐdɯoɔ ƃuıɥʇou ˙oʍʇ
˙uosıod ʎq uɹoɥʇ sʇı sɐɥ ǝsoɹ ʎɹǝʌǝ ɯoɹɟ '"ƃuos pɐs pɐs ɐ sƃuıs ʎoqʍoɔ ʎɹǝʌǝ" ˙ǝǝɹɥʇ
ʞɔɐɹʇpunos ǝıʌoɯ pɹɐnƃʎpoq ǝɥʇ ɹoɟ uoʇsnoɥ ʎǝuʇıɥʍ ʎq ɹɐlndod ǝpɐɯ ʇnq uoʇɹɐd ʎllop ʎq ʎllɐuıƃıɹo 'noʎ ǝʌol sʎɐʍlɐ llıʍ ı 'ɹo - ",noʎ, ɟo ǝɔuɐʇsuı uɐ ɥʇıʍ pǝllıɟ ,ǝʌol, ʎʇɹǝdoɹd ɐ ƃuıʌɐɥ" uoıʇdıɹɔsǝp ǝɥʇ ʎq pǝɯnsqns ɯɐ ı 'ʇ sǝɯıʇ llɐ ɹoɟ ˙ɹnoɟ
suǝʞɔıp sǝlɹɐɥɔ ʎq sǝıʇıɔ oʍʇ ɟo ǝlɐʇ ɯoɹɟ sǝɔuǝʇuǝs ƃuıuǝdo ǝɥʇ sı sıɥʇ ˙(ʎʇɹǝdoɹd ǝɥʇ ɟo ǝsɹǝʌuı suɐǝɯ 1- ɟo ɹǝʍod" ǝɥʇ) ǝɯıʇ ɟo ʇsɹoʍ ǝɥʇ sɐʍ ʇı ˙sǝɯıʇ ɟo ʇsǝq ǝɥʇ sɐʍ ʇı ˙ǝʌıɟ
(poƃ)ɟoǝuo ƃuıɯnsqns sn ʎq pǝlıɐʇuǝ sı ʇɐɥʍ sʞsɐ ʇı ʎllɐɔısɐq ˙ʇıɥ ɹǝpuoʍ ʇıɥ ǝuo 5991 ǝɥʇ 'sn ɟo ǝuo sɐʍ poƃ ɟı ʇɐɥʍ ˙xıs
pɐq ǝlƃuıs ʇıɥ ǝɥʇ uı "pɐq s,oɥʍ" ƃuıʞsɐ 'uosʞɔɐɾ lǝɐɥɔıɯ ˙uǝʌǝs
ɔıƃol lɐpoɯ ɯoɹɟ ɹoʇɐɹǝdo ʎılıqıssod ǝɥʇ sı puoɯɐıp ǝɥʇ ˙"ǝɯ ǝʌɐs oʇ ǝuo ǝɥʇ ǝɹ,noʎ ǝqʎɐɯ" ǝuıl ǝɥʇ sɐɥ ʇı ˙sısɐo ʎq 'llɐʍɹǝpuoʍ ˙ʇɥƃıǝ
˙ooʇ ǝuo ǝɹɐ noʎ ǝɹǝɥʍ ǝɯıʇ ɐ sı ǝɹǝɥʇ ǝqʎɐɯ puɐ ˙(ǝɯ ʇou ǝɹɐ sɹǝɥʇo ǝsoɥʇ puɐ 'sɹǝɯɐǝɹp ɹǝɥʇo ǝɹɐ ǝɹǝɥʇ) ǝuo ʎluo ǝɥʇ ʇou ɯɐ ı ʇnq ˙ɹǝɯɐǝɹp ɐ ɯɐ ı" ˙ǝuıƃɐɯı 'uıɐƃɐ uouuǝl uɥoɾ ˙ǝuıu
(ǝlɔɐɹo ǝɥʇ sʇǝǝɯ ǝɥ ǝɹoɟǝq ʇsnɾ oǝu oʇ ƃuıʞɐǝds pıʞ oɥɔʎsd ǝɥʇ) xıɹʇɐɯ ǝıʌoɯ ǝɥʇ ɯoɹɟ ǝʇonb ssɐlɔ ˙uoods ou sı ǝɹǝɥʇ ˙uǝʇ
ʎǝuoɯ ǝɯos sʇǝƃ puǝıɹɟ sıɥ os ƃuıʎl ʎlqɐqoɹd sɐʍ ǝɥ ʇnq 'puǝıɹɟ ɐ oʇ sɔıɹʎl ǝɥʇ pǝʇnqıɹʇʇɐ ʎǝlɹɐɯ ˙"ʎɹɔ ʇou" sʍolloɟ "uɐɯoʍ ʇou" ɯoɹɟ ˙uǝʌǝlǝ
ǝod uɐllɐ ɹɐƃpǝ ʎq '"uǝʌɐɹ ǝɥʇ" ɯoɹɟ ɥʇonb ˙ǝɹoɯɹǝʌǝu :ɹo ˙ǝɯıʇ ou sı ǝɹǝɥʇ ˙ǝʌlǝʍʇ

My horoscope for today

Here's my horoscope for today:

You may be overly concerned with how your current job prevents you from reaching your long-term goals. The Sun's entry into your 9th House of Big Ideas can work against your efficiency by distracting you with philosophical discussions about the purpose of life. These conversations may be fun, but they should be kept to a minimum until you finish your work.

(from Tarot.com via iGoogle)

How the heck did they know??

England eagerly lacking cofidence

My Google Alerts just send me the following news alert about Croatia. At least the reporters checked all their sources :)

England players lacking confidence against Croatia International Herald Tribune - France AP ZAGREB, Croatia: England's players confessed to a lack of confidence when they took on football's No. 186-ranked nation in their opening World Cup ...

England eager to break Croatia run Reuters UK - UK By Igor Ilic ZAGREB (Reuters) - England hope to put behind their gloomy recent experiences against Croatia when they travel to Zagreb on Wednesday for an ...

Beating the Second Law

Yihon Ding has an interesting blogpost taking analogies to the laws of thermodynamics and why this means trouble for the Semantic Web.

I disagree in one aspect: I think it is possible to invest the amount of human power to the system and to still keep it going. I can't nail it down exactly -- I didn't read "Programming the Universe" yet, so I can't really discuss it, but the feeling goes along the following lines: the value of a network increases superlinearly, if not even quadratic (Metcalfe's Law), whereas the amount of information increases sublinearly (due to redundancies in human knowledge). Or, put it in another way: get more people and Wikipedia or Linux gets better, because they have a constrained scope. The more you constrain the scope the more value is added by more people.

This is an oversimplification.

Blogging from an E90

28 May 2008

After pondering it for far too long, I finally got a new mobile phone: a Nokia E90. It is pretty big and heavy, but I don't mind really. I am looking at it as a light-weight laptop replacement. But I am not sure I will learn to love the keyboard, really. Experimenting.

But since it has a full keyboard, programming in Python is indeed an option. I had Python on my previous phone too, but heck, T9 is not cool to type code.

One world. One web.

I am in Beijing at the opening of the WWW2008 conference. Like all WWWs I was before, it is amazing. The opening ceremony was preceded by a beautiful dance, combining tons of symbols. First a woman in a traditional Chinese dress, then eight dancers in astronaut uniforms, a big red flag with "Welcome to Beijing" on it (but not on the other side, when he came back), and then all of them together... beautiful.

Boris Motik's paper is a best paper candidate! Yay! Congratulations.

I rather listen to the keynote now :)

Blogging from my XO One.

Certificate of Coolness

Now that the Cool URIs for the Semantic Web note by Richard and Leo have been published -- congratulation guys! -- I am sure looking forward if anyone will create a nice badge and a procedure to get official Certificates of Coolness. Pretty please?

On a different note: I know, I should have blogged from New Zealand. It sure was beautiful Maybe I will still blog about it a bit later. My sister has blogged extensively, and also made a few great pictures, take a look over there if you're interested.

Coming to New Zealand

Yes! Three weeks of vacation in New Zealand, which is rumoured to be quite a beauty. This also means: three weeks no work, no projects, no thesis, no Semantic We...

Oh, almost. Actually I will enjoy to have the opportunity to give a talk on Semantic Wikipedia while staying in Auckland. If you're around, you may want to come by.

It is on February 22nd, 1pm-2pm at the AUT. You may want to tell Dave Parry that you're coming, he is my host.

Looking forward to this trip a lot!

Charlie Wilson's War

Ein einfacher Kongressmann (Tom Hanks). Eine sehr rechte reiche Texanering (Julia Roberts). Ein äußerst guter CIA Agent (Philip Seymour Hoffman). Und Sowjets die in Afghanistan einmarschieren, im Kalten Krieg. Amerika muss sich wehren, auch am Hindukusch!

Der Film behandelt den afghanischen Krieg (den aus den 1980er Jahren), und es geht um sehr ernste Themen. Zudem beruht er auf wahren Ereignissen. Dennoch verpackt er es in abstrusen Witz, liefert uns charmante Antihelden, und deutet schließlich auch an, wie es zum nächsten Afghanistankrieg kommen konnte (den aus den 2000er Jahren), doch ist das kaum das Thema in diesem Film.

Wie bereits Hunting Party ein Film, der doch sensibel mit jenen Wahrheiten umgeht, für die er eintritt, und mit umso derberen Humor und Aufklärung die Missstände anprangert. Ein besonderer Spagat gelingt dem Film da er die Aufteilung des politischen Spektrums in rechts und links nicht einfach mit Böse und Gut gleichsetzt, wie es etwa Michael Moore gerne macht, sondern ähnlich wie Team America nach beiden Seiten austeilt -- Lob, wie auch Kritik.

Wir sahen den Film gestern in der Sneak, leider in der Originalversion -- insbesondere die Texanischen Dialekte waren echt schwer zu verstehen, weswegen wohl der eine oder andere Gag verloren ging. Ich hoffe ihn dann auch in der deutschen Synchro zu sehen.

Charlie Wilson's War (Der Krieg des Charlie Wilson) läuft in Deutschland am 7. Februar 2008 in den Kinos an.

Bewertung: 4 von 5

7 Jahre

Heute sind es genau sieben Jahre, dass ich diese Website eröffnet habe. Und heute wird sie umbenannt! In den letzten Wochen stellte ich die Software vollständig auf Semantic MediaWiki um, welches es mir deutlich einfacher erlaubt, die Seite zu pflegen als jemals zuvor.

In den nächsten Wochen will ich langsam, aber sicher, die alten Inhalte, die seit einem Hacker-Angriff auf Nodix verschwunden sind, wieder hochladen.

Eine wichtige Änderung gibt es freilich (nicht die Hintergrundfarbe, die ist geblieben): der Name der Website wurde geändert. Kein Nodix mehr, ab jetzt heißt die Seite Simia. Und ihr werdet merken, viele der Seiten sind Englisch, andere Deutsch. Einfach auch, weil inzwischen vieles von dem was ich mache, Englisch ist. Ich hoffe, dass das nicht abschreckt. Es sind ja dennoch noch viele Deutschsprachige Inhalte vorhanden.

Und was leider noch nicht funktioniert, ist das Kommentieren, sorry. Das heißt, vorübergehend ist das nur per Email möglich. Ich arbeite dran.

Kindheitsträume wahr werden lassen

Randy Pausch ist Professor für User Interfaces and der CMU, einer der bekanntesten Universitäten der USA. Im September 2006 wurde bei ihm Bauchspeicheldrüsenkrebs diagnostiziert. Seitdem kämpft er um jeden Tag.

In der Vortragsreihe Journeys (Reisen) der CMU, welche Randy mit seinem Vortrag eröffnete, sollen die Vortragenden sich überlegen, was sie den Zuhörern sagen würden, wenn dies ihre letzte Gelegenheit für einen Vortrag wäre. Ihr Erbe, sozusagen.

Der Vortrag -- auch wenn er knappe anderthalb Stunden dauert -- stellt flott und unterhaltend Randys Kindheitsträume vor, und wie sie wahr geworden sind, oder nicht. Er erzählt viele Anekdoten, und fasst wichtige Weisheiten zusammen.

Das Video des Vortrags, mit Untertiteln in Deutsch oder Englisch, ist bei Google Video erhältlich. Sehenswert.

Darjeeling Limited

Wunderschöner Film. Auch wenn der Sympatexter offenbar nicht allzu begeistert war, mir gefiel er sehr. Wem Wes Andersons andere Filme gefallen haben (insbesondere The Royal Tenenbaums und The Life Aquatic with Steve Zissou), der wird sich auch an Darjeling Limited sehr freuen.

Der Film ist farbenfroh, hat einen witzigen Soundtrack, unzählige skurrile Situationen, und hin und wieder auch sehr tiefer Stoff zum Nachdenken. Anderson setzt seine Schauspieler hervorragend in Szene, verzaubert mit den wunderbaren Macken der drei Brüder, und lässt einen mit der Gewissheit wieder aus dem Kino gehen, dass die eigene Familie gar nicht so verrückt ist, wie man immer angenommen hat. Die drei Brüder auf dem Weg durch Indien können erst sich selbst finden, wenn sie zueinander gefunden haben -- und das ist erst möglich, nachdem sie mit dem Gesicht nach vorne auf ein echtes Schicksal treffen.

Bewertung: 5 von 5

Social Web and Knowledge Management

Obviously, the social web is coming. And it's also coming to this year's WWW conference in Beijing!

I find this topic very interesting. The SWKM picks up the theme of last year's very successful CKC2007 workshop, also at the WWW, where we aimed at allowing the collaborative knowledge construction. The SWKM is a bit broader, since it is not just about knowledge construction, but about the whole topic of knowledge management, and how the web changes everything.

If you are interested in the social web, or the semantic web, or specifically about the intersection of these two, and how it can be applied for knowledge management within or without an organisation, you will like the SWKM workshop at the WWW2008. You can submit papers until January 21st, 2008. All information can be found at the Social Web and Knowledge management workshop website.

Semantic MediaWiki 1.0 released

After about two years of development and already with installations all over the world, we are very happy to announce the release of Version 1.0 of Semantic MediaWiki, and thus the first stable version. No alpha, no beta, it's out now, and we think you can use it productively. Markus managed to release it in 2007 (on the last day of the year), and it has moved far beyond what 0.7 was, in stability, features, and performance. The biggest change is a completely new ask syntax, much more powerful since it works much smoother with MediaWiki's other systems like the parser functions, and we keep constantly baffling ourselves about what is possible with the new system.

We have finally reached a point where we can say, OK, let's go for massive user testing. We want big and heavy used installations to test our system. We are fully aware that the full power of the queries can easily kill an installation, but there are many ways to tweak performance and expressivity. We are now highly interested in performance reports, and then moving towards our actual goal, Wikipedia.

A lot has changed. You can find a full list of changes in the release notes. And you can download and install Semantic MediaWiki form SourceForge. Spread the word!

There remains still a lot of things to do. We have plenty of ideas how to make it more useful, and our users and co-developers also seem to have plenty of ideas. It is great fun to see the numbers of contributors to the code increase, and also to see the mailing lists being very lively. Personally, I am very happy to see Semantic MediaWiki flourish as it does, and I am thankful to Markus for starting this year (or rather ending the last) with such a great step.

Willkommen auf Simia

Willkommen auf Simia, der neuen Website von Denny Vrandecic. Nachdem ich in meinem Blog seit gefühlten drei Zeitaltern nix mehr geschrieben habe, und auf meinen Seiten seit Anbeginn der Zeiten keine neuen Inhalte eingestellt habe, kann ich euch jetzt sagen, es lag daran, dass ich die ganze Technik umstellen wollte.

Womit ich endlich gut vorangekommen bin. Zur Zeit finden sich hier alle Blogeinträge von Nodix und die Kommentare. Die Funktion zum Erstellen neuer Kommentare funktioniert noch nicht, aber ich arbeite daran. Ihr werdet auch merken, dass deutlich mehr Inhalte auf der Seite in Englisch sind als früher.

Technisch gesehen ist Simia eine Semantic MediaWiki Installation. Damit gehört dieser Blog auch zu meiner Forschung, indem ich ein wenig Erfahrung aus erster Hand sammeln möchte, wie es ist, sein Blog und seine persönliche Homepage mit Semantic MediaWiki zu führen. (Insofern ist das natürlich kein Blog mehr, sondern ein so genanntes Bliki, aber wen schert's?). Und da das ganze semantisch ist, will ich herausfinden, wie so eine persönliche Website ins Semantic Web passt...

Um Up to date zu bleiben, gibt es eine Reihe von feeds auf Simia. Wählt Euch aus, was ihr wollt. Schöne Grüße, und ich hoffe, Ihr habt Euch gut durch die Weihnachtszeit gemampft! :)

San Francisco and Challenges

Time is running totally crazy on me in the last few weeks. Right now I am in San Francisco -- if you like to suggest a meeting, drop me a line.

The CKC Challenge is going on and well! If you didn't have the time yet, check it out! Everybody is speaking about how to foster communities for shared knowledge building, this challenge is actually doing it, and we hope to get some good numbers and figures out of it. An fun -- there is a mystery prize involved! Hope to see as many of you as possible at the CKC 2007 in a few days!

Yet another challenge with prizes is going on at Centiare. Believe it or not, you can actually make money with using a Semantic MediaWiki, wih the Centiare Prize 2007. Read more there.

First look at Freebase

I got the chance to get a close look at Freebase (thanks, Robert!). And I must say -- I'm impressed. Sure, the system is still not ready, and you notice small glitches happening here and there, but that's not what I was looking for. What I really wanted to understand is the idea behind the system, how it works -- and, since it was mentioned together with Semantic MediaWiki one or twice, I wanted to see how the systems compare.

So, now here are my first impressions. I will sure play more around with the system!

Freebase is a databse with a flexible schema and a very user friendly web front end. The data in the database is offered via an API, so that information from Freebase can be included in external applications. The web front end looks nice, is intuitive for simple things, and works for the not so simple things. In the background you basically have a huge graph, and the user surfs from node to node. Everything can be interconnected with named links, called properties. Individuals are called topics. Every topic can have a multitude of types: Arnold Schwarzenegger is of type politician, person, actor, and more. Every such type has a number of associated properties, that can either point to a value, another topic, or a compound value (that's their solution for n-ary relations, it's basically an intermediate node). So the type politician adds the party, the office, etc. to Arnold, actor adds movies, person adds the family relationships and dates of birth and death (I felt existentially challenged after I created my user page, the system created a page of me inside freebase, and there I had to deal with the system asking me for my date of death).

It is easy to see that types are crucial for the system to work. Are they the right types to be used? Do they cover the right things? Are they interconnected well? How do the types play together? A set of types and their properties form a domain, like actor, movie, director, etc. forming the domain "film", or album, track, musician, band forming the domain "music". A domain is being administrated by a group of users who care about that domain, and they decide on the properties and types. You can easily see ontology engineering par excellence going on here, done in a collaborative fashion.

Everyone can create new types, but in the beginning they belong to your personal domain. You may still use them as you like, and others as well. If your types, or your domain, turns out to be of interest, it may become promoted as being a common domain. Obviously, since they are still alpha, there is not yet too much experience with how this works out with the community, but time will tell.

Unsurprising I am also very happy that Metaweb's Jamie Taylor will give an invited talk at the CKC2007 workshop in Banff in May.

The API is based on JSON, and offers a powerful query language to get the knowledge you need out of Freebase. The description is so good that I bet it will find almost immediate uptake. That's one of the things the Semantic Web community, including myself, did not yet manage to do too well: selling it to the hackers. Look at this API description for how it is done! Reading it I wanted to start hacking right away. They also provide a few nice "featured" applications, like the Freebase movie game. I guess you can play it even without a freebase account. It's fun, and it shows how to reuse the knowledge from Freebase. And they did some good tutorial movies.

So, what are the differences to Semantic MediaWiki? Well, there are quite a lot. First, Semantic MediaWiki is totally open source, Metaweb, the system Freebase runs on, seems not to be. Well, if you ask me, Metaweb (also the name of the company) will probably want to sell MetaWeb to companies. And if you ask me again, these companies will make a great deal, because this may replace many current databases and many problems people have with them due to their rigid structure. So it may be a good idea to keep the source closed. On the web, since Freebase is free, only a tiny amount of users will care that the source of Metaweb is not free, anyway.

But now, on the content side: Semantic MediaWiki is a wiki that has some features to structure the wiki content with a flexible, collaboratively editable vocabulary. Metaweb is a database with a flexible, collaboratively editable schema. Semantic MediaWiki allows to extend the vocabulary easier than Metaweb (just type a new relation), Metaweb on the other hand enables a much easier instantiation of the schema because of its form based user interface and autocompletion. Metaweb is about structured data, even though the structure is flexible and changing. Semantic MediaWiki is about unstructured data, that can be enhanced with some structure between blobs of unstructured data, basically, text. Metaweb is actually much closer to a wiki like OntoWiki. Notice the name similarity of the domains: freebase.com (Metaweb) and 3ba.se (OntoWiki).

The query language that Metaweb brings along, MQL, seems to be almost exactly as powerful as the query language in Semantic MediaWiki. Our design has been driven by usability and scalability, and it seems that both arrived at basically the same conclusions. Just a funny coincidence? The query languages are both quite weaker than SPARQL.

One last difference is that Semantic MediaWiki is fully standards based. We export all data in RDF and OWL. Standard-compliant tools can simply load our data, and there are tons of tools who can work with it, and numerous libraries in dozens of programming languages. Metaweb? No standard. A completely new vocabulary, a completely new API, but beautifully described. But due to the many similarities to Semantic Web standards, I would be surprised if there wasn't a mapping to RDF/OWL even before Freebase goes fully public. For all who know Semantic Web or Semantic MediaWiki, I tried to create a little dictionary of Semantic Web terms.

All in all, I am looking forward to see Freebase fully deployed! This is the most exciting Web thingy 2007 until now, and after Yahoo! pipes, and that was a tough one to beat.


Comments are still missing on this post.

The benefit of Semantic MediaWiki

I can't comment on Tim O'Reilly's blog right now it seems, maybe my answer is too long, or it has too many links, or whatever. It only took some time, my mistake. He blogged about Semantic MediaWiki -- yaay! I'm a fanboy, really -- but he asks "but why hasn't this approach taken off? Because there's no immediate benefit to the user." So I wanted to answer that.

"About Semantic MediaWiki, you ask, "why hasn't this approach taken off?" Well, because we're still hacking :) But besides that, there is a growing number of pages who actually use our beta software, which we are very thankful to (because of all the great feedback). Take a look at discourseDB for example. Great work there!

You give the following answer to your question: "Because there's no immediate benefit". Actually, there is benefit inside the wiki: you can ask for the knowledge that you have made explicit within the wiki. So the idea is that you can make automatic tables like this list of Kings of Judah from the Bible wiki, or this list of upcoming conferences, including a nice timeline visualization. This is immediate benefit for wiki editors: they don't have to make pages like these examples (1, 2, 3, 4, 5, or any of these) by hand. Here's were we harness self-interest: wiki editors need to put in less work in order to achieve the same quality of information. Data needs to be entered only once. And as it is accessible to external scripts with standard tools, they can even write scripts to check the correctness or at some form of consistency of the data in the wiki, and they are able to aggregate the data within the wiki and display it in a nice way. We are using it very successfully for our internal knowledge management, where we can simply grab the data and redisplay it as needed. Basically, like a wiki with a bit more DB functionality.

I will refrain from comparing to Freebase, because I haven't seen it yet -- but from what I heard from Robet Cook it seems that we are partially complementary to it. I hope to see it soon :)"

Now, I am afraid since my feed's broken this message will not get picked up by PlanetRDF, and therefore no one will ever see it, darn! :( And it seems I can't use trackback. I really need to update to a real blogging software.


Comments are still missing on this post.

DL Riddle

Yesterday we stumbled upon quite a hard description logics problem. At least I think it is hard. The question was, why is this ontology unsatisfiable? Just six axioms. The ontology is availbe in OWL RDF/XML, in PDF (created with the owl tools), and here in Abstract Syntax.

Class(Rigid complete restriction(subclassof allValuesFrom(complementOf(AntiRigid))))
Class(NonRigid partial)
DisjointClasses(NonRigid Rigid)
ObjectProperty(subclassof Transitive)
Individual(publishedMaterial type(NonRigid))
Individual(issue type(Rigid) value(subclassof publishedMaterial))

So, the question is, why is this ontology unsatisfiable? It is even a minimally unsatisfiable subset, actually, that means, remove any of the axioms and you get a satisfiable ontology. Maybe you like to use it to test your students. Or yourself. The debugger in SWOOP actually gave me the right hint, but it didn't offer the full explanation. I figured it out, after a few minutes of hard thinking (so, now you know how bad I am at DL).

Do you know? (I'll post the answer in the comments if no one else does in a few days)

(Just in case you wonder, this ontology is based on a the OntOWLClean ontology from Chris Welty, see his paper at FOIS2006 if you like more info)


Comments are still missing on this post.

Zur Macht der Blogger

sympatexter hat einen Eintrag dazu geschrieben, dass sich Blogger gerne für zu wichtig nehmen (als Antwort auf ein Stück von Robert Basic, der darüber schreibt, dass sich Blogger noch nicht wichtig genug nehmen). Als Randbemerkung: es ist amüsant zu sehen, dass ausgerechnet sympatexter auf diesen Missstand hinweißt, insbesondere da die Tagline des eigenen Blogs sympatexter rules the world ist. (Mein Fehler, sorry)

Was ist der Sinn des Bloggens? Das würde vielleicht zu weit führen. Aber einzelne Argumente des sympatexters möchte ich doch genauer beleuchten:

  • "Was die Blogger interessiert, interessiert auch leider NUR die Blogger." Stimmt nicht ganz - oder zumindest würde ich dafür gerne mehr Beleg sehen. Blogger werden von der Werbewirtschaft als Multiplikatoren betrachtet - eine Eigenschaft, die sie nur haben können, wenn mehr Leute Blogs lesen als sie schreiben. Außerdem bloggen viele über allgemein interessante Themen, von Lost über Britney Spears, Verbrauchererfahrungen mit Produkten und Dienstleistungen, die Bundestagswahlen bis hin zu Menschenrechtsverletzungen in Guantanamo oder direkten Berichten aus Krisengebieten im Nahen Osten oder Thailand. Glaubt Ihr nicht? Schaut auf Technorati nach, die haben eine aktuelle Liste von populären Themen. Heutige Favoriten: die Oscars, Antonella Barba, und Al Gore. Alles Themen die auch außerhalb der Blogosphäre relevant sind.
  • Meine Zustimmung zu der Beobachtung bezüglich der Statistiken. Die Zahlen, die in den Medien genannt werden, sind häufig irreführend, aber das ist eine Eigenschaft von Statistiken und Medien. Verfolgt man die Zahlen auf die Quelle, wird man oft enttäuscht sein.
  • "In Deutschland lesen sehr wenige Menschen Blogs." Auch hier hätte ich gerne Zahlen. Ich bin mir sicher, dass ein großer Teil der webnutzenden Bevölkerung schon mal einen Blog gelesen hat, schlicht, weil sie bei Anfragen bei den Suchmaschinen häufig auf Blogeinträge stoßen. Vielleicht sind sich die Leser nicht mal bewusst, dass sie einen Blog lesen (ebenso wie der Anteil der Wikipedia-Leser, der nicht weiß, dass die Wikipedia von jedem verändert werden kann, stark zugenommen hat). Einige meiner bestbesuchten Einträge haben mit dem Kochen von Milchreis, den Machenschaften des Kleeblatt-Verlags, und Filmen zu tun. Die Leute, die das Lesen sind nicht die üblichen Leser meines Blogs -- aber ein wichtiger Anteil.
  • "Die meisten Blogs haben noch nicht mal dreistellige Zugriffszahlen pro Tag und werden meistens von Freunden gelesen." Zustimmung, und gleichzeitig die Frage: na und? Ich erwarte ja, dass dieser Blog hier eigentlich nur von Leuten gelesen wird, die mich kennen. Das kann wieder für einzelne Beiträge anders sein, aber im Allgemeinen trifft das zu. Und das, was ich schreibe, interessiert auch meistens nur diese Wenigen -- wenn überhaupt. Aber das ist OK. Blogs werden vielfach dafür verwendet, die Kommunikation zu Freunden und Bekannten, oder gar zur Familie, zu vereinfachen, gar zu ermöglichen, oder sie schlicht aufrechtzuerhalten. Und das ist gut so. Nicht jeder Blog muss Hunderttausende von Lesern haben, das wäre nicht mal möglich. Man darf halt als Blogger dann aber auch nicht erwarten, dass Hunderttausende lesen und durch die Einträge beeinflusst werden.
  • "Etwas zu verlinken, was älter als eine Woche ist, ist ja schon fast Blasphemie - so versinkt das meiste, kaum wahrgenommen, in den Archiven." Zurecht beanstandet. Man sollte häufiger in die Archive verlinken, und strukturierte Einträge machen, die langfristig von Interesse sind. Semantische Technologien, wie ich sie auch in meiner Arbeit entwickle, sollen auch konkret an diesen Baustellen arbeiten. Ein Probekapitel zu semantischen Blogs und Wikis aus einem jüngst erschienen Buch über Wikis und Blogs gibt dazu ein wenig Einsicht, wie man sich das vorstellen kann. Leider sind nur die ersten 8 Seiten online verfügbar. (Achtung Werbung!) Kauft das Buch! (Werbung Ende) Solche Technologien sollen helfen, Blogeinträge dann verfügbar zu machen, wenn sie relevant sind. Einen ersten Vorgeschmack bietet die Firefox Extension Blogger Web Comments von Google.

Letztlich aber bleibt ein Argument vor allem: selbst wenn es wenige lesen, und es viel zu häufig Nabelschau ist, was die Blogger machen -- dieser Eintrag mit eingeschlossen, ironischerweise -- ist das Bloggen eine Technik, die es zum ersten Mal in der Geschichte der Menschheit tatsächlich so vielen Leuten konkret ermöglicht, aktiv eine Stimme zu haben. Ob das, was diese Leute damit anfangen, gut ist oder nicht, dass sei eine Entscheidung des Einzelfalls. Aber allein die Tatsache, dass heute Klein-Gretchen aus Hintertupfingen ihre handgekrakelten Bilder hochladen kann, und sie sofort weltweit zugänglich sind, ist ein Schritt auf dem Weg zu einer globalen Gesellschaft. Ein kleiner, ja, aber ein notwendiger und auch wichtiger.

Talk in Korea

If you're around this Tuesday, February 13th, in Seoul, come by the Semantic Web 2.0 conference. I had the honour to be invited to give a talk on the Semantic Wikipedia (where a lot is happening right now, I will blog about this when I come back from Korea, and when the stuff gets fixed).

Looking forward to see you there!

Mail problems

The last two days my mail account had trouble. If you could not send something to me, sorry! Now it should work again.

Since it is hard to guess who tried to eMail me in the last two days (I guess three persons right), I hope to reach some this way.

Building knowledge together - extended

In case you did not notice yet -- the CKC2007 Workshop on Social and Collaborative Construction of Structured Knowledge at the WWW2007 got an extended deadline due to a number of requests. So, you have time to rework your submissions or finish yours! Also the demo submission deadline is upcoming. We want to have a shootout of the tools that have been created in the last few years, and get hands on to the differences, problems, and best ideas.

See you in Banff!

Nutkidz Jubiläum

Die 50. Folge der nutkidz ist erschienen! Und zum Jubiläum haben wir uns was besonderes einfallen lassen.

Viel Spaß! Und übrigens, ja, sie erscheinen wieder regelmäßig. Zwar nur monatlich, aber immerhin regelmäßig. Und wieviele monatliche Webcomics gibt es schon?

Collaborative Knowledge Construction

The deadline is upcoming! This weekend the deadline for submissions to the Workshop on Social and Collaborative Construction of Structured Knowledge at the WWW2007 will be over. And this may be easily the hottest topic of the year, I think: how do people construct knowledge in a community?

Ontologies are meant to be shared conceptualizations -- but how many tools really allow to build ontologies in a widely shared manner?

I am especially excited about the challenge that comes along with the workshop, to examine different tools, and to see how their perform. If you have a tool that fits here, write us.

So, I know you have thought a lot about the topic of collaboratively building knowledge -- write your thoughts down! Send them to us! Come to Banff! Submit to CKC2007!

Was für ein Zufall!

Ich schreibe einem Kollegen in den Niederlanden. Der Antwortet mir, dass er bald nach Barcelona zieht, auf eine neue Stelle. Keine zwei Minuten später schickt mir Sixt eine eMail mit einem Spezialangebot, Hotel und Mietwagen für drei Tage Barcelona für nur X Euro.

Was für ein Zufall!

Semantic MediaWiki goes business

... but not with the developers. Harry Chen writes about it, and several places copy the press release about Centiare. Actually, we didn't even know about it, and were a bit surprised to hear that news after our trip to India (which was very exciting, by the way). But that's OK, and actually, it's pretty exciting as well. I wish Centiare all the best! Here is their press release.

They write:

Centiare's founder, Karl Nagel, genuinely feels that the world is on the verge of an enormous breakthrough in MediaWiki applications. He says, "What Microsoft Office has been for the past 15 years, MediaWiki will be for the next fifteen." And Centiare will employ the most robust extension of that software, Semantic MediaWiki.

Wow -- I'd never claim that SMW is the most robust extension of MediaWiki -- there are so many of them, and most of them have a much easier time of being robust! But the view of MediaWiki taking the place of Office -- Intriguing. Although I'd put my bets rather on stuff like Google Docs (former Writely), and add some semantic spice to it. Collaborative knowledge construction will be the next big thing. Really big I mean. Oh, speaking about that, check out this WWW workshop on collaborative knowledge construction. Deadline is February 2nd, 2007.

Click here for more information about Centiare.


Comments are still missing on this post.

Goldener Würfel

Nein, kein Rollenspielpreis, das war der Goldene Becher. Vielmehr hatte Schwesterchen von der Post einen Brief abgeholt, während ich in Indien war, und nun, da ich zurück bin, habe ich ihn endlich aufgemacht, und deswegen kann ich jetzt nicht über Indien schreiben sondern widme mich diesem Brief.

Der Inhalt? Ein Würfel, scheinbar mit goldener Folie überzogen, und wo die Eins sein sollte ist ein kleines Bild von etwas kaum erkennbaren. Ich dachte im ersten Moment, es sei eine Rorschach-Figur. Nachdem Schwesterchen ja schon letztes Jahr beim Hustle-the-Sluff dabei war, nehme ich an, dass es diesmal etwas ähnliches ist. Also, ab zu Googles Blogsearch, und danach gesucht, und, wer sagt's denn, gleich ein Treffer, bei Daniel Gramsch, dem Zeichner von Alina Fox.

Aus einem Kommentar ist zu entnehmen, dass Daniel Rüd ebenfalls einen solchen Brief enthalten hat, aber noch nicht darüber gebloggt hat. Ein weiterer Kommentar von Angie enthält sogar einen Link, auf dem man erkennt, dass der vermeintliche Rorschachtest doch eine Kuppel ist, von Schloss Charlottenburg in Berlin, mit der Fortuna drauf. Passend für einen Würfel, durchaus.

Laut dem Fuchsbau deutet das ganze auf ein sogenanntes Alternate Reality Game hin, eine Art elaborierter Mischung aus Live-Rollenspiel und Schnitzeljagd. Etwas, für das ich zur Zeit überhaupt nicht die Zeit habe, aber dann wiederum klingt es so spannend, dass ich sehr viel Lust darauf habe. Ich habe erst unlängst das spannende Buch Convergence Culture von Henry Jenkins vom MIT CMS verschlungen, in dem er solche und ähnliche Phänomene beschreibt.

Drum fänd ich's spannend, doch dabei zu sein. Also, auf zu der von Angie entdeckten Webseite, um mehr Informationen auszugraben. Angie, wenn Du das liest -- wie hast Du die Seite ausfindig gemacht? Und wer bist Du?

Wer sonst hat noch einen Würfel erhalten?


Template:Comments missing

Frohes Neues Jahr!

Letztes Jahr schrieb ich einen langen 2005er Abschiedseintrag. Dieses Jahr nicht, nicht etwa weil das Jahr nicht gut zu mir war -- es war sehr gut zu mir! -- sondern einfach, weil mir die Zeit fehlt. Ich muss noch packen. In weniger als zwei Stunden beginnt meine Reise nach Indien. Nein, nicht wegen der Arbeit, ganz privat. Langsam werde ich nervös, dass ich das mit dem Packen nicht mehr packe.

2006 war unglaublich gut. Ich konnte von mancher Arbeit die Früchte ernten, und andere Pflanzen weiter wachsen sehen. 2007 und 2008 stehen dann weitere Ernten an. Allerdings fällt mir auf, dass ich wahrscheinlich einer der schlechtesten Blogger des Planeten bin. Da schreibe ich an Büchern mit, und hier erwähne ich die nicht mal! Das werde ich nächstes Jahr nachholen müssen. Allerdings hängt das auch ein wenig mit dem geplanten Relaunch von Nodix zusammen. Viele Inhalte warten noch darauf, dass ich sie wieder hochlade, aber das mache ich erst, wenn ich die neue Software eingerichtet habe. Den Wunschtermin - zum Nodix-Geburtstag - werde ich wohl nicht mehr schaffen, schade. Aber ich lenke wieder ab, ich sollte wirklich packen. Wir hören uns ja wieder, in ein paar Wochen. Vielleicht erzähle ich sogar von Indien. Und von ein paar anderen Reisen von diesem Jahr. Zu erzählen gäbe es zumindest manches.

Allen Lesern meine besten Wünsche zum Neuen Jahr! Allen eine schöne Feier, einen guten Rutsch, mögen ein paar Eurer Wünsche für 2007 in Erfüllung gehen.

Five things you don't know about me

Well, I don't think I have been tagged yet, but I could be within the next few days (the meme is spreading), and as I won't be here for a while, I decided to strike preemptively. If no one tags me, I assume to take one of danah's.

So, here we go:

  1. I was born without fingernails. They grew after a few weeks. But nevertheless, whenever they wanted to cut my nails when I was a kid, no one could do it alone -- I always panicked and needed to be held down.
  2. Last year, I contributed to four hardcover books. Only one of them was scientific. The rest were modules for Germany's most popular role playing game, The Dark Eye.
  3. I am a total optimist. OK, you knew that. But you did not know that I actually tend to forget everything bad. Even in songs, I noticed that I only remember the happy lines, and I forget the bad ones.
  4. I co-author a webcomic with my sister, the nutkidz. We don't manage to meet any schedule, but we do have a storyline. I use the characters quite often in my presentations, though.
  5. I still have an account with Ultima Online (although I play only three or four times a year), and I even have a CompuServe Classic account -- basically, because I like the chat software. I did not get rid of my old PC, because it still runs the old CompuServe Information Manager 3.0. I never figured out how to run IRC.

I bet no one of you knew all of this! Now, let's tag some people: Max, Valentin, Nick, Elias, Ralf. It's your turn.


Comments are still missing on this post.

Semantic Web patent

Tim Finin and Jim Hendler are asking about the earliest usage of the term Semantic Web. Tim Berners-Lee (who else?) spoke about the need of semantics in the web at the WWW 1994 plenary talk in Geneva, though the term Semantic Web does not appear there directly. Whatever. What rather surprised me, though, is, when surfing a bit for the term, I discovered that Amit Sheth, host of this year's ISWC, filed the patent on it, back in 2000: System and method for creating a Semantic Web. My guess would be, that is the oldest patent of it.

Der am schnellsten gebrochene Vorsatz

  1. Keine Vorsätze haben.

Schneller als den kann man keinen Vorsatz brechen.

Supporting disaster relief with semantics

Soenke Ziesche, who has worked on humanitarian projects for the United Nations for the last six years, wrote an article for xml.com on the use of semantic wikis in disaster relief operations. That is a great scenario I never thought about, and basically one of these scenarios I think of when I say in my talks: "I'll be surprised if we don't get surprised by how this will be used." Probably I would even go to state the following: if nothing unexpected happens with it, the technology was too specific.

Just the thought that semantic technology in general, and maybe even Semantic MediaWiki in particular, could relief the effects of a natural disaster, or maybe even safe a life, this thought is so incredible exciting and rewarding. Thank you so much Soenke!

All problems solved

Today I feel a lot like the nameless hero from the PhD comics, and what is currently happening to him (begin of the storyline, continuation, especially here, and very much like here, but pitily, not at all like here). Today we had Boris Motik visiting the AIFB, who is one of the brightest people on this planet. And he gave us a more than interesting talk on how to integrate OWL with relational databases. What especially interested me was his great work on constraints -- especially since I was working on similar issues, unit tests for ontologies, as I think constraints are crucial for evaluating ontologies.

But Boris just did it much cleaner, better, and more thorough. So, I will dive into his work and try to understand it to see, if there is anything left to do for me, or if I have to refocus. There's still much left, but I am afraid the most interesting part from a theoretic point is solved. Or rather, in the name of progress, I am happy it is solved. Let's get on with the next problem.

(I *know* it is my own fault)

Semantic Wikipedia presentations

Last week on the Semantics 2006 Markus and I gave talks on the Semantic MediaWiki. I was happy to be invited to give one of the keynotes at the event. A lot of people were nice enough to come to me later to tell me how much they liked the talk. And I got a lot of requests for the slides. I decided to upload them, but wanted to clean them a bit. I am pretty sure that the slides are not self-sufficient -- they are tailored to my style of presentations a lot. But I added some comments to the slides, so maybe this will help you understand what I tried to say if you have not been in Vienna. Find the slides of the Semantics 2006 keynote on Semantic Wikipedia here. Careful, 25 MB.

But a few weeks ago I was at the KMi Podium for an invited talk there. The good thing is, they don't have just the slides, they also have a video of the talk, so this will help much more in understanding the slides. The talk at KMi has been a bit more technical and a lot shorter (different audiences, different talks). Have fun!

Rollenspiel und Web 2.0

Letzte Woche hielt ich in Wien einen Vortrag auf der Semantics 2006. Danach wurde ich um ein Radio-Interview gebeten, und dabei sprachen wir über das Semantic Web, Web 2.0 und ähnliche Themen -- meine Arbeit halt. Semantic Web, das Web der Daten, Web 2.0, das Mitmach-Web (ganz grob).

Plötzlich aber wechselte die Reporterin das Thema, meinte, ich würde ja auch an Deutschlands beliebtestem Rollenspiel Das Schwarze Auge arbeiten. Ob ich den Zuhörern erklären könnte, was denn Rollenspiel sei. Und da erklärte ich Rollenspiel als Geschichtenerzählen 2.0 -- Geschichtenerzählen zum Mitmachen, wo es darum geht, in der Gruppe eine gemeinsame Geschichte zu erzählen.

Na, wenn das mal keine neue Definition ist.

Zeitverschiebung

Es ist Mittag in Hawaii, und ich bin müde! Herrje.

Liegt wahrscheinlich daran, dass ich in Karlsruhe bin.

Semantic MediaWiki 0.6: Timeline support, ask pages, et al.

It has been quite a while since the last release of Semantic MediaWiki, but there was enormous work going into it. Huge thanks to all contributors, especially Markus, who has written the bulk of the new code, reworked much of the existing, and pulled together the contributions from the other coders, and the Simile team for their great Timeline code that we reused. (I lost overview, because the last few weeks have seen some travels and a lot of work, especially ISWC2006 and the final review of the SEKT project I am working on. I will blog on SEKT more as soon as some further steps are done).

So, what's new in the second Beta-release of the Semantic MediaWiki? Besides about 2.7 tons of code fixes, usability and performance improvements, we also have a number of neat new features. I will outline just four of them:

  • Timeline support: you know SIMILE's Timeline tool? No? You should. It is like Google Maps for the fourth dimension. Take a look at the Timeline webpage to see some examples. Or at ontoworld's list of upcoming events. Yes, created dynamically out of the wiki data.
  • Ask pages: the simple semantic search was too simple, you think? Now we finally have a semantic search we dare not to call simple. Based on the existing Ask Inline Queries, and actually making them also fully functional, the ask pages allow to dynamically query the wiki knowledge base. No more sandbox article editing to get your questions answered. Go for the semantic search, and build your ask queries there. And all retrievable via GET. Yes, you can link to custom made queries from everywhere!
  • Service links: now all attributes can automatically link to further resources via the service links displayed in the fact box. Sounds abstract? It's not, it's rather a very powerful tool to weave the web tighter together: service links specify how to connect the attributes data to external services that use that data, for example, how to connect geographic coordinates with Yahoo maps, or ontologies with Swoogle, or movies with IMdb, or books with Amazon, or ... well, you can configure it yourself, so your imagination is the limit.
  • Full RDF export: some people don't like pulling the RDF together from many different pages. Well, go and get the whole RDF export here. There is now a maintenance script included which can be used via a cron job (or manually) to create an RDF dump of the whole data inside the wiki. This is really useful for smaller wikis, and external tools can just take that data and try to use it. By the way, if you have an external tool and reuse the data, we would be happy if you tell us. We are really looking forward to more examples of reuse of data from a Semantic MediaWiki installation!

I am looking much forward to December, when I can finally join Markus again with the coding and testing. Thank you so very much for your support, interest, critical and encouraging remarks with regards to Semantic MediaWiki. Grab the code, update your installation, or take the chance and switch your wiki to Semantic MediaWiki.

Just a remark: my preferred way to install both MediaWiki and Semantic MediaWiki is to pull it directly from the SVN instead of taking the releases. It's actually less work and helps you tremendously in keeping up to date.

Semantic Web Challenge 2006 winners

Sorry for the terseness, but I am sitting in the ceremony.

18 submissions. 14 passed the minimal criteria.

Find more information on challenge.semanticweb.org -- list of Finalists, links, etc. See also on ontoworld.

And the winners are ...

3. Enabling Semantic Web communities with DBin: an overview (by Christian Morbidoni, Giovanni Tummarello, Michele Nucci)

2. Foafing the Music: Bridging the semantic gap in music recommendation (by Oscar Celma)

1. MultimediaN E-Culture demonstrator (by Alia Amin, Bob Wielinga, Borys Omelayenko, Guus Schreiber, Jacco van Ossenbruggen, Janneke van Kersen, Jan Wielemaker, Jos Taekema, Laura Hollink, Lynda Hardman, Marco de Niet, Mark van Assem, Michiel Hildebrand, Ronny Siebes, Victor de Boer, Zhisheng Huang)

Congratulations! It is great to have such great projects to show off! :)


Comments are still missing on this post.

ISWC 2008 coming to Karlsruhe

Yeah! ISWC2006 is just starting, and I am really looking forward to it. The schedule looks more than promising, and Semantic MediaWiki is among the finalists for the Semantic Web Challenge! I will write more about this year's ISWC the next few days.

But, now the news: yesterday it was decided that ISWC2008 will be hosted by the AIFB in Karlsruhe! It's a pleasure and a honor -- and I am certainly looking forward to it. Yeah!


Comments are still missing on this post.

Semantic Web and Web 2.0

I usually don't just point to other blog entries (thus being a bad blogger regarding netiquette), but this time Benjamin Nowack nailed it in his post on the Semantic Web and Web 2.0. I read the iX article (a popular German computer technology magazine), and I lost quite some respect for the magazine as there were so many unfounded claims, off-the-hand remarks, and so much bad attitude in the article (and in further articles scuttered around the issue) towards the Semantic Web that I thought the publisher was personally set on a crusade. I could go through the article and write a commentory on it, and list the errors, but honestly, I don't see the point. At least it made me appreciate peer review and scientific method a lot more. The implementation of peer review is flawed as well, but I realize it could be so much worse (and it could be better as well - maybe PLoS is a better implementation of peer review).

So, go to Benji's post and convince yourself: there is no "vs" in Semantic Web and Web 2.0.

Java developers f*** the least

Andrew Newman conducted a brilliant and significant study on how often programmers use f***, and he splitted it on programming languages. Java developers f*** the least, whereas LISP programmers use it on every fourth opportunity. In absolute term, there are still more Java f***s, but less than C++ f***s.

Just to add a further number to the study -- because Andrew unexplicably omitted Python -- here's the data: about 196,000 files / 200 occurences -> 980. That's the second highest result, placing it between Java and Perl (note that the higher the number, the less f***s -- I would have normalized that by taking it 1/n, but, fuck, there's always something to complain).

Note that Google Code Search actually is totally inconsisten with regards to their results. A search for f*** alone returns 600 results, but if you look for f*** in C++ it returns 2000. So, take the numbers with more than a grain of salt. The bad thing is that Google counts are taken as a basis for a growing number of algorithms in NLP and machine learning (I co-authored a paper that does that too). Did anyone compare the results with Yahoo counts or MSN counts or Ask counts or whatever? This is not the best scientific practice, I am afraid. And I comitted it too. Darn.


Comments are still missing on this post.

Meeting opportunities

I read in an interview in Focus (German) with Andreas Weigend, he says that publishing his travel arrangements in his blog helped him meet interesting people and allow for unexpected opportunities. I actually noticed the same thing when I wrote about coming to Wikimania this summer. And those were great meetings!

So, now, here are the places I will be in the next weeks.

  • Oct 18-Oct 20, Madrid: SEKT meeting
  • Oct 22-Oct 26, Milton Keynes (passing through London): Talk at KMi Podium, Open University, on Semantic MediaWiki. There's a webcast! Subscribe, if you like.
  • Oct 30-Nov 3, Montpellier: ODBASE, and especially OntoContent. Having a talk there on Unit testing for ontologies.
  • Nov 5-Nov 13, Athens, Georgia: ISWC and OWLED
  • Nov 15-Nov 17, Ipswich: SEKT meeting
  • Nov 27-Dec 1, Vienna: Keynote at Semantics on Semantic Wikipedia
  • Dec 13-17, Ljubljana: SEKT meeting
  • Dec 30-Jan 10, Mumbai and Pune: the travel is private, but this doesn't mean at all we may not meet for work if you're around that part of the world

Just mail me if you'd like to meet.

RDF in Mozilla 2

I read last week Brendan Eich's post on Mozilla 2, where he said that with Moz2 they hope that they can "get rid of RDF, which seems to be the main source of "Mozilla ugliness"". Danny Ayers commented on this, saying that RDF "should be nurtured not ditched."

Well, RDF in Mozilla always was crappy, and it was based on a pre-1999-standard RDF. No one ever took up the task -- remind you, it's open source -- to dive into the RDF inside Mozilla and repair and polish it, make it a) compatible to the 2004 RDF standard, (why didn't RDF get version numbers, by the way?) and b) clean up the code and make it faster.

My very first encounter with RDF was through Mozilla. I was diving into Mozilla as a platform for application development, which seemed like a very cool idea back then (maybe it still is, but Mozilla isn't really moving into this direction). RDF was prominent there: as an internal data structure for the developed applications. Let's repeat that: RDF, which was developed to allow the exchange of data on the web, was used within Mozilla as an internal data structure.

Surprisingly, they had performance issues. And the code was cluttered with URIs.

I still think it is an interesting idea. In last year's Scripting for the Semantic Web workshop I presented an idea on integrating semantic technologies tightly into your programming. Marian Babik and Ladislav Hluchy picked that idea up and expanded and implemented the work much better than I ever could.

It would be very interesting how this combination could really be put to work. An internal knowledge base for your app that is queried via SPARQL. Doesn't sound like the most performant idea. But then -- imagine not having one tool accessing that knowledge base. But rather a system architecture with a number of tools accessing that knowledge base. Adding data to existing data. Imagine just firing up a newly installed Adressbook, and -- shazam! -- all your data is available. You don't like it anymore? Switch back. No changes lost. Everything is just views.

But then again, didn't the DB guys try to do the same? Yes, maybe. But with semantic web technology, shouldn't it be easier to do it, because it is meant to integrate?

Just thinking. Would love to see a set of tools based on a plugabble knowledge base. I'm afraid, performance would suck. I hope we will see.


Comments are still missing on this post.

Das Stöckchen behalte ich

So. Ich fange das nächste Stöckchen auf, von Buddy. Also los.

Fünf Dinge, die ich nicht habe, aber gerne hätte

  1. Weltfrieden.
  2. Das Semantic Web.
  3. Einen Roman. Selbstgeschrieben, fertig, und mit Verleger. Und richtig gut.
  4. Ein Direktes Neuronales Interface zur Matrix. Ich meine zum Web.
  5. Die Eine Richtige.
  6. Sechs Richtige.
  7. Die Fähigkeit bis Fünf zu zählen.

Die Reihenfolge enspricht nicht der Relevanz.

Fünf Dinge, die ich habe, aber lieber nicht hätte

  1. Keine Ahnung.
  2. Eine schlecht gewartete Website.
  3. Zu wenig Zeit.
  4. Mir wird schlecht, wenn ich Gurken esse.
  5. Ein paar Kilogramm.

Fünf Dinge, die ich nicht habe, aber auch nicht haben möchte

  1. Allzuviel gesunder Menschenverstand.
  2. Verzweiflung.
  3. Langeweile.
  4. Geldsorgen.
  5. Die Falsche.

Fünf Dinge, die ich habe und aus keinem Grund der Welt missen möchte

  1. Schwesterchen.
  2. Freunde.
  3. Meine Doktorandenstelle am AIFB.
  4. Optimismus. Eine Menge.
  5. Musik.

Fünf Menschen, die dies noch nicht beantwortet haben, von denen ich mir das aber wünsche

  1. Der Papst.
  2. Die Bundeskanzlerin. In ihrem nächsten Podcast.
  3. John Lennon.
  4. Lisa Simpson.
  5. Der unbekannte Soldat.

Enjoying dogfood

I guess the blog's title is a metaphor gone awry... well, whatever. About two years ago at Semantic Karlsruhe (that is people into semantic technologies at AIFB, FZI, and ontoprise) we installed a wiki to support the internal knowledge management of our group. It was well received and used. We also have an annual survey about a lot of stuff within our group, to see where we can improve. So last year we found out, that the wiki was received OK. Some were happy, some were not, nothing special.

We switched to a Semantic MediaWiki since last year, which was a labourous step. But it seems to have payed off: recently we had our new survey. The scale goes from very satisfied (56%) over satisfied (32%) and neutral (12%) to unsatisfied (0%) and very unsatisfied (0%).

You can tell I was very happy with these results -- and quite surprised. I am not sure, how much the semantics made the difference, or maybe we did more "gardening" on the wiki since we felt more connected to it, and gardening cannot be underestimated, both in importance and resources. In this vein, very special thanks to our gardeners!

One problem a very active gardener pointed out not long ago was that gardening gets hardly appreciated. How do other small-scale or intranet wiki tackle this problem? Well, in Wikipedia we have the barnstars, and related cool stuff -- but that is basically working because we have an online community and online gratification is OK for that (or do I misunderstand the situation here? I am not a social scientist). But how to translate that to an actually offline community that has some online extensions like in our case?

Berlin, Berlin, ich war jetzt in Berlin...

... und nicht in Olpe. Da fahre ich wohl in ein paar Wochen hin. Aber aus einem ganz anderen Grund. Dafür fliegt Schwesterchen nach Berlin. Aber auch aus einem ganz anderen Grund.

Und Berlin ist sehr wow. Der Reichstag, das Bundeskanzleramt, riesige, beeindruckende Gebäude; die gewaltigen Parks; atemberaubend: das Pergamonmuseum. Einfach unglaublich. Ich habe immer gedacht, der Pergamonaltar wäre halt ein Altar, also ein Steinblock. Niemand hat mir gesagt, dass sie den ganzen Tempel auseinandergeschnitten haben und nach Berlin gebracht haben! Unglaublich. Und dann geht man weiter, tritt durch einen Eingang, und denkt sich, hmm, irgendwie habe ich Blau im Nacken. Und da dreht man sich um, und ist gerade durch das Ischtartor getreten. Und es sieht noch viel beeindruckender aus, als in den Büchern. Ist halt auch etwas größer als in den Büchern.

Und Nofretete ist auch hübsch. Nicht ganz so hübsch wie Ginevra, also bleibt letztere mein Desktop-Hintergrund.

Und das Mauermusuem hat mich mehrfach auf meine Tränendrüse gedrückt. Wahnsinnsgeschichten. Und auf dem Weg zu einem allgemeinem Menschenrechtsmuseum. Mit Info zu Gandhi, zu dem Aufstand in Ungarn, zu den Religionen und ihren Gemeinsamkeiten, und vieles mehr. Aber halt auch: wie ist die Mauer gebaut? Die Helden, die Tunnel bauten. Die Helden, die trotz Befehl nicht auf die Flüchtlinge schossen. Die Helden, die den Stacheldraht hochheben, damit das Kind durchkommt. Die Helden, die stark genug waren, keine Gewalt anzuwenden. Ich weiß nicht, ob ich die selbe Stärke hätte.

Habe auch ein Stück Mauer gekauft. Und weggeschickt. Weil die Mauer muss weg.

Genau.

Ach ja, und die Arbeit war auch cool. Wahrscheinlich eine der coolsten Locations überhaupt: Unter den Linden 1. Direkt an der Museumsinsel.

Apricot ist keine Farbe

Zumindest nicht für Männer. Es gibt aber einen kleinen Anteil Frauen, die vier statt drei verschiedene Zapfen im Auge haben. Dadurch können sie tatsächlich mehr verschiedene Farben sehen (bis zu 100 Millionen verschiedene Farben -- statt nur eine Million bei uns Männern, und den meisten Frauen, nur etwa 2-3% der Frauen sind sogenannte Tetrachromanten). Es sei unvorstellbar, wie das aussehe (ja, das kann ich mir vorstellen, ich meine, nicht vorstellen).

Blöd. Ich fühle mich jetzt voll benachteiligt. Gott ist doch Sexist, verflixtnochmal.

Wetten, dass...

„Hast Du die Titelseite gesehen? Schon wieder eine Story über den 11. September…“
„Weißt Du, solange es noch um 2001 geht, ist doch alles in Ordnung.“
„Das ist doch nur für die Medien. Meinst Du, die schlagen wieder zu?“
„Zum Jahrestag?“
„Ja. Wäre doch der Hammer.“
„Ich weiß nicht. Das ist sicher zu vorhersagbar. Alle werden vorbereitet sein.“
„Eben! Stell Dir vor, das gelingt ihnen! Damit zeigen sie, wie fett sie sind.“
„Ja, schon, aber ich glaube, es klappt nicht.“
„Ich würde wetten…“
„Was?!“
„50 Euro. Am 11. September kommt es zu einem weiteren Anschlag.“
„Diesen Jahres?“
„Ja.“
„Ich weiß nicht…“
„Na gut, ich gewinne nur, wenn es mehr als 200 Tote gibt, sonst zählt es nicht.“
„Hmm… abgemacht.“

* * *

… die Polizei war abgelenkt durch einen Bombenalarm im Hauptbahnhof … wegen einem vergessenen Koffer… ein terroristischer Hintergrund … eine Reihe von Explosionen erschütterte das Brandenburger Tor, bevor es schließlich in sich zusammenstürzte … die ersten Schätzungen gehen von mehr als 100 Toten aus … die wollen doch Krieg! Dann geben wir ihnen Krieg! … niemand hätte erwartet, dass ausgerechnet Deutschland … die Bundesregierung wurde in Sicherheit gebracht … die ganze Nation konnte zusehen, wie das Brandenburger Tor … vermutet Al-Quaida … wurde eine Moschee angegriffen … Live Schaltung nach Washington … uneingeschränkte Solidarität mit unseren deutschen Brüdern und Schwestern … tiefe Bestürzung … einziges Ziel der sukzessiven Explosionen war, den Fernsehteams genug Zeit … bestätigt, dass es bislang 200 Todesopfer gab, und eine weitere Frau noch um ihr Leben kämpft …

* * *

Er zündete sich eine Zigarette an. Seine Finger zitterten. Er stand im Treppenhaus des Hospitals. Hier durfte er rauchen. Und die Ruhe genießen. Das dauernde piep, piep, piep zehrte an seinen Nerven. Seine Mutter. Seine Mutter kämpfte um das Leben.
Die Tür ging auf. Sein bester Freund stand da, schaute etwas blass. Er holte seinen Geldbeutel raus, und drückte ihm zwei Zwanziger in die Hand.
„Fehlen noch Zehn. Kriegst Du morgen im Club.“

Er stieg die Treppe runter. Die Tür unten schloss mit einem lauten, tiefen Schlag.

Noch ein Stöckchen

Diesmal von Schwesterchen.

Wann stehst Du zur Arbeit auf?

Sehr unterschiedlich. Manchmal erst zu Mittag, manchmal um 5:45.

Stehst Du rechtzeitig auf oder bleibst Du bis zur letzten Minute liegen?

Ich bleibe so lange liegen wie möglich.

Wie viele Wecker hast Du?

Einen. Seit meiner Grundschulzeit der selbe. So eine grüner, würfelförmiger Plastikradiowecker mit roter LED-Digitalanzeige.

Machst Du Frühsport?

Sollte ich mal anfangen.

Frühstückst Du? Wenn ja, was?

Meistens. Habe gehört, dass es gesund sein soll.

Wie fährst Du zur Arbeit?

Zunächst mit der U-Bahn zum Hauptbahnhof, dann von dort aus mit dem IC nach Karlsruhe, dort mit der Bahn zur Uni. Und das heute seit genau 2 Jahren. Habe Jubiläum :)

So, das Stöckchen gebe ich jetzt, nachdem es erledigt ist, Pflichtgemäß an Ralf und Cindy weiter. Ich gehe davon aus, dass wir Arbeit durch Uni ersetzen können. Außerdem sollte Cindy mal wieder was schreiben.

Hmm, eigentlich würde ich bei Ralf ja gerne den Blogeintrag verlinken, in dem er antwortet. Da dieser aber noch nicht geschrieben ist, geht das nicht. Blödes Web. Sollte man mal was erfinden. Ich kann ja raten, wo die Antwort auf dieses Stöckchen sein wird.

Das Lied vom Herrn Baron

Das Diadem von Elfenhand ist ein DSA Abenteuer von mir, welches in der Abenteuersammlung Leicht verdientes Gold erschienen ist. In dem Abenteuer spielt ein Lied eine wichtige Rolle, und so wurde es das erste (und bisher einzige) DSA Abenteuer mit Noten (Apropos Noten: wer es gespielt hat, kann auf dem Wiki Aventurica seine Meinung dazu abgeben - auch als Spielleiter)

So, hier ein besonderer Autorenservice (nachdem ich heute wieder mal nach der Melodie gefragt wurde), das Midi file zu Der Baron.

Viel Spaß bei dem Abenteuer! Ganz klar mein bestes DSA Abenteuer (bisher... ;)

Geht doch

Kroatien besiegt den Fußballweltmeister Italien 2:0. Gestern abend. Im Fußball.

Semantic MediaWiki officially Beta

Semantic MediaWiki has gone officially Beta. Markus Krötzsch released Version 0.5 yesterday -- download it at Sourceforge and update your installation!

Markus and I are both busy today updating existing installations (and creating new ones -- greetings towards California!). The new version has several new features:

  • One can reuse existing Semantic Web vocabulary, like FOAF. This feature is used so strongly, it led to Swoogle actually believing, FOAF was defined at ontoworld!
  • The unit code was improved a lot -- one can define linear units now from inside the wiki. Africa has a list of all African countries, and you can see their size neatly listed.
  • New datatypes for URLs and Emails.
  • Better code (why we dare to call us Beta)

Check out the new features on ontoworld. Thanks for Markus and S for the coding marathon this weekend, that allowed to make this new release! We are fetching bugs now, and planning 0.6, and our first big stress tests, with lots and lots of data...


Comments are still missing on this post.

Volver

Pedro Almodóvar macht mal wieder einen Film, und natürlich wird er schön. Wo La mala educación auf schöne Männer setzte, spielen diesmal Frauen die Hauptrolle. Und Nebenrollen. Eigentlich, alle Rollen. In Volver, wie schon in La mala educación, sind Männer nur diejenigen, die Unheil über die Welt bringen. Frauen sind diejenigen, die danach alles verkomplizieren. Äußerst scharfsinnige Beobachtung.

Die Geschichte ist so schön kompliziert und verkorkst, dass sie schon fast wahr sein muss. Der Vergangenheit kann man nicht entfliehen, sie holt dich immer wieder ein -- vielleicht ist das die Hauptaussage des Films. Paul Valéry sagte mal, dass wir der Zukunft mit dem Rücken voran entgegengehen, dass wirnicht sehen, was uns erwartet, sondern nur das, was in der Vergangenheit geschehen ist. Und genau so handeln die Helden hier. Wann auch immer es um die Zukunft geht, verschieben sie das Gespräch auf morgen. Sie möchten die Gegenwart meistern. Die Vergangenheit überwinden. Und erst dann sehen wir weiter.

Passend dazu stellt der Film auch die morbide Faszination der Spanier mit dem Tod dar, vor allem in den kleinen Orten. Aber warum das Bild mit den Windmühlen? Teilweise wurde es in der Mancha aufgenommen, wo auch ein anderer Held, der mit Windmühlen zu tun hatte, herstammt. Doch scheint mir, rennt in diesem Film keiner gegen Windmühlen an. Bei solchen Szenen habe ich immer das Gefühl, etwas verpasst zu haben. Na was soll's, der Film war auch so schön genug.

Drama. Komplizierte Familienkiste, bei der der nächste Schritt ein wenig zu leicht zu durchschauen ist (aber das ist ja bei jeder realistischen Familienkiste so), dafür aber der übernächste Schritt meist noch eine Überraschung bereithält. Sehenswert.

ACM Review II

Thanks to Mary-Lynn Bragg for answering my complaint about non-accessible reviews so quickly. She sent me the review to read, and I was quite happy with it. Michael Lesk, one of the guys who built Unix, wrote the review, and recommended it to "designers of volunteer or collaborative systems and [...] those studying cooperative work." Thanks, Michael, thanks Mary-Lynn.

Sooo -- read our straightforward and easy to read paper! :)

By the way -- I still think ACM Computing Reviews should offer their reviews for free to the public. This would increase the impact of their reviews dramatically, and also the impact of the reviewed papers.

Schlagzeuger sind ersetzlich

Durch Maschinen. Diese Studie zeigt eine solche Möglichkeit.

Im ernst, wer Percussion mag, wird sicher dieses Video sehr cool finden. Und wer Musik mag auch. Oder Computerzeichentrick. Ich habe gelesen, dass das Programm, welches das Video erzeugt hat, ein beliebiges Midi reinladen konnte. Man musste dann nur noch die Kameraschwenks und Perspektiven bestimmen. Wow.

Weitere solche Videos bei Google. Die Animusic DVD gibt es auch, in deutlich besserer Qualität, zu kaufen.

Besser spät als nie

Auch ich fang das Stöckchen. Wenn auch spät.

Warum bloggst du?

Weil ich manchmal glaube, etwas sagen zu können, das unterhält, nachdenken lässt, erinnert, berührt, zum lachen bringt, informiert, oder auch einfach nur die Zeit etwas angenehmer vorbeigehen lässt.

Seit wann bloggst du?

Nodix gibt es seit dem 15. Januar 2001. Das erste Mal, dass ich das ganze einen Blog nannte, und auch das Aussehen der Seite den üblichen Blogstandards anpasste, war erst am 21. August 2002, als die ersten 10000 Besucher zusammengekommen waren.

Selfportrait?

Eingebildet (glaubt, gut zu sein), verrückt (Arbeitet auch am Wochenende, und hat Spaß dabei), sich in Details verlierend (und dadurch nicht das tun, was er tun sollte), großspurig (die Menschheit retten wollen, und auch zu glauben, es zu können), dickköpfig (lässt sich schwer umüberzeugen), alles machen wollen (und auch von anderen das selbe verlangen), zu selten das Partizip benutzen (es sollte benutzend heißen), und Wikipedianer (behauptet zumindest die Wikipedia).

Ansonsten findet sich unter Denny eine bessere Beschreibung.

Warum lesen Deine Leser Deinen Blog?

Weil ich hin und wieder das schaffe, was ich in Punkt 1 schreibe. Außerdem schreibe ich so selten, dass die meisten Leser dafür nicht sonderlich viel Zeit brauchen.

Welche war die letzte Suchanfrage, über die jemand auf Deine Seite kam?

gutes karma

Das finde ich aber sehr erfreulich: sucht man bei Google.de nach gutem Karma, kommt man zu mir (Platz 3!)

Welcher Deiner Blogeinträge bekam zu Unrecht zu wenig Aufmerksamkeit?

Oh, ich dachte nicht dass Blogeinträge ein Recht auf Aufmerksamkeit haben. Aber wenn man mich so fragt... dass Männer schlauer als Frauen sind, wurde in diesem Blog schon behauptet, mit eben so viel Recht. Ansonsten schlägt auch das Risiko Alter in eine ähnliche Kerbe. Und dann bleibt noch mein liebster Blogeintrag, über das Pendeln.

Dein aktuelles Lieblings-Blog?

nakit-arts. Oh, ich meine be croative!. Da bleibe ich auf dem Laufenden, was daheim passiert, auch wenn ich nicht daheim bin.

An welche vier Blogs wirfst du das Stöckchen weiter und warum?

Uh, ich glaube, Blogger, die ich kenne, haben das Stöckchen schon gefangen. Ich werfe die jetzt einfach mal in die Luft, mal sehen, wer fängt! Linkt hierher, wenn ihr das Stöckchen gefangen habt!

Wikimania 2006 is over

And it sure was one of the hottest conferences ever! I don't mean just because of the 40°C/100°F that we had to endure in Boston, but also because of the further speakers there.

Brewster Kahle, the man behind the Internet Archive, and who started Alexa and WAIS Inc., told us about his plans to digitalize every book (just a few Petabytes), every movie (just a few Petabytes), every record (just a... well, you get the drill), and to make a snapshot of the web every few months, and archive this. Wow.

Yochai Benkler spoke about the Wealth of Networks. You can download his book from his site, or go to a bookstore and get it there. The talk was really inviting to read it: why does a network thingy like Wikipedia work and not suck? How does this change basically everything?

Next day, there was Mitch Kapor, president of the Open Source Application Foundation -- and I am really sorry I had to miss his talk, because at the same time we were giving our workshop on how to reuse the knowledge within a Semantic MediaWiki in your own applications and websites. Markus Krötzsch, travel companion and fellow AIFB PhD student, and basically the wizard who programmed most of the Semantic MediaWiki extension, totally surprised me by being surprised about what you can do with this Semantic Web stuff. Yes, indeed, the idea is to be able to ask another website to put stuff up on yours. And to mush data.

There was David Weinberger, whose talk made me laugh more than I had for a while (and I am quite merry, usually!). I still have to rethink what he actually said, contentwise, but it made a lot of sense, and I took some notes, it was on the structure of knowledge, and how it changes in the new world we are living in.

Ben Shneiderman, the pope on visualization and User Interfaces had an interesting talk on visualizing the Wikipedia. The two talks before his, by Fernanda Viegas and Martin Wattenberg, were really great, because they have visualized real Wikipedia data -- and showed us a lot of interesting data. I hope their tools will become available soon. (Ben's own talk was rather a bit disappointing, as he didn't seem to have the time to take some real data, but only used fake data to show some general possible visualizations. As i had the chance to see him in Darmstadt last year anyway, I didn't see much new stuff).

The party at the MIT Museum was great! Even though I wasn't allow to drink, because I forgot my ID. I'd never think anyone would consider me looking younger than 21. So I take this as the most sincere compliment. Don't bother explaining they had to check my ID even if I looked 110, I really don't want to hear :) I saw Kismet! Pitily, he was switched off.

Trust me. I was kinda tired after this week. It was lots of fun, it was enormously interesting. Thanks to all the Wikipedians, who made Wikipedia and Wikimania possible. Thanks to all these people for organizing this event and helping out! I am looking forward to Wikimania 2007, wherever it will be. The bidding for hosting Wikimania 2007 are open!

Angeblich kritisch

Die Geschichte von Epic. Der Film ist schon etwas älter (äh, 2005?), jetzt aber endlich in einer hervorragenden deutschen Übersetzung zu haben.

Es geht darum, wie sich das Internet weiterentwickeln könnte. Eine recht spannende Version.

ACM Review

ACM Computing Reviews reviewed our paper on the Semantic Wikipedia. Yay! Pitily, I can't read the review because I need a login.

Would love to see what they're saying, but I guess it's not meant to be open to the public. Or to the authors.


Comments are still missing on this post.

Maybe the hottest conference ever

The Wikipedia Hacking Days are over. We have been visiting Siggraph, we had a tour through the MIT Media Lab, some of the people around were Brion Vibber (Wikimedia's CTO), Ward Cunningham (the guy who invented wikis), Dan Bricklin (the guy who invented spreadsheets), Aaron Swartz (a web wunderkind, he wrote the RSS specs at 14), Jimbo Wales (the guy who made Wikipedia happen), and many other people. We have been working at the One Laptop per Child offices, the office to easily the coolest project of the world.

During our stay at the Hacking Days, we had the chance to meet up with the local IBM Semantic Web dev staff and Elias Torres, who showed us the fabulous work they are doing right now on the Semantic Web technology stack (never before rapid application deployment was so rapid). And we also met up with the Simile project people, where we talked about connecting their stuff like Longwell and Timeline to the Semantic MediaWiki. We actually tried Timeline out on the ISWC2006 conference page, and the RDF worked out of the box, giving us a timeline of the workshop deadlines. Yay!

Today started Wikimania2006 at the Harvard Law School. was not only a keynote by Lawrence Lessig, as great as expected, but also our panel on the Semantic Wikipedia. We had an unexpected guest (who didn't get introduced, so most people didn't even realize he was there), Tim Berners-Lee, probably still jetlagged from a trip to Malaysia. The session was received well, and Brion said, that he sees us on the way of getting the extension into Wikipedia proper. Way cool. And we got bug reports from Sir Timbl again.

And there are still two days to go. If you're around and like to meet, drop a note.

Trust me — it all sounds like a dream to me.


Comments are still missing on this post.

At Wikimania 2006

I am here in Boston now, walking the sacred grounds of Harvard and MIT, and listening and talking with the great people who created the MediaWiki software that runs Wikipedia. At Thursday, Markus and I will host the Semantic Wikipedia panel, where we present the state of our implementation, and talk about how to make it real. We also have a tutorial on Sunday, about how to reuse knowledge from a Semantic MediaWiki.

If you're around Boston, MIT, Harvard, or if you even attend Wikimania and the Hacking Days, and want to meet -- contact me!


Comments are still missing on this post.

An wen...

...erinnert mich nur dieser Chatbot?

ESWC2006 is over

I have been the week in Budva, Montenegro, at the ESWC2006. It was lovely. The keynotes were inspiring, the talks had a good quality, and the SemWiki workshop was plain great. Oh, and the Semantic Wikipedia won the best poster award!

But what is much more interesting is the magic that Tom Heath, the Semantic Web Technologies Co-ordinator, managed: the ESWC website is a showcase of Semantic Web technologies! A wiki, a photo annotation tool, a chat, a search, a bibliography server, a rich semantic client, an ontology, the award winning flink... try it out!

Now I am in Croatia, and taking my first real break since I started on the Semantic Web. Offline for three weeks.

Yay.

Daheim

Nach einer Woche in Montenegro bin ich jetzt in Kroatien angekommen. Und genieße.

Offline.

OWL luna, nicer latex, OWL/XML to Abstract Syntax, and more

After a long hiatus, due to some technical problems, finally I could create a new version of the owl tools. So, version 0.27 of the owl tools is now released. It works with the new version of KAON2, and includes six months of bug fixing, but also a number of new features that have introduced a whole new world of new, exciting bugs as well.

The owl latex support was improved greatly. Translation of an owl ontology is done now more careful, and the user can specify much more of the result than before.

A new tool is owl luna - luna like local unique name assumption. It adds an axiom to your ontology that states that all individuals are different from each other. Due to most ontology editors not allowing to do this automatically, here you find a nice maintenance tool to make your ontology much less ambiguous.

The translations of OWL/RDF to OWL/XML and back have been joined in one new tool, called owl syntax, that allows you to translate owl ontologies also to OWL Abstract Syntax, a much nicer syntax for owl ontologies.

owl dlpconvert has been extended as it now also serialzes it results as RuleML if you like. So you can just pipe your ontologies to RuleML.

So, the tools have both become sharper and more numerous, making your toolbelt to work with owl in daily life more usable. Get the new owl tools now. And if you peek into the source code, the code has underwent a major clean up, and you will also see the new features I am working on, that have to do with ontology evaluation and more.

Have fun with the tools! And send me your comments, wishes, critiques!

Weltuntergang

Gestern war der 6.6.6.

Und die Welt ist doch nicht untergegangen. Da muss für so manchen Zahlenmystiker die Welt untergegangen sein.

(Übrigens war schon zum zweiten Mal der 6.6.6. Vor 1000 Jahren hat das mit dem Weltuntergang auch nicht geklappt.)

WWW2006 social wiki

18 May 2006

The WWW2006 conference next week has a social wiki. So people can talk about evening activities, about planning BOF-Sessions, about their drinking habits. If you're coming to the conference, go there, make a page for yourself. I think it would be fun to capture the information, and to see how much data we can get together... data? Oh, yes, forgot to tell you: the WWW2006 wiki is running on Semantic MediaWiki.

Yay!

Let's show how cool this thing can get!

Semantic Mediawiki 0.4 - Knowledge Inside!

15 May 2006

Until now, Semantic MediaWiki was kind of a nerds project. Yes, you could get a lot of information out in RDF, and actually, I used it as an RDF editor more than once -- but heck, what normal person needs that?

Now, with the freshly implemented feature, the advantages of a Semantic MediaWiki over a normal MediaWiki should become obvious: you can simply ask the wiki for stuff! Wiki, what are the 10 biggest city in the US? Put a list here. Or, wiki, what is the height of the current German chancellor? Put the info here. I have made a writeup on those inline queries on our demo wiki. Go there, read it.

But a lot of other things made it into the 0.4 release. Here's Markus' list:

  • Improved output for Special:Relations and Special:Attributes: usage of
  • relations and attributes is now counted
  • Improved ontology import feature, allowing to import ontologies and to update existing pages with new ontological information
  • Experimental suport for date/time datatype
  • More datypes with units: mass and time duration
  • Support for EXP-notation with numbers, as e.g. 2.345e13. Improved number formating in infobox.
  • Configurable infobox: infobox can be hidden if empty, or switched off completely. This also works around a bug with MediaWiki galeries.
  • Prototype version of Special:Types, showing all available datatypes with their names in the current language setting.
  • "[[:located in::Paris]]" will now be rendered as "located in [[Paris]]"
  • More efficient storage: changed database layout, indexes for fast search
  • Code cleaned up, new style guidelines
  • Bugfixes, a lot of Bugixes

Thanks to everyone who contributed and still contributes to the project! And, connected to this, thanks to the answers to my last blog entry -- I will write more on this tomorrow.

Need help with SQL

11 May 2006

But I have no time to write it down... I will sketch my problem, and hopefully get an answer (I've started this blog in the morning, and now I need to close it, cause I have to leave).

Imagine I have following table (a triple store):

Subject Predicate Object
Adam yahoo im adam@net
Adam skye adam
Berta skype berta

How do I make a query that returns me the following table:

Subject o1 o2
Adam adam@net adam
Berta - berta

The problems are the default value in the middle lower cell. The rest works (as maybe seen in the Semantic Mediawiki code, file includes/SMW_InlineQueries.php -- note that the CVS is not up to date for now, because SourceForge's CVS is down for days!)

It should work in general, with as many columns as I like on the answer table (based on the predicates in the first one).

Oh, and if you solved this -- or have an idea -- it would be nice if it worked with MySQL 4.0, i.e. without Subqueries.

Any ideas?


Comments are still missing on this post.

Rom mit vielen Gesichtern

10 May 2006

Man läuft durch Rom, und man sieht, es ist die Stadt der Päpste. Überall finden sich gigantische Marmorplatten, auf denen erklärt wird, dass diese(r/s) Platz / Kirche / Gebäude / Monument / Brunnen / Brücke von Seiner Heiligkeit, P.M. (nicht etwa Prime Minister, sondern Pontifex Maxmimus) Soundso dem Sounsovielten gemacht wurde, im Jahre 12. Das heißt natürlich nicht etwa, dass das Ding knapp 2000 Jahre alt ist, sondern bezeichnet das 12. Jahr seines Papstseins. Es gibt sogar eine riesige Platte auf dem McDonalds gegenüber vom Pantheon, die auf -- Erinnerung ist schwach -- glaube Leo XII. verweist. Das Mc-Symbol ist da deutlich dezenter. Wieviele Fastfoodketten haben das noch?

Dann läuft man weiter, und entdeckt auch, dass Rom die Stadt eines antiken Imperiums ist. Das Kollosseum (gigantisch!), das Forum Romanum, das Pantheon natürlich, das Mausoleum von Kaiser Hadrian, heute Castel St Angelo, die Statuen überall. Atemberaubend.

Dann läuft man weiter, und entdeckt dass es auch die Stadt Berninis, der die Stadt in der Renaissance wieder wachküsste, und sie in eine Stadt der Brunnen verwandelte. Überall fließt Wasser, wohlgemerkt, allesamt Trinkwasser, was vor allem an den heißeren Tagen sehr beliebt ist, das Plätschern, die wunderschönen Parkanlagen (wirklich atemberaubend mit ihren künstlichen Seen und natürlichen Hügeln, und darauf die jahrhundertealten Bäume).

Und dann läuft man weiter, und man sieht, dass es auch die Stadt der Regierung eines modernen Italiens ist. Der Palast des Präsidenten, mit den wohl coolsten Wächteruniformen überhaupt (vergesst die Uniformen der Schweizer Garde, die von einem gewissen Michelangelo designt wurden). Lange, wehende Mäntel, große blitzende Knöpfe, und ein Ritter, der im Eingang steht, weit hinten, im Dunkeln, und sich nie bewegt, auf sein Schwert gestützt, mit einem großen Helm. Denke nicht, dass er gegen einen Bewaffneten mit 'ner Schusswaffe auch nur den Hauch einer Chance hätte, aber er sieht so morsmäßig cool aus!

Und schließlich, nach dem vielen Gelaufe, wird es dunkel, und dann merkt man, dass Rom auch eine Stadt der jungen Leute ist, mit unzähligen Kleinen und manchen Großen Bars, und viel Leben auf der Straße. Rom hat unglaublich viele Gesichter, aber ich nehme an, in einer Stadt mit 2800 Jahren Geschichte sammelt sich halt mit der Zei was an...

Semantic Web Summer School 2006

The Summer School for the Semantic Web and Ontological Engineering is an annual event that brings together PhD students from all over the world and some of the brightest heads in the Semantic Web, to teach, to socialize, to learn, and to have fun. This year's invited speakers are Jim Hendler himself, and Enrico Motta, Stephan Baumann, Guus Schreiber, and the tutors are John Domingue, Asun Gomez-Perez, Jerome Euzenat, Sean Bechhofer, Fabio Ciravegna, Aldo Gangemi. You will learn a lot. You will have lots of fun. The place is really beautiful, the girls, well at least last year, were really beautiful, the stuff we learned was interesting, and inspired quite some cooperation further on. And it's really great for getting to know a lot of people: at the next conference you're guaranteed to meet someone again, and thus it is also a perfect possibility ot get into the community.

The deadline is May 1st, so be sure to go over to the SSSW2006 website and sign up.

If this didn't convince you, take a look at my series of posts about last year's summer school.


Comments are still missing on this post.

Research visit

So, I have arrived for my very first longer research visit. I am staying at the Laboratory of Applied Ontologies in Rome. I have never been to Rome before, and all my type of non-work experience I'll chat about in my regular blog, in German. Here I'll stick to Semantic Stuff and such.

So, if you are nearby and would like to meet -- give me a note! I am staying in Rome up to the WWW, i.e. up to May 20th. The plan is to work on my dissertation topic, Ontology Evaluation, especially in the wake of the EON2006 workshop, but that's not all it seems. People are interested in and knowledge about Semantic Wikis as well. So there will be quite a lot stuff happening in the next few weeks -- I'm excited about it all.

Who knows what will happen? If my plan works out, at the end of the stay we will have a common framework for Ontology Evaluation. And I am not talking about this paper frameworks -- you know, that are presented in papers with titles starting "Towards a..." or "A framework for...". No, but real software, stuff you can download, and play with.

Preisschock

Innerlich beschwerte ich mich über das teure Zimmer. Gestern erfuhr ich, wieviel es eigentlich kostet -- schlappe 165 Euro pro Nacht. Das Ding ist winzig! Nein, versteht mich nicht falsch -- die Leute sind echt nett, es ist gemütlich, es ist sauber, es ist hervorragend gelegen. Aber 165 Euro pro Nacht für nichtmal 8 Quadratmeter, inklusive Badezimmer und Wandschrank? Ich meine, 1 Euro pro Stunde und Quadratmeter?

Dann bin ich lieber ruhig und bedanke mich für den Preis, den mein Gastgeber am Institut mir organisiert hat.

Ankunft

Ich hoffe, der Steuerzahler weiß zu schätzen, dass ich mich gestern für ihn richtig aufgeopfert habe. Wie gesagt, das Hotel hatte mir angeboten, mich mit einer Limo vom Flughafen abzuholen, mit Schildchen, wo mein Name draufsteht usw., das ganze Programm.

Das kostete mir zuviel, also fuhr ich mit der Bahn zum Römischen Hauptbahnhof Termini. Von dort hätte ich ja auch ein Taxi nehmen können - doch, woran erkennt man in Rom die legalen Taxis? Und um Termini herum, so heißt es, sei das schlechte Viertel von Rom.

Also ab in die Metro (übersichtlich, zwei Linien, ein Ost-West (blau) und eine Nord-Süd (rot), zwei Haltestellen) und dann zu Fuß weiter. Auf der Karte sah das nach etwa einem Kilometer aus, also solte es doch kein Problem sein. Mein Gedächtnis ließ mich nicht im Stich, den Weg fand ich.

Nur ein Kilometer ist mir 30 Kilo Gepäck doch deutlich weiter als gedacht... und das war auf der Karte nicht eingezeichnet. Ich brauchte für die Strecke tatsächlich eine Stunde, mein Rücken tat weh, meine Handflächen brannten, ich war durstig und kaputt. Ohne Handschuhe hätte ich mir die Hände wundgescheuert. Immerhin, das Wetter war angenehm kühl, schattig und windig.

Dafür schlief ich zunächst mal 12 Stunden durch. Wow. Mache ich sonst nie. Jetzt bin ich erholt und bei der Arbeit. Und es macht Spaß.

Abflug

Ich habe Schwesterchen versprochen, regelmäßig aus Rom zu berichten. Wobei ich noch keine Ahnung habe, wie ich in Rom überhaupt verbunden sein werde, mit dem Netz.

Aber von vorne -- warum verschlägt es mich überhaupt in die Ewige Stadt? Es geht dabei um das Thema meiner Doktorarbeit, herauszufinden, was eine gute Ontologie ist. Zugegeben, ich weiß erst seit kurzem, was eine Ontologie überhaupt ist -- und jetzt will ich wissen, wann eine gut ist und wann nicht? Soviel zu hohen Zielen. Jedenfalls gibt es weltweit nicht so wahnsinnig viele Experten auf diesem Gebiet, und einer davon -- der geneigte Leser wird es sicher bereits erahnen -- sitzt in Rom.

So jedenfalls kommt es, dass ich für ein paar Wochen nach Rom fahre, und wir zusammen an diesem Thema arbeiten werden. Die Arbeitsergebnisse werden sich auf Semantic Nodix finden, hier hingegen private Eindrücke und Erlebnisse.

Wie zum Beispiel, dass ich furchtbar nervös bin. Ich realisierte, dass ich noch nie so lange weg von daheim war (wobei ich großzügig daheim als sowohl Stuttgart als auch Brač zähle). Aber ich freue mich auch sehr darauf. Ich war noch nie in Rom. Und kann kein Italienisch. Aber ich habe mir auf Wikitravel schon zahlreiche Informationen über Rom eingeholt, also kann auch gar nichts schief gehen. Da habe ich etwa erfahren, dass der Bahnhof Termini, an dem ich ankomme, berühmt für seine hohe Kriminalität ist. Und dass der Flughafen Leonardo da Vinci zu weit weg ist, um ein Taxi zu nehmen. Das Hotel hat mir jedenfalls einen Limousinenservice angeboten, für läppische 55 €. Habe ich dann doch abgelehnt, mit dem Zug kostet es nur Neuneinhalb. Und es gibt auch, so wie in Sofia, betrügerische Taxen, die ohne Lizenz und zu horrenden Preisen fahren. Oder man wird von Horden von Kindern angefallen, die einem dann die Geldbörse entwenden. Bester Trick: nicht wie ein Tourist aussehen. Stelle ich mir schwierig vor, wenn ich gerade mit Rucksack und zwei Koffern ankomme.

So, wie gesagt, da ich nun informiert bin, kann ich ja ganz beruhigt hinfahren. Es wird sicher sehr schön.

Semantic Mediawiki 0.3

Yay! Markus "the Sorcerer" Krötzsch finished the new release of Semantic MediaWiki today. The demo website is already running version 0.3 for a while.

I'll let Markus speak:

I am glad to finally announce the official release of Semantic MediaWiki 0.3, available as usual at http://sourceforge.net/projects/semediawiki/. The final 0.3 is largely equivalent to the preview version that is still running on wiki.ontoworld.org -- the latest changes mainly concern localization.

Semantic MediaWiki 0.3 now runs on MediaWiki 1.6.1 that was released just yesterday. Older versions of MediaWiki should also work but upgrading is generally recommended.

The main new features of 0.3 are:

  • support for geographical coordinates (new datatype),
  • improved user interface: service links for JScript tooltips, CSS layout,
  • OWL/RDF export of all annotation data,
  • simplified installation process (including special page for setup/upgrade),
  • (almost) complete localization; translations available for English and German,
  • better MediaWiki integration: namespaces, user/content language, support for MediaWiki 1.6,
  • specials for displaying all relations/attributes,
  • experimental (OWL/RDF) ontology import feature,
  • and, last but not least, we also fixed quite some bugs.

The next steps towards 0.4 will probably be the inclusion of query results into existing pages, date/time support, and individual user settings for displaying certain datatypes. We also will have another look at ways of hiding the annotations from uninitiated users.

Have fun.

Markus

P.S.: I am not available during the weekend. Upgrading existing wikis should work (it's what we do all the time ;), but be aware that there is not going to be much support during the next three days.


Comments are still missing on this post.

CeBIT

Am Wochenende war ich auf der CeBIT arbeiten. Wir stellten am Stand des BMBF Ergebnisse des AIFB vor. Einige interessante Besucher dagehabt, aber auch am Sonntag die Chance genutzt, ein wenig über die Messe zu wandern. Ich war zum ersten Mal da.

Zunächst mal: riesig groß, mit viel, viel, richtig viel freie Fläche zwischen den Messehallen. Ich meine, schaut es euch selber an. Und von der Expo 2000 sind noch ein paar nette Gebäude übrig geblieben. Groß heißt aber auch weite Wege. Meine Füße...

Und, was gab es cooles? Nein, ich werde jetzt nicht über die ganzen Semantischen Technologien reden, die es auf der CeBIT gab (Semantic Talk, oder Gnowsis, oder, halt uns), sondern drei andere Impressionen.

Erstens: Demo von Windows Vista gesehen. Positiv überrascht. Nicht nur dass es fast so cool wie MacOS X aussieht, die Metadatenfähigkeit wirkt auch überraschend zugänglich. Aber dennoch, das neu Interface ist echt fesch.

Zweitens: einen 102" Flachbildschirm gesehen. Wow. Und darauf lief der Trailer für X-Men 3. Doppelwow. Ich weiß gar nicht was jetzt cooler war.

Drittens: neben uns saßen Forscher vom Fraunhofer, die das AMI Projekt vorstellten. Eines der Projektergebnisse war eine Kamera, die alle Teilnehmer eines Meetings aufnimmt. Sie findet dabei automatisch die Gesichter in einem Bild, und insbesondere ob sie gerade sprechen, ob sie lächeln, nicken, etc. Sie hatten zur Demonstration vier Plüschtiere zum Meeting hingesetzte, und an dreien wurde auch zuverlässig das Gesicht entdeckt. Am vierten nicht -- ein nachtschwarzer Maulwurf. Da fragte ich, ob es denn auch bei unterschiedlichen Hautfarben klappe, wie eben etwa bei schwarzer Haut. Ja, überhaupt kein Problem, es ist sogar einfacher mit Menschen als mit dem ganzen Plüschzeug, weil Menschen alle die selbe Hautfarbe haben. Nur die Intensität sei unterschiedlich. Aber die Farbe, die ist bei allen gleich.

Was man mit Computern alles lernen kann.

Good ontologies?

We have asked you for your thoughts and papers. And you have sent us those -- thank you! 19 submissions, quite a nice number, and the reviewing is still going on.

Now we ask you for your results. Apply your evaluation approaches! We give you four ontologies on the EON2006 website, and we want you to take them and evaluate them. Are these ontologies good? If they are, why? If not, what can be changed? We want practical results, and we want to discuss those results with you!. So we collected four ontologies, all talking about persons, all coming from very different background and with different properties. Enough talking -- let's get down and make our hands dirty by really evaluating these ontologies.

The set is quite nice. Four ontologies. One of them we found over rdfdata.org, a great resource for ontologies, some of them I would have never found myself. We took a list of Elvis impersonators. One person edited the ontology, it is about a clear set of information, basically RDF. The second ontology is the ROVE ontology about the Semantic Web Summer School in Cercedilla last year. It was created by a small team, and is richly axiomatized. Then there is the AIFB ontology, based on the SWRC. It is created out of our Semantic Portal in the AIFB , and edited by all the members of the AIFB -- not all of them experts in the SemWeb. Finally, there's a nice collection of FOAF-files, taken from all over the web, and to be meshed up together and evaluated as one ontology, created with a plethora of different tools, by more than a hundred persons. So there should be an ontology fitting to each of the evaluation approaches.

We had a tough decision to make when choosing the ontologies. In literally the last moment we got the tempting offer to take three or four legal ontologies and to offer those for evaluation. It was hard, and we would have loved to put both ontology sets up to evaluation, but finally decided for the set mentioned previously. The legal ontologies were all of similar types, and certainly would need a domain expert for proper evaluation, which many of the evaluators won't have at hand at the moment. I hope it is the right decision (in research, you usually never know).

The EON2006 workshop will be a great opportunity to bring together all people interested in evaluating ontologies. I read all the submissions, and I am absolutely positive that we will be able to present you with a strong and interesting programme soon. I was astonished how many people have interest in that field, and was intrigued to discover and follow the paths lead out by the authors. I am looking forward to May, and the WWW!


Comments are still missing on this post.

Visa haben ungleich Visa brauchen

Als ich vor ein paar Wochen in Dortmund war, um mein Visum für Großbritannien zu beantragen, hatte ich ja eines für 5 Jahre beantragt und eines für 2 bekommen. Da habe ich schon gehofft, dass es das letzte Visum für Großbritannien sei, dass ich beantrage, und dass bis zum Ablauf desselben Kroatien Teil der Europäischen Union sein werde.

Mit einem Punkt hatte ich schon mal Recht -- Großbritannien schafft mit Wirkung zum 22. März die Visumspflicht für Kroaten ab. Das mein schönes Visum bis 2008 Gültigkeit hat ist dadurch natürlich vollkommen irrelevant geworden.

Menno, das haben die doch absichtlich gemacht.

Die GUI - XUL, HTML und CSS

Das Herzstück von Mozilla ist die Renderingengine Gecko. Sie liest zum Beispiel (X)HTML ein und stellt dieses dann auf dem Monitor dar. Damit lassen sich bereits sehr interaktive und sich äußerst dynamisch anfühlende Webseiten erstellen, insbesondere wenn man JavaScript mit verwendet. An Beispielen wie GMail oder start.com kann man erkennen, wieviel heute schon mit HTML möglich ist. Dennoch: viele GUI-Elemente wie Listen, Menüs oder Knöpfe sind in HTML eher umständlich umgesetzt. Darum wurde XUL, die XML User Interface Language, eingeführt, eine XML-Sprache zur Beschreibung von graphischen Benutzerschnittstellen.

Mit XUL ist es dann möglich zu beschreiben, wo bestimmte Elemente auftauchen, welche wo auftauchen, etc. Dadurch wird es zum Beispiel leicht fallen, verschiedene Versionen der GUI zu erstellen, eine für Experten und eine für Anfänger. Insbesondere könnte man sich einen Schritt-für-Schritt-Wizard für die Heldenerschaffung vorstellen, wie sie in den meisten anderen Editoren wie Helden praktiziert wird. Meine persönliche Vorliebe bleibt dennoch beim "Immer alles veränderbar"-Modus. Aber die Codebasis wird sehr einfach beides hergeben.

Ein weiterer Vorteil von XUL ist, dass damit das User Interface interpretiert, nicht compiliert wird. Das hat den Vorteil, dass, wenn man das Aussehen des Programms verändern will, man nicht eine ganze Programmierumgebung mit Compiler etc. braucht, sondern einfach nur einen Texteditor und etwas XML- oder HTML-Kenntnisse. Hoffnung dabei: schöne neue Skins und bessere Bedienbarkeit können von wesentlich mehr Leuten beigesteuert werden als bisher.

Apropos Skins: ja, auch das wird möglich sein. Wie von Thunderbird und Firefox gewohnt, sind XUL-basierte Anwendung vollkommen mit CSS und ähnlichem skinbar. Das heißt Farben, Hintergründe, Aussehen der Elemente sind steuerbar. Ich stelle mir jetzt schon ein Horasreich-Skin, ein Myranor-Skin und ein G7-Skin vor. Mal sehen. Es wird letztlich von Eurer Kreativität abhängen.

Auch hier ist das wichtige: man muss dafür keine Programmierumgebung besitzen und keine Programmierkenntnisse haben (auch wenn sie natürlich nicht schaden). Der wichtigste Teil des neuen Designs ist es, die einzelnen Teile der Architektur orthogonal zu gestalten, so dass man an einem bestimmten Teil arbeiten kann, ohne das man ein Experte in allen sein muss. Etwas, was bei der ersten Version sträflichst vernachlässigt wurde.

Nächstes Mal: zum Datenmodell und der Datenhaltung.

EON2006 deadline extension

We gave the workshop on Evaluating Ontologies for the Semantic Web at the WWW2006 in Edinburgh an extension to the end of the week, due to a number of requests. I think it is more fair to give an extension to all the authors than to allow some of them on request and to deny this possibility to those too shy to ask. If you have something to say on the quality of ontologies and ontology assessment, go ahead and submit! You still have a week to go, and short papers are welcomed as well. The field is exciting and new, and considering the accepted ESWC paper the interest in the field seems to be growing.

A first glance of the submissions reveals an enormous heterogeneity of methods and approaches. Wow, very cool and interesting.

What surprised me was the reaction of some: "oh, an extension. You didn't get enough submissions, sorry". I know that this is a common reason for deadline extensions, and I was afraid of that, too. A day before the deadline there was exactly one submission and we were considering cancelling the workshop. It's my first workshop and thus such things make me a whole lot nervous. But now, two days after the deadline I am quite more relaxed. The number of submissions is fine, and we know about a few more to come. Still: we are looking for more submissions actively. For the sole purpose of gathering the community of people interested in ontology evaluation in Edinburgh! I expect this workshop to become quite a leap for ontology evaluation, and I want the whole community to be there.

I am really excited about the topic, as I consider it an important foundation for the Semantic Web. And as you know I want the Semantic Web to lift off, the sooner the better. So let's get these foundations right.

For more, take a peek at the ontology evaluation workshop website.


Comments are still missing on this post.

Zurück aus Sheffield

Was wirklich gemein ist: die ganze Zeit in Sheffield hat es geregnet und gewindet. War ja nichts anders zu erwarten in England, oder?

Aber heute, da ich zurück bin, ist es dort über 10 Grad wärmer als hier, und sie haben Nieselregen und Sonnenschein statt Schnee und Grauingrau.

Die gute Nachricht: der Vortrag ist phantastisch gelaufen, das ganze Review war ein großer Erfolg. Danke allen Daumendrückern. Und danke den nutkidz, die ebenfalls eine Rolle spielten. Mehr dazu bald.

My Erdös Number

After reading a post by Ora and one by Tim Finin, I tried to figure my own Erdös Number out. First, taking Ora's path, I came up with an Erdös of 7:

Paul Erdös - Stephan Hedeniemi - Robert Tarjan - David Karger - Lynn Stein - Jim Hendler - Steffen Staab - Denny Vrandečić

But then I looked more, and with Tim's path I could cut it down to 6:

Paul Erdös - Aviczir Fraenkl - Yaacov Yesha - Yelena Yesha - Tim Finin - Steffen Staab - Denny Vrandečić

The point that unnerved me most was that the data was actually there. Not only a subscription-only database for mathematical papers (why the heck is the metadata subscription only?), but there's DBLP, there's the list of Erdös 1 and 2 people on the Erdös Number project, there's Flink, and still, I couldn't mash up the data. This syntactic web sucks.

The only idea that brought me further - without spending even more time with that - was a Google search for "my erdös number" "semantic web", in the hope to find some collegues in my field that already have found and published their own Erdös number. And yep, this worked quite fine, and showed me two further, totally disjunctive paths to the one above:

Paul Erdös - Charles J. Coulborn - A. E. Brouwer - Peter van Emde Boas - Zhsisheng Huang - Peter Haase - Denny Vrandečić

and

Paul Erdös - Menachem Magidor - Karl Schlechta - Franz Baader - Ian Horrocks - Sean Bechhofer - Denny Vrandečić

So that's and Erdös of 6 on at least 3 totally different paths. Nice.

What surprises me - isn't this scenario obviously a great training project for the Semantic Web? Far easier than Flink, I suppose, and still interesting for a wider audience as well, like Mathematicians and Noble Laureates? (Oh, OK, not them, they get covered manually here).

Update

I wrote the post quite a time ago. A colleague of mine notified me in the meantime that I have a Erdös of only 4 by the following path:

Paul Erdös - E. Rodney Canfield - Guo-Quiang Zhang - Markus Krötzsch - Denny Vrandečić

Wow. It's the social web that gave the best answer.

2019 Update

Another update, after more than a dozen years: I was informed that I have now an Erdös number of 3 by the following path:

Paul Erdös - Anthony B. Evans - Pascal Hitzler - Denny Vrandečić

I would be very surprised if this post requires any further updates.


Comments are still missing on this post.

In Sheffield

Die Stadt aus Stahl. Sheffield ist eine überraschend große Stadt, knapp 520.000 Einwohner und damit die viertgrößte Stadt in Großbritannien. Steht aber weder auf den Wetterkarten, noch hat es einen Flughafen. Es hat nichtmal eine vernünftige Verbindung zum nächsten Flughafen. Wir fuhren über eine Stunde den Snake Pass entlang, eine Strecke mit einer angeblich wunderschönen Aussicht. Es war Nacht.

Ich mach selber ja keine Photos, darum hier die Photos bei Flickr zu der Stadt. Auch auf Google Maps konnte ich das Octogon gegenüber finden. Ist ziemlich cool.

Zur Zeit sitze ich im großen Review unseres Projektes zu Semantischen Technologien. Die EU entscheidet, ob ihr Geld gut ausgegeben war. Ich bin nervös, und habe heute Nachmittag meinen Vortrag. Mit den nutkidz auf den Folien.

Drückt mir die Daumen.

Ungeschick

Ich war's nicht!

Neid auf Freiheit

"Vielleicht klang auch etwas Neid auf euch durch, da die Form von Meinungsäußerung, die ich als Werbetexter seit über 30 Jahren betreibe, alles andere als frei ist: Jedes Wort wird vor der Veröffentlichung lange abgewogen, mit Auftraggebern verhandelt und dann noch repräsentativ auf seine Wirkung getestet." (Quelle)

Jean-Remy von Matt hatte in einer internen eMail, die an die Blogosphäre durchgesickert ist, sich über die Miesepetrigen beschwert, die alles schlecht machen. "Wer hat die denn gefragt?" -- eines der Argumente. Was hörst Du dann hin?, frage ich mich. Wie auch immer, es war eine impulsive Reaktion. Seine, und die Arbeit seiner Kollegen, wurde niedergemacht, und er ärgerte sich. Da soll sich die Blogosphre ruhig an die eigene Nase fassen, wir quasseln noch viel mehr Unsinn als Monsieur von Matt.

Interessant aber fand ich obiges Zitat: er neidet uns die Freiheit, die wir genießen, die er nicht genießen kann.

Jean-Remy, einen Blog zu schreiben ist kinderleicht. Das kannst auch Du. Denn: Du bist Blog!

Architektur

Die bisherige Version des DSA4 Werkzeugs ist in C++ programmiert. Für die Daten entschied ich mich damals für ein XML-Datenformat, was zu der Anbindung von Xerces führte. Die GUI wurde mit wxWindows gemacht (die ersten Versionen, falls sich noch jemand erinnert, beruhten auf MSXML und native Windows GUI Elementen). Der Wechsel auf Xerces und wxWindows wurde durchgeführt, um Plattformunabhängig zu sein. Und tatsächlich: der Code compilierte auch unter Linux (dank hier an die Portierer). Aber richtig laufen tat er nie, es waren immer irgendwelche Bugs in der Linuxversion, die ich nicht in der Windowsversion nachvollziehen konnte.

Außerdem war das verwendete C++ viel zu kompliziert. Ich benutzte massig Templates (vor allem für die Rassen, Kulturen und Professionen), was den Code sehr schwer lesbar und bearbeitbar machte. Auch ein schlichtes compilen war eine recht anspruchsvolle Prozedur. Ich gehe davon aus, dass deswegen nie im größeren Maße Entwicklungsarbeit von anderen als von mir geleistet wurde: mein Code war schlicht zu kompliziert. Aus de selben Grund habe ich selber ja in den letzten Monaten den Code nicht angefasst.

Dies ist die wichtigste Lektion für die neue Version: deutlich einfacherer Code. Änderungen müssen auch ohne sich extrem reinzuarbeiten möglich sein. Idealerweise sollte das ganze Werkzeug interpretiert sein. Kein compilen mehr. Ändern. Neustart. Fertig.

Wie ich das erreichen will verrate ich in den nächsten Blogeinträgen genauer (deswegen habe ich auch nicht so viel geschrieben die letzten Tage -- und weil ich auf Dienstreisen in Kaiserslautern und Düsseldorf war -- ich wollte zunächst ein halbwegs tragbares Konzept haben. Das kommt jetzt die nächsten paar Tage. Hier schonmal eine grobe Übersicht -- ich werde im Folgenden genauer auf die Begriffe und den Aufbau eingehen.

Mozilla, die Open Source Gruppe die uns nicht nur Firefox und Thunderbird beschert hat, hat, was die Basis ihrer Tools ist, ein umfangreiches, sogenanntes Mozilla Application Framework erstellt. Im Großen und Ganzen ist es ein supermächtiges Biest -- ich will es soweit zähmen, damit das DSA4 Werkzeug darauf läuft. Hierbei gibt es eine Hauptengine, die XULRunner heißt. XUL ist so etwas wie HTML, bloß für GUIs von Anwendungen (und man kann tatsächlich auch HTML und JavaScript in XUL mit benutzen). Das bedeutet, das User Interface des neuen DSA4 Werkzeugs zu ändern wird so leicht sein wie HTML-Seiten schreiben, ja, sogar mit CSS arbeitet das ganze zusammen. Interessant ist hierbei vor allem das Verwenden von JavaScript, das eine dynamische GUI erlaubt. Die Applikationslogik hingegen kann entweder in JavaScript implementiert werden, oder auch in C++ oder Java (oder Python), um dann über XPCOM (oder PyXPCOM) darauf zuzugreifen. Potenziell also kann man auch Teile des bisherigen Codes wiederverwenden! Schließlich, die Daten werden in RDF gespeichert, einer XML-basierten Sprache, die aber deutlich Vorteile zu XML (und ein paar Nachteile) aufweist.

All dies unterstützt das Mozilla Application Framework von Haus aus. Anhand von Thunderbird und Firefox sieht man ja, dass das durchaus zu brauchbaren Applikationen führen kann. Ich hoffe, dass es auch hier aufgeht.

WLAN im Zug

Heute fuhr ich nach Düsseldorf, um bei der Britischen Botschaft mein Visum für Großbritannien zu beantragen und, so es klappt, auch gleich zu holen. Ich hatte schon einiges erwartet -- manche Horrorgeschichte wurde erzählt über Beamtentum, Formaliakrieg, Großkotzigkeit und Willkürherrschaft. Doch nichts dergleichen trat ein (zumindest nicht beim Beantragen heute morgen). Äußerst höflich, zuvorkommend und hilfsbereit.

Allerdings muss ich jetzt einige Zeit totschlagen (um am Nachmittag das fertige Visum abzuholen). Da ich die Zeit zum Arbeiten nutzen wollte, und zudem in der Gegend war, dachte ich mir, probiere ich mal das neue Pilotprojekt der Bahn aus, WLAN im ICE. Also fuhr ich nach Dortmund um von dort Richtung München nach Köln im ICE zu fahren. Dummerweise habe ich doch den falschen Zug erwischt, keinen ICE3 scheinbar, und ich kann einfach nicht herausfinden, welche Züge jetzt tatsächlich WLAN haben.

Schade. Na gut, habe ich mich doch in Düsseldorf hingesetzt und hier ein wenig gearbeitet. Und dann geht es zurück zum Konsulat. Ich war ganz frech und habe gleich ein Visum für fünf Jahre beantragt (statt des üblichen halben Jahrs). Mal sehen, was rauskommt... ein unbekannter (Fast)Landsmann, den ich zufällig am Konsulat traf, erklärte mir, dass das nie klappen würde, und dass das ganze immer Willkür sei, er hätte schließlich Erfahrung damit, schließlich muss er schon zum wiederholten Mal verlängern. Na ja, mit der Einstellung, kein Wunder dass es dann Probleme gibt. Ich vermied es, ihn darauf hinzuweisen, dass seine beiden Aussagen ("Willkür" und "das wird nie klappen") sich gegenseitig widersprechen.

Na, schauen wir mal, was rauskommt. Bloß blöd, dass das mit dem WLAN nicht geklappt hat.

Ein neuer Anfang

Alles, was aufwändiger ist als ein Blog scheint zur Zeit nicht zu funktionieren.

Wie viele von Euch wissen, verfolge ich seit langem das Ziel, ein Programm zu erstellen, welches beim Spielen von DSA in der vierten Edition Spielern und Spielleitern zur Hand geht. Wie ebenfalls die meisten wissen, war das Projekt leider die letzten Monate tot. Dies hatte mehrere Gründe: einerseits habe ich schlichtweg viel weniger Zeit als früher, und schließlich habe ich mich mit den Aufgaben beim DSA4 Werkzeug übernommen. Ich wollte sowohl das Programm erstellen - in allen Aspekten, Datenhaltung, User Interface, Logos, als auch die dazugehörige Website betreuen, als auch die Dokumentation schreiben, als auch eine Softwarebibliothek in C++ erstellen, die für DSA4-Tools gedacht ist, als auch die Druckausgabe verfeinern, als auch das XML-Format definieren, als auch die entstehende Community verwalten. Das mag eine ganz kurze Zeit funktioniert haben -- aber es musste, als ich meinen Beruf angetreten habe, schief gehen. Dummerweise liebe ich auch noch meinen Beruf, was mich zwar persönlich erfreut, aber der Entwicklung des DSA4 Werkzeugs nicht gut tut.

Ich werde in diesem Blog anfangen, die weitere Entwicklung des DSA4 Werkzeugs nach außen zu präsentieren. Dies ist ein vorläufiger Ersatz für eine echte Website, die einen Community-Prozess erlaubt. Aber diese Website werde ich nicht aufstellen. Ebenso werde ich tatsächlich nur wenige Emails im Zusammenhang mit dem DSA4 Werkzeug ausführlich beantworten. Ich brauche jemanden, der sich bereit erklärt, eine dazugehörige Website zu pflegen, und jemanden, der mir bei der Community helfen will. Ich erinnere mich, dass ich früher teilweise mehr Zeit in die Website und die Beantwortung von Emails investiert habe, als in die eigentliche Entwicklung. Dies muss ich diesmal vermeiden. Die Community war großartig! Ich brauche bloß an Leute wie Twel oder Wolfgang denken, die unglaubliches geleistet haben. Wer sich berufen fühlt, zu helfen, melde sich bitte bei mir.

Damit klar ist: ich werde auch in Zukunft nicht auf magische Weise mehr Zeit haben. Aber meine Erfahrungen in Industrie und Forschung, sowie das Wissen und die Erfahrung, die ich inzwischen über Open Source Community Prozesse und Software Engineering angesammelt habe, erlauben mir -- so die Hoffnung -- einige Fehler der ersten zwei Versionen des DSA4 Werkzeugs zu vermeiden. Die Architektur des alten DSA4 Werkzeugs war sehr durchdacht, und extrem flexibel. Leider war sie auch sehr kompliziert: metatemplate-Programmierung in C++ ist megacool, aber es macht den Einstieg für andere nicht gerade einfach. Einer der Gründe, warum es in den Monaten der aktiven Entwicklung keine 100 Zeilen von anderen Entwicklern in den Quelltext geschafft haben, oder warum niemand das Projekt aufgegriffen hat, nachdem ich offensichtlich nicht mehr weiterarbeitete. Dies wird der Hauptpunkt, den ich angreifen werde.

In den nächsten Tagen - ja, wirklich, Tagen - werde ich hier anfangen, in Blogeinträgen die neue Architektur zu skizzieren, sowie diverse Punkte auflisten. Zu einem Zeitplan möchte ich mich nicht durchringen, wann das Programm fertig wird. Aber ich kann versprechen, dass ein neuer Anfang gemacht ist. Warum? Weil das nicht Absicht ist, sondern auf erstem Code beruht. Zugegeben, bisher nur auf meiner Platte. Aber sobald es vorzeigbar ist, auch wieder auf SourceForge im CVS.

Ein halbes Jahrzehnt Nodix

Vor genau fünf Jahren habe ich Nodix begründet. Wie jedes Jahr (2002, 2003, 2004, 2005) heißt das Rückblick und Ausblick und ein wenig Zahlenspielchen, sowie Versprechen, die ich nicht einlösen werde.

Laut dem Zähler auf dieser Website hatten die Nodix-Seiten im ersten Jahr 2.000, im zeiten 20.000, dann 44.000, und dann 115.000 Besuche. Und letztes Jahr? Laut dem Zähler ging die Zahl auf 92.000 neue Besuche zurück. Was bedeutet das? Sind es tatsächlich weniger Besucher als früher? Nun ja, überraschend wäre es nicht -- einen Großteil des Jahres waren die Seiten eher ruhig, manche einfach tot, und vor fast zwei Monaten gab es einen fiesen Angriff auf Nodix, der alle Daten löschte, und die Daten immer noch nicht wieder zurück sind.

Genauer in die Zugriffsstatistiken geschaut, können wir aber sehen, dass Nodix nicht unbedingt weniger Besucher anlockt: laut der 1&1-Statistik hatten wir schon 2004 nicht 115.000 Besucher sondern 198.000 -- und letztes Jahr nicht etwa 92.000, sondern sage und schreibe 416.454 Besucher! Nicht Seitenaufrufe.

Woher die Diskrepanz? Ich tippe auf die Feeds. Seit Ende 2004 setzen die Nodix-Seiten vermehrt auf Feeds, und ihr, werte Leser, nutzt diese Gelegenheit natürlich auch, und das ist auch gut so. Der Abruf eines Feeds jedoch erhöht nicht den Zählerstand auf der Seite. Auch weiß ich nicht, ob die Statistik von 1und1 Abrufe des Feeds, der sich nicht verändert hat, auch mitzählt -- sprich, wird jedesmal, wenn jemand, der einen RSS-Feed der Nodix-Seiten abonniert hat, online geht und Feeds checkt, der Zähler erhöht? Dann wäre die Zahl von 416.454 natürlich deutlich überhöht! Aber dann wiederum biete ich auf semantic.nodix.net auch einen Feed im RSS-Format, den Feedburner aus dem Atom-Feed erstellt, den ich nicht mitzählen kann. Und PlanetRDF smusht semantic.nodix-Einträge auch noch mit. Was cool ist. Aber über Reichweite kann man dann letztlich nur noch die Kristallkugel befragen. Sie ist genauso glaubwürdig wie der Zähler von 1und1 oder der auf dieser Seite und jetzt etwa 273.000 anzeigt.

Oder kurz: ich habe eigentlich keine Ahnung, wieviele Leser Nodix eigentlich hat. Nada.

Also kommen wir zu den Rück- und Ausblicken: nutkidz sind wieder da, bisher regelmäßig, und das soll auch so bleiben. Auch die englische Übersetzung läuft prima. Apropos Übersetzungen: nach einer langen Pause ist auch die deutsche Übersetzung von something*positive wieder angelaufen. Ihr kennt das nicht? Ist auch ein Webcomic - und ein sehr fieser dazu! Ich würde sagen, ab 16 und nichts für schwache Gemüter. nakit-arts läuft prima -- Schwesterchen, gratuliere! Dies ist bei weitem der erfolgreichste Blog auf den Nodix-Seiten und auch um längen die schönste Seite von Nodix. Nur weiter so! semantic.nodix läuft und läuft gut. Wer sich für meine Arbeit interessiert, sollte da mit-lesen. Das DSA4 Werkzeug - nun, die Seite ist tot, ich denke noch darüber nach, und wenn alles nach Plan läuft, wird es dieses Jahr einen neuen Start geben. Dazu mehr zu gegebenem Zeitpunkt. XML4Ada95 hat ein Interesse geweckt ähnlich wie die Sprache selbst ;) -- ich will die Seite bald noch ein letztes Mal ändern, und dann ist das Projekt beendet. Nodix ist groß, wie ihr seht. Und es wird dieses Jahr noch weiter wachsen.

Also, auf in die zweite Hälfte des Jahrzehnts! Eine drittelmillion soll der Zähler am Jahresende anzeigen!

GESTS journal invitation! - ideas for better spam

Yeah, isn't that great! Got an invitation to submit my paper to the GESTS Journal "Transactions on Communications and Signal Processing" (won't link to it). Well, not directly my field, and I never heard of the Journal, but hey, a journal paper, isn't that great...

Ehhm, not exactly. Actually it seems to be spam. Another collegue got the same invitation last week. And no one heard about the journal. And it really isn't my field. I don't have to do anything with Signal Processing. And why do they want money for printing my article?

What I was wondering: why didn't they do it some better? With the AIFB OWL Export they could have got the machine processable information about the interests of each person at the AIFB. With a bit of SPARQLing they could have gotten tons of information -- fully machine processable! They could have found out that I am not into Signal Processing, but into Semantic Web. Personalizing Spam would be sooo easy. Spam could become so much more time-consuming to filter out, and much more attractive, if those spammers would just harvest FOAF-data and semantic exports. I really am surprised they didn't do that yet.


Comments are still missing on this post.

Denny macht Milchreis

Heute wollte ich Milchreis kochen. Wie geht das?

Zuerst geht man zu Leo und findet heraus, dass Milchreis auf Englisch rice pudding heißt. Es gibt nämlich keine deutsche Wikipedia-Seite zu Milchreis! Wahrscheinlich den deutschen Wikipedianern nicht enzyklopädisch genug. Mit rice pudding jedenfalls können wir sowohl auf den englischen Wikipediaartikel zu rice pudding zugreifen, als auch in Google Base nach rice pudding suchen. Hmm, die ersten Hits sind nur Fertiggerichte. Heute ist Feiertag - wieso weiß das dämliche Ding nicht, dass ich nicht einkaufen kann?

Also klicke ich auf recipes, um die Suche einzuschränken. Mist, war falsch, ich kriege vor allem Rezeptbücher. Zurück. Auf recipe klicken. Wieder rice pudding eingeben (dass sich das Ding das nicht merken konnte!), und neues Glück. Ja, das Ergebnis sieht gut aus. 45 Milchreisrezepte. Verdammt, die Milch ist übergekocht.

Herdplatte wechseln. Die eine abkühlen lassen. Mit einem Klick kann ich die Suche auf die recipe of the day verkürzen, der Rest will anscheinend nur was verkaufen. Doch die sind nicht weiter kategorisierbar. Doof. Hätte gerne nach Zutaten oder Kalorien weiter verfeinert. Na ja, Google Base ist nicht das Semantic Web, sondern nur eine erste UI Studie dorthin, oder?

Kippe den Reis in die kochende Milch. Überlege mir, dass mir das Semantic Web hier nur hätte helfen können, wenn ich das Haus auch mit Ubiquitous Computing oder Ambient Intelligence ausgestattet hätte. Genug komische Begriffe in die Gegend geworfen.

Der Milchreis klumpt. Suche in Google nach milchreis verklumpt. Dritter Hit sagt (aus einem Gaming-Forum) : "Och Mist, jetzt ist mein Milchreis verklumpt, hätte ich doch schneller rühren müssen!" Rühren! Ich rühre. Sehe derweil einen weiteren Hit: Den Milchreis sollte man nach dem Kochen abspülen und mit Milch weiter kochen. Verflixt! Reis zuerst kochen. Das hat mir meine Mama auch schon gesagt, letztes Mal. Schon wieder vergessen. Aber in den amerikanischen Rezepten auf Google Base wurde das nicht erwähnt.

Der ertse Hit führt übrigens auf Frag Mutti - Das Nachschlagewerk (nicht nur) für Junggesellen. Sachen gibt's. Dort gibt es ein hünsches T-Shirt: Milchreis schmeckt hervorragend, wenn man es kurz vor dem Verzehr durch ein saftiges Steak ersetzt. Oh, bei Frag Mutti im Milchreisrezept heißt es auch, den Reis nicht vorher aufkochen. Anscheinend gibt es mehrere Varianten (und beide von Mama? Ich bin verwirrt). Außerdem soll man den Milchreis für ein bis zwei Stunden ins Bett mitnehmen. Spart Energie. Das hätte meine Mama nie gesagt! Wir sind ja schließlich katholisch.

Und jetzt wisst ihr, warum man mich üblicherweise nicht in die Nähe einer Küche lässt.

Ein neues Jahr steht an

2005 war ein gutes Jahr. Ich werde es vermissen. Und dennoch freue ich mich auch auf 2006. Auch das nächste Jahr wird sicher sehr spannend, und wenn es auch nur halb so gut ist zu mir wie 2005 es war, dann kann ich mich glücklich schätzen.

Vor zwei Jahren versuchte ich mich mit einer Prognose für 2004 für das kommende Jahr, und obwohl sie so unglaublich generisch gehalten war, ging sie in die Hose. Deswegen habe ich es 2004 unterlassen, was für 2005 wahrzusagen, und, wenn ich zurückblicke, war das eine gute Entscheidung. Es wäre wieder in die Hose gegangen. Oder sieht jemand das letzte Einhorn im Kino?

2006 wird sehr spannend. Auf der Arbeit wird das EU Projekt SEKT zu Ende gehen. Das heißt, die ganzen spannenden Case Studies werden die SEKT Technologien benutzen und wir werden furchtbar viele Erfahrungsberichte sammeln. Sehr spannende Zeiten! Auch dem BMBF Projekt SmartWeb, an dem ich eher am Rande beteiligt bin, stehen spannende Zeiten bevor: die Fußball WM ist eines ihrer Case Studies! Na, da hat man sich ja mal was vorgenommen. Die Semantische Erweiterung der Wikipedia steht an, ebenso der Workshop auf der WWW zum Thema Ontologieevaluierung -- dem Thema meiner Dissertation, die nebenher auch noch entstehen soll.

Soviel zur Arbeit. Privat? Das ist immer viel schwerer einzuschätzen. Ich würde gerne eine USA-Reise machen. Am liebsten dieses Jahr, wenn es die Zeit erlaubt, sonst halt 2007. Quer durch die Staaten. In Neuengland anfangen und rüber nach Kalifornien. Das wäre cool. Mal wieder auf den RatCon gehen. Neue Freunde kennenlernen, alte behalten, noch ältere neu entdecken (Michael, wenn Du das liest, meld Dich doch mal!)

Ich will dieses Jahr ebenso gesundbleiben wie die vorhergehenden. Na gut, ein wenig öfter zum Arzt gehen kann nicht schaden. Ich merke, dass ich nicht mehr 16 bin. Etwas abnehmen, das wäre cool! Mehr Bewegung und etwas gesünder essen. Gar nicht mal unbedingt weniger. Nur besser. Und halt Bewegung. Seufz.

Weiterhin so häufig ins Kino gehen. Aber mehr darüber Bloggen. Vernünftigere Kritiken schreiben. Ich merke in letzter Zeit bin ich etwas blahblah. Text ist da, aber kein Inhalt mehr. Will ich wieder ändern. Seht euch nur das gestrige Post an. Wen hätte ich dadurch überzeugt, zum Konzert mitzukommen, wenn man die Bands nicht schon kennt. Das haben weder Amber noch Saltatio verdient.

Die nutkidz sind da, haben gar eine neue Folge erhalten! 2006 werden viel mehr neue Folgen kommen.

Ach ja, die eigenen Projekte. Das DSA4 Werkzeug. Ich habe versprochen, zum Status etwas zu schreiben, noch vor Weihnachten. Wie üblich habe ich mein Versprechen bezüglich des DSA4 Werkzeugs gebrochen. Na ja, fast. Geschrieben habe ich es, aber nicht veröffentlicht. Ich muss das noch umarbeiten. Überhaupt, DSA: dieses Jahr habe ich da einiges getan: sowohl an der Sieben Gezeichneten Kampagne mitgearbeitet, wie auch am Jahr des Feuers. Soviel werde ich nächstes Jahr wohl kaum auf die Beine stellen. Obwohl - ein Abenteuervorschlag ist eingereicht. Ein sehr gewagter. Mal sehen, was rauskommt.

Kurz, 2006 wird ein interessantes Jahr. An guten Vorsätzen mangelt es mir nicht. An Möglichkeiten zum Glück auch nicht. An guten Freunden und Kollegen, mit denen ich meine Ziele erreichen kann, ebenfalls nicht. Was also soll schon schiefgehen? Jeder Tag, nicht nur jedes Jahr, ist ein neuer Anfang. Man darf nur nicht vergessen, sich mal auszuruhen. Hinsetzen. Nachdenken. Das fehlt mir ein wenig. Das wäre eigentlich ein guter Vorsatz.

Mehr Zeit zum Nachdenken.

Im Moment bin ich aber etwas in Eile, verzeiht. Jetzt zunächst zum Konzert, und dann womöglich gleich nach Erfurt, beziehungsweise, nach Lützensömmern, tief im Thüringischen. Dort Silvester feiern, und das nächste Jahr begrüßen. Aber ich werde mir diesen guten Vorsatz im Hinterkopf behalten. Mehr Zeit zum Nachdenken.

Einen guten Rutsch!

Amber in Stuttgart

Yeah! Morgen spielen Amber und Gefährten in Stuttgarter Landespavillon. Ich gehe davon aus, euch zahlreich bei dem Konzert zu sehen. Sie sind als Support für Saltatio Mortis dort. Auch die, sehr lohnenswert.

Amber promotet zur Zeit mit der Tour ihr zweites Album, Rabenflug. Hörproben gibt es auf Ambers Website. Hört rein, kommt zu dem abschließenden Konzert der laufenden Tour in Stuttgart, morgen, am 30.12, und habt eine Menge Spaß!

Komplette nutkidz

Yeah! Die nutkidz sind wieder vollständig online. Alle bisherigen 38 Folgen, in deutsch und in englisch.

Und das beste: Gerüchten zufolge soll bald sogar eine neue Folge kommen... ;)

Zum 500.

Herzlichen Glückwünsch an Schwesterchen für Ihren 500. Eintrag in nakit-arts. Wow, 500 Einträge! Sehr fleißig.

Spaßigerweise ist dieser Eintrag wiederum der 250. Eintrag auf Nodix. Koinzidenz.

FOAF browser

Thanks Josef, thanks Pascal! I have complained that Morten's Foaf explorer is still down, they, instead of complainig as well, pointed me to their own FOAF explorers: Josef has his Advanced FOAF Explorer, very minimalistic, but it works! And Pascal points to Martin Borho's FOAFer. FOAFer has a few nice properties.

Thank you guys, your sites are great!

Is your source code there? Because both of your tools lack a bit in looks, to be honest. And do you really think, users like to see SHA1 sums? Or error messages? (Well, actually, that was OK, that helped me discover a syntax error in the AIFB FOAF files). Please, don't misunderstand me: your site really are great. And I like to use them. But in order to reach a more general audience, we need something slicker, nicer.

Maybe a student in Karlsruhe would like to work on such a thing? Email me.

Nutkidz sind wieder da!

Auf nutkidz.de finden sich die ersten drei Folgen der nutkidz wieder online! Nach dem Angriff auf Nodix mussten die Sicherheitskopien von meinem alten Rechner im Keller geholt werden. Jetzt ist die Technik erneuert: es gibt einen nutkidz-feed, so dass man sich auch mit seinem RSS-Reader den Webcomic ins Haus holen kann! So einfach sollte jeder Webcomic zu lesen sein.

In den nächsten Tagen werden die weiteren Folgen relativ bald aufeinander kommen, ich plane auf mindestens zwei oder drei täglich.

Ach ja -- und auf nutkidz.net gibt es von jetzt an die nutkidz auch auf Englisch.

New tagline is my New Year's resolution

I just changed the tagline of this blog. The old one was rather, hmm, boring:

"Discovering the Semantic Web, Ontology Engineering and related technologies, and trying to understand these amazing ideas - and maybe sharing an idea or two... "

The new one is at the same time my new year's resolution for 2006.

"Kicking the Semantic Web's butt to reality"

'nuff said, got work to do.

Fellow bloggers

Just a few pointers to people with blogs I usually follow:

  • Max Völkel, a colleague from the AIFB, soon moving to the FZI and right now visiting DERI. He obviously likes groups with acronyms. And he's a fun read.
  • Valentin Zacharias, who has deeper thoughts on this whole Semantic Web stuff than most people I know, working at the FZI. He's often a thought-provoking read.
  • Planet RDF. The #1 blog on news for the (S/s)emantic (W/w)eb, both with major and minor initials. That's informative.
  • Nick Kings from BT exact. We are working together on the SEKT project, and he just started to blog. Welcome! A long first post. But the second leads to a great video!
  • Brendan Eich, one of the Mozilla gurus. I want to know where Mozilla is headed to - so I read his musings.
  • PhD. It's not a person, it's a webcomic, granted, but they offer a RSS feed for the comic. Cool. I always look forward for new episodes.

So, if you think I should read you, drop me a note. I especially like peers, meaning, people who like I do are working on the Semantic Web, maybe PhD students, and who don't know the answer to anything, but like to work on it, making the web come real.

Nacktschnecken - Soundtrack?

Heute in der Sneak gewesen im Stuttgarter Metropol - wie fast jeden Montag. In letzter Zeit kamen einige ziemlich coole Filme, die ich eigentlich kaum weiterempfehlen kann. Terkel in Trouble (Trailer auf kino-zeit.de, dänische Website zum Film), der vielleicht krasseste Film seit langem, mit soviel Menschenverachtung und Witz, dass einem South Park schon moralisch vorkommt. Oder Kiss Kiss Bang Bang - einer der besten Filme des Jahres, aber wiederum so überaus fies, dass ich mir schwertue jemanden weiterzuempfehlen. So auch heute.

Nacktschnecken ist ein österreichischer Film, und das hört man auch. Und es geht um einen Pornodreh, und was dabei alles schiefgehen kann. So, wen das nicht abgeschreckt hat, der kann reingehen.

Aber eines nervt mich echt. Weil der Film bislang ja nur in der Sneak lief (und auf einem tschechichsen Festival) finde ich im Netz keine Infos darüber, was im Soundtrack war. Ein paar Songs waren echt cool! Insbesondere der Song, den sie bei diesem Würfelspiel gespielt haben. Wie heißt der? Von wem ist der?

Letztlich: der Link von Mr Sneak ging zu kino-how.de. Hier kann man über die Sneakfilme abstimmen. Mit Gewinnspiel. Für weitere Sneakkarten.

More FOAF

Wow, I never can't get enough FOAF :) Besides my Nodix-FOAF file the AIFB Portal now also offers a FOAF export for all the people at the AIFB (using URIs for the people besides the mailbox SHA1-Sum as identifiers. Hah! FOAFers won't like that, but TimBL told us to do it this way in Galway a few weeks ago).

If you point your smushers at the FOAFs, I wonder if you can also compile the SWRC-output into it, as they use the same URI? And can you also, by adding my own FOAF from Nodix, that I am the same person? Anyone dare's to try? :)

It's a pity Morten's FOAF explorer is down, I'd really like to try it out and browse. Isn't there something similar out there?

A tidbit more on that is also posted on the AIFB blog, but from a different point of view.

Kroatiens nächsten drei WM Spiele 2006

Nachdem gestern die ziemlich sicheren Spiele aufgeschrieben wurde, hier der weitere Verlauf für Kroatien. Gewisse Diskrepanzen zur Realität können sich durch Tatsachen einstellen.

Kaiserslautern, Montag, 26. Juni, 16 Uhr, Zweite Runde: Italien gegen Kroatien

Haushoher Favorit des Spiels ist klar der mehrfache Weltmeister. Die Italien sind jedoch etwas zu leichtherzig, und Kroatien schafft es den alten Nachbar gegenüber der Adria zu besiegen. Das Spiel wird als SmartWeb Demonstrationsobjekt benutzt (erstes K.O.-Spiel in Kaiserslautern!), doch das geht unter weil es danach zu Schlägereien und Randalen kommt in verschiedenen deutschen Städten. Sehr traurig so was.

Hamburg, Freitag, 30. Juni, 20 Uhr, Viertelfinale: Kroatien gegen Frankreich

Mal wieder ein Weltmeister. Diesmal will Frankreich die Schlappe in 2002 wettmachen, doch Kroatien ist in Hochform nach den bisherigen Spielen. Doch es ist eine Wiederholung der WM 1998. Damals besiegte Kroatien im Viertelfinale den Favoriten Deutschland, und verlor dann im Halbfinale gegen den späteren Weltmeister Frankreich. So auch hier. Im Viertelfinale besiegen wir den Favoriten!

München, Mittwoch, 5. Juli, 20 Uhr, Halbfinale: Kroatien gegen Brasilien

Das wir die Brasilianer auch dummerweise jetzt schon wiedertreffen müssen! Im Finale wäre es doch viel spannender. Beide Mannschaften haben eine regelrechte Odyssee durch Deutschland hinter sich gebracht, um sich nach dem Eröffnungsspiel in Berlin hier in München wieder zu begegnen. Wer darf wieder nach Berlin? - nur, diesmal zum Finale. Am Tag zuvor konnte sich Deutschland als einer der beiden Finalteilnehmer qualifizieren - und wer wird nun den Gastgeber im Finale fordern? Alle tippen auf den Weltmeister.

Letztes Spiel wird dann entweder in Stuttgart gegen Holland um Platz Drei, oder in Berlin gegen Deutschland um den Titel. Egal wie es ausgeht: das Spiel wird am nächsten Tag Schlagzeilen machen!

Kroatiens erste drei WM Spiele 2006

Gestern wurde ich durch die Nachricht, dass die WM Gruppen ausgelost werden, eher überrascht (Dank an Rudi für den Hinweis), aber jetzt bin ich über die Ergebnisse erfreut - das werden sehr schöne Spiele!

Berlin, Dienstag, 13. Juni, 20 Uhr: Brasilien gegen Kroatien

Yeah! Das Megaspiel für Kroatien. Der amtierende Weltmeister. Der Favorit. Die Nummer 1 der Weltrangliste. Das erste Spiel auf der WM für beide Länder. Gab es schon mal ein Spiel zwischen den beiden? Nicht in einem offiziellen Turnier. Aber es gab ein Freundschaftsspiel letztes Jahr. Ergebnis? 1:1. Na, nichts ist unmöglich!

Nürnberg, Sonntag, 18. Juni, 14 Uhr: Japan gegen Kroatien

Die Japaner haben wir schon mal besiegt - nur ist das schon ein paar Jahre her, und es war knapp. Beim ersten Auftritt der Kroaten und der Japaner bei einer WM. Doch diesmal sind die Vorzeichen umgedreht. Japan ist Nummer 15 auf der Weltrangliste, Kroatien die 20. Es wird ein enges Spiel. Und - auch Japan konnte dieses Jahr beim Confederations Cup gegen Brasilien ein 2:2 rausholen, und hätte gar fast gewonnen...

Stuttgart, Donnerstag, 22. Juni, 20 Uhr: Kroatien gegen Australien

Ein Heimspiel für die Kroaten. Und in der kroatischen Mannschaft steckt nicht nur ein Australier kroatischer Herkunft. Die Australien waren ganz schön angekäst, als Australier wieder nach Kroatien zurückwanderten, weil sie dort in der Nationalmannschaft vorankommen konnten. Jetzt wird sich Australien rächen wollen. Aber in Stuttgart hat Kroatien eine große kroatische Gemeinde in der Hinterhand. Nur - wird schwerlich dieses Spiel entscheidend werden.

Eine starke Gruppe - aber es gibt ja auch keine leichten Gruppen mehr. Ein klarer Favorit. Aber es kommt auch der Zweite weiter. Noch sechs Monate, dann rollt der Ball!

Und ich interessiere mich gar nicht für Fußball.

Du nimmst meine Schnalle, ich nehm Deine...

Tja, manchmal denkt man, dass es eigentlich ganz einfach ist: ich bin mit Deiner Frau durchgebrannt, hier, nimm Du dafür meine, OK?

Zumindest dachte dass dieser Herr. Na, mal sehen, wie es ausgeht.

A blog for the AIFB

Although I blogged here about the AIFB as well - the great place I am working at - Jens Hartmann suggested to create an own, dedicated AIFB-Blog on ontoworld. It's in beta still, kind of. We hope that other AIFB-people will blog there as well, and so keep you up to date with AIFB-stuff, blog about our papers, workshops, conference attendances, break-through results, but also the great weather in Karlsruhe and stories that happened here.

So, while I will still continue to post on the Semantic Web here, the more workplace related stuff will be found there: at the new AIFB-Blog.

Der Papst ist tot, Hurra?

Die Umfrage Perspektive Deutschland, die ich ansonsten eigentlich nur empfehlen kann, weil die Fragen echt in die Tiefe gehen und die daraus generierten Reports durchaus inhaltlich bemerkenswert sind, hat mich bei der derzeit laufenden Umfrage doch irgendwie überrascht:

Frage: Wie haben die Ereignisse der letzten zwölf Monate, z.B. der Tod von Papst Johannes Paul II., ihre Meinung gegenüber der katholischen Kirche beeinflusst?

Antwortmöglichkeiten: Stark verbessert - verbessert - nicht verändert - verschlechtert - stark verschlechtert - weiß nicht

Sonst denken die von der Umfrage aber durchaus etwas nach.

Nodix schon wieder angegriffen

Nachdem die Nodix-Seiten schon vor ein paar Wochen Opfer eines Angriffs wurden, hat es uns diesmal fies erwischt. Gab es letztes Mal nur ein Defacement, wurden wir diesmal vollständig gelöscht (nur ein perl-Script wurde hochgeladen, mit spanischen Kommentaren, die, laut Übersetzung, recht fies waren. Aber unpersönlich).

Tja, das bedeutet dass, bis ich etwas Zeit finde, die Nodix-Seiten nur teilweise funktionstüchtig sind. Die DSA4 Werkzeug-Seiten sind ganz weg, ebenso Something*Positive. Nutkidz hat eine weiße Seite, nakit-arts ist halbwegs auf den Beinen, Semantic Nodix läuft großteils schon, Nodix selber ist eher schlecht als recht wieder da.

Grrr. Ich Pappnase. 1und1 hat mich noch gewarnt, und mögliche Fehlerquellen genannt, ich hatte das dummerweise ignoriert. Hat man nun davon.

Annotating axioms in OWL - Reloaded

Yesterday I sent a lengthy mail to the OWL-ED Mailinglist] about how to annotate axioms. Peter Patel-Schneider himself, first author of the OWL Semantics specification, told me in nice words that my solution sucked heavily, by pointing out that the semantics of annotations in OWL are a tiny bit different than I thought. Actually, they are not at all as I thought. So, in the evening hours, instead of packing my stuff for a trip, I tried to solve the problem anew. Let's see where the problem will be this time.

Peter, you were right, I was wrong. I took a thorough look at the Semantics, and I had to learn that my understanding of annotations was totally screwed. I thought they would be like comments in C++ or Prolog, but instead they are rather like a second ABox over (almost) the whole universe. This surprised me a lot.

But still, I am not that good at giving up, and I think my solution pretty much works syntactically. Now we need only a proper Semantics to get a few things right.

What would be the problem? Let's make an example. I need some kind of Syntax to give axioms name. I will just take Name":" Axiom. This is no proposal for the Abstract Syntax extension, this is just for now.

Axiom1: SubClassOf(Human Mortal)
Axiom2: Individual(Socrates type(Human))

Do they entail the following?

Axiom3: Individual(Scorates type(Mortal))

Well, pitily they don't. Because the Axiom3 has a name, Axiom3, that is not entailed by Axiom1 and Axiom2. Their contents would be entailed, but the name of the axiom would not.

I guess, this is the problem Peter saw. So, can we solve it?

Well, yes, we can. But it's a bit tricky.

First, we need the notion of Combined Inverse Functional Properties, CIFP. A CIFP has several dimensions. A CIFP with dimension 1 ist a normal Inverse Functional Property. A CIFP with dimension 2 over the properties R, S can be represented with the following rule: a R c, a S d, b R c, b S d -> a = b. This means, in a two dimensional space I can identify an individual with the help of two roles. More on this here: http://lists.w3.org/Archives/Public/semantic-web/2005Feb/0095.html

Second, we extend the semantics of OWL. Every axiom entails reifying annotations. This means:

SubClassOf(Human Mortal)

entails

Individual(Statement1 type(rdf:statement)
annotation(rdf:subject Human)
annotation(rdf:property owl:SubClassOf)
annotation(rdf:object Mortal))

or, in N3:

Human owl:subClassOf Mortal.

entails

Statement1 rdf:type rdf:statement.
Statement1 rdf:subject Human.
Statement1 rdf:property owl:subClassOf.
Statement1 rdf:object Mortal.
rdf:subject rdf:type owl:AnnotationProperty.
rdf:predicate rdf:type owl:AnnotationProperty.
rdf:object rdf:type owl:AnnotationProperty.

Third, we have to state that we have a 3D-CIFP for statements over rdf:subject, rdf:property and rdf:object*. This is to ensure that Statement1 always maps to the same element in the universe, even though an OWL API could give it a blank node, or a different URI everytime (mind you, I am not suggesting to extend the OWL language with CIFPs, I just say that it is used here in order to state that all triples with the same subject, object and predicate actually is the same triple).

Fourth, the above statement also entails

Individual(Axiom1 type(owl11:axiom)
annotation(owl11:consistsOf Statement1))

or, in N3:

Axiom1 rdf:type owl11:axiom.
Axiom1 owl11:consistsOf Statement1.
owl11:consistsOf rdf:type owl:AnnotationProperty.

Fifth, owl11:consistsOf needs to be an n-dimensional CIFP with n being the number of triples the original axiom got translated to (in this case, happy us!, n=1).

This assures that an axiom is always the same, whatever it's name is, as long as it expresses the same thing. Thus, in our example, Axiom3 would indeed be entailed by Axiom1 and Axiom2. So, even if two editors load an ontology an annotate an axiom, they could later interchange and find each others annotation attached to the correct axiom.

This is only a rough sketch of the way, and yes, I see that the Interpretation gets filled up with a lot of annotations, but I still think that this is quite easy to implement, actually. Both the OWL API by Bechhofer and Volz and the KAON2 API by Motik offer access to axioms on an ontology level, and also offer the possibility to check if they are the same anyway, if I remember correctly (which is basically a shortcut for the whole semantic entailment and CIFP-stuff proposed earlier). All they need is a further field containing the URI of the axiom.

As said, this looks far more nasty than it actually is, and for most practical reasons it won't do much harm. Now we finally can annotate axioms, yeeeha!

Merrily awaiting Peter to acknowledge that this is a brilliant solution :) Or else tell me I did it all wrong again, so that I have to think over the weekend how to solve this problem again.

Cheers, denny

 *What I mean with that is the following rule: a=b :- a rdf:subject s, a rdf:property p, a rdf:object o, b rdf:subject s, b rdf:property p, b rdf:object o

Annotating axioms in OWL

This was sent to the OWLED-List by me, that prepares to come up with an OWL 1.1 recommendation. The week before, Alan Rector suggested to add the possibility to annotate axioms in OWL, which is currently not possible. There is many a good use for that, like provenance, trust, and son on. But the discussion wasn't too fruitful, so I suggested the following solution.

After it came up in discussion last week, I hoped an elegant solution for annotating axioms would arise. Pitily, no one had a brilliant idea, so I went ahead and tackled the problem in my mediocre way.

First, what do I want to achieve with my solution:

  1. Don't crack the Semantic Web stack. The solution has to be compatible to XML, RDF and OWL. I don't want to separate OWL from RDF, but to offer a solution that is able to be handled by both.
  2. We want to annotate not just entities, but also axioms. Thus an axiom needs to be able to be a subject in a statement. Thus an axiom needs to have an URI.
  3. The solution must be easy to implement, or either people will get my FOAF-file and see whom I care about and hurt them.

Did I miss something? I found two solutions for this problem.

A) Define the relationship between an ontology (which does have an URI) and the axioms stated inside. Then we can talk about the ontologies, annotate those, add provenance information, etc. Problem: after importing axioms from one ontology into another, those information is lost. We would need a whole infrastructre for Networked Ontologies to achieve that, which is a major and worthy task. With this solution, you can annotate a single axiom by putting it alone into an ontology, and claim that when annotating the ontology you actually annotate the axiom as well. Not my favourite solution, because of several drawbacks which I won't dwell in deeper if not asked.

B) The other solution is using Reification (stop yelling and moaning right now!). I'm serious. And it's not that hard, really. First, the OWL specification offers a standard of how to translate the Axioms into triples. Second, thte RDF specification offers a standard way to reify a triple. With the RDF reification we can give a triple a name. Then we can introduce a new resource type owl11:axiom, where its instances contains the triples that were translated from a certain DL Axiom. This rdf resource of type owl11:axiom is then the name/URI of the original DL Axiom.

RDF-triples that have a subject of type rdf:statement or owl11:axiom don't have semantics with regards to OWL DLs Model Theoretic Semantics, they are just syntactic parts of the ontology in order to allow the naming of axioms in order to annotate them.

For example, we say that all Humans are Mortal. In Abstract Syntax this is

SubClassOf(Human Mortal)

In RDF triples (N3) this is:

:Human rdfs:subClassOf :Mortal.

Now reifiying this we add the triples:

:statement1 rdf:type rdf:statement.
:statement1 rdf:hasSubject :Human.
:statement1 rdf:hasPredicate owl:subClassOf.
:statement1 rdf:hasObject :Mortal.
:axiom1 owl11:consistsOf :statement1.

Now we can make annotations:

:axiom1 :bestBefore "24/12/2011"^xsd:date.
:axiom1 :utteredBy :Aristotle.

Naturally, :bestBefore and :utteredBy have to be Annotation Properties. When an axiom is broken up in more than one triple, the reasone of having an extra owl11:axiom instead of simply using rdf:statement should become clear.

Does this solution fulfill the given conditions?

  1. The Semantic Web stack is safe and whole. RDF Semantics is adhered to, and OWL semantics is fine, and all syntax regulations imposed by XML and RDF/XML are regarded. Everything is fine.
  2. Yep, we can annotate single axioms. Axioms have URIs. We can annotate our metadata! Yeah!
  3. Is it easy to implement? I think it is: for reading OWL ontologies, a tool may just ignore all those extra triples (it can easily filter them out), and still remain faithful to the standard semantics. Tools that allow to name axioms (or annotate them) and want to deal with those, have to simply check for the correct reification (RDF toolkits should provide these anyway), and get the axiom's URI.

Problems that I see: I identified two problems. First, what happens, if those triples get separated from the other actual axiom triples? What if they get ripped apart and mushed into another ontology? Well, that problem is somewhat open for OWL DL and Lite anyway, since not all axioms map to single triples. The answer probably is, that reification would fail in that case. Strict reading could be that the ontology leaves OWL DL then and moves to OWL full, but I wouldn't require that.

Second problem, and this is by far more serious, is that people can't stand reification in RDF, that they simply hate it and that alone for that they will ignore this solution. I can only answer that reification in practise is probably much easier than expected when done properly, due to some short-hand notations available in RDF/XML-serialization, and other syntaxes. No one holds us back from changing the Abstract Syntax and the OWL XML Presentation Syntax appropriately in order to name axioms far more easy than in the proposed RDF/XML-Syntax. Serializations in RDF/XML-Syntax may get yucky, and the RDF graph of an OWL ontology could become cluttered, but then, so what? RDF/XML isn't read by anyone anyway, is it? And one can remove all those extra triples (and then the annotations) automatically if wished, without changing the Semantics of the ontology.

So, any comments on why this is bad? (Actually, I honestly think this is a practicable solution, though not elegant. I already see the 2007 ISWC best paper award, "On the Properties of Higher Order Logics in OWL"...)

I hope you won't kill me too hard for this solution :) And I need to change my FOAF-file now, in order to protect my friends...

Job at the AIFB

Are you interested in the Semantic Web? (Well, probably yes or else you wouldn't read this). Do you want to work at the AIFB, the so called Semantic Web Machine? (It was Sean Bechhofer who gave us this name, at the ISWC 2005) Maybe this is your chance...

Well, if you ask me, this is the best place to work. The offices are nice, the colleagues are great, our impact is remarkable - oh well, it's loads of fun to work here, really.

We are looking for a person to work on KAON2 especially, which is a main building block of many a AIFB software, as for example my own OWL Tools, and some European Projects. Mind you, this is no easy job. But if you finished your Diploma, Master or your PhD, know a lot about efficient reasoning, and have quite some programming skills, peek at the official job offer (also available in German).

Do you dare?

Semantic Web Gender Issue

Well, at least they went quite a way. With Google Base one can create new types of entities, entities themselves, and search for them. I am not too sure about the User Interface yet, but it's surely one of the best actually running onbig amounts of data. Nice query refinement, really.

But heck, there's one thing that scares me off. I was looking today for all the people interested in the Semantic Web, and there are already some in. And you can filter them by gender. I was just gently surprised about the choices I was offered when I wanted to filter them by gender...

Hier fehlt noch ein Bild.

Oh come on, Google. I know there are not that many girls in computer science, but really, it's not that bad!

What is a good ontology?

You know? Go ahead, tell me!

I really want to know what you think a good ontology is. And I will make it the topic of my PhD: Ontology Evaluation. But I want you to tell me. And I am not the only one who wants to know. That's why Mari Carmen, Aldo, York and I have submitted a proposal for a workshop on Ontology Evaluation, and happily it got accepted. Now we can officially ask the whole world to write a paper on that issue and send it to us.

The EON2006 Workshop on Evaluation of Ontologies for the Web - 4th International EON Workshop (that's the official title) is co-located with the prestigous WWW2006 conference in Ediburgh, UK. We also were very happy that so many reknown experts accepted our invitation to the program committee, thus ensuring a high quality of reviews for the submissions. The deadline is almost two months away: January 10th, 2006. So you have plenty of time to write that mind-busting phantastic paper on Ontology Evaluation until then! Get all the details on the Workshop website http://km.aifb.uni-karlsruhe.de/ws/eon2006.

I really hope to see some of you in Edinburgh next May, and I am looking for lively discussions about what makes an ontology a good ontology (by the way, if you plan to submit something - I would love to get a short notification, that would really be great. But it is by no means requested. It's just so that we can plan a bit better).

Regenbogen

Ich bin gerade in Galway, und hier regnet es ständig. Wirklich. Zwar meist nur kurz, aber doch halt immer wieder neu.

Dafür wurde ich heute mit einem schier unglaublichen Regenbogen belohnt: er ging wirklich über den ganzen Horizont, ein vollständiger Bogen! So was habe ich noch nie gesehen. Das Bild ist untertrieben, deutlich, in Wirklichkeit schien er noch viel heller.

Besonders spannend war, dass er kaum zwei-, dreihundert Meter entfernt aus dem Wasser zu steigen schien. Nicht irgendwo weit weg, er war direkt da - man kann sogar die Häuser durch den Regenbogen hindurch auf dem Bild sehen, der Regenbogen war vor den Häusern. So etwas habe ich noch nie zuvor gesehen. Wahnsinnig beeindruckend.

Ich hoffe, dass ich bald noch ein paar bessere Bilder bekomme.

Hier fehlt noch ein Bild.

ISWC impressions

The ISWC 2005 is over, but I'm still in Galway, hanging around at the OWL Experiences and Direction Workshop. The ISWC was a great conference, really! Met so many people from the Summer School again, heard a surprisingly number of interesting talks (there are some conferences, where one boring talk follows the other, that's definitively different here) and got some great feedback on some work we're doing here in Karlsruhe.

Boris Motik won the Best Paper Award of the ISWC, for his work on the properties of meta-modeling. Great paper and great work! Congratulations to him, and also to Peter Mika, though I have still to read his paper to form my own opinion.

I will follow up on some of the topics from the ISWC and the OWLED workshop, but here's my quick, first wrap-up: great conference! Only the weather was pitily as bad as expected. Who decided on Ireland in November?

KAON2 and Protégé

KAON2 is the Karlsruhe Ontology infrastructure. It is an industry strength reasoner for OWL ontologies, pretty fast and comparable to reasoners like Fact and Racer, who gained from years of development. Since a few days KAON2 also implements the DIG Interface! Yeah, now you can use it with your tools! Go and grab KAON2 and get a feeling for how good it fulfills your needs.

Here's a step to step description of how you can use KAON2 with Protégé (other DIG based tools should be pretty the same). Get the KAON2 package, unpack it and then go to the folder with the kaon2.jar file in it. This is the Java library that does all the magic.

Be sure to have Java 5 installed and in your path. No, Java 1.4 won't do it, KAON2 builds heavily on some of the very nice Java 5 features.

You can start KAON2 now with the following command:

java -cp kaon2.jar org.semanticweb.kaon2.server.ServerMain -registry -rmi -ontologies server_root -dig -digport 8088

Quite lengthy, I know. You will probably want to stuff this into a shell-script or batch-file so you can start your KAON2 reasoner with a simple doubleclick.

The last argument - 8088 in our example - is the port of the DIG service. Fire up your Protege with the OWL plugin, and check in the OWL menu the preferences window. The reasoner URL will tell you where Protege looks for a reasoner - with the above DIG port it should be http://localhost:8088. If you chose another port, be sure to enter the correct address here.

Now you can use the consistency checks and automatic classification and all this as provided by Protege (or any other Ontology Engineering tool featuring the DIG interface). Protégé tells you also the time your reasoner took for its tasks - compare it with Racer and Fact, if you like. I'd be interested in your findings!

But don't forget - this is the very first release of the DIG interface. If you find any bugs, say so! They must be squeezed! And don't forget: KAON2 is quite different than your usual tableaux reasoner, and so some questions are simply not possible. But the restrictions shouldn't be too severe. If you want more information, go to the KAON2 web site and check the references.

Noch eine Blume: die Tulpe

Überraschend: auf n-tv lief gerade ein Bericht über die Geschichte der Tulpe. Tulpen! Das kann doch nicht interessant werden.

Oh doch! Ich will nicht alles aus der Sendung wiederholen, aber Tulpen führten vielen hunder Jahren zu einem ersten Zusammenbruch der holländischen Börse. Zwar waren auch die allgemein bekannten einfarbigen Blüten, mit ihren kräftigen Farben, beliebt, doch wirklich teuer waren die mehrfarbigen Blüten wie die hier dargestellte Semper Augusta. 10.000 Gulden wurde für sie bezahlt. 150 Gulden war der normale Verdienst einer Familie zu jener Zeit, im Jahr - umgerechnet bewegen wir uns also, für eine einzelne Blume wohlgemerkt!, im Millionen Euro Bereich!

Eine andere Blume wurde gegen eine Villa im niederländischen Harleem getauscht.

Die Tulpenmanie beschädigte die Niederlande schwer: die Preise waren deutlich zu hoch, teilweise wurde mit Tulpen gehandelt, die noch nicht mal gepflanzt waren. Innerhalb weniger Tage brach der Markt plötzlich zusammen, und viele Niederländer mussten in den Ruin.

Die Semper Augusta, die man hier sieht, gibt es heute leider nicht mehr. Schade. Eine wirklich schöne Blume...

Mehr nachzulesen zum Beispiel im Wikipedia-Artikel zur Tulpenmanie.

Hier fehlt noch ein Bild.

Flower of Carnage

Im Original

Übersetzung: Blume von Tod und Zerstörung

Shindeita
Asa ni
Tomorai no
Yuki ga furu

Schmerzvoller Schnee fällt im Morgengrauen,
Streunende Hunde heulen
und Getas Schritte durchdringen die Luft

Hagure inu no
Touboe
Geta no
Otokishimu

Das Gewicht der Milchstraße drückt auf meine Schultern
aber ein Schirm hält die Dunkelheit,
die alles ist, was ist.

Iin na naomosa
Mitsumete aruku
Yami wo dakishimeru
Janomeno kasa hitotsu

Ich bin eine Frau,
die an der Grenze zwischen Leben und Tod wandelt,
leerte meine Tränen, viele Monde her.

Inochi no michi wo
Yuku onna
Namida wa tooni
Sutemashita

All das Mitleid, Tränen, Träume,
Die verschneiten Nächte und das Morgen haben keine Bedeutung

Furimuita
Kawa ni
Toozakaru
Tabinohima

Ich tauchte meinen Körper in den Fluss des Zorns
und warf meine Weiblichkeit fort, viele Monde her.

Itteta tsuru wa
Ugokasu
Naita
Ame to kaze

Im Auftrag des Himmels,
sie sind unsere Soldaten,
loyal, unbesiegbar und mutig

Kieta mizu mo ni
Hotsure ga miutsushi
Namida sae misenai
Janomeno kasa hitotsu

Jetzt ist ihre Zeit gekommen,
das Land ihrer Eltern zu verlassen,
ihre Herzen aufgeputscht von ermutigenden Stimmen

Urami no michi wo
Yuku onna
Kokoro wa tooni
Sutemashita

Feierlich werden sie bestimmt
nicht lebendig zurückzukehren,
ohne Sieg.

Giri mo nasake mo
Namida mo yume no
Kinou mo ashita mo
Henno nai kotoba

Hier, daheim,
warten die Bürger auf Euch.
In fremden Ländern,
die mutigen Truppen.

Urami no kawa ni
Mi wo yudanete
Honma wa tooni
Sutemashita

Statt Freundlichkeit von Jemandem
der mir egal ist
möchte ich lieber Egoismus
von Dir

Mein Problem mit asiatischen Texten ist, dass ich sie häufig auch nicht verstehe, wenn man sie übersetzt. Aus dem Kill Bill Soundtrack.

Hommingberger Gepardenforelle

Ich kann mir zwar nicht vorstellen, dass es ernsthaft hilft, aber versuchen kann man es ja mal, ein paar Kollegen dabei zu helfen, im Suchmaschinenwettbewerb von Heise beim erfundenen Begriff Homminberger Gepardenforelle möglichst weit nach vorne zu kommen. Linkt noch jemand mit? Wäre cool, wenn die hübsch weit nach vorne kommen. Mehr Erklärung gibt es übrigens hier, auf der Seite der Uni Kassel dazu.

Ab sofort mit Backlinks

So, jetzt hat auch der Nodix Blog dieses coole, unverzichtbare Feature: Backlinks! Ihr verlinkt auf einen Nodix-Post, und die Backlinks merken das.

Na ja, zumindest in der Theorie.

Mit Dank an Blogger für's Implementieren.

KAON2 OWL Tools V0.23

A few days ago I packaged the new release of the KAON2 OWL tools. And they moved from their old URL (which was pretty obscure: http://www.aifb.uni-karlsruhe.de/WBS/dvr/owltools ) to their new home on OntoWare: owltools.ontoware.org. Much nicer.

The OWL tools are a growing number of little tools that help people working with OWL. Besides the already existing tools, like count, filter or merge, partly enhanced, some new entered the scene: populate, that just populates an ontology randomly with instances (which may be used for testing later on) and screech, that creates a split program out of an ontology (you can find more information on OWL Screech' own website).

A very special little thing is the first beta implementation of shell. This will become a nice OWL shell that will allow to explore and edit OWL files. No, this is not meant as a competitor to full-fledged integrated ontology development environments like OntoStudio, Protégé or SWOOP, it's rather an alternative approach. And it's just started. I hope to have autocompletion implemented pretty soon, and some more commands. If anyone wants to join, give me a mail.

Asterix in Gefahr

Dieser Tage erschien der 33. Band der Asterix-Reihe: Gallien in Gefahr. Nein, kein "Asterix rettet Gallien", oder "Asterix bringt Gallien in Gefahr", oder ähnlich, sondern schlicht der reißerische Titel "Gallien in Gefahr". Im französischen heißt der Band Le ciel lui tombe sur la tête, Der Himmel fällt ihnen auf den Kopf, aber bei Asterix waren die Übersetzungen schon immer sehr frei - und meistens dadurch auch herausragend gut! (Ich bin, mein lieber Freund, sehr glücklich, dich zu sehn!" - "Das ist ein Alexandriner." -Asterix und Kleopatra). Vor dem weiterlesen ist es vielleicht sinnvoll, den Band zuerst zu lesen, sonst wird einiges vorweggenommen. Ich verrate zwar nicht, wer stirbt, aber dennoch.

Aber kommen wir zum Inhalt. Überraschend wenig gelacht. Ich muss mal die alten Hefte rauskramen, ob man da auch so wenig gelacht hat, ob man nur die besten Stellen sich gemerkt hat (mein Liebling ist Asterix als Legionär, und da habe ich reichlich gelacht, da bin ich mir sicher). Asterix war immer dafür begann, mit seiner Zeit zu spielen. Das Jahr 50 v. Chr. Ganz Gallien ist von Römern besetzt... Das zeigte einen gewissen Rahmen auf, auch wenn er schon in Vergangenheit arg gestreckt wurde: Die große Überfahrt führte nach Amerika, Asterix im Morgenland nach Indien. Doch diesmal reist Asterix gar nicht, sondern vielmehr kommt die Fremde in das kleine gallische Dorf. Und zwar heftig.

Aliens besuchen das kleine gallische Dorf. Ihr Kampf gegen die Römer hat Gerüchte um ihre letale Geheimwaffe ins ganze Universum verbreitet. Die guten Aliens kommen zuerst, um Asterix zu warnen, und dann kommen die bösen nach, und es kommt zu einer Schlacht zwischen den Außerirdischen. Gewaltig große Bilder, oft eine halbe Seite, ein Panel gar die ganze Seite bedeckend - ungewöhnlich für Asterix.

Man mag den Band befremdlich finden. Aber so schlecht ist er nicht. Es ist zudem nicht schwer durch die oberflächliche Geschichte zu stoßen, zu sehen, was dahinter steckt: das gute Alien ist offensichtlich Micky Maus ohne Ohren, ja, selbst der Name des Planeten von dem sie stammen ist ein Angram auf Walt Disney. Sogar die Details - Tuuns Knöpfe, die Handschuhe, die Gesichtsmimik - stimmen. Und er isst Heißen Hund - deutliche Anspielung auf Amerika. Sein Begleiter, im Stile Schwarzeneggers, eine Mischung aus Terminator und Superman, hingegen ist ein Klon, ein Superheld, austauschbar durch den anderen. Dies sind die Ikonen des Amerikanischen Comics. Ich frage mich nur, was der Name Tuun und seines weisen Mannes Hubs bedeutet?

Die bösen Aliens hingegen stammen vom Planeten Nagma, ebenfalls ein nichts verhehlendes Anagram auf japanische Comics. Auch sind sie so insektenhaft gezeichnet und bewegen und kämpfen, wie man es von machem Manga gewohnt ist. Ihr deutsch, oder vielmehr gallo-römisch, ist schlecht, und die Amerikaner, entschuldigung, die guten Aliens behaupten, dass sie nur alle ihre Erfolgsrezepte kopieren. Das erste, was der Abgesandte der Nagmas macht, als er Asterix entgegentritt, ist es, ihm einen Friedensabkommen anzubieten, doch Asterix begrüßt den Außerirdischen mit Dresche. Und so kommt es zur Schlacht.

Amerikanische Superheldencomics, Walt Disneys Massenproduktion und Mangas überfluten den europäischen Markt, bedrängen die französisch-belgischen Künstler, wollen sich das Geheimnis ihres Zaubertranks aneignen wollen, das Geheimnis ihres Erfolgs. Das hätte man für die Nachricht dieses Bandes halten können. Uderzos Postscriptum zu seiner und Goscinnys Asterix-Reihe, der wahrscheinlich letzte Band des nunmehr 78-jährigen Uderzo. Sein Kommentar zur Geschichte des europäischen und globalen Comics.

Doch nach zwei Dritteln des Bandes fragt man sich, was will er nun sagen? Es wird kreuz und quer Frieden geschlossen und aufeinander geprügelt, die Römer und Piraten kriegen einen Gastauftritt, der wirkt, als ob man die Todo-Sachen der Liste "Was alles in einen Asterix-Comic auftauchen muss" noch abhandeln müsste, und am Ende wird allen das Gedächtnis gelöscht, so dass die Geschichte vollkommen folgenlos bleibt. Uderzo hätte viel sagen können, und man hätte zugehört. So aber deutete er einen Kommentar an, um dann doch nur irgendwie die Geschichte zum Ende zu bringen, bevor Seite 48 erreicht wird. Schade. Und warum sieht der römische Offizier diesmal wie Signore Berlusconi aus?

Sicher nicht der schlechteste Asterix. Wahrscheinlich der ungewöhnlichste.

Why some are disenchanted

In a comment to my last blog entry, Christopher St John wrote:

"I suffered through the 80's Knowledge Representation fad, both academically in the AI program at Edinburgh and as a practitioner at the only company ever to produce a commercial system written in Prolog (that wasn't a Prolog development system.) So I'm familiar with the problems that the Semantic Web effort is attempting to address. Having slogged through real-life efforts to encode substantial amounts of knowledge, I find some of the misty-eyed musings that surround the Semantic Web effort depressing. That "most information on the Web is designed for human consumption" is seen as an obstacle surmountable via tools like RDF is especially sad. On the other hand, I'm always happy to make use of the cool tools that these sorts of things seem to throw off. There's probably a certain Proverbs 26:11 aspect to it as well."

Thanks for your insightful comment, and being new to the field I certainly appreciate some report based on real life experience - and I have to admit to probably be faulty of being misty-eyed myself more than once about the Semantic Web (and probably will be in the future as well).

'"Most information on the Web is designed for human consumption" is seen as an obstacle'. Yes, you are right, this is probably the worst phrased sentence in the Semantic Web vision. Although I think it's somehow true: if you want the computer to help you dealing with today's information overflow, it must understand as much of the information as possible. The sentence should be at least rephrased as "most information on the Web is designed only for human consumption". I think it would be pretty easy to create both human-readable and machine-friendly information with only little overhead. Providing such systems should be fairly easy. But this is only about the phrasing of the sentence - I hope that every Semwebber agrees that the Semantic Web's ultimate goal is to help humans, not machines. But we must help the machines in order to enable them to help us.

The much more important point that Christopher addresses is his own disenchantment with the Knowledge Represenation research in the 80s, and probably by many people with the AI research a generation before. So the Semantic Web may just seem as the third generation of futile technologies to solve AI-complete problems.

There were some pretty impressive results from AI and KR, and the Semantic Web people build on that. Some more, some less - some too much even, forgetting the most important component of the Semantic Web underway: the Web. Yes, you can write whole 15-page papers and file them to Semantic Web conferences and journals and not even once mention anything web-specific. That's bad, and that's what Christopher, like some researchers, does not see as well, the main difference between this work two decades ago and today's line of investigation. The Web changes it all. I don't know if AI and KR had to fail - it probably must have failed, because they were so many intelligent people doing it and so there's no other explanation than that it had to fail due to the premises of its time. I have no idea if the Semantic Web is bound to fail as well today. I have no idea if we will be able to reach as much as AI and KR did in their time, or less, or maybe even more. I am a researcher. I have no idea if the things I do will work.

But I strongly believe it will and I will invest my time and part of my life towards this goal. And so do dozens of dozens other people. Let's hope that some nice thing will be created in the course of our work. Like RDF.

RDF is not just for dreamers

Sometimes I stumble upon posts that leave me wonder, what actually do people think about the whole Semantic Web idea, and about standards like RDF, OWL and the like. Do you think academia people went out and purposefully made them complicated? That they don't want them to get used?

Christopher St. John wrote down some nice experience with using RDF for logging. And he was pretty astonished that "RDF can actually be a pretty nifty tool if you don't let it go to your head. And it worked great."

And then: "Using RDF doesn't actually add anything I couldn't have done before, but it does mean I get to take advantage of tons of existing tools and expertise." Well, that's pretty much the point of standards. And the point of the whole Semantic Web idea. There won't be anything you will be able to do later, that you're not able to do today! You know, assembler was pretty turing-complete already. But having your data in standard formats helps you. "You can buy a book on RDF, but you're never going to buy a book on whatever internal debug format you come up with"

Stop being surprised that some things on the Semantic Web work. And don't expect miracles either.

Eine Viertelmillion

Heute Nacht überschritt die Zahl der Besucher der Nodix-Webseiten die Viertelmillion.

Vielen Dank für die vielen treuen Besucher auf nakit-arts, auf semantic.nodix, hier, und auch auf den zur Zeit deutlich vernachlässigten Seiten nutkidz, something*positive, DSA4 Werkzeug und XML4Ada95. Ihr seid großartig.

Semantic MediaWiki: The code is out there

Finally! 500 nice lines of code, including the AJAX-powered search, and that's it, version 0.1 of the SeMediaWiki project! Go to Sourceforge and grab the source! Test it! Tell us about the bugs you found, and start developing your own ideas. Create your own Semantic Wiki right now, today.

Well, yes, sure, there is a hell of a lot left to do. Like a proper triplestore connecting to the Wiki. Or a RDF-serialization. But hey, there's something you can play with.

Eklige Metzger

Ich will ja gar nicht wissen, was das für ein Zeug ist... bäh!

Hier fehlt noch ein Bild.

Semantic MediaWiki Demo

Yeah! Doccheck's Klaus Lassleben is implementing the Semantic MediaWiki, and there's a version of it running for quite some time already, but some bugs had to be killed. Now, go and take a look! It's great.

And the coolest thing is the search. Just start typing the relation, and it gives you an autoexpansion, just like Google Suggest does (well, a tiny bit better :) Sure, the autoexpansion is no scientific breakthrough, but it's a pretty darn cool feature.

The SourceForge project Semediawiki is already up and running, and I sure hope that Mr Lassleben will commit the code any day soon!

Even better, Sudarshan has already started implementing extensions to it - without having the code! That's some dedication. His demo is running here, and shows how the typed links may be hidden from the source text of the wiki, for those users who don't like it. Great.

Now, go and check the demo!

Auto hat 10000

Nachdem ich diesen Sommer 10.000 geworden bin (Tage), hat es mein Autole heute auch geschafft (Kilometer).

Hier fehlt noch ein Bild.

Angriff auf Nodix

Wow, gestern wurden die meisten Nodix-Seiten Opfer einer Defacement-Attacke, wie auch das Defacement-Archiv Zone-H festhält. Auch nakit-arts erging es so. Und so hat es ausgehen.

So albern es ist, aber ich fühle auch einen gewissen Stolz, Opfer *einer* ersten Defacement-Attacke geworden zu sein. Heißt ja schließlich, dass die Nodix-Seiten entsprechend aufgefallen sind. Und, ach ja, dass die Sicherheit hier nicht sonderlich ist :(

Frage mich bloß, wie das passiert ist und was man dagegen machen kann. Zum Glück haben die Angreifer keine Daten gelöscht, sonder nur die index-Seite ausgetauscht, aber die hätten ja auch viel unfreundlicher sein können.

Tipps?

Geht wählen!

Ist wichtig. Ich darf heute leider nicht mitwählen. Und Schwesterchen auch nicht. Ihr dürft. Also, geht wählen!

New people at Yahoo and Google

Vint Cerf starts working at Google, Dave Becket moves to Yahoo. Both like the Semantic Web (Vint said so in a German interview with c't, and I probably don't have to remind you about Daves accomplishments).

I'm sure, Yahoo got Dave because of his knowledge about the Semantic Web. And I wonder if Google got Vint out of the same reason? Somehow, I doubt it.

Another Semantic MediaWiki

I stumbled about another Semantic MediaWiki, an implementation created by Hideaki Takeda and Muljadi Hendry of the Japanese National Institute of Informatics in Tokyo. Their implementation looks very neat, although it is quite different in a few basic things (that we consider crucial in order to work), take a look at their full paper (it's in their wiki - oh, and it's in Japanese).

The basic difference between their approach and the one we suggest is that they add metadata management abilities to MediaWiki - which is cool. But they don't seem to aim at a full integration into the Wikipedia, i.e. embedding the metadata into the article text instead of appending it at some place. Actually, if we had software that is able to process natural languages, we wouldn't need our approach, but their would still be useful.

Nevertheless, they have a big huge advantage: a running system. Go there, take a look, it's cool! Actually, we have a system online too, but won't yet disclose the link due to a bug that's a kind of showstopper. But expect it to be online next week - including the source and all! It will be just a first version, but I sure hope to gather the people who want to work on it around the code.

Commited to the Big S

Not everyone likes our proposal for the Semantic Wikipedia. That's not a big surprise really. Boris Mann was talking about the advantages of tagging, and some ideas like blessed tags, that sounded very nice, when Jay Fienberg pointed him to the Semantic MediaWiki proposal. Boris answers: "I notice with a shudder however, that the Mediawiki stuff uses a large "S" Semantic, and includes RDF. I admit it, I'm afraid of RDF."

Yes, we do. And we're proud of it. Actually, it's the base for the better half of the possible applications we describe. Jay has some nice answers to it: "I think the MediaWiki folks are just recognizing the connection between their "tags" and the big "S" Semantic Web [you bet!, denny]. There are taxonomies and ontologies behind the popular tagging apps too--folks behind them just aren't recognizing / publicizing this (for a number of reasons, including that tags are often part of a practical application without big "S" Semantic Web goals). [...] I'm not a super huge fan of RDF myself, but I think it's useful to not be afraid of it, because some interesting things will come out of it at some point."

Our idea was to allow the user to use Semantic Web technologies even without really understanding them. No one needs to understand RDF fully, or OWL, to be able to use it. Sure, if she does, well, it surely will help her. Any by the way, RDF really is not complicated at all, it just has a syntax that sucks. So what?

Maybe it's a crude joke of history to start the Semantic Web with syntactic problems...

By the way, does anyone have a spare invitation to GMail for me? I'd really like to check out their service. Thanks, Peter, that was fast.

Semantic Wikipedia

Marrying Wikipedia and the Semantic Web in Six Easy Steps - that was the title of the WikiMania 2005 presentation we gave about a month ago. On the Meta-Wikipedia we - especially Markus Krötzsch - were quite active on the Semantic MediaWiki project, changing and expanding our plans. DocCheck is working right now on a basic implementation of the ideas - they have lots of Wiki-Experience already, with Flexicon, a MediaWiki-based medical lexicon. We surely hope the prototype will be up and running soon!

Wow, the project seems perceived pretty well.

Tim Finin, Professor in Maryland: "I think this is an exciting project with a lot of potential. Wikipedia, for example, is marvelously successful and has made us all smarter. I’d like my software agents to have a Wikipedia of their own, one they can use to get the knowledge they need and to which they can (eventually) contribute." - Wikipedia meets the Semantic Web, Ebiquity blog at UMBC

Mike Linksvayer, CTO of Creative Commons: "The Semantic MediaWiki proposal looks really promising. Anyone who knows how to edit Wikipedia articles should find the syntax simple and usable. All that fantastic data, unlocked. (I’ve been meaning to write on post on why explicit metadata is democratic.) Wikipedia database dump downloads will skyrocket." - Annotating Wikipedia, Mike Linksvayers Blog

Danny Ayers, one of the developers of Atom and Author of Atom and RSS Programming: "The plan looks very well thought out and quite a pile of related information has been gathered. I expect most folks that have looked at doing an RDF-backed Wiki would come to the same conclusion I did (cf. stiki) - it’s easy to do, but difficult to do well. But this effort looks like it should be the one." - Wikipedia Bits, Danny Ayers, Raw Blog

Lambert Heller of the University of Münster wrote a German blog entry on the netbib weblog, predicting world domination. Rita Nieland has a Dutch entry on her blog, calling us heroes - if we succeed. And on Blog posible Alejandro Gonzalo Bravo García has written a great Spanish entry, saying it all: the web is moving, and at great speed!

So, the idea seems catching like a cold in rainy weather, we really hope the implementation will soon be there. If you're interested in contributing - either ideas or implementation - join our effort! Write us!

Männer schlauer als Frauen

Ein Artikel von Paul Irwing und Richard Lynn im Journal of Psychology berichten, dass Männer schlauer als Frauen sind. Die beiden Autoren sind natürlich Männer, und Lynn hat sich schon einen Namen damit gemacht, Menschen unterschiedlicher Hautfarbe ebenfalls unterschiedliche Intelligenz zu attestieren. Diese Nachricht wird mit Sicherheit die nächsten Tage durch die Presse geistern.

Tony Halpin von der Times fragt schon, wie schlau das wohl war. Nicht überliefer ist bisher die Reaktion von Irwings und Lynns Angetrauten. Als WikiNews davon berichtete, war zunächst der Titel der Meldung falsch geschrieben: UK study claims men more intelligent then women. Da der Autor leider anonym war, ist dessen Geschlecht unbekannt.

Statt über die Ergebnisse zu lachen, blicken wir einfach etwas tiefer in die Studie, zumindest darüber, was die die oben erwähnten Artikel berichten (der Text der Studie selber war mir nicht zugänglich). 24.000 Studenten machten einen IQ-Test, wobei die Männer im Schnitt 5 Punkte mehr erreichten. Dabei wird ein IQ von 125 von doppelt so vielen Männern erreicht wie von Frauen, während bei einem IQ von 155, wo man üblicherweise von Genies spricht, das Verhältnis gar 5,5 : 1.

Diese Zusatzfakten führen zu zweu interessanten, mathematischen Konsequenzen, die mit Sicherheit in eben derselben Presse, durch die diese Nachricht verbreitet wird - Sommerloch, ihr wisst schon - vollkommen ignoriert werden:

Erstens: da Männer in den genialen Bereichen in der Studie offensichtlich stark überrepresäntiert sind, bewegt sich der Schnitt unverhältnismäßig weit nach oben für den nicht-genialen Mann. Anders gesagt: rechnet man die furchtbar wenigen übermäßig intelligenten Personen aus der Studie heraus, schrumpft der Vorsprung der Männer deutlich zusammen. Oder, ganz einfach gesagt: die Studie zeigt, dass Genies eher Männer sind. Genies muss man aber eh mit der Lupe suchen, und da die meisten Menschen keine Genies sind, ist dieses Ergebnis für sie kaum von Relevanz. Das wäre so, als ob ein Tausendseelendorf mit dem Erfolg angeben wollte, dass das Jahresdurschnittsgehalt im Ort um eine Million Euro gestiegen sei, nur weil halt Bill Gates zugezogen ist. Die anderen Dorfbewohner werden dadurch auch nicht reicher.

Zweitens: die Studie basiert auf Studenten. Dadurch wurde von vorneherein bereits Bevölkerungsgruppe mit ohnehin überdurschnittlichem IQ ausgewählt. Wenn jetzt der von mir sehr geschätzte Steven Pinker recht hat, der da sagt, dass es bei den männlichen Exemplaren einer Gattung schon allein aus evolutionären Gründen zu einer größeren Varianz in fast allen Bereichen kommt (siehe auch die sehr interessante Debatte mit Elizabeth Spelke), dann heißt dass auch, dass bedeutet die höhere Varianz bei männlichen IQ einfach, dass es zwar mehr männliche Genies gibt, aber halt auch unter den Dummen Männer die deutliche Mehrheit bilden. Und da wir, durch die Beschränkung der Studie auf Studenten, diese ausgeblendet haben, muss es - wenn Pinker recht hat - rein mathematisch zu einem höheren IQ-Schnitt für Männer kommen. Hätten wir die Studie mit Sonderschülern durchgeführt, ist ebenso notwendig ein höherer IQ-Durchschnitt für Frauen zu erwarten.

So, und jetzt nochmal einfach, so dass auch die Männer und Frauen es kapieren: aus der Studie zu schließen, dass man als Mann schlauer als eine Frau ist, ist so falsch wie dumm. Aus der Studie zu schließen, dass Männer im Allgemeinen oder im Schnitt schlauer als Frauen sind, ist ebenso falsch wie dumm. Die Studie ist ein Indiz dafür, dass wenn jemand total dämlich oder aber ein Genie ist, es eher ein Mann ist. Wer blind sichergehen möchte, nicht einen Idiot einzustellen, sollte sich für eine Frau entscheiden. Wer nur Genies haben möchte, wird es bei der Auswahl schwerer haben, aber letztlich ein Team aus überwiegend Männern haben.

Doch für den allgemeinen Fall haben die Aussagen keine Bedeutung.

Hmm, Statistik sollte dringend Pflichtfach an den Schulen werden. Siehe auch Der Hund, der Eier legt. Ein Muss für jeden, der Statistiken verstehen will.

Guter Preis

Ich kann mir nicht vorstellen, dass das der Microsoft-Preis ist. Oder werden da Hunde bevorzugt? Müssen Hunde auch der Lizenz vor Öffnen der Packung zustimmen?

Hier fehlt noch ein Bild.

Noch ein Test

So, ein weiterer Test, wie gut ich von meinem Handy aus bloggen kann. Schon spannend: als ich mit Nodix anfing, habe ich die Einträge per Hand in HTML geschrieben, dann kam ein selbst programmiertes Tool in Delphi, dann Python, welches ich in einer späteren Version auch von anderen PCs verwenden konnte. Schließlich übernahm Blogger den Blog, und dank der standardisierten Schnittstellen kann ich jetzt auch direkt von meinem Handy aus Einträge erstellen. Ob mit - wie gestern - oder ohne Photo.

Ich habe bloß noch keine Ahnung, wie das mit dem verlinken funktioniert. Vielleicht einfach so: <a href="http://notausgang.baumgarf.de">geht auf diesen tollen Blog!</a> Gleich mal ausprobieren.

Update:(Test doppelt misslungen. Der Post landete zunächst auf dem falschen Blog, Semantic Nodix, und das mit dem verlinken hat auch nicht funktioniert. Menno)

Failed test

Testing my mobile blogging thingie (and it failed, should have gone to the other blog). Sorry for the German noise.

Preisuhr

Dies ist ein weiterer Test, ob meine Moblog-Einstellungen funktionieren. Das Bild ist mit meinem Handy aufgenommen und direkt von da an mein Blog geschickt. Cool, ne'?

Hier fehlt noch ein Bild.

Sacre Coeur

Ich teste nur flickr aus. Angeblich kann flickr direkt auf meinen Blog posten. Und das Bild ist wirklich schön. Aber nicht von mir.

Update: Bild ist nunmehr gesperrt. Es handelte sich um dieses Bild der Kirche Sacre Ceour in Paris, doch es scheint nun auch von flickr entfernt worden zu sein

Gotta love it

Don't do research if you don't really love it. Financially, it's desastrous. It's the "worst pay for the investment", according to CNN.

Good thing I love it. And good thing Google loves the Semantic Web as well. Or why else do they make my desktop more and more semantic? I just installed the Desktop2 Beta - and it is pretty cool. And it's wide open to Semantic Stuff.

FOAFing around

I tried to create FOAF-files out of the ontology we created during the Summer School for the Semantic Web. It wasn't that hard, really: with our ontology I have enough data to create some FOAF-skeletons, so I looked into the FOAF-specification and started working on it.

<foaf:person about="#gosia">
  <foaf:knows resource="#anne" />
  <foaf:name datatype="http://www.w3.org/2001/XMLSchema#string">Gosia Mochol</foaf:name>
</foaf:person>
<rdf:description about="#anne">
  <rdfs:isdefinedby resource="http://semantic.nodix.net/sssw05/anne.rdf" />
...

Well, every one of us gets his own FOAF-file, where one can find more data about the person. Some foaf:knows-relations have been created automatically for the people who worked together in a miniproject. I didn't want to assume too much else.

The code up there is valid FOAF as much as I can tell. But all the (surprisingly sparse) tools could not cope with it, due to different reasons. One complained about the datatype-declaration in the foaf:name and then ignored the name at all. Most tools didn't know that rdfs:isDefinedBy is a subproperty of rdfs:seeAlso, and thus were not able to link the FOAF-files. And most tools were obviously surprised that I gave the persons URIs instead of using the IFP over the sha1-sum of their e-Mails. The advantage of having URIs is that we can use those URIs to tag pictures or to keep track of each other publications, after the basic stuff has been settled.

Pitily, the basic stuff is not settled. To me it seems, that the whole FOAF stuff, although being called the most widespread use of the Semantic Web, is still in its infancy. The tools hardly collaborate, they don't care too hard about the specs, and there seems no easy way to browse around (Mortens explorer was down at the time when I created the FOAFs, which was frustrating, but now it works: take a look at the created FOAF files, entering with my generated FOAF file or the one for Enrico Motta). Maybe I just screwed it all up when generating the FOAF-files in the first run, but I don't think so really...

Guess someone needs to create some basic working toolset for FOAF. Does anyone need requirements?

SSSW Last Day

The Summer School on Ontological Engineering and the Semantic Web finished on Saturday, July 16th, and I can't remember having a more intense and packed week in years. I really enjoyed it - the tutorials, the invited talks, the workshops, the social events, the mini project - all of it was awesome. It's a pity that it's all over now.

Today, besides the farewells and thank yous and the party in Madrid with maybe half of the people, also saw the presentation of the mini projects. The mini projects where somewhat similar to the The Semantic Web In One Day we had last year - but without a real implementation. Groups of four or five people had to create a Semantic Web solution in only six hours (well, at least conceptually).

The results were interesting. All of them were well done and highlighted some promising use cases for the Semantic Web, where data integration will play an important role: going out in the evening, travelling, dating. I'd rather not consider too deeply if computer scientists are rather attacking an own itch here ;) I really enjoyed the Peer2Peer theater, where messages wandered through the whole class room in order to visualize the system. This was fun.

Our own mini project modelled the Summer School and the projects itself, capturing knowledge about the buildup of the groups and classifying them. We had to use not only quite complex OWL constructs, but also SWRL-rules - and we still had problems expressing a quite simple set of rules. Right now we are trying to write these experiences down in a paper, I will inform you here as soon as it is ready. Our legendary eternal struggle at the boundaries of sanity and Semantic Web technologies seemed to be impressive enough to have earned us a cool price. A clock.

Thanks to all organizers, tutors and invited speakers of the Summer School, thanks to all the students as well, for making it such a great week. Loved it, really. I hope to stay in touch with all of you and see you at some conference pretty soon!

Lucky Luke gegen die Daltons

Wie meistens montags, auch gestern in der Sneak gewesen. In letzter Zeit gab es - wie man im Notausgang-Blog oder auf nakit-arts mitbekommen konnte - einige echt geniale Filme: 11:14, Mr and Mrs Smith, L.A. Crash, The Fantastic Four, Nicotina, Antikörper, Madagascar, Hitchhikers Guide through the Galaxy oder Garden State. Gestern war der Film nicht ganz so gut.

Genau genommen war er grottig. Ich schreibe ja schon länger keine Filmrezis mehr, weil die Filmrezis auf Baumgarf deutlich besser sind. Ja, klar, ich stimme ihm nicht immer zu, aber meistens. So auch bei den Daltons. Einen so schlechten Film habe ich sehr selten - vielleicht noch nie - gesehen. Auf imdb gibt es eine Rezi mit dem Titel "Not ENTIRELY bad", wobei ich ein Wort aus der Überschrift streichen würde (und es ist nicht das großgeschriebene). Ansonsten sind die Meinungen auf imdb über den Film sehr deutlich: 40% haben dem Film die schlechtesmögliche Note gegeben. Reicht nicht, um den Film in Top 100 der schlechtesten Filme zu katapultieren, überraschenderweise. Ich kenne davon - soweit ich mich erinnere - aber nur einen Film, Alone in the Dark (ich weiß nicht mehr, welche Police Academy-Teile ich gesehen habe). Ob der wirklich schlechter ist? Müsste ich nochmals sehen, um das zu beurteilen. Sprich, ich werde es nie, nie, nie erfahren...

Warum sind die Daltons so grottig schlecht? Erstens, wie bei Clever & Smart schon, hat irgendjemand den Darstellern gesagt: hey, das ist eine Comicverfilmung. Also müsst ihr alle wie doof rumhüpfen und euch seltsam bewegen. Zweitens, wozu Gags? Die Vorlage hat doch schon genug davon. Drittens, die beiden Szenen, die ausnahmsweise gute sind, also die, wo Lucky Luke Joe Dalton das Serum verabreicht und wo Joe Dalton die Gulch Bank für den finalen Überfall betritt, zeigen, dass die Leute es offenbar besser draufhaben. Warum also machen sie es nicht?

Das, was ich wirklich schade finde, sind, dass 27 Millionen Dollar reingeflossen sind und hunderte von Menschen mitgewirkt haben. Muss dann das Ergebnis nicht wie Verrat vorkommen? Oder glauben diese Leute wirklich, dass sie einen guten Film gemacht haben? Darf oder muss man mit ihnen Mitleid haben? Die glauben doch an ihre Arbeit. Wie also kann so ein Murks überhaupt entstehen?

Dafür immerhin einen Narnia-Trailer (hier ist die deutsche Version verlinkt) gesehen. Auf französisch. Sah sehr vielversprechend aus. Endlich darf auch Disney ein Fantasy-Meisterwerk verfilmen, nachdem Tolkien das für den Herrn der Ringe ja testamentarisch ausgeschlossen hatte.

SSSW Day 5

Today (which is July 15th) just one talk. The rest of the day - beside the big dinner (oh well, yes, there was a phantastic dinner speech performed by Aldo Gangemi and prepared by Enrico and Asun if I understood it correctly, which was hilariously funny) and the disco - was available for work on the mini projects. But more about the mini projects in the next blog.

The talk was given by University of Manchester's Carol Goble (I like that website. It starts with the sentence "This web page is undergoing a major overhaul, and about time. This picture is 10 years old. the most recent ones are far too depressing to put on a web site." How many professors did you have that would have done this?). She gave a fun and nevertheless insightful talk about the Semantic Web and the Grid, describing the relationship between the two as a very long engagement. The grid is the old, grudgy, hard working groom, the Semantic Web the bride, being aesthetically pleasing and beautiful.

What is getting gridders excited? Flexible and extensible schemata, data fusion and reasoning. Sounds familiar? Yes, these are exactly the main features of Semantic Web technologies! The grid is not about accessing big computers (as most people think in the US, but they are a bit behind on this as well), it is about knowledge communities. But one thing is definitively lacking: scalability, people, scalability. They went to test a few Semantic Web technologies with a little data - 18 million triples. Every tool broke. The scalability lacks, even thought the ideas are great.

John Domingue pointed out, that scalability is not that much of a problem as it seems, because the TBoxes, where the actual reasoning will happen, will always remain relatively small. And the scalability issue with the ABoxes can be solved with classic database technology.

The grid offers real applications, real users, real problems. The Semantic Web offers a lot of solutions and discussions about the best solution - but lack surprisingly often an actual problem. So it is obvious that the two fit together very nicely. At the end, Carole described them as engaged, but not married yet.

At the end she quotes Trotsky: "Revolution is only possible when it becomes inevitable" (well, at least she claims it's Trosky, Google claims its Carole Goble, maybe someone has a source? - Wikiquote doesn't have it yet). The quote is in line with almost all speakers: the Semantic Web is not Revolution, it is Evolution, an extension of the current web.

Thanks for the talk, Carole!

Wikimania is coming

Wikimania starts on Friday. Looking forward to it, I'll be there with a collegue and we will present a paper on Wikipedia and the Semantic Web - The Missing Links on Friday. Should you be in Frankfurt, don't miss it!

Here's the abstract: "Wikipedia is the biggest collaboratively created source of encyclopaedic knowledge. Growing beyond the borders of any traditional encyclopaedia, it is facing new problems of knowledge management: The current excessive usage of article lists and categories witnesses the fact that 19th century content organization technologies like inter-article references and indices are no longer sufficient for today's needs.

Rather, it is necessary to allow knowledge processing in a computer assisted way, for example to intelligently query the knowledge base. To this end, we propose the introduction of typed links as an extremely simple and unintrusive way for rendering large parts of Wikipedia machine readable. We provide a detailed plan on how to achieve this goal in a way that hardly impacts usability and performance, propose an implementation plan, and discuss possible difficulties on Wikipedia's way to the semantic future of the World Wide Web. The possible gains of this endeavour are huge; we sketch them by considering some immediate applications that semantic technologies can provide to enhance browsing, searching, and editing Wikipedia."

Basically we suggest to introduce typed links to the Wikipedia, and an RDF-export of the articles annotated with these typed links being regarded as relations. And suddenly, you get the a huge ontology, created by thousands and thousands of editors, queryable and usable, a really big starting block and incubator for Semantic Web technologies - and all this, still scalable!

If the Wikipedia community agrees that this is a nice idea, which I hope with all my heart. We'll see this weekend.

SSSW Day 4

This day no theoretical talks, but instead two invited speakers - and much social programme, with a lunch at a swimming pool and a dinner in Segovia. Segovia is a beautiful town, with a huge, real, still standing roman aqueduct. Stunning. And there I ate the best pork ever! The aqueduct survived the huge earthquake of Lisbon of 1755, although houses around it crumbled and broke. This is, because it is built without any mortar - just stone over stone. So the stones could swing and move slightly, and the construction survived.
Made me think of loosely coupled systems. I probably had too much computer science the last few days.

The talks were very different today: first was Mike Woolridge of the University of Liverpool. He talked about Multiagent Systems in the past, the present and the future. He identified five trends in computing: Ubiquity, Interconnection, Intelligence, Delegation and Human-orientation.
His view on intelligence was very interesting: it is about the complexity of tasks that we are able to automate and delegate to computers. He quoted John Alan Robertson - the guy who invented resolution calculus, a professor of philosophy - as exclaiming "This is Artificial Intelligence!", when he saw a presentation of the FORTRAN compiler at a conference. I guess the point was, don't mind about becoming as intelligent as humans, just mind at getting closer.
"The fact that humans were in control of cars - our grandchildren will be quite uncomfortable with this idea."

The second talk was returning to the Semantic Web in a very pragmatic way: how to make money with it? Richard Benjamins of iSOCO just flew in from Amsterdam where he was at the SEKT meeting, and he brought promising news about the developing market for Semantic Web technologies. Mike Woolridge was criticizing Richard's optimistic projections and noted that he also, about ten years ago, spent a lot of energy and money into the growing Multiagent market - and lost most of it. It was an interesting discussion - Richard being the believer, Mike the sceptic, and a lot of young people betting a few years worth of life on the ideas presented by the first one...

Theodor W. Adorno

Hier mein Beitrag zur neuen Pro7-Show die 100 beliebtesten Aphorismen von Theodor W. Adorno (oder so ähnlich - moderiert das eigentlich Oliver Pocher oder Sonya Kraus?)

"Bei vielen Menschen ist es bereits eine Unverschämtheit, wenn sie Ich sagen"

SSSW Day 3

Yeah, sure, the Summer School for the Semantic Web is over for quite a while now, and here I started to blog about it daily, and didn't manage to get over the first three days. Let's face it: it was too much! The program was so dense, the social events so enjoyable, I couldn't even spare half an hour a day to continue the blogging. Now I want to recap some of my notes and memories I have of the second half of the Summer School. My bad memory be damned - if you want to correct something feel free to do so.

This day's invited speaker was Roberto Basili of the University of Rome. He sketched the huge field of natural language processing, and although he illustrated the possible interactions between lexical knowledge bases and ontologies, he nevertheless made a strong distinction between these two. Words are not concepts. "The name should have no value for defining a concept." This is like "Don't look into URIs" for HLT-people. He made a very interesting point: abductions will become very important in the Semantic Web, as they model human thinking patterns much closer than strict deduction does. Up until this day I was quite against abductions, I discussed this issue very stubbornly in Granada. But Roberto made me aware of a slightly different viewpoint: just sell abductive resolutions as suggestions, as proposals to the user - et voilà, the world is a better place! I will have to think abou this a bit more some day, but he did made me think.

The theoretical sessions and workshops today were packed and strenuos: we jumped from annotations to Semantic Web Services and back again. Fabio Ciravegna of the University of Sheffield's NLP-Group, who created tools like Armadillo and GATE, gave us a thorough introduction to annotations for the Semantic Web and the usage of Human Language Technologies in order to enhance this task. He admitted that many of the tools are still quite unhandy, but he tried to make a point by saying: "No one writes HTML today anymore with a text editor like Emacs or Notepad... or do you?"
All students raised their hands. Yes, we do! "Well, in the real world at least they don't..."

He also made some critical comments on the developments of the Semantic Web: the technologies being developed right now allow for a today unknown ability of collecting and combining data. Does this mean, our technologies actually require a better world? One with no secrets, privacy and spam, because there is no need for such ideas? Is metadata just adding hay to the haystak instead of really finding the needle?

John Domingue's Talk on Semantic Web (Web) Services was a deep and profound introduction to the field, and especially to the IRS system developed by the KMi at Open University. He was defending WSMO valiantly, but due to time constraints pitily skipped the comparison with OWL-S. But he motivated the need for Semantic Web Services and sketched a possible solution.

The day ended in Cercedilla, where we besieged a local disco. I guess the people were hiding, "watch it, them nerds are coming!" ;) The music surprisingly old - they had those funny vinyl albums - but heck, Frank Sinatra is never outdated. But the 80s certainly are...

SSSW Day 2

Natasha Noy gave the first talk today, providing a general overview on Mapping and Alignment algorithms and tools. Even though I was not too interested in the topic, she really caught my interest with a good and clean and structured talk. Thank for that! After, Steffen Staab continued, elaborating on the QOM approach to ontology mapping, having some really funny slides, but, as this work was mostly developed in Karlsruhe I already knew it. I liked his appeal for more tools that are just downloadable and usable, without having to fight for hours or days just to create the right environment for them. I totally agree on that!

The last talk of the day was from Aldo Gangemi on Ontology Evaluation. As I consider making this the theme of my PhD-thesis - well, I am almost decided on that - I was really looking forward to his talk. Although it was partially hard to follow, because he covered quite a broad approach to this topic, there have been numerous interesting ideas and a nice bibliography. Much to work on. I especially didn't yet see the structural measures he presented applied to the Semantic Web. Not knowing any literature on them, I am still afraid, that they actually fail [SSSW Day 1|Frank's requirements from yesterday]]: not just to be taken from graph theory, but rather to have the full implications of the Semantic Web paradigm been applied to them and thought through. Well, if no one did that yet, there's some obvious work left for me ;)

The hands-on-sessions today were quite stressy, but nevertheless interesting. First, we had to powerconstruct ontologies about different domains of traveling: little groups of four persons working on a flight agency ontology, a car rental service ontology and a hotel ontology. Afterwards, we had to integrate them. Each exercise had to be done in half a hour. We pretty much failed miserably in both, but we surely encountered many problems - which was the actual goal: in OWL DL you can't even concatenate strings. How much data intefration can you do then?

The second hands-on-session was on evaluationg three ontologies. It was quite interesting, although I really think that many of these things can happen automatically (I will work on this in the next two weeks, I hope). But the discussion afterwards was quite revealing, as it showed how differently people think about some quite fundamental issues, the importance they give to structural measures compared to the functional ones. Or, differently said: the question is, is a crappy ontology on a given domain better than a good ontology that doesn't cover your domain of interest? (The question sounds strange to you? To me as well, but well...)

Pitily I had to miss today's social special event, a football match between the students of the Summer School. Instead I had a very interesting chat with a colleague from the UPM, who came here for a talk, and who also wants to make her PhD in Ontology Evaluation, Mari Carmen Suárez de Figueroa. Interesting times are lying ahead.

SSSW Day 1

Today's invited speaker was Frank von Harmelen, co-editor of the OWL standard and author of the Semantic Web Primer. His talk was on fundamental research challenges generated by the Semantic Web (or: two dozen Ph.D. topics in a single talk). He had the idea after he was asked one day in the cafeteria "Hey Frank, whazzup in the Semantic Web?"

In the tradition of Immanuel Kant's four famous questions on philosophy, Frank posed the four big research challenges:

  • Where does the metadata come from?
  • Where do the ontologies come form?
  • What to do with the many different ontologies?
  • Where's the Web in the Semantic Web?

He derived many research questions that arise when you bring results from other fields (like databases, natural language, machine learning, information retrieval or knowledge engineering) to the Semantic Web and not just change the buzzwords, but take the implications that come along with the Semantic Web seriously.

Some more notes:

  • What is the semantic equivalent to a 404? How should a reasoner handle the lack of referential integrity?
  • Inference can be cheaper than lookup on the web.
  • Today OWL lite would probably have become more like OWL DLP, but they didn't know better than

The other talks were given by Asun Gómez-Pérez on Ontological Engineering, and Sean Bechhofer on Knowledge Representation Languages for the SemWeb, pretty good stuff by the people who wrote the book. I just wonder if it was too fast for the people who didn't know about it already, and too repeting for the others, but well, that's always the problem with these kind of things.

The hands-on session later was interesting: we had to understand several OWL ontologies and explain certain inferences, and Natasha Noy helped us with the new Protégé 3.1. It was harder than I thought quite some times. And finally Aldo Gangemi was giving us some exercises with knowledge representation design patterns, based on DOLCE. This was hard stuff...

Wow, this was a lot of namedropping. The social programme (we were hiking today) around the summer school, and the talks with the peers are sometimes even more interesting than the actual summer school programme itself, but this probably won't be too interesting for most of you, and it's getting late as well, so I just call it a day.

Summer School for the Semantic Web, Day 0

Arrived in Cercedilla today, at the Semantic Web Summer School. I really was looking forward to these days, and now, flipping through the detailed programme I am even more excited. This will be a very intense week, I guess, where we learn a lot and have loads of fun.

I was surprised by the sheer number of students being here: 56 or 57 students have come to the summer school, from all over the world - met someone from Australia, from Pittsburgh, and many Europeans. Happily, I also met quite a number of people I already knew, and thus I know it will be a pleasurable week. But let's just do the math for a second: we have more than 50 accepted students at this summer school. There are at least three other summer schools with related fields, like the one in Ljubljana the week before, there's one in Edinburgh, and the ESSLLI. So, that's about 200 students. Even if we claim that every single PhD student is going to a summer school - which I don't think - that would mean we get 200 theses every year! (Probably this number will be only reached in three years or so)

So, just looking at the sheer amount of people working on it - what's the expected impact?

Interesting times lie ahead.

Madrid Stadt

Mein letzter Blogeintrag aus Madrid war ja eher ein Schnellschuss aus dem Flughafen, diesmal habe ich ein wenig mehr Zeit. Das Hotel in dem ich abgestiegen bin, ansonsten brauchbar, wenn auch nichts besonderes - ich wuerde es nicht weiterempfehlen - hat laut Website Internetanschluss in allen Zimmern und WLAN. Was sie nicht haben, ist beides zusammen, WLAN auf dem Zimmern also, sondern man muss mit seinem Modemkabel in die Wand stoepseln. Und was sie auch auf der Website verschweigen ist der eher unhoefliche Preis von 9 Euro die Stunde...

Aber zurueck zu Madrid. Die Stadt selbst hat ein paar Autos zu viel, dafuer aber sprechen viel zu wenig Menschen Englisch. Na, was gehe ich auch ohne spanisch zu koennen nach Spanien koennte man einwerfen. Aber das Museum de Prada ist wirklich sehenswert: Velazquez Familia sieht so beeindrucken aus, wie ich sie mir vorgestellt habe, die Werke von Goya sind auf zwei Stockwerken ausgestellt: die lustigen, bunten ganz oben, die duesteren, wie der weit bekannte Saturn, darunter, in duesteren Raeumen. Sehr beeindruckend. Und die gewaltige Sammlung von Rubensbildern, allein die Menge an mythischen Figuren - wow. Da moechte man am liebsten nochmal ein Mythologielexikon rausgraben und sich darin eingraben.

Auch der Retito, der grosse Park in Madrid: sehr schoen, beeindruckend, gross. Nicht so gross wie der Stuttgarter Stadtpark, aber welche Stadt hat schon einen solch grossen Park? Die moderne Kathedrale der Stadt ist ebenfalls sehr auffaellig, einfach weil sie ganz anders ist als andere Kathedralen. Blau aufgemalte Himmelsflaechen? Sehr cool. Aber auch nur wegen der Einzigartigkeit, zugegeben.

Wenn man mich fragt, was ich mit Madrid verbinde, habe ich auch eine ganz deutliche Antwort.

Durst.

Joghurt und Coke

Was mir in Spanien sehr positiv auffiel: sie hatten viele Getränkeautomaten rumstehen. Nun, zugegeben, nicht überraschend. Durstig wie ich war suchte ich also gerade einen Getränkeautomaten, und als ich einen solchen erblickte, war ich erstaunt gleich daneben einen Joghurtautomaten zu finden. Was für eine coole Idee! Später fiel mir auf, dass das durchaus häufig der Fall war: überall verkaufte man Joghurt. Warum es das bei uns nicht gibt? Schönen, frischen kühlen Joghurt, überall, labend, lecker, ach, einfach herrlich, dachte ich, und kaufte eine Cola.

Klickfrei

Sehr coole Designidee: eine Webseite, die man vollständig navigieren kann, ohne zu klicken, dontclick.it. Sehr schnell, sehr flüssig und sehr hübsch. Ich stelle mir jetzt noch vor, dass wir keine Maus haben, sondern versuchen, mit unseren Augen den Cursor zu steuern - und schon haben wir ein extrem futuristisches User Interface...

Hilft auch bei RSI.

Wowarich?

Sehr nette idee, von Fred darauf gebracht:

Wo war ich schon überall auf der Welt, bzw. in Europa? (bei einer Weltkarte würde noch die USA dazukommen, der Rest wäre gähnendes Grau, darum habe ich lieber die Europakarte gewählt).

Hier fehlt noch ein Bild.

Könnt ihr auch ganz einfach für euch selber zusammenbasteln, auf World66.

Abraham Bernstein on users

"The regular user is not able to cope with strict inheritance."

Abraham Bernstein of the University of Zürich was today at the AIFB and gave a talk on SimPack - A Generic Java Library for Similarity Measures in Ontologies. Not being an expert in mapping, alignment and similarity I still saw some of the interesting ideas in it, and I liked the big number of different approaches towards measuring similiarity.

Which struck me much more was the above statement, which is based on his experience with, you know, normal users, who are "not brainwashed with object-oriented paradigms". Another example he gave was his 5 years old kid being perfectly able to cope with default reasoning - the "pinguins are birds, but pinguins can't fly"- thing, and thus do not follow strict inheritance.

This was quite enlightening, and leads to many questions: if the user can't even deal with subsumption, how do we expect him to be able to deal with disjunctions, complements or inverse functional properties?

Abraham's statement is based on experience with the Process Handbook, and not just drawn from thin air. There are a lot of use cases for the Semantic Web that do *not* require the participation of the normal end user, thus there still lie plenty of possibilities for great research. But I still believe that the normal end user has to unlock the Semantic Web in order to really make the whole idea lift off and fly. But in order to achieve that we need to tear down the wall that Abraham describes here.

Any ideas how to do this?

Garden State

Der Garden State ist New Jersey, wie Ralf mich freundlicherweise aufklärte. Und aus dem dem fernen, uns allen bekannten Los Angelese kommt der junge Held unserer Geschichte nach New Jersey, in seine Heimat zurück, und trifft auf manchen Geist seiner Vergangenheit, um erwachsen zu werden.

Schöner Film, spritzige Dialoge, skurrile Situationen, charmante Charaktere, treffend gezeichnet, und zudem glaubwürdig. Und dabei eher sparsam auf Sarkasmus und Ironie setzend, sondern vielmehr dieselbe abstrafen durch ehrliche Verrücktheit. Ich kann ihn nur empfehlen!

Eine deutlich längere und bessere Rezi von Garden State gibt es - wie üblich - bei Ralf.

Live from ICAIL

"Your work remindes me a lot of abduction, but I can't find you mention it in the paper..."

"Well, it's actually in the title."

Bologna la dotta

Bologna ist großartig! Das erste Gebäude. das ich hier kennenlernte, war die Sala Borsa. Angekommen, im Hotel eingecheckt, spazieren, und sofort reinmarschiert. Man könnte es einen Mall nennen, aber mit einem sehr ausgewählten Angebot: zwei, drei Bibliotheken, Mediothek, Buchhandlungen, Studentenbedarf (Taschen, Blöcke), Internetcafé, WLAN-Access, Restaurants, unglaublich! Das ganze in einem wunderschänen, dezent renovierten alten Gebäude, das auf einer Ruine gebaut ist, wahrscheinlich aus römischer Zeit. Die Ruine kann man sehen, weil der Boden des Erdgeschosses aus Glas besteht, das ganze Gebäude ist eine große Halle, mehr oder weniger, mit mehreren Galerien, in der die ganzen tollen Geschäfte untergebracht sind.

Phantastischer Ort. Allein deshalb ist Bologna eine Reise wert. Ja, die Universität (gegründet 1088, damit die älteste Uni der Welt) ist auch ganz ansehnlich. Vorlesungen in diesen Hallen zu hören oder Prüfungen abzulegen in Renessaincegeschmückten Räumen wird schon irgendwie was besonderes sein. Daher hat Bologna auch einen seiner Beinamen, Bologna la dotta, Bologna, die Gelehrte.

Auch die Konferenz ist überraschend spannend, aber dazu bald mehr auf Semantic Nodix.

Angelina Jolie wird 30

Bei dieser Gelegenheit erfuhr ich im Radio, dass die Dame auch bei Meat Loaf in einem Video auftrat. Ich versuchte mich krampfhaft durch die Meat Loaf Videos zu erinnern, und von der Zeit her wäre das einzig passende I'd do anything for love (but I won't do that) dachte ich. Und während ich mich so erinnerte, vermeinte ich das Bild der Hauptdarstellerin aus dem Video herauszubeschwören, und war mir in der Erinnerung schon ganz und gar sicher, ja, das war Angelina Jolie!

Heute abend recherchierte ich dann doch ein wenig. Laut imdb spielte Angelina Jolie in Bat out of Hell II: The Picture Show mit. Dieses wiederum ist laut Rotten Tomatoes eine Sammlung von Videos, darunter eben auch besagter Song! Doch Angelina soll einen Runaway gespielt haben, was nicht passt - in dem Video kam kein Runarway vor. Viel mehr wurde die Rolle von Mrs Loud gespielt, die Laut Wikipedia in Wirklichkeit Lorraine Crosby heißt. War es dann Rock'n'Roll dreams come true? Da kam ein Runaway vor.

Ergebnis: Lorraine Crosby sieht so aus. Angelina Jolie so. In meiner Erinnerung sehe ich aber, wenn ich an das Video zu I'd do anything etc. denke nur mehr nur noch Angelina, und nicht Lorraine. Diese verdammte Erinnerung ist doch ein zu wackeliges Ding, ich sollte mich gar nicht mehr darauf verlassen...

Was ich aber nicht herausgefunden habe: in welchem Meat Loaf Video spielte die Jolie denn jetzt mit? Weiß das jemand?

Touristenfalle

Diese Woche auf Kreta, in einem gewaltigen Touristenkomplex. Alles ist strahlend weiß oder himmelblau, das Wetter ist herrlich, der Strand gleich um die Ecke, überall gibt es Möglichkeiten zu Essen, Trinken, Einkaufen, das ganze Krams, man muss den Komplex gar nicht verlassen, ja selbst die Konferenzräume der ESWC2005, die mich nach Kreta führt, finden sich innerhalb des Hotels, und auch das Conference Dinner fand hier statt.

Letzte Woche, Granada, die World Conference on Philosophy of Law and Social Philosophy. Die Konferenzräume waren deutlich beeindruckender, das Essen aber auch schlechter. Die Unterkunft? Mit Kollegen zusammen quartierten wir uns in den umliegenden Bergen in einem Haus ein. Die Räume der Konferenz waren direkt in der Stadt. Social Event - Dinner? Nein, ein ein Besuch der Alhambra. In Kreta gab es kein kulturelles Programm - leider.

Und die Moral von der Geschichte? Noch gar keine, weil das wichtigste nach wie vor die Präsentationen, die Leute und die Inhalte. Und hier wie da waren ausgesprochen schlaue Leute - aber natürlich lag die ESWC mir mehr (auch wenn ich auch auf der World Conference manch Neues lernte). Aber Habermas zu hören war dennoch ein Erlebnis!

ESWC2005 is over

The ESWC2005 is over and there have been a lot of interesting stuff. Check the proceedings! There were some really nice idea, like RDFSculpt, good work like temporal RDF (Best Paper Award), the OWL-Eu extensions, naturally the Karlsruhe stuff like ontology evolution, many many persons to meet, get to know, many chats, and quite some ideas. Blogging from here is quite a mess, the uplouad rate is catastrophal, so I will keep this short, but I certainly hope to pick up on some of the talks and highlight the more interesting ideas (well, interesting to me, at least). Stay tuned! ;)

OWL 2.0

31 May 2005

I posted this to the public OWL dev mailing list as a response to a question posed by Jim Hendler quite some while ago. I publish it here for easier reference.

Quite some while ago the question of OWL 2.0 was rised here, and I wrote already two long replies with a wishlist - but both were never sent and got lost in digital nirvana, one due to a hardware, the second due to a software failure. Well, let's hope this one passes finally through. That's why this answer is so late.

Sorry for the lengthy post. But I tried to structure it a bit and make it readable, so I hope you find some interesting stuff here. So, here is my wishlist.

  1. I would like yet another OWL language, call it OWL RDF or OWL Superlite, or whatever. This is like the subset of OWL Lite and RDFS. For this the difference between of owl:Class and rdf:Class needs to be somehow standardly solved. Why is this good? It makes moving from RDF to OWL easier, as it forces you to keep Individuals, Classes and Relations in different worlds, and forgets about some of the more sophisticated constructs of RDF(S) like lists, bags and such. This is a real beginners language, really easy to learn and implement.
  2. Defined Semantics for OWL FUll. It is unclear -- at least to me -- what some combinations of RDF(S)-Constructs and OWL DL-constructs are meant to mean.
  3. Add easy reification to OWL. I know, I know, making statements about statements is meant to be the root of all evil, but I find it pretty useful. If you like, just add another group of elements to OWL, statements, that are mutually disjoint from classes, instances and relations in OWL DL, but there's a sublanguage that enables us to speak about statements. Or else OWL will suck a lot in comparison to RDF(S) and RDF(S) + Rules will win, because you can't do a lot of the stuff you need to do, like saying what the source of a certain statement is, how reliable this source is, etc. Trust anyone? This is also needed to extend ontologies toward probabilistic, fuzzy or confidence-carrying models.
  4. I would love to be able to define syntactic sugar, like partitionOf (I think, this is from Asun's Book on Ontology Engineering). ((A, B, C) partitionOf D) means that every D is either an A or a B or a C, that every A, B or C is a D, and that A, B and C are mutually disjunct. So you can say this already, but it needs a lot of footwork. It would be nice to be able to define such shotcuts that lever upon the semantics of existing constructors.
  5. That said, another form of syntactic sugar - because again you can use existing OWL constructs to reach the same goal, but it is very strenuous to do so - would be to define UNA locally. Like either to say "all individuals in this ontology are mutually different" or "all individuals with this namespace are mutually different". I think, due to XML constraints the first one would be the weapon of choice.
  6. I would like to be able to have more ontologies in the same file. So you can use ontologies to group a number of axioms, and you also could use the name of this group to refer to it. Oh well, using the name of an ontology as an individual, what does this mean? Does it imply any further semantics? I would like to see this clarified. Is this like named graphs?
  7. The DOM has quite nicely partitioned itself in levels and modules. Why not OWL itself? So you could have like a level 2 ontology of mereological questions, and such stuff, all with well defined semantics, for the generic questions. I am not sure there are too many generic questions, but taxonomy is (already covered), mereology would be, and spatiotemporal and dynamic issues would be as well. Mind you, not everyone must use them, but many will need them. It would be fine to find stan dard answers to such generic questions.
  8. Procedural attachments would be a nice thing. Like have a a standardized possibilities to add pieces of code and have them executed by an appropriate execution environment on certain events or requests by the reasoner. Yes, I am totally aware of the implications on reasoning and decidability, but hey, you asked what people need, and did not ask for theoretical issues. Those you understand better.
  9. There are some ideas of others (which doesn't mean that the rest is necessarily original mine) I would like to see integrated, like a well-defined epistemic operator or streamlining the concrete domains to be more consistent with abstract domains, or to define domain and range _constraints_ on relations, and much more. Much of this stuff could be added optional in the sense of point 7.
  10. And not to forget that we have to integrate with rules later, and to finally have an OWL DL query language. One goal is to make it clear what OWL offers over simply adding rules atop of RDF and ignoring the ontology layer completely.

So, you see, this is quite a list, and it sure is not complete. Even if only two or three points were finally picked up I would be very happy :)

D'Artagnans Tochter

31 May 2005

Am Samstag war ich - zum ersten mal überhaupt - in einer Theaterpremiere: D'Artagnans Tochter, im Alten Schauspielhaus in Stuttgart, geschrieben von Tom Finn und Volker Ullmann.

Kurz gesagt: hat sehr viel Spaß gemacht! Es war sehr flott, Mantel und Degen auf der Bühne, ein netter Plot, lustig, und vor allem das erste: Flott. In den Szenenwechselns fliegen die Schauspieler über die Bühne, kein Moment Langeweile, kaum scheint man in ruhigen Gewässern angekommen zu sein, stürmen die Schergen des Kardinals heran.

Fiese Bösewichte, die Musketiere sind so dargestellt, wie wir sie kennen, der König wird wunderbar gespielt - viel Spaß, viel Spannung - anschauen! Geh mal wieder ins Theater.

Semantic Scripting

28 May 2005

Oh my, I really need to designate some time to this blog. But let's not ranting about time - no one of us has time - let's directly dive into my paper for the Workshop on Scripting for the Semantic Web on the 2nd ESWC in Heraklion next week. Here is the abstract.

Python reached out to a wide and diverse audience in the last few years. During its evolution it combined a number of different paradigms under its hood: imperative, object-oriented, functional, listoriented, even aspect-oriented programming paradigms are allowed, but still remain true to the Python way of programming, thus retaining simplicity, readability and fun. OWL is a knowledge representation language for the definition of ontologies, standardised by the W3C. It reaps upon the power of Description Logics and allows both the definition of concepts and their interrelations as well as the description of instances. Being created as part of the notoriously known Semantic Web language stack, its dynamics and openness lends naturally to the ever evolving Python language. We will sketch the idea of an integration of OWL and Python, but not by simply suggesting an OWL library, but rather by introducing and motivating the benefits a really deep integration offers, how it can change programming, and make it even more fun.

You can read the full paper on Deep Integration of Scripting Languages and Semantic Web Technologies. Have fun! If you can manage it, pass by the workshop and give me your comments, rants, and fresh ideas - as well as the spontaneous promise to help me design and implement this idea! I am very excited about the workshop and looking forward to it. See you there!

Granada ist phantastisch

27 May 2005

Hier in Granada, auf dem XXII. Weltkongress der Philosophie des Rechts, ist es wahnsinn. Phantastisches Wetter, bekannte Sprecher, viele, viele Leute - über 900 Teilnehmer! - ein Wahnsinns Konferenzgebäude, und bei unserem Workshop kamen einige spannende Fragen und Diskussionen auf. Der Vortrag sei sehr gut gelaufen (ic bin mir nur selber nie dessen sicher). Die Abende sind lang und anregend, die morgen dafür schwer zum Aufstehen. Schade, dass ich nicht bis zum Schluss bleiben kann, aber morgen ist die Theaterpremiere von D'Artagnans Tochter, aus der Feder von Tom Finn, zu der ich eingeladen bin, und dazu muss ich halt wieder in Stuttgart sein. Und am Sonntag geht es auf die ESWC...

Das Leben ist schön.

Flughafenblogging

24 May 2005

Richtig fies: morgens um 3:30 aus dem Bett zu gehen, um dann in Madrid zwei Stunden lang am Flughafen warten zu muessen.

Gruß aus Madrid!

Standpunkt (nicht meiner!)

23 May 2005
"We could really speed up the whole process of drug improvement if we did not have all the rules on human experimentation. If companies were allowed to use clinical trials in Third World countries, paying a lot of poor people to take risks that you wouldn't take in a developed country, we could speed up technology quickly. But because of the Holocaust --"

Dark Side of Popularity

"I actually worry a lot that as I get "popular" I'll be able to get away with saying stupider stuff than I would have dared say before. This sort of thing happens to a lot of people, and I would *really* like to avoid it."

Wenn Nerds protestieren

Wer versteht schon so ein Schild?

Wann gibt es das bei den Studentenstreiks hierzulande? (Und überhaupt, was bedeutet es, wenn Studenten streiken? Dass sie die Arbeit niederlegen?)

Pflichtlektüre

Der Aufstieg Chinas zur politischen und wirtschaftlichen Weltmacht

Teure Taxis

Mein erster Blogeintrag aus Sofia!

Eigentlich ist hier das Taxi spottbillig - also im Verleich zu westlichen Ländern. Drei bis vier Leva für zehn bis fünfzehn Kilometer. Aber manchmal erwischt man ein Taxi, dass nicht OK ist (im wahrsten Sinne des Wortes: die zu empfehlende Taxifirma hier ist OK Taxi) - und dann kostet es leicht das Sechsfache.

Das seltsame ist, dass es immer noch günstiger ist als bei uns, und so richtig zu beschweren möchte man dann auch nicht. Zumal wenn man die Sprache nicht beherrscht.

Mit dem eigenen Auto herzukommen ist zudem noch eine besonders dumme Idee, weil es hier eine besonders interessante Regelung gibt: wenn man sein Auto einführt, und es dann gestohlen wird, und man es entsprechend nicht mehr ausführen kann - muss man an der Grenze zudem noch Strafzoll nachzahlen...

Internet jetzt auch Offline

Ihr braucht das Internet bloß hier online runterzuladen, und schon könnt ihr darin ganz ohne Netzverbindung surfen. Cool, ne?

Eine Stimme im Himmel

"Wolfgang, bist Du Dir wirklich sicher?"

Man sitzt ganz gemütlich im Flugzeug. Am Anfang erfährt man noch, dass nicht der Kapitän, sondern der erste Offizier fliegt. Dann, plötzlich, über die Lautsprecher kam der oben stehende Satz. Und man denkt sich: oh, der Anfänger scheint Schwierigkeiten zu haben und fragt den Chef. Zumal es ein wirklich turbulenter Flug ist (im wörtlichen Sinne).

"Wolfgang, hier spricht Dein Gewissen - bist Du Dir wirklich sicher?"

Was bedeutet das? Hat der Offizier, oder gar der Kapitän zufällig den falschen Schalter umgelegt? Doch die Auflösung folgt stante pede, nach dem Wetterbericht:

"Allen Fluggästen einen schönen Tag und wir hoffen, dass Wolfgang seine Entscheidung, das Junggesellendasein zu beenden nicht bereuen wird."

Puh...

Bosc de les Fades

Das - für mich - beeindruckendste Restaurant / Café / Bar -Dings in Barcelona war ganz eindeutig das Bosc de les Fades (Feenwald) welches irgendwie zum Wachsmuseum gehört (von der Wachsmuseumswebsite kommt ihr auf die offizielle Webseite des Bosc de les Fades, ist leider nicht direkt verlinkbar).

Das Ding ist eine Bar, jedoch derart gestaltet, dass sie jedes Rollenspielers Herz (und überhaupt von jedem, der auf Fantasy und Märchen steht) höher schlagen lässt. Ein Feenwald, ein verwunschenes Zimmer, Burgkämmerlein, und manches mehr bilden die Elemente dieser Bar. Die Bilder auf den verlinkten Seiten geben nur ein sehr ungenügendes Bild ab. Sollte es euch nach Barcelona verschlagen, auf keinen Fall verpassen! Einfach zum Wachsmuseum am südlichen Ende der Rambla, und vor dem Eingang einfach rechts.

Und nächstes mal muss ich auch das Wachsmuseum selber besuchen. Ich dachte, nach dem Madame Tussauds und dem Dungeon in London würde das Barceloner Museum nichts spannendes enthalten können - doch ein Museum dass mit Statuen von Superman und C3PO am Eintritt wirbt könnte interessanter sein als gedacht.

Danke an Núria für den Tipp!

Unique Name Assumption - another example

Ian Davis has a very nice example illustrating the Unique Name Assumption: "Two sons and two fathers went to a pizza restaurant. They ordered three pizzas. When they came, everyone had a whole pizza. How can that be?"'

Better than my examples. And much shorter!

Sagrada Familia

Ich bin zwar schon zurück aus Barcelona, aber so zwei, drei Nachwehen werde ich dazu noch bloggen. Heute: die Sagrada Familia, die Kathedrale der Heiligen Familie zu Barcelona (noch im Bau). Man kommt hin, und muss schlappe 8 Euro hinblättern für den Eintritt. Erscheint viel, ist es auch.

Aber wie berechnet man den Eintritt für etwas, das wirklich einzigartig ist? Man sieht vor allem eine Baustelle, umgeben von den gewaltigen bestehenden Türmen. Der Aufzug nach oben kostet nochmal 2 Euro, und der Treppenaufgang ist versteckt und finster und dunkel - und sehr lohnenswert. Der Blick über Barcelona, den man von da oben genißen kann, ist nur das Sahnestück, das eigentlich unglaubliche ist die Kathedrale selbst. Höher und höher, Scharten, Fenster, Durchblicke überall, immer wieder fällt der Blick durch ein Fenster, genau auf ein golden Geschriebens Wort, auf eine Engelsfigur, auf eines der unzähligen Elemente der Kathedrale.

Wie gewaltige Mammutbäume schmiegen sich die Säulen an eines der vier Portale, alles scheint gen Himmel zu fließen, überall Zitate aus Natur und Geschichte. Das Museum öffnet einem die Augen für zahlreiche Details, zeigt die Tricks für die fließende Wirkung, entschlüsselt manches Detail aus der Natur, und ich blieb staunend zurück mit der Frage, was Gaudi heute mit einem CAD-Programm anstellen würde, ob er genialer, ob er konservativer wäre...

Teuer, aber sehenswert. Und zudem ein schönes Glockenspiel. Und sehr, sehr hoch...

Barcelona II

Wieder in Barcelona. Nachdem letztes Mal im November so ein Traumwetter war, war ich diesmal leicht gekleidet losgeflogen - und prompt ist es kälter als im Ländle. Würde mich nicht überraschen, wenn ich eine Erkältung mitschleppe...

Aber dafür habe ich diesmal sogar etwas Zeit, mir die Stadt anschauen. Wahnsinnggemütliche kleine Bars wie die Eclectica, supertolle Restaurants wie die Muscleria, etc. Die letzten zwei Tage war ich mit Barcelonesen unterwegs, heute mit Kai, einem Erasmusstudenten, und seiner Schwester Rosa. Spannend, wie die unterschiedlichen Einstellungen zu den Sehenswürdigkeiten der Stadt sind, was sie empfehlen, wo sie hingehen.

Die Stadt ist toll! Heute schaue ich noch die Sagrada Familia an, das freue ich mich besonders darauf. Und dann brauche ich ganz dringend Schlaf...

New OWL tools

The KAON2 OWL Tools get more diverse and interesting. Besides the simple axiom and entity counter and dumper, the not so simple dlpconverter, and the syntactic transformer from XML/RDF to OWL/XML and back, you now also have a filter (want to extract only the subClassOf-Relations out of your ontology? Take filter), diff and merge (for some basic syntactic work with ontologies), satisfiable (which checks if the ontology can have a satisfying model), deo (turning SHOIN-ontologies in SHIN-ontologies by weakening, should be sound, but naturally not complete) and ded (removes some domain-related axioms, but it seems this one is still buggy).

I certainly hope this toolbox will still grow a bit. If you have any suggestions or ideas, feel free to mail me or comment here.

Kebab Connection

aus der Reihe Filme in 50 Worten

Montag, Sneak im Metropol. Eine der Regeln der Sneak: ein Film, der eine Preview bekommt, kommt nicht in der Sneak. Wurde Montags gebrochen.

Zum Glück. Weil den Film hätte ich sonst nicht gesehen. Dem Titel nach erwartete ich eine platte Komödie über Türken der zweiten Generation, die voll auf cooler Lan, blöde Sprüche, und zwischen den Knöcheln hängende Hosen stehen. Stattdessen: ein witziger Film, mit einer glaubhaften Geschichte, der Klischees nicht nur bedient, sondern sie an den richtigen Stellen auch ernst nimmt und konsequent weitertreibt. Wenn der Vater den Sohn als Sohn eines Esels beschimpft, weil er weiß, dass er im Unrecht ist, oder er verzweifelt versucht, seine Fehler wegzuerklären, weil er zu stolz ist, sie als solche zuzugeben, dann sehen wir überzeugend dargestellte Szenen. Kurz: ein wirklich guter Film, der leider wegen des Marketings und des Titels den größten Teil der potenziellen Zuschauer verfehlen könnte.

One Day in Europe

Aus der Reihe Filme in 50 Worten

Letzten Freitag war im Delphi eine kleine Premiere dieses Streifens, mitsamt Hauptdarstellern, Regisseur und so ein Krams. Das drumherum ist zwar nett, aber hier gehts trotzdem nur um den Film, und der war - trommelwirbel! - lustig, schnell, bildreich, musikalisch prima unterlegt, überzeugend gespielt, interessante Charaktere, und allerlei sonst.

Vier Episoden in Moskau, Istanbul, Santiago de Compostela und Berlin erzählen von kleinen Ereignissen, die in Europa geschehen. Weil es eben nicht um dramatische, existenzielle Dinge geht, hat der Zuschauer die Muse und der Film die Zeit die ganzen vielen Details aufzuzeigen.

Auch sehr nett: auf die Frage, was der Regisseur denn von Europa halte, antwortete er: "Ich maße mich nicht an, über das Vereinigte Europa zu orakeln. Ich weiß nicht, ob es in 20 oder 40 oder 60 Jahren da ist..." Aber dass es da sein wird, schien er nicht einen Moment zu bezweifeln. So ähnlich auch der Film: die verbindenden Elemente Europas sind Sprachenwirrwar und Fußball (und Diebstahl und Versicherungsbetrug, na gut).

Kurz: wer die Gelegenheit bekommt, anschauen! Der Film lohnt sich.

MinCardinality

More on the Unique Name Assumption (UNA), because Andrew answered on it, with further arguments. He quotes Paul: " The initial problem was cardinality and OWL Flight attempts to solve the problem with cardinality. Paul put it succinctly: "So what is the point of statements with the owl:minCardinality predicate? They can't ever be false, so they don't tell you anything! It's kind of like a belt and braces when your belt is unbreakable." "

Again I disagree, this time to Paul: the minimal cardinality axiom does make sense. For what, they ask - well, for saying that there is a minimal cardinality on this relation. Yeah, you are right: this is an axiom which hardly can lead to an inconsisten ontology. But so what? You nevertheless can cut down the number of possible models with it and get more information out of the ontology.

"I would agree - this was my main problem - how do you explain to Joe (and Andrew) that all his CDs are the same rather than different."

That's turning around the argument. If the reasoner would claim that all of Joes CDs are the same, he would be doing a grave mistake. But so would he if he would claim that all are different: the point is, he just doesn't know. Without having someone to state sameness or difference explicitly, well, you can't know.

"I did comment that the resolution, local unique names using AllDifferent, didn't actually seem to solve the problem well enough (without consideration for scalability for example)."

I am not sure why that should be. It seems that Andrew would be happy if there was a file-wide switch claiming "If I use different URIs here I mean different objects. This file makes the UNA." These files would easily be translated to standard OWL files, but there would be less clutter inside (actually, everything that would need to be done is adding an axiom of allDifferent with all the names of the file).

"I have a feeling that context is a better solution to this problem (that might just be my golden hammer though)."

I don't understand this one, maybe Andrew will elaborate a bit on this.

If you imagine an environment with axioms floating around, from repository to repository, being crawled, collected, filtered, mapped and combined, you must not make the Unique Name Assumption. If you remain in your own personal knowledge base, you can embrace UNA. And everything you need between is one more axiom.

Is it that bad?

Hannover spielt

Die diesjährige Hannover spielt 2005 war... langweilig.

Die letzte Hannover spielt die ich besuchte war Mitte der 90er. Es waren damals um die 2000 Besucher da, zahlreiche Workshops liefen ständig und parallel, unzählige Ehrengäste gaben sich die Klinke in die Hand. War's R. A. Salvatore, den ich damals sah? Viele Leute lernte ich damals kennen, die ich heute noch gerne und viel zu selten sehe. Es war toll.

Über Jahre hinweg verplante ich es stets, nach Hannover zu fahren. Dieses Jahr ergab sich die plötzliche Situation, dass der Samstag frei wurde, weil ich eine Abgabe überraschenderweise vor der Zeit schaffte (Osterwochenende sei dank, mein Text für Rückkehr des Kaisers ist draußen). Also in den Zug gesetzt und ab nach Hannover (etwas mehr als 3 1/2 Stunden - super!).

Wenn das ein Spielecon vor der Haustür gewesen wäre, in Tübingen, in Villingen-Schwenningen, gar hier in Stuttgart, das wäre OK. Aber früher hatte sich eben auch eine viel längere Strecke gelohnt! Und das ist es, was ich vermisse. Kaum Programm, kaum bekannte Gesichter, vielleicht 200 Besucher, die Überreichung des Goldenen Bechers war, nun, auch weniger spektakulär als früher, ah ja. Ich glaube, das war meine letzte Hannover spielt.

Allerdings, ich tue dem Con unrecht. Ich hörte, die Veranstalter hätten sich mit Absicht für ein Downsizing entschieden, um das ganze besser zu verwalten und weniger stressfrei zu machen. Ein nachvollziehbarer und mutiger Entschluss. Und ich bin kaum Con-Spieler: mich interessieren die Workshops und die Gelegenheit, Leute zu treffen und zu planen. Insofern ist die Aussage einfach: ich war nicht Zielruppe, und, wie gesagt, tue damit der Hannover Spielt deutlich unrecht. Ich bin mir sicher, es wird wieder Conberichte geben, die mit Spielerlebnissen und viel Spaß aufwarten werden - jeder soll sich also selber die Nachlese aussuchen, die ihm am besten liegt.

Die gute Nachricht: ein paar liebe Menschen gabelten mich auf und brachten mich dann zur Ostermühle, wo ich dann einen Abend und einen Morgen viel mehr nach meinem Geschmack verbrachte. Danke für diese Gelegenheit!

Pforzheim

Soeben, beim ein paar Links zu Blogs verfolgen...

Ich: "Schon wieder Pforzheim! Ich glaube, Pforzheim hat die größte Blogdichte der Welt..."
WG-Mitbewohnerin: "Vielleicht haben die dort sonst nichts zu tun?"

Ich bin mir auch ganz sicher, Pforzheimer haben alle ganz, ganz viel Humor... ;)

Unique Name Assumption

I just read Andrew Newman's entry on the Unique Name Assumption (UNA). He thinks that not having an UNA is "weird, completely backwards and very non-intuitive". Furher he continues, that "It does seem perverse that the basis for this, the URI, is unique." He cites an OWL Flight paper that caused me quite some headache a few weeks ago (cause there was so little in it that I found to like).

Andrew, whose blog I really like to read, makes one very valid point: "It doesn't really say, though, why you need non-unique names."

There was an OWL requirement that gives a short rationale for the UNA, but it seems it is not yet stated obvious enough.
Let's make a short jump to the close future: the Semantic Web is thriving, private homepages offer rich information sources about anything, and even the companies see the value of offering machine-processable information, thus, ontologies and knowledge bases everywhere!

People want to say how they liked the movie they just saw. They enrich their movie review with an RDF-statement that says

http://semantic.nodix.net/movie#Ring_2 http://semantic.nodix.net/rating#rated http://semantic.nodix.net/rating#4_of_5.

Or rather, their editor creates this statement automatically and publishes it along the review.

I'd be highly surprised if imdb would use the same URI for denoting the movie. They would probably use an imdb-URI. And so could I, using the imdb-specified URI for the movie. But I didn't, and I don't have to. If I want to state that this is the same movie, I can assert that explicitly. If I had UNA, I couldn't do that. The two knowledge bases could not work together.

With UNA, many knowledge bases relying on inverse functional properties would break as well. FOAF, for examples, uses this, identifiying persons with an IFP of their eMail-Hash. With UNA, this wouldn't work anymore.

Let's take another example. On my mothers webpage there could be a statement saying she has three kids, Jurica, Rozana and Zdenko. I would state on my page that I am my moms kid. My sister, being the social kind, tells the world about her mom and her two brothers, Jurica and Denny.
Now, if we have UNA, a reasoner would infer that one of us is lying. But all of us are very honest, trustworthy people. The problem here is, that my name is Zdenko, but most people refer to me as Denny. UNA says that Denny and Zdenko are the same person. If we have no UNA, we wouldn't believe that. But still we can state it explicitly: my mom could have said that she has three kids, Jurica, Rozana and Zdenko, and those are mutually distinct. Problem solved.

You could say, wait, if we had UNA we still could just claim that Zdenko owl:sameAs Denny, and the problem wouldn't arise. That is true. But then I would have to consider my moms statements. That maybe OK on a scale like this, but imagine this in the wilds of the web - you would have to consider every statement made about something, before you may state something as well. Impossible! And you would introduce non-monotonic inferences, and you probably wouldn't really want that.

What does this mean? Let's take the following row of statements, and consider the answer to the question "Is Kain on of Adams two sons?". So we know that Adam has two sons, and that there is an entity named Kain.

Adam fatherOf Abel.

UNA and non-UNA both answer: don't know.

Adam fatherOf Cain.

UNA says "No, Kain is no son of Adam". non-UNA says: "Sorry, I still don't know".

Cain sameAs Kain.

UNA says "Yes, Kain is a son of Adam (hope you didn't notice my little lie seconds before)". non-UNA says: "Yes, Kain is a son of Adam".

Assuming that, instead of the last statement, we claimed that

Adam fatherOf Kain.

UNA would say: "I'm messed up, I don't know anything, my database is inconsistent, sorry." , whereas non-UNA would answer: "Yes, Kain is a son of Adam (and by the way, maybe Kain and Abel are the same, or Kain and Cain, or Abel and Cain)."

The problem is, that in the setting of the Semantic Web you have a World Wide Web with thousands of facts, always changing, and you must assume that you didn't fetch all the information about a subject. You really can't know if you know everything there is about Adam. But you still want to be able to ask questions. And you want to get answers, and these answers to be monotonic. You don't want the Semantic Web to answer one day "No", the other "Yes" and sometimes "I don't know", but you could be fine with having it either provide the correct answer or non at all.

OWL-Flight and proponents of UNA actually forgot that it's a Semantic Web, not just a Semantic Knowledge Base. If you want UNA, take your Prolog-engine. The Semantic Web is more. And therefore it has to meet some requirements, and UNA is an astonishingly basic requirement of the Semantic Web. Don't forget, you can create local unique names if needed. But the other way would be much harder.

Still, Andrews arguments lead to a very important question: taking for granted that Andrew is an intelligent guy with quite some experience with this kind of stuff, how probable is it, that Joe Random User will have really big problems with grasping such concepts as non-UNA? How should the primers be written? How should the tools work in order to help users deal with this stuff - without requiring the user to study these ideas in advance?

Still a long way to go.

Ring 2

Filme in 50 Worten

Angst. Anspannung. Ring 2 ist die würdige Fortsetzung von Ring. Wie soll ich sagen? Kinder sind der größte Horror.
Wir tauchen tiefer in die Geschichte von Samara ein, und wieder sehen wir, wie Rachel um ihren Sohn kämpft. Der Mut, den sie dabei aufbringt, grenzt ans unglaubwürdige - bis man sich vor Augen hält, dass sie um das blanke Leben ihres Kindes kämpft.

Ring ist bislang der einzige Horror-Film, der mir überhaupt gefiel - und ich habe so manchen gesehen. Es ist insbesondere, der einzige, der Horror herberief. Nun gesellt sich Ring 2 dazu.

So spannend, dass ich ganz verspannt im Kino saß, und danach echt verspannt war...

Auch das Internet hat ein Ende

Und zwar hier.

Und wieder zu den Google-Suchen

Bei den drei Google-Anfragen von letzter Woche ist zwar etwas Ruhe eingekehrt, aber ein weiterer Effekt ist zu beobachten: bei allen drei Anfragen (geruch der luft nach regen, gefühl von schnee auf der haut, eigene wahrnehmung von dem gefühl von schnee auf der haut) ist genau dieser Blog der Nummer 1-Hit geworden.

Die armen Leute, die auf der Suche nach dem Geruch der Luft nach Regen sind, und das einzige, was sie bekommen, ein Blogeintrag ist, der sich genau darüber lustig macht...

Unter Brüdern

Aus der hochbeliebten, aber nur unregelmäßig fortgeführten Reihe 'Filme in 50 Worten'

Hiermit möchte ich meine Rolle als Multiplikator wahrnehmen, und darauf hinweisen, dass ich über Buddy, und der über den KulturSpiegel, gestern für umme in die Preview des dänischen Films Brothers reingehen konnte. Der hatte mich im Vorfeld nicht wirklich interessiert, aber was man tut man nicht für kein Geld...

Nun ja, der Film überraschte mich. Glaubwürdige Geschichte, überzeugende Darsteller. Drama. Aber so richtig, mit Tränen und verzwickten Situationen, mit Sturköpfen und Helden. Der Krieg in Afghanistan, Angst vor dem eigenen Vater, der Tod eines geliebten Menschen, Besaufen in der Kneipe, Väter, die unverzeihliches sagen - all das wird dargestellt, nicht gewertet, was ich geradezu bewundernswert finde.

Schade, dass manche Szenen übergangen wurden. Hätte ich gerne gesehen, wie der Bruder erfährt, dass sein Bruder gar nicht tot ist. Da wurde geschnitten. Auch das Ende ist dann etwas mittendring statt schon vorbei. Andere Handlungen sind für mich kaum nachvollziehbar. Aber so sind Menschen halt. Auch schade ist: dadurch, dass die Erzählweise nicht wertet, nehmen wir auch nicht so viel mit. Als ob man hier wirklich nur eine Geschichte erzählen wollte, ohne Nachricht. Niemand will mir etwas sagen in dem Film. Es passiert einfach. So wie das Leben. Ob man das von einem Film will mag man selber entscheiden.

Wer Dramen mag, wer gut erzählten, echt wirkenden Geschichten gerne folgt - der sollte sich diesen Film anschauen. Niemand wird aber wegen einer Kulturlücke ausgelacht werden, wenn er den Film doch verpasst.

So, Multiplikatordienst erfüllt. Wann kriege ich die nächste Freikarte?

AIFB OWL tools

Working with ontologies isn't yet as easy as it could be - especially because the number of little helpers is still far too small. After having written dlpconvert and owlrdf2owlxml (the tool with the maybe most clumsy name in the history of the Semantic Web) I noticed how easy it would be to write some more tools based on Boris' KAON2 OWL ontology infrastructure.

And so I went ahead. First I integrated dlpconvert and owlrdf2owlxml (or short, r2x) in it, then I added a simple ontology dumper and axiom and entity counter. Want to know how many individuals are in your ontology? Simply type owl count myontology.owl -individual, and there you go. Want a list of all Classes? Try owl print myontology.owl -owlclass. It's as easy as that.

I'm totally aware that this functionality maybe isn't worth the effort of building a tool for. But this is just a beginning: I want to add more functionality to filter, merge, compare and much more to it. The point is, at the end having a handy little set of OWL tools you can work with. I miss that really with OWL, and now here it is. At least, a beginning.

Grab your copy now of the AIFB OWL Tools.

Philosophische Grundlagen

I had a talk on Philosophical Foundations of Ontologies last week at the AIFB. I prepared it in German (and thus, all the slides were in German) and just before I started I got asked if I may give the talk in English.

Having never heard a single lesson in philosophy in English and having read English philosophy only on Wikipedia before, I said yes. Nevertheless, the talk was very well perceived, and so I decided to upload it. It's pure evil PowerPoint, no semantic slides format, and I didn't yet manage to translate it to English. If anyone can offer me some help with that - I simply don't know many of the technical terms, and I don't have ready access to the sources - I would be very happy! Just drop me a note, please.

Philosophische Grundlagen der Ontologie (PowerPoint, ca. 4,5 MB)

Broken link

Boogey-Man

Ich habe hier schon länger keine Filmrezensionen mehr geschrieben, leider. Und ich will für diesen Film auch nicht damit anfangen, weil es sich dafür nicht lohnt, und zudem Schwesterchen wie auch Buddy schon ihren Senf dazugegeben haben.

Warum ich trotzdem dazu schreibe? Weil Telepolis eine lobende Rezi zu diesem Film gemacht hat. Wer dies liest, muss sich fragen, ob sie vielleicht ernsthaft einen anderen Film gesehen habe. Beim Hamburger Abendblatt konnte ich mich irgendwie deutlich eher wiederfinden...

Lange Nacht der Museen

Ist es eigentlich Etikettenschwindel bei der langen Nacht der Museen von einer lange Nacht zu sprechen? Die hört ja schon um 2 auf!

Gestern war in Stuttgart wieder die Lange Nacht der Museen. Diesmal leider ohne mich, da ich in Karlsruhe noch auf der DenkWerkzeug 2005 war (die wiederum einige Aha-Erlebnisse und interessante Einsichten erbrachte), und ich erst nach Mitternacht in Stuttgart aufschlug (weil ich nicht bedachte, dass Samstags die Züge anders fahren).

Hätte sie wirklich lange gedauert, wäre ich noch hingegangen. Für anderthalb Stunden aber 12 Euro auszugeben, nun ja, ich weiß nicht. Ironisch: die Rückfahrt dauerte dann fast so lange, wie die Museumsnacht noch ging, weil am Schlossplatz eine Bahn entgleist war.

Zur langen Nacht der Museen muss immer etwas passieren, wie etwa vor zwei Jahren, als das Parkhaus uns nicht rausließ.

Dieter-Hamann-Bridge

Leider ohne ursprünglichen Autor, soeben in meinem Mailkasten gelandet:

"Engländer ärgern - macht alle mit!!!

Hallo, das ist doch der Spaß wert ;-)

Wer erinnert sich nicht an das glorreiche 1:0 der deutschen Fußball-Nationalmannschaft im letzten Spiel vor dem Abriss des altehrwürdigen Wembley-Stadions? (Wie sehr die Engländer diese Niederlage geschmerzt hat, lässt sich übrigens gut in David Beckham "My Side" nachlesen.)

Nun ist es an der Zeit, Didi Hamann für seinen Sieg-Freistoß (ca. 25 Meter Entfernung, flach über den nassen Rasen ins untere linke Eck!) entsprechend zu würdigen: Mittlerweile ist das Wembley-Stadion wieder aufgebaut und zum Stadion führt eine neue Brücke, die noch namenlos ist. Deswegen hat die London Development Agency einen Wettbewerb ausgeschrieben, bei dem der Name gewinnt, der am häufigsten genannt wird. Und das ist unsere Chance!

Also hier für "Dietmar-Hamann-Bridge" voten:
http://www.lda.gov.uk/server.php?show=ConForm.9

In der Begründung bitte angeben: 'In tribute to the player who scored the last goal in the old stadium'

P.S. Schickt den Link mal an alle Freunde und Bekannte und/oder stellt ihn in die bekannten und internen Foren - wäre doch gelacht wenn wir die Tommies nicht noch ein bisschen Ärger könnten."

Da mach' ich mit!

What's DLP?

OWL has some sublanguages which are all more or less connected to each other, and they make the mumbojumbo of ontology languages not any clearer. There is the almighty OWL Full, there's OWL DL, the easy* OWL Lite, and then there are numerous 'proprietary' expansions, which are more (OWL-E) or less (OWL Flight) compatible and useful.

We'd like to add another one, OWL DLP. Not because we think that there aren't enough already, but because we think this one makes a difference. Because it has some nice properties, like fully translatable to logic programs, and because it is easy to use and because it is fully compatible to standard OWL, and you don't have to use any extra tools.

If you want to read more, I and some colleagues at the AIFB wrote a short introduction to DLP (and the best thing is: if I say short, I mean short. Just two pages!). It's meant to be easy to understand as well - but if you have any comments on that, please provide them.

 * whatever easy means here

Gutes Karma

Tibetanische Buddhisten glauben, dass das benutzen des Mantras OM MANI PADME HUM die sechs bösen Geister Stolz, Eifersucht, Lust, Unwissen, Gier und Wut vertreibt und die sechs guten herbeiruft, Großzügigkeit, Harmonie, Ausdauer, Begeisterungsfähigkeit, Konzentration und Einsicht.

Statt das man das Mantra selber meditiert, kann man es aber auch auf eine sogenannte Gebetsmühle eintragen. Eine Umdrehung der Gebetsmühle wirkt wie die einmalige Rezitation des Mantras. Gebetsmühlen haben außerdem den Vorteil, dass man das Mantra mehrfach aufmalen kann: 10 mal geschrieben, einmal gedreht, 10 mal Wirkung.

Heute geht das alles viel schneller und effizienter. Moderne Festplatten drehen sich mit ein paar tausend Umdrehungen pro Minute. Das heißt, wenn man ein Gebetsmantra runterlädt und auf der Festplatte abspeichert, dann wirkt das Mantra mit der Zeit ganz automatisch und mit geballter Kraft - viel stärker, als ein Mensch es herbeten könnte.

Mehr Infos zur digitalen Gebetsmühle (engl.).

Social tagging und Co

Cool. Ich sei im Januar Inspiration für diesen Blogeintrag zu Social Tagging mit del.icio.us, flickr und ähnlichem gewesen. Scheint, als hätte ich meine Arbeit nicht soo schlecht gemacht... (und dabei war ich nach genau dieser ersten Sitzung davon überzeugt, dass ich grottig gewesen sei. Aber das bin ich ja von eigener Arbeit zunächst immer...)

New versions: owlrdf2owlxml, dlpconvert

New versions of owlrdf2owlxml and dlpconvert are out.

owlrdf2owlxml got renamed, as it was formerly known as rdf2owlxml. But as a colleague pointed out, this name can easily be misunderstood, meaning to transform arbitrarily RDF to OWL. It doesn't do that, it only transforms OWL to OWL, from RDF/XML-serialisation to XML Presentation Syntax. And it seems to work quite stable, it can even transform the famous wine ontology. Version 0.4 out now.

dlpconvert lost a lot of its bugs. And as most of you were feeding RDF/XML to it, well, now you can do it officially (listen to the users), too. It reads both syntaxes, and creates a Prolog program out of your ontology. Version 0.7 is out.

They are both based on KAON2, the Karlsruhe Ontology Infrastructure module, written by Boris Motik. My little tools are just wrapped around KAON2 and using its functionality. To be honest, I'm thinking of writing quite a number of little tools like this, who offer different functionality, thus providing you with a nice toolkit to handle ontologies efficiently. I don't lack ideas right now, it's just I' m not sure that there's interest in this.

Well, maybe I should just start and we'll see...

By the way, both tools are not only available as web services, but you may also download them as command line tools from their respective websites and play around it on your PC. That's a bit more comfortable than using a browser as your operating system.

200.000

Yeah. Heute Nacht hat der Counter den 200.000. Besucher der Nodix-Seiten erwischt. Danke an all die Besucher über die Jahre hinweg!

Der Googlecount auf die Suchbegriffe der letzten zwei Blogs ist weiter rückläufig: 28,100 / 36,100 / 667. Das bedeutet besonders bei dem letzten Begriff einen Unterschied von weit über 500%!

Google ändern

Huch? Seit gestern haben sich für die dort angegeben Suchen die Anzahl der Hits gewaltig gesenkt. Heute sieht es so aus:

geruch der luft nach regen: 28,200 Treffer
gefühl von schnee auf der haut: 36,600 Treffer
eigene wahrnehmung von dem gefühl von schnee auf der haut: 1,140 Treffer

Hmmm. Wahrscheinlich benutzen se ein approximate Reasoning Algorithmus für die erste Näherung, und, bei mehrfacher Wiederholung machen sie wahrscheinlich dann eine genauere Rechnung auf, die gecacht wird.

Oder so.

Es ist nicht real, wenn es nicht im Internet ist

Eben im ICQ...

Buddy: ich weiß, was man nicht im internet findet, kann nicht real sein
Buddy: ;)
denny: man findet etwas nicht im internet?
denny: was?
Buddy: der geruch der luft nach regen
Buddy: das gefühl von schnee auf der haut
Buddy: etc.
denny: warte mal
Buddy: oh bitte, du googlest doch nicht etwa danach...
denny: geruch der luft nach regen: 38,400 treffer
denny: gefühl von schnee auf der haut: 57,800 treffer
Buddy: ja, du findest die beschreibung anderer davon, aber nicht die eigene wahrnehmung
denny: eigene wahrnehmung von dem gefühl von schnee auf der haut: 3,620 treffer

Unexpected problems

As you know, I'm a strong believer in the vision of the Semantic Web, and I actively pursue this goal. I am not too sure what it means, but I have hundreds of ideas floating through my head, about what will be possible in this future...

But the road seems longer than expected. For some time I have the dlpconvert and rdf2owlxml web services running. It is very enlightening and interesting to see, what kind of ontologies were used for testing. And I most certainly don't mean the domain of the ontologies used, but rather the syntax.

Both services state very clearly what syntaxes you may use. dlpconvert allows only OWL XML presentation syntax, rather obscure, I admit. That's the main reason, rdf2owlxml was offered. But most people didn't care, they just keep on using RDF - and not just OWL in RDF/XML-serialisation, but much more simple, plain RDF.

Yeah, every RDF is in OWL Full. But dlpconvert only deals with OWL DL. That's stated explicitly. And much less does it work with Abstract Syntax or N3. All of this was tested.

I most definitively don't want to rant about users here. You never should rant about users (I mean, in public). Especially, since everyone who uses a service like dlpconvert is probably quite intelligent and has some expertise in the field of Semantic Web. It's not his fault. It isn't mine either, I wrote quite explicitly what is needed. Maybe it's the W3Cs fault, or maybe it's just to blame on politics.

The fine differences between RDF, RDFS, RDF(S), OWL, OWL Full, OWL DL, OWL Lite, DLP - yes, I said fine differences between RDF and OWL DL - it's just too much to cope with. If it is too much for us, what do we expect of the future user of the Semantic Web? The web as we know it grew to its todays size because it was easy. It wasn't because of standards. For the first few years no one really cared about the HTML standard, I mean, not to the extent we do today in the Semantic Web. Even with tons of errors, pages would load and show nice results. It was a very forgiving system. And now, find out why it was so widely adopted?

The problem is: maybe we really need to be as strict as we are. But I hope we don't. I strongly believe into the virtue of "View source" - but this means understandable views on the source. Not RDF/XML-Serialisation. And still easy to copy. Only this way the Semantic Web can lift off from the roots, from the users. The users were creating the Web in the first years, not the companies. I don't know why everybody is turning to the companies today.

Oh, I should stop, it sounds like ranting again.

Pendeln

"Und, pendelst Du immer noch?"
"Na ja, eigentlich lege ich eher Tarot-Karten als zu pendeln, aber manchmal schon."
"Ich meinte zur Arbeit."

Danke für die Heimfahrt!

Wodka

Neulich, im Getränkemarkt...

"Oh, wir könnten noch Bananen- und Kirschsaft kaufen, für KiBa."
"Coole Idee. Das sind zwei Flaschen Banane und eine Flasche Kirsch."
"Nö, das mischt man 1:1. Wir nehmen zwei von jeder."
"Echt? Na gut."
"Oh, schau, Mangosaft! Nehmen wir auch eine Flasche Mangosaft."
"Soviel passt aber nicht in die Kiste. Wir müssen was raustun."
"Hier, nehmen wir nur eine Flasche Banane."
"Aber dann haben wir ja zwei Flaschen Kirsche auf nur eine Flasche Banane."
"Ja und?"
"Du sagtest doch, die mischt man 1:1."
"Ja, schon, aber den Kirschsaft kann man auch für anderes hernehmen."
"Ach ja? Für was denn?"
"Kirsch Wodka zum Beispiel."
"Wir haben Wodka da?"
"Nein."
"..."
"Blogg das bitte nicht!"

Glaubst Du doch selbst nicht. Wer solche Bilder von mir bloggt, der hat kaum Gnade verdient... ;)

Valentinstag

Warum gibt es eigentlich die Valentinsgrußkarte "Ich liebe nur Dich!" auch im Fünferpack?
(gefragt von Schwesterchen)

Die Hasselhof-Rekursion

Achtung! Schäden für das Gehirn sind nicht ausgeschlossen.

Wer hier klickt, ist selber schuld.

Ironie

Ein militanter Gegner des Gesetzes zum verpflichtenden Anschallen mit dem Sicherheitsgurt beim Autofahren stirbt bei einem Autounfall.

Flop of the Year?

IEEE Spectrum Editor Steven Cherry wrote the article Digital Dullard in, well, IEEE Spectrum. Well, he obviously dislikes Paul Allen for his money, and can't stop ranting about him, and about Mr Allen spending millions and millions of Dollars in research projects ("that's just the change that drops down behind the sofa cushions"). Yeah, Mr Cherry, you're totally right - why should he spend more than 100 Million Dollars in research, he should rather invest it in a multi-million house, an airline or produce a Hollywood blockbuster with James Cameron.

The thing is, Cherry claims the whole project of creating a Digital Aristotle, dubbed Project Halo, is naught but thrown out money, because understanding a page of chemistry costs about 10.000$. For one single page! Come on, how many students would learn one page for 10.000$?

Project Halo succeeded in creating a software program that is capable of taking a high school advanced-placement exam in chemistry, and actually, to pass the exam - and it did, and even beating the average student in it. Millions have been spent, says Cherry, for that? Wow...

Cherry fails to recognise two points here, that illustrate the achievement of such a project:

First, sure, it may cost 10.000$ to get a program that understands one page, and it may cost only 20$ to get a human to do the same. So, training a program that is able to replace a human may cost millions and millions, whereas training a human to do so will probably cost a mere few ten thousands of dollars. But ever considered the costs of replication? The program can be copied for an extremely low cost of a few hundred bucks, whereas every human costs the initial price.

Second, even though the initial costs of creating such prototype programs may be extremely high, that's no reason against it. Arguments like this would have hindered the development of the power loom, the space shuttle, the ENIAC and virtually all other huge achievements in engineering history.

It's a pity. I really think that Project Halo is very cool, and I think it's great, Mr Allen is spending some of his money on research instead of sports. Hey, it's his money anyway. I'd thank him immediately if I should ever meet him. The technologies exploited and developed there are presented in papers and thus available to the public. They will probably help in the further development and raise of the Semantic Web, as they are able to spend some money and brain on designing usable interfaces for creating knowledge.

Why do people bash on visions? I mean, what's Cherry's argument? I don't catch it... maybe someone should pay me 20.000$ to understand his two pages...

Introducing rdf2owlxml

Very thoughtful - I simply forgot to publish the last entry of this blog. Well, there you see it finally... but let's move to the new news.
Another KAON2 based tool - rdf2owlxml - just got finished, a converter to turn RDF/XML-serialisation of an OWL-ontology into an OWL/XML Presentation Syntax document. And it even works with the Wine-ontology.

So, whenever you need an ontology in the easy to read OWL/XML Presentation Syntax - for example, in order to XSL it further to a HTML-page representing your ontology, or anything like that, because it's hard to do this stuff with RDF/XML, go to rdf2owlxml and just grab the results! (The results work fine with dlpconvert as well, by the way).

Hope you like it, but be reminded - it is a very early service right now, only a 0.2 version.

Released dlpconvert

There's so much to do right now here, and even much, much more I'd like to do - and thus I'm getting a bit late on my announcements. Finally, here we go: dlpconvert is released in a 0.5 version.

You probably think: what is dlpconvert? dlpconvert is able to convert ontologies, that are within the dlp-fragment, from one syntax - namely the OWL XML Presentation Syntax- into another, here Datalog. Thus you can just take your ontology, convert it and then use the result as your program in your Prolog-engine. Isn't that cool?

Well, it would be much cooler if it were stronger tested, and if it could read some more common syntaxes like RDF/XML-Serialisation of OWL ontologies, but both is on the way. As for testing I would hope that you may test a bit - as for the serialisation, it should be available pretty soon.

dlpconvert is based totally on KAON2 for the reduction of the ontology.

I will write more as soon as I have more time.

Vier Jahre Nodix (und kein bisschen weißer)

Am 14. Januar eröffnete Nodix - meine zweite Website nach einem Vorgänger in den frühen 90ern - die Pforten, und vieles, vieles hat sich seitdem getan. Ein halbes Dutzend eigener Websiten ist aus Nodix seitdem entstanden, und noch ein paar sind geplant, die mehr oder weniger ein kleines Netzwerk von Websiten bilden sollen.

Im ersten Jahr wahr ich sehr stolz, 2.000 Besucher gehabt zu haben. Irre! Größenwahnsinnig startete ich die Aktion 10.000, bis Januar 2003 10.000 Besucher gehabt zu haben - und tatsächlich es gelang, sogar schon bis August 2002. 2002 hatte Nodix insgesamt 20.000 Besucher - eine unglaubliche Verzehnfachung! 2003 wurde die Zahl nochmal mehr als verdoppelt - 44.000 Besucher in jenem Jahr. Und 2004 schließlich, bis heute, hatten über 115.000 Besucher! Für eine private Homepage wirklich famos. Vielen Dank all den Leuten, die Nodix treu sind, und es ab und an besuchen.

Allerdings hat sich in den Jahren auch ein kleines Team um Nodix gebildet. Nicht, dass es sich selber so bezeichnen würde, aber große Teile von Nodix entstehen und entstanden unabhängig voneinander. Jüngster Sproß ist Schwesterchens nakit-arts, das lange nur ihre Galerie enthielt, jetzt aber mit ihrem Blog - einem der schönsten überhaupt - zu den bestbesuchten Seiten hier gehört. Desweiteren Ralf Baumgartner, der, obwohl er eine eigene Website hat, sich demnächst verstärkt um something*positive kümmern wird, einer der Dornröschen-Seiten von Nodix. Genau wie die nutkidz. Doch für beide ist geplant, in den nächsten Wochen sie wiederzubeleben, und die Pläne schreiten voran. Versprochen, nichts davon stirbt.

Dies gilt insbesondere für das DSA4 Werkzeug, das zur Zeit leider nicht einmal eine Website hat. Twel wird sich hier verstärkt darum kümmern, doch es hängt an mir. Zu meinen guten Vorsätzen und Plänen dieses Jahr gehört es aber, einige Aufgaben endlich abzuschließen: dazu gehört zunächst das Verfassen zweier Texte für DSA, einer fast fertig, einer in Planung; und dazu gehört der Relaunch der Website zum und das Fertigstellen des DSA4 Werkzeugs. Ja, genau gelesen, mein Zeitplan für 2005 sieht eine fertige Version des DSA4 Werkzeugs vor - definiert als über 95%-ige Regelkonformanz bei der Generierung und Steigerung. Na, ist das nichts?

So weit zu den Versprechungen und Plänen. Und was hat es mit dem Weißer auf sich? Nun, wie seit dem ersten Tag ist die Hintergrundfarbe von Nodix, einer weltbekannten Spielefirma zu Ehren, eaeaea.

Comments to naming

Richard Newman sent me some thoughtful comments via eMail on the What's in a name series (there were also some great comments on the individual entries, feel free to browse them). He sent them via eMail, cause he thought he couldn't comment - that should be wrong, everyone should be able to comment anonymously. Or did anyone else encounter problems? I should switch to some dedicated software soon, anyway, but right now I don't have the time to dig deeper into it. I especially miss trackback, sigh.

Here's what Richard wrote:

"Your first point, about ISBNs and "what's being referenced" --- I think you'd be interested in FRBR, which is a modelling of the bibliographical domain. It splits things up into

Work -> Expression -> Manifestation -> Item

A work is an abstract concept, like "Politeia". An expression is a realisation of a work, so a particular translation is an expression. A manifestation is physical embodiment of an expression: this is what's given an ISBN. All copies of a certain book are Items; the edition of the book is their Manifestation.

So, you see, when you're discussing Plato's Politeia, you have to be conceptually clear about whether you're talking about works, expressions, manifestations, or items.

E.g.

:PolWork dc:creator "Plato" ;
  rdfs:label "Plato's Politeia, the abstract concept." .
:PolExp1 ex:translator "Mr Smith" ;
  frbr:work :PolWork ;
  rdfs:label "Mr. Smith's translation of Plato's Politeia." .
:PolMan1 ex:publisher "Penguin" ;
  frbr:expression :PolExp1 ;
  rdfs:label "Penguin's edition of Smith's translation." .
:MyCopy ex:owner hg:RichardNewman ;
  frbr:manifestation :PolMan1 ;
  rdfs:label "Richard's copy of the Penguin edition." .

Do you see? Each level has its own properties (and some may be duplicated; e.g. each has a title: the title of the abstract work, the name given to the translation, the name Penguin prints on each book, and the name printed on my copy).

I've done a bit of work on modelling FRBR in RDFS/OWL, but haven't yet finished. "

I think that's really interesting, and taking a look at FRBR it was pretty well done. I sure am looking forward to see Richards interpretation in OWL, and will probably use it.

"Your second issue is the difference between a resource and its representation. A URI should only refer to one thing; it is entirely wrong to use http://www.holygoat.co.uk to refer both to my homepage (as in using RDF to describe its language, or size, or last-modified) and to me (my name, my email address, etc.) which I have seen done.

Your web server should return RDF for http://semantic.nodx.net/#Plato if your browser says that it accepts RDF+XML. A normal browser should have an HTML representation returned. Indeed, it's possible to do the following:

  • the abstract resource. Hit this with a browser, get an HTML page; with an RDF agent, get some RDF.
http://example.com/Plato a rdf:resource .
  • the HTML representation.
http://example.com/Plato/html a ex:representation ;
  ex:representationOf http://example.com/Plato .
    1. the RDF.
http://example.com/Plato/rdf a ex:representation ;
  ex:representationOf http://example.com/Plato .

i.e. you can unambiguously refer to each representation, and the resource. When your client arrives, asking for Plato, you can redirect them to the appropriate place. Clever, huh?

URIs should never give a 404. They should return the appropriate headers or content for whatever the client is requesting; this may be the RDF file in which the resource is defined, if the client understands RDF, or an HTML page.

If you're interested in this sort of thing, it pops up on the W3C's RDF Interest Group list occasionally.

Patrick Stickler and others have come up with an additional HTTP verb, MGET, which will return the RDF description of a resource. Combined with their URIQA architecture, it will give you a Concise Bounded Description for a URI. This stops you having to somehow put descriptions into particular files, and better deals with the distributed nature of the Semantic Web. Check it out; it presents several convincing arguments for not using fragment identifiers to refer to resources, and solves your bandwidth problem. You should never have to dump a whole file to get a description of a URI."

I have to note that Richard wrote me this just after part 4 of the series was published, so I could answer some of the questions already in the last two parts. Just to summarise it: I don't like content negotiation. Although it is technically totally feasible, I disagree that it should be done or is a good solution. If my browser asks for http://semantic.nodix.net/#Plato I don't think I should get different things depending on the content negotiation. This feels like cheating.

I wrote that to Richard already, and he answered:

"I think we agree on the main point, which is that

foaf:name "Richard" ; ex:format "HTML" .

which is a travesty :) "

He is totally right here.

"You still see it happen, though, with people referring to Wikipedia pages as if they were the abstract resource.

The content negotiation (getting different things depending on what you accept) is exactly what the Web is supposed to do. If I'm using a mobile browser, I want a simplified version of a page; if I'm an RDF agent, I want RDF, if it exists, because HTML is of no use to me. A common usage of this is to serve up strict XHTML to Mozilla, and less-strict HTML to Internet Explorer. It is also done all the time to serve PNG where the client accepts it, and GIF if it doesn't, and there is an intentional disconnect on the Web between a resource and its representations.

The lack of such a disconnect would lead to exactly the problem you describe; if I can't return a representation of a resource, because it's abstract, then how do I find out anything about it? I could use MGET, but you can't MGET a person... so, if you want to talk about the real world thing "Plato", he has to 404, or you get the "what am I talking about?" problem. Better, in my view, to redirect a browser to plato.html and a SW agent to a chunk of RDF. "

I would rather like to ask for http://semantic.nodix.net/Plato.rdf to get the RDF/XML representation, http://semantic.nodix.net/Plato.owl to get the OWL/XML representation, http://semantic.nodix.net/Plato.html to get a HTML page for the user to read and Plato.jpg for a picture of Plato. This shouldn't be hidden behind content negotiation. I know, I know, Patrick would strongly disagree here, but I think it feels wrong and actually defies the idea of an URI.

"You can do exactly that (and I agree that the representations should have separate URIs --- conneg is only for when you're trying to get some description of an abstract resource), but then how do you refer to the abstract concept of "Plato"? http://.../Plato is a resource, and I want to make statements about him. But there's no point in it being 404 when dereferenced, because then how would I find out that Plato.html exists? HTTP doesn't return URIs, it returns representations of them.

A URI is simply something that is dereferenced to get a representation, and that representation should be decided on by conneg. In this case, /Plato is an abstract resource, so one of the representations should be returned. We can then make statements about Plato (e.g. foaf:name "Plato"), and about the JPEG and HTML representations, because they have different URIs, but still get something useful back when we want to access /Plato."

I also dislike MGET right now. Maybe I am wrong, but to me, the whole URIQA architecture feels somewhat wrong - but maybe I should just dwell deeper into it, I have to admit, I didn't study it yet enough to really be in a position to bash on it. The problem is, that MGET seems unnecessary to me - and it works on a different conceptual level than the rest of the Semantic Web proposals. I think everything MGET solves can be solved with tools that already exist: Richards example above, where he gives triples telling us which representations are used to describe a resource, shows perfectly well that you actually don't need content negotiation and MGET.

"There are things to question about URIQA, but it does have some good going for it. MGET is actually an implicit query. In the standard Web model, you request URIs and get back document representations. Doing an MGET on a Web server is asking it to return a description, regardless of where on the site descriptions of that resource exist, and you're explicitly asking for meta-data. As Patrick points out, it's similar doing a GET and specifying that you accept RDF, but is likely to be more concise (the difference between a "representation" and a "description"). In fact, this is exactly what the Nokia URIQA server does.

MGET overlaps with query servers a bit, and with GET a bit, but it's a little bit special, too. The whole idea is that from a single URI you can get a useful description of a resource, just by issuing a single MGET. Every other approach needs more work."

This URIQA / MGET stuff sounds more and more interesting. I really should dwell deeper into it.

Also, the idea of Concise Bounded Descriptions may be very neat, I have to study that more as well. Funny thing, the very same day Richard pointed me to it, a colleague told me about it too - this is usually a sign, that this idea is worth considering more.

Richard also wrote "URIs should never give a 404", and as you know, I disagreed with it mildly. He tried to summarise his position:

"I consider that each returned resource should have its own URI --- e.g. Plato.jpg --- and that the original URI should be used to make statements about the abstract resource. This allows you to say

...Plato foaf:name "Plato" .
...Plato.jpg ex:resolution "150dpi" .
...Plato.html dc:creator "Denny" .

Dereferencing the abstract resource, rather than throwing a 404, should do something useful --- e.g. redirecting with a 303 to one of the representations. Have you ever tried viewing a Blogger Atom feed in your browser? If you hit it with an RSS reader, you get the XML, but in a browser Blogger shows you an XHTML transformation of the XML. That's useful, and I think that's how the Semantic Web should work. Imagine if your agent hit /Plato, and got RDF out of it, but when you looked at it with your browser you saw a dynamically-generated HTML page? Handy!

I can understand your objection, though; it does seem wrong that you get different things out of the same URI. However, you should almost always get HTML out of plato.html, and RDF out of plato.rdf. All the conneg is doing is making sure you can see an abstract thing in the best way possible, according to what you've told the server you can understand. "

Richard is pretty good in convincing me, cause he uses the right arguments: it's for the people, dummy, and the machines can work it out anyway.

I still stick to the recommendations I gave yesterday. But just as I am writing, and rereading it all, I am starting to change my mind on content negotiation. Maybe it is a good thing. I will have to think about it some more, and as soon as I come to a solution, I will bother you with it again. I still have a gut feeling about it that tells me 'no', but the reasons given sound very convincing and I agree with most of them, so heck, let's meditate on this as soon as I find a few hours to spare.

Big thanks to Richard and his thoughts, anyway. I hope this discussion helps you to make up your own mind as well.

What's in a name - Part 6

In this series we learned how to make URIs for entities. I know there's a big discussion flaring up every few weeks or so, if we should use fragment identifier or not. For me, this question is pretty much settled. Using a fragment identifier has the advantage of giving you the ability of providing a human readable page for those few lost souls who look up the URI, so maybe it's a tad nicer than using no fragment identifier and returning 404s. Not using fragids has the advantage of probably reducing bandwidth - but this discussion should be more or less academic, because looking up URIs, as we have seen, should not happen.

There is some talking about different representations, negotiating media-types, returning RDF in one, XHTML in the other case, but to be honest, I think that's far too complicated. And you would need to use another web server and extensions to HTTP to make this real, which doesn't really help the advent of the Semantic Web. Look at Nokias URIQA project for more information.

Keep this rules in mind, and everything should be fine:

  • be careful to use unused URIs if you reference a new entity. Take one from an URI space you have control of, so that URI collision won't appear
  • don't put a website under the URI you used to to name an entity. That would lead to URI collision
  • try to make nice looking URIs, but don't try to hard. They are supposed to be hidden by the application anyway
  • provide rdfs:label and rdfs:seeAlso instead. This solves everything you would want to try to solve with URI naming, but in a standard compliant way
  • give your resources URIs. Please. So that other can reference them more easily.

I should emphasise the last one more. Especially using RDF/XML-Syntax easily leads to anonymous nodes, which are a pain in the ass because they are hard or impossible to address. Especially, don't use rdf:nodeID. They don't give your node an ID that's visible to the outer world. This is just a local name. Don't use it, please.

The second is using them like this:

<foaf:person about="me">
  <foaf:knows>
    <foaf:Person>
      <foaf:name>J. Random User</foaf:name>
    </foaf:Person>
  </foaf:knows>
</foaf:Person>

Actually, the Person known to "me" is an anonymous one. You can't refer to her. Again, try to avoid that. If you can, look up the URI the person gave to herself in her own FOAF-file. Or give her a name in your own URI-space. Don't be afraid, you won't run out of it.

Another very interesting approach is to use published subjects. I will return to this in another blog, promised, but so long: never forget, there is owl:sameAs to make two URIs point to the same thing, so don't mind too much if you doublename something.

Well, that's it. I hope you enjoyed the series, and that you learned a bit from it. Looking forward to your comments, and your questions.

What's in a name - Part 5

After calling Plato an XML-Element, making movies out of websites and having several accidents with careless URIs, it seems we return to the very beginning of this series.

http://semantic.nodix.net/document/Politeia dc:creator "Plato".

Whereby http://semantic.nodix.net/document/Politeia explicitly does not resolve but returns a 404, resource not found. Let's remember, why didn't we like it? Because humans, upon seeing this, have the urge to click on it in order to get more information about it. A pretty good argument, but every solution we tried brought us more or less trouble. We didn't get happy with any of them.

But how can I dismiss such an argument? Don't I risk loosing focus with saying "don't care about humans going nowhere"? No, I really don't think so. Due to two reasons, one meant for humans and one for the machines.

First the humans (humans always should go first, remember this, Ms and Mr PhD-student): humans actually never see this URI (or at least, should not but when debugging). URIs who will grace the GUI should have a rdfs:label which provides the label human users will see when working with this resource. Let's be honest: only geeks like us think that http://semantic.nodix.net/document/Politeia is a pretty obvious and easy name for a resource. Normal humans would probably prefer "Politeia", or even "The Republic" (which is the usual name in English speaking countries). Or be able to define their own name.

As they don't see the URI, they actually never feel the urge to click on it, or to copy and paste it to the next browser window. Naming it http://semantic.nodix.net/document/Politeia instead of http://semantic.nodix.net/concept/1383b_xc is just for the sake of readability of the source RDF files, but actually you should not derive any information out of the URI (that's what the standard says). The computer won't either.

The second point is, a RDF application shouldn't look up URIs either. It's just wrong. URIs are just names, it is important that they remain unique, but they are not there for looking up in a browser. That's what URLs are for. It's a shame they look the same. Mozilla realised the distinction when they gave their XUL language the namespace http://www.mozilla.org/keymaster/gatekeeper/there.is.only.xul. Application developers should realise this too. rdfs:seeAlso and rdfs:isDefinedBy give explicit links applications may follow to get more information about a resource, and using owl:imports actually forces this behaviour - but the name does not.

Getting information out of names is like making fun of names. It's mean. Remember the in-kids in primary school making fun of out-kids because of their names? You know you're better than that (and, being a geek, you probably were an out-kid, so mere compassion and fond memories should hold you back too)..

Just to repeat it explicitly: if an URI gives back a 404 when you put it in a browser navigation bar - that's OK. It was supposed to identify resources, not to locate them.

Now you know the difference between URIs and URLs, and you know why avoiding URI collision is important and how to avoid it. We'll wrap it all in the final instalment of the series (tomorrow, I sincerely hope) and give some practical hints, too.

By the way, right after the series I will talk about content negotiation, which was mentioned in the comments and in e-Mails.

Uh, and just another thing: the wary reader (and every reader should be wary) may also have noticed that

Philosophy:Politeia dc:creator "Plato".

is total nonsense: it says, that there is a resource (identified with QName Philosophy:Politeia) that is created by "Plato". Rest assured that this is wrong - no, not because Socrates should be credited as the creator of the Politeia (this is another discussion entirely) but because the statement claims that the string "Plato" created it - not a Person known by this name (who would be a resource that should have an URI). But this mistake is probably the most frequent one in the world of the Semantic Web - a mistake nevertheless.

It's OK if you make it. Most applications will cope with it (and some are actually not able to cope with the correct way). But it would not be OK if you didn't know that you are making a mistake.

What's in a name - Part 4

I promised you four solutions to the problem of dubbing with appropriate URIs. So, without further ado, let's go.

The first one you've seen already. It's using anonymous nodes.

_person foaf:interest _security.
http://dmoz.org/Computers/Security/ dc:subject _security.

But here we get the problem, that we can't reference _security from outside, thus loosing a lot of the possibilities inherent in the Semantic Web, because this way you can not say that someone else is interested in the same topic as _person above. Even if you say, in another RDF file,

_person2 foaf:interest _security.
http://dmoz.org/Computers/Security/ dc:subject _security.

_security actually does not have to be the same as above. Who says, websites only have one subject? The coincidental equality of the variable name _security bears as much semantics as the equality of two variables x in a C and a Python-Program.
So this solution, although possible, bears too much short-comings. Let's move on.

The second solution is hardly available to the majority of us puny mortals. It's introducing a new URI schema. Let's return to our very first example, where we wanted to say that the Politeia was written by Plato.

urn:isbn:0192833707 dc:creator "Plato".

Great! No problems here. Sure, your web-browser can't (yet) resolve urn:isbn:0192833707, but no ambiguity here: we know exactly of what we speak.

Do we? Incidentally, urn:isbn:0465069347 also denotes the Politeia. No, not in another language (those would be another handful of ISBN numbers), just a different version (the text is public domain). Now, does the following statement hold?

urn:isbn:0192833707 owl:sameAs urn:isbn:0465069347.

Most definitively not. They have different translators. They have different publishers. These are different books. But it's the same - what? What is the same? It's not the same text. It's not the same book. They may have the same source text they are translated from. But how to express this correctly and still useful?

The urn:isbn: scheme is very useful for a very special kind of entities - published books, even the different versions of published books.
The problem with this solution that you would need tons of schemes. Imagine the number of committees! This would, no, this should never happen. We definitively need an easier solution, although this one certainly does work for very special domains.

Let's move on to the third solution: the magic word is fragment identifier. #. Instead of saying:

http://semantic.nodix.net/Politeia dc:creator http://semantic.nodix.net/Plato.

and thus getting 404s en masse, I just say:

http://semantic.nodix.net/#Politeia dc:creator http://semantic.nodx.net/#Plato.

See? No 404. You get to the homepage of this blog by clicking there. And it's valid RDF as well. So, isn't it just perfect? Everything we wished for?

Not totally, I fear. If I click on http://semantic.nodx.net/#Plato, I actually expect to read something about Plato, and not to see a blog about the Semantic Web. So this somehow would disappoint me. Better than a 404, still...
The other point is my bandwidth. There can be RDF files with thousands of references. Following every single one will lead to considerable bandwidth abuse. For naught, as there is no further information about the subject on the other side. Maybe using http://semantic.nodix.net/person#Plato would solve both problems, with http://semantic.nodix.net/person being a website saying something like "This page is used to reserve conceptual space for persons. To understand this, you must understand the magic of URIs and the Semantic Web. Now, go back whereever you came from and have a nice day." Not too much webspace and bandwith will be used for this tiny HTML-page.

You should be careful though to not have a real fragment identifier "Plato" in the page, or you would actually dereference to this element. URI collision again. You don't want Plato to become half-philosopher / half-XML-element, do you?

We will return to fragment identifiers in the last part of this six part series again. And now let's take a quick look at the fourth solution - we will discuss it more thoroughly next time.

Use a fresh URI whenever you need an URI and don't care about it giving a 404.

What's in a name - Part 3

Last time we merrily published our first statement for the Semantic Web:

http://www.imdb.com/title/tt0088247/ http://purl.org/dc/elements/1.1/creator "James Cameron".

A fellow Semantic Web author didn't like the number-encoded IMdb-URI, but found a much more compelling one and then published the following statement:

http://en.wikipedia.org/wiki/The_Terminator http://purl.org/dc/elements/1.1/date "1984-10-26".

A third one sees those and, in order to foster integration of data offers helpfully the following statement:

http://www.imdb.com/title/tt0088247/ owl:sameAs http://en.wikipedia.org/wiki/The_Terminator.

And now they live merrily ever after. Or do you hear the thunder of doom rolling?

The problem is that the URIs above actually already denote something, namely the IMdb website about the Terminator and the Wikipedia-article on the Terminator. They did not denote the movie itself, but that's how they're used in our examples. Statement #3 above actually says the two websites are the same. The first one says, that "James Cameron" created the IMdb website on the Terminator (they'd wish), and the second one says that the Wikipedia article was created in 1984, which is wrong (July 23, 2001 would be the correct date). We have a classic case of URI collision.

This happens all the time. People working professionally on this do this too:

_person foaf:interest http://dmoz.org/Computers/Security/.

I'd bet, _person (remaining anonymously here) does not have such a heavy interest in the website http://dmoz.org/Computers/Security/, but rather in the Topic the website is about.

_person foaf:interest _security.
http://dmoz.org/Computers/Security/ dc:subject _security.

Instead of letting _security be anonymous, we'd rather give it a real URI. This way we can reference it later.

_person foaf:interest http://semantic.nodix.net/topic/security.
http://dmoz.org/Computers/Security/ dc:subject http://semantic.nodix.net/topics/security.

But, oh pain - now we're exactly at the same spot we've been in the last part. We have an URI that does not dereference to a website (by the way, I do know that the definition of foaf:interest actually says the semantics of foaf:interest is, that the Subject is interested in the Topic of the Object, and not the Object itself, but that's not my point here)
Thinking for a moment about it, we must conclude that it is actually impossible to achieve both goals: either the URIs will identify a resource retrievable over the web and are thus unsuitable as URIs for entities outside the web (like persons, chairs and such) because of URI collision, or they don't - and will then lead to 404-land.

Isn't there any solution? (Drums) Stay tuned for the next exciting installment of this series, introducing not one, not two, not three, but four solutions to this problem!

What's in a name - Part 2

How to give a resource a name, an URI? Let's look at this statement:

movie:Terminator dc:creator "James Cameron".

Happy with that? This is a valid RDF statement, and you understand what I wanted to say, and your RDF machine will be able to read and process it, too, so everything is fine.

Well, almost. movie:Terminator is a QName, and movie: is just a shorthand prefix, a namespace, that actually has to be defined as something. But as what? URIs are well-defined, so we shouldn't just define the namespace arbitrarily. The problem is, someone else could do the same, and suddenly, one URI could denote two different resources - this is called URI collision, and it is the next worst thing to immanentizing the Eschaton. That's why you should grab some URI space for yourself and there you go, you may define as many URIs there as you like (remember, the U in URI means Universal, that's why they make such a fuss about the URI space and ownership of it).

I am the webmaster of http://semantic.nodix.net, and the URI belongs to me and with it, all the URIs starting with it. Thus I decide, that movie: shall be http://semantic.nodix.net/movie/. Our example statement thus is the same as:

http://semantic.nodix.net/movie/Terminator http://purl.org/dc/elements/1.1/creator "James Cameron".

So this is actually what the computer sees. The short hand notation above is just for humans. But if you're like me, and you see the above Subject, you're already annoyed that it is not a link, that you can't click on it. So you copy it into your browser address bar, and go to http://semantic.nodix.net/movie/Terminator. Ups. A 404, the website is not found. You start thinking, oh man, stupid! Why you giving the resource such a name that looks so much like an web address, and then point it to 404-Nirvana?

Many think so. That's because they don't grasp the difference between URIs and URLs, and to be honest, this difference is maybe the worst idea the W3C ever had (that's a hard-to-achieve compliment, considering the introduction of XML/RDF-serialisation and XSD). We will return to this difference, but for now, let's see what usually happens.

Because http://semantic.nodix.net/movie/Terminator leads to nowhere, and I'm far too lazy to make a website for the Terminator just for this example, we will take another URI for the movie. Jumping to IMdb we quickly find the appropriate one, and then we can reformulate our statement:

http://www.imdb.com/title/tt0088247/ http://purl.org/dc/elements/1.1/creator "James Cameron".

Great! Our subject is a valid URI, clicking on http://www.imdb.com/title/tt0088247/ (or pasting it to a browser) will tell you more about the subject, and we have a valid RDF statement. Everything is fine again...

...until next time, where we will discuss the minor problems of our solution.

Pah!

Geht nicht auf diese Seite. Schaut euch auf gar keinen Film das grausige Foto von mir an. Das eigentlich Foto, dass ich dann auch für meine neue Webseite verwendet habe sieht nämlich viel besser aus. Ignoriert also diese Seite. Kapiert?

Habe es hingegen tatsächlich endlich geschafft, hinter den Link über Denny, der schon seit Anfangszeiten von Nodix vorhanden ist, auch mal Inhalt zu bringen. Die Seite verweist jetzt auf die äußerst schreibfreundliche Seite denny.vrandecic.de. Letztlich soll sich hier auch irgendwann CV, Publikationsliste und ähnliches tummeln, eben das, was man als eigentlich Homepage im klassischen Sinn versteht. So, jetzt aber endgültig, auf Richtung Neandertal...

What's in a name - Part 1

There are tons of mistakes that may occur when writing down RDF statements. I will post a six part series of blog entries, starting with this one, about what can go wrong in the course of naming resources, why it is wrong, and why you should care - if at all. I'll try to mix experience with pragmatics, usability with philosophy. And I surely hope that, if you disagree, you'll do so in the comments or in your own blog.

The first one is the easiest to spot. Here we go:

"Politeia" dc:creator "Plato".

If you don't know about the differences between Literals, QNames and URIs, please take a look at the RDF Primer. It's easy to read and absolutely essential. If you know about the differences, you already know that the above said actually isn't a valid RDF statement: you can't have a literal as the subject of a statement. So, let's change this:

philo:Politeia dc:creator "Plato".

What's the difference between these two? In the first one you say that "Plato" is the creator of "Politeia" (we take the semantics of dc:creator for granted for now). But in the second you say that "Plato" is the creator of philo:Politeia. That's like in Dragonheart, where Bowen tries to find a name for the dragon because he can't just call him "dragon", and he decides on "draco". The dragon comments: "So, instead of calling me dragon in your own language, you decide to call me dragon in another language."

Yep, we decide to talk about Politeia in another language. Because RDF is another language. It tries to look like ours, it even has subjects, objects, predicates, but it is not the language of humans. It is (mostly) much easier, so easy in fact even computers can cope with it (and that's about the whole point of the Semantic Web in the first place, so you shouldn't be too surprised here).

"Politeia" has a well defined meaning: it is a literal (the quotation marks tell you that) and thus it is interpreted as a value. "Politeia" actually is just a word, a symbol, a sign pointing to the meant string Politeia (a better example would be: "42" means the number 42. "101010b", "Fourty-Two" or "2Ah" would have been perfectly valid other signs denoting the number 42).

And what about philo:Politeia? How is it different from "Politeia", what does this point to?

philo:Politeia is a Qualified Name (QName), and thus ultimatively a short-hand notation for an URI, an Unified Resource Identifier. In RDF, everything has to be a resource (well, remember, RDF stands for Resource Description Framework), but that's not really a constraint, as you may simply consider everything a resource. Even you and me. And URIs are names for resources. Universally (well, at least globally) unique names. Like philo:Politeia.

You may wonder about what your URI is, the one URI denoting you. Or what the URI of Plato is, or of the Politeia? How to choose good URIs, and what may go wrong? And what do URIs actually denote, and how? We'll discuss this all in the next five parts of this series, don't worry, just stay tuned.

Team America

Ab morgen bin ich im Neandertal, und wer weiß schon, wie da die Internetverbindung ist. Also muss ich noch schnell einen Tipp für alle Kinogänger loswerden, zu einem Film, den niemand verpassen sollte - Team America!

Ein Film mit Alec Baldwin, Liv Tyler, Matt Damon, Sean Penn, Michael Moore, Samuel L. Jackson, Kim Il-Yong, George Clooney und vielen anderen - und das waren nur Nebenrollen! Das South Park Team macht einen bösen, bösen Film, mit schwärzesten Humor und einem Hauch an Kritik an gewissen amerikanischen imperialistischen Tendenzen.

Heißer Sex, unglaubliche Action-Sequenzen, wie sie noch nie zuvor mit dieser Technik (Marionetten) ausgeführt wurden, an den großen Schauplätzen dieser Welt (Nordkorea, Kairo, Paris, Mount Rushmore, Washington) - was will man mehr? Für ausgesprochene Cineasten sind zudem äußerst subtile Zitate auf die großen Kinofilme der letzten Jahrzehnte eingebaut (Star Wars oder Matrix).

Im Kino ab dem 30. Dezember.

Auch auf baumgarf wird ein Review von Team America geboten. Wer lieber etwas anderes ansehen möchte, sollte es mit Ocean's Twelve versuchen - der coolste Film des Winters! Wer auf seichteren Humor steht, kann es auch getrost mit Trouble ohne Paddel versuchen (auch auf Baumgarf findet sich ein Review, jedoch wählt Ralf den Originaltitel Survival Camp, weil er den "deutschen" Titel nicht mag. Sich aber über den schleichenden Verfall des Deutschen beschweren, tsts. Heuchler ;)

Why we will win

People keep saying that the Semantic Web is just a hype. That we are just an unholy chimaera of undead AI researchers talking about problems solved by the database guys 15 years ago. And that our work will never make any impact in the so called real world out there.

As I stated before: I'm a believer. I'm even a catholic, so this means I'm pretty good at ignoring hard facts about reality in order to stick to my beliefs, but it is different in this case: I slowly start to comprehend why Semantic Web technology will prevail and make life better for everyone out there. It' simply the next step in the IT RevoEvolution.

Let's remember the history of computing. Shortly after the invention of the abacus the obvious next step, the computer mainframe, appeared. Whoever wanted to work with it, had to learn to use this one mainframe model (well, the very first ones were one-of-a-kind machines). Being able to use one didn't necessarily help you using the other.

First the costs for software development were negligible. But slowly this changed, and Fred Brooks wrote down his experience with creating the legendary System/360 in the Mythical Man-Month (a must-read for software engineers), showing how much has changed.

Change was about to come, and it did come twofold. Dennis Ritchie is to blame for both of them: together with Ken Thompson he made Unix, but in order to make that, he had to make a programming language to write Unix in, this was C, which he made together with Brian Kernighan (this account is overly simplified, look at the history of Unix for a better overview).

Things became much easier now. You could port programs in a simpler way than before, just recompile (and introduce a few hundred #IFDEFs). Still, the masses used the Commodore 64, the Amiga, the Atari ST. Buying a compatible model was more important than looking at the stats. It was the achievement of the hardware development for the PC and of Microsoft to unify the operating systems for home computers.

Then came the dawning of the age of World Wide Web. Suddenly the operating system became uninteresting, the browser you use was more important. Browser wars raged. And in parallel, Java emerged. Compile once, run everywhere. How cool was that? And after the browser wars ended, the W3Cs cries for standards became heard.

That's the world as it is now. Working at the AIFB, I see how no one cares what operating system the other has, be it Linux, Mac or Windows, as long as you have a running Java Virtual Machine, a Python interpreter, a Browser, a C++ compiler. Portability really isn't the problem anymore (like everything in this text, this is oversimplified).

But do you think, being OS independent is enough? Are you content with having your programs run everywhere? If so, fine. But you shouldn't be. You should ask for more. You also want to be independent of applications! Take back your data. Data wants to be free, not locked inside an application. After you have written your text in Word, you want to be able to work with it in your Latex typesetter. After getting contact information via a Bluetooth connection to your mobile phone, you want to be able to send an eMail to the contact from your web mail account.

There are two ways to achieve this: the one is with standard data formats. If everyone uses vCard-files for contact information, the data should flow freely, shouldn't it? OpenOffice can read Word files, so there we see interoperability of data, don't we?

Yes, we do. And if it works, fine. But more often than not it doesn't. You need to export and import data explicitly. Tedious, boring, error prone, unnerving. Standards don't happen that easily. Often enough interoperability is achieved with reverse engineering. That's not the way to go.

Using a common data model with well defined semantics and solving tons of interoperability questions (Charset, syntax, file transfer) and being able to declare semantic mappings with ontologies - just try to imagine that! Applications being aware of each other, speaking a common language - but without standard bodies discussing it for years, defining it statically, unmoving.

There is a common theme in the IT history towards more freedom. I don't mean free like in free speech, I mean free like in free will.

That's why we will win.

Dick werden...

Was las mir Schwesterchen schönes vor? (ohne jedoch eine Quelle anzugeben)

"Dick wird man nicht von dem, was man Heiligabend bis Neujahr isst, sondern von dem, was man Neujahr bis Heiligabend isst"

Wahr gesprochen!

Love is...

"Love is what's in the room with you at Christmas if you stop opening presents and listen."

Angeblich aus einer Studie, in der 4- bis 8-jährige Kinder gefragt wurden, was Liebe ist. Herzallerliebst.

Leider konnte ich die Originalstudie nicht finden - vielleicht schafft es jemand anderes ja - aber der Text wird ständig im Netz zitiert: hier, hier hier, hier oder hier, aber auch hier. Google findet über 60.000 Treffer... wahrscheinlich, dass der Text erfunden ist.

Trotzdem schön, ihn zu lesen.

Amazonempfehlung

Ich bin von Amazons Empfehlungen wirklich begeistert. Mal ehrlich - welcher Buchhändler kennt heute noch seine Kunden so gut, dass er so treffende Vorschläge machen kann?

Nur heute war ich etwas verdutzt, weil mir Leicht Verdientes Gold empfohlen wurde. Stimmt, eine wirklich hervorragende Anschaffung. Schließlich habe ich ja mitgeschrieben...

Vier fehlende Posts

Wahrscheinlich hat es ohnehin kein Mensch bemerkt, aber bei der Umstellung auf blogger.com bemerkte ich den Verlust von vier Posts: vom 5. November 2002, vom 5. und 18. Dezember 2002 und vom 23. Januar 2003.

Habe sie wiedergefunden und eingestellt (Sicherheitskopien sind was feines). Dies ist übrigens Post Nummer 175 oder so, es scheint hier keine automatische Methode zu geben, das automatisch nachzuzählen. Gar nicht mal so viele, für knapp vier Jahre - Schwesterchen hat im Schönsten Blog von Welt in einem Monat fast 40 Einträge geschafft! Und des Wahnsinns fette Beute brauchte für etwa 140 Einträge auch weniger als drei Jahre.

Kurz, ich bin ein unregelmäßiger, fauler Blogger. Trotzdem danke für die vielen Besuche! (Zu den Besucherzahlen, der Ehrlichheit halber, unter uns: nakit-arts hat inzwischen mehr regelmäßige Besucher als der Nodix-Blog. Gratuliere Dir, Schwesterchen!)

Wissenswertes über Jamba

Herzallerliebst geschrieben, dazu noch äußerst unterhaltsam, und dennoch aufklärerischer und kritischer Inhalt:

http://spreeblick.de/wp/index.php?p=324

Da freut man sich. Was las ich vor kurzem in einem Telepolis-Interview mit Norbert Bolz gelesen?
"Ich lese am liebsten das ›Streiflicht‹ in der Süddeutschen Zeitung und ›Das Letzte‹ in der Zeit, also Glossen. Diese Glossen haben sehr viel mehr Sprengkraft als die Kommentare von irgendeinem Leitartikler. Solche Texte sind so voraussehbar in ihrer politischen Korrektheit, dass sie mich einfach nur anöden. Über die Form des Witzes lassen sich so manche politischen Informationen und Kritik viel besser vermitteln."

Na, dann ist der obige Link ein Beispiel für die Information der Zukunft.

I am weak

Basically I was working today, instead of doing some stuff I should have finished a week ago for some private activities.

The challenge I posed myself: how semantic can I already get? What tools can I already use? Firefox has some pretty neat extensions, like FOAFer, or the del.icio.us plugin. I'll see if I can work with them, if there's a real payoff. The coolest, somehow semantic plugin I installed is the SearchStatus. It shows me the PageRank and the Alexa rating of the visited site. I think that's really great. It gives me just the first glimpse of what metadata can do in helping being an informed user. The Link Toolbar should be absolutely necessary, but pitily it isn't, as not enough people make us of HTMLs link element the way it is supposed to be used.

Totally unsemantic is the mouse gestures plugin. Nevertheless, I loved those with Opera, and I'm happy to have them back.

Still, there are such neat things like a RDF editor and query engine. Installed it and now I want to see how to work with it... but actually I should go upstairs, clean my room, organise my bills and insurance and doing all this real life stuff...

What's the short message? Get Firefox today and discover its extensions!

Nachholtag

Herrje, fast zwei Wochen nichts gepostet. Immerhin, man kriegt anfragen: ist was? Wieso liest man nichts? Allerlei :)

Zunächst: alles prima, danke der Nachfragen. Außer, dass die DSA4 Website nicht tut, und Andre nicht zu erreichen ist, und ich wahrscheinlich wieder umziehen muss mit der Seite. Ich glaube, jetzt hole ich mir einfach einen eigenen Webspace, der auch PHP kann, besser noch Python, und all das für wenig Geld. Aber da ich faul bin, und mit 1&1 eigentlich zufrieden bin, weiß ich nicht, ob ich richtig Lust auf umziehen habe... na, mal sehen.

So, jetzt noch schnell: die letzten Tage habe ich natürlich auch Filme geschaut, aber statt jetzt selber Kritiken zu schreiben, verweise ich auf diejenigen, die den Film mit mir geschaut haben, und dies selbst gebloggt haben, dahinter nur eine ganz kurze Ein-Satz-Wertung von mir:

  • Anatomie einer Entführung (Brilliante Schauspieler langweilen den Zuschauer gezielt zu Tode)
  • New York Taxi (Lustig, aber nichts, was man nicht verpassen darf; die brasilianischen Bankräuberinnen sehen echt verdammt gut aus)
  • Die Unglaublichen (Schwesterchen / baumgarf (Wow! Unglaublich gute Story, da hat jemand bei Frank Miller geklaut - hinreißend komisch auch!)
  • The Lord of the Rings: Return of the King Special Extended Edition (bester Film des letzten Jahres noch besser gemacht, endlich, komplett! Schwärm. Verpasst die Easter Eggs nicht: ins Menü, Szenenauswahl, letzte Szene, und dann eins runter - ein kleiner Ring erscheint. Auswählen. Sehr, sehr lustig! Klappt auf beiden Film-DVDs)

Imagine there's a revolution...

... and no one is going to it.

This notion sometimes scares me when I think abou the semantic web. What if all this great ideas are just to complex to be implemented? What if it remains an ivory tower dream? But, on the other hand, how much pragmatism can we take without loosing the vision?

And then, again, I see the semantic web working already: it's del.icio.us, it's flickr, it's julie, and there's so much more to come. The big time of the semantic web is yet to come, and I think none of us can really imagine the impact it is going to have. But it will definitively be interesting!

Perspektive Deutschland

Ich mag diese Umfrage, und mache schon seit Jahren mit: ich kann auch nur empfehlen, selber mal die Meinung reinzudrücken. Schirmherr ist Richard von Weizsäcker, den älteren von uns noch bekannt als vielleicht meistrespektiertester Bundespräsidents des letzten Vierteljahrhunderts. Also, mitmachen! Ich glaube, gewinnen kann man auch etwas...

AcceLogiChip

Accelerated logic chips - that would be neat.

The problem with all this OWL stuff is, that it is computationally expensive. Google beats you in speed easily, having some 60.000 PCs or so, but indexing some 8 billion web pages, each with maybe a thousand words. And if you ever tried Googles Desktop Search, you will see they can perform this miracles right on your PC too! (Never mind that there are a dozen tools doing exactly the same stuff Googles Desktop Search does, just better - but hey, they lack the name!)

What does the Semantic Web achieve? Well, ever tried to run a logic inferencing engine with a few million instances? With a highly axiomatized TBox of, let's say, just a few thousand terms? No? You really should.

Sure, our PCs do get faster all the time (thanks to Moores Law!), but is that fast enough? We want to see the Semantic Web up and running not in a few more iterations of Moores Law, but much, much earlier. Why not use the same trick graphic magicians did? Highly specialized accelerated logic chips, things that can do your tableu reasoning in just a fraction of the time needed with your bloated all-purpose-CPU.

Wort des Jahres

Merriam-Webster hat das (englischsprachige) Wort des Jahres bekanntgegeben: Blog.

World Wide Prolog

Today I had an idea - maybe this whole Semantic Web idea is nothing else than a big worldwide Prolog program. It's the AI researchers trying to enter the real world through the W3Cs backdoor...

No, really, think about it: almost all most people do with OWL is actually some logic programing. Declaring subsumptions, predicates, conjunctions, testing for entailment, getting answers out of this - but on a world wide scale. And your browser does the inferencing for you (or maybe the server? Depends on your architecture).

They are still a lot of questions open (and the actual semantic differences between Description Logics, and Logic Programming surely ain't the smalles ones of them), like how to infere anything with contradicting data (something that surely will happen in the World Wide Semantic Web), how to treat dynamics (I'm not sure how to do that without reification in RDF), and much more. Looking forward to see this issues resolved...

Das Vermächtnis der Tempelritter

aus der Reihe Filme in 50 Worten

Es gibt keinen Film der besser geeignet ist als dieser, um die Wartezeit auf Indiana Jones 4 zu verkürzen. Ein sehr flotter Film, überraschend intelligenter Plot, annehmbar recherchierter Hintergrund (na ja, ein paar Fehler könnte man natürlich bemäkeln, aber darüber sehe ich hinweg), äußerst witzige Sprüche - er nimmt sich niemals zu ernst, doch vermeidet es auch, albern zu werden, kurz: gute Action, anschauen. Indy hat mindestens so gut zu werden!

Übrigens, meine Mitkinogänger bloggen, wie ich sehe jetzt auch selbst ganz fleißig ihre Meinung zu den angeschauten Filmen - lest Schwesterchens Blogeintrag zum Vermächtnis der Tempelritter und Buddys Blogeintrag zu Alles auf Zucker, von gestern.

Gnowsis and further

Today, Leo Sauermann of the DFKI was here, presenting his work on Gnowsis. It was really interesting, and though I don't agree with everything he said, I am totally impressed by the working system he presented. It's close to some ideas I had, about a Semantic Operating System Kernel, doing nothing but administrate your RDF data and offering it to any application around via a http-protocol. Well, I guess this idea was just a tat too obvious...

So I installed Gnowsis on my own desktop and play around with it now. I guess the problem is we don't really have roundtrip information yet - i.e., Information I change in one place shall magically be changed everywhere. What Gnowsis does is integrate the data from various sources into one view, that makes a lot of applications easily accessible. Great idea. But roundtripping data integration is definitively what we need: if I change the phone number of a person, I want this change to get propagated to all applications.

So again, differing to Gnowsis I would prefer a RDF store, that actually offers the whole data householding for all applications sitting atop. Applications are nought but a view on your data. Integrating from existing applications is done the Gnowsis way, but after that we leave the common trail. Oh well, as said, really interesting talk.

Alles auf Zucker

aus der Reihe Filme in 50 Worten

Als das Intro kam, und es klar war, dass ein deutscher Film gesneakt wird, standen die ersten auf und verschwanden. Selber schuld.

Zwei Brüder, ein kommunistischer Ossi-Jude und ein orthodoxer, Frankfurter-Banken-Reicher Wessi-Jude müssen sich versöhnen, um an das Erbe der Frau Mutter zu gelangen - nachdem sie seit dem Mauerbau nicht mehr gesprochen hatten. Lustig, skurill, nie übertrieben absurd, und dennoch mit zahlreichen witzigen und manchen tiefschwarzen Gemeinheiten versehen. Kein "Muss man gehsehen haben", aber sicher ein "Sehr nett anzuschauen".

Ab 6. Januar 2005 im Kino.

Der Manchurian Kandidat

aus der Reihe Filme in 50 Worten, heute mal ein Film, der schon läuft

Ein hochkomplizierter Plan soll sicherstellen, dass der mächtigste Mensch der Welt, der nächste Präsident der Vereinigten Staaten, eine willenlose Marionette für einen weltweit agierenden Superkonzern wird. Ein Einzelner - wie immer von Denzel Washington dargestellt - beginnt die Wahrheit zu erahnen und stellt sich gegen die Verschwörung, doch es glaubt ihm Niemand...

Warum auch? Der in den Film ausgeübte Plan ist viel zu kompliziert, kann an viel zu vielen Stellen scheitern, als dass es auch nur ein halbwegs intelligenter Mensch wirklich angehen würde. Die Ziele, die Manchurian erreichen möchte, könnten so viel leichter erreicht werden, mit weit weniger Risiken. Ein recht spannender Thriller, mit einigen sehr netten Überraschungen, ein gutgemachter Film - aber die Grundidee des Plots ist hanebüchend.

Mother philosophy

I should start to write some content on this blog soon, but actually I am still impressed with this technology I am learning here every day...

When the FOIS2004 was approaching, an Italian newspapers published this under the heading "Philosophy - finally useful for something" (or so, my Italian is based on a autodidactic half day course). I found this funny, and totally untrue.

Philosophy always had the bad luck, that every time a certain aspect of it provoced wider attention, this aspect became a discipline of its own. Physics, geometry and mathematics are the classical examples, later on theology, linguistics, anthropology, and then, in the 20th century, logic went this way too. It's like philosophy being the big incubator for new disciplines (you can see that still in the anglo-american tradition of almost all doctors actually being Ph.D.s, philosophical doctors.

Thus this misconception becomes understandable. Now, let's look around - what's the next discipline being born from philosophy? Will it be business ethics? Will it be the philosophy of science, being renamed as scientific managment?

My guess is: due to the fast growing area of the Semantic Web, it will be ontology. Today, the Wikipedia already made two articles on it, ontologies in philosophy and ontologies in computer science. This trend will gain momentum, and even though applied ontology will always feed from the fundamental work done from Socrates until today, it will become a full-fledged discipline of its own.

La Grande seduction

aus der Reihe Filme in 50 Worten

Ein kleines Dorf weit, weit ab vom Schuss, mit etwas mehr als 100 Einwohner (bei Bedarf auch doppelt so viele), stirbt langsam vor sich hin. Als es darum geht, eine Fabrik hier anzusiedeln, fehlt vor allem eines - ein ansässiger Arzt. Durch einen dummen Zufall kommt ein schickimicki Schönheitschirurg in das Dorf, und die Dorfbewohner versuchen alles, davon zu überzeugen, dass dies der beste Ort auf Erden ist...

Sympathisch, lustig, unterhaltsam. Man hätte die Figuren vielleicht noch ein wenig überspitzter erwartet, die ganze Geschichte noch absurder gestaltet, den Humor etwas schwärzer - aber stattdessen blieb der Film stets in einem realistischen, wohligen Rahmen. Sehenswert.

Ab dem 2. Dezember im Kino.

I'm a believer

The Semantic Web is promising quite a lot. Just take a look at the most cited description of the vision of the Semantic Web, written by Tim Berners-Lee and others. Many people are researching on the various aspects of the SemWeb, but in personal discussions I often sense a lack of believing.

I believe in it. I believe it will change the world. It will be a huge step forward to the data integration problem. It will allow many people to have more time to spend on the things they really love to do. It will help people organize their lives. It will make computers seem more intelligent and helpful. It will make the world a better place to live in.

This doesn't mean it will safe the world. It will offer only "nice to have"-features, but then, so many of them you will hardly be able to think of another world. I hardly remember the world how it was before e-Mail came along (I'm not that old yet, mind you). I sometimes can't remember how we went out in the evening without a mobile. That's where I see the SemWeb in 10 years: no one will think it's essential, but you will be amazed when thinking back how you lived without it.

Die McDonalds-Verschwörung

Folgendes basiert nur teilweise auf Fakten. Welche Teile mag der geneigte Leser selbst entscheiden, ist ja schließlich eine freie Welt.

In Stuttgart gab es seit jeher (= seitdem ich mich erinnern kann) eine ganze Menge McDonalds. Allein in der Königsstraße +/- 50 Meter waren es läppische vier. Dahinter steckte der Plan, dass durch die Menge der McDonalds zwar in jedem Einzelnen der Gewinn sinkt, aber wenn bloß jeder Platz, wo möglicherweise ein Burger King einziehen hätte können, durch einen Mc ausgefüllt wird, geht dennoch der ganze Gewinn aus FastFood-Konsum ja an das große gelbe M.

Doch nun erzittern die zahlreichen Mitarbeiter! Es kommt der erste Burger King in die Stuttgarter Innenstadt, und zwar in den Hauptbahnhof. Dort wird schon nach Mitarbeitern geworben, und Gerüchten zufolge würden die Topleute bereits abgeworben...

Exorzist - Der Anfang

aus der Reihe unnötige Filme in 50 Worten

Falls jemand, den ihr nicht sonderlich mögt, heute abend noch nichts vorhat, schickt ihn doch in diesen Film. Na gut, er ist nicht grottig schlecht - wie etwa Open Water oder Honey. Aber ist es nicht bedauerlich, wenn das das Beste bleibt, was über den Film zu sagen ist?
Einige sehr coole Aufnahmen, grobe Schnitzer im Hintergrund, völlig unnötige, blutige Szenen - vielleicht mochte ich den Film ja auch nur nicht, weil ich die Vorgänger nicht kannte. Dämonenaustreibung der effektvollen Art, kein kaltes Grausen (na gut, man ist ja auch in Afrika).

Übrigens, es gibt eine Version des Filmes, die mehr auf Psychologie und subtilen Horror setzt. Der Verleih fand das aber nicht massentauglich genug. Diese Version wird allerdings auch auf DVD erhältlich sein - mal sehen, ob er dadurch wirklich besser wird.

Who am I?

Well, as this being a blog, it will turn out that it is more important what I write than who I am. Just for the context, I nevertheless want to offer a short sketch about my bio.

I studied Computer Science and Philosophy at the University of Stuttgart, Germany. In Computer Science, I thought about Software Architectures, Programming Languages and User Interfaces, and my master thesis happened to be the first package to offer a validating XML parser for the programming language Ada 95.
In Philosophy I started thinking a lot of Justice, especially John Rawls and Plato, but finally I had a strong move to Construcitivst Epistemology and the ontological status of neural networks (both papers are in German and available from my website.

It's a pretty funny thing that next week I will listen to talk on neural networks and ontologies again, and nevertheless my then made paper and the talk won't have too much in common ;-)

Well, so how comes I am working on Semantic Web technologies by now? I have the incredible luck to work in the Knowledge Management Group of the AIFB in Karlsruhe, and there on the EU SEKT Project. I still have a lot to learn, but in the last few weeks I aggregated quite a good grasp on Ontology Engineering, RDF and OWL and some other fields. This is all pretty exicting and amazing and I am looking forward to see what's around the next triple.

Sky Captain and the World of Tomorrow

aus der Reihe Filme in 50 Worten

Ein Film, der so mordsmäßig dick aufträgt und sich derart frech bei allen Klassikern bedient, dass es richtig cool wird. Die Schauspieler spielen ihre Rollen perfekt - ob Gwyneth Paltrow, ob Jude Law, und insbesondere Angelina Jolie.
Man merkt gar nicht, dass der ganze Film - außer den Schauspielern (und selbst von ihnen einer, der schon eine Weile lange nicht mehr lebt - Lawrence Olivier) - aus dem Computer stammt. Der Sprung von Final Fantasy zu Sky Captain ist unglaublich.
Die Story ist banal, und dennoch in ihrer naiven Übertriebenheit extrem spaßig, die Figuren sind klischeehaft wie seit langem nicht und die Sprüche dermaßen knackig, dass es eine Freude ist.

Reingehen!

Auch Schwesterchen hat einen Blog, inzwischen! Hier ihr Kommentar zu Sky Captain and the World of Tomorrow.

Welcome!

Welcome to my new blog! Technology kindly provided by Blogger.com

Ein neuer Blogger

Schon wieder habe ich das Bloggingformat gewechselt - diesmal nichts selbstgeschriebenes, sondern einfach Blogger.com verwendet. Superleicht zu bedienen, bloß der Import der alten Daten war etwas hakelig...

Solltet Ihr irgendwo Fehler entdecken, bitte melden.

Essener Spielemesse, Teil 4

Wegen der Auslandseinsätze komme ich gar nicht mehr dazu, meine Essen Spiele Messe Einträge fertig zu machen. Deswegen heute noch die letzten paar Kommentare, im Schnelldurchlauf...

Der Sohn des Ministerpärsidenten NRW spielt kein DSA, sondern Shadowrun, wie er am FanPro-Stand versicherte ... Piper bringt frecherweise das Rad der Zeit neu raus -- und zwar endlich so in einzelne Bücher aufgeteilt, wie auch das Original es ist. Sprich: nur noch 11 Bänder statt über 30 ... Crystal Caste stellt Würfel aus Edelsteinen her. Ein Würfelset (W4-W20, also sieben Würfel) aus Hämatit 25 Euro, Knochen 40 Euro (aber man sieht, besonders bei den W20, dass sie handgemacht sind), Silber 100 Euro ... Der absolute Oberhammer: Mammutelfenbein, mit Zertifikat: 500 Euro... Mein Flutschi ist geplatzt! ... Das gibt ärger. Laufe am Stand zum Erwachsenenspiel Love Cubes vorbei. Sie: "Schatzi, schau, ein Spiel nur für Liebespaare!" -- Er: "Was?! Und für wen nehmen wir das mit?" ... Apropos Erwachsenenspiele: PornStar. Ist so wie Graverobbers from Outer Sprace, bloß mit Pornos ...Und schließlich: wenn Ihr Spiele für Kinder sucht, geht zu Flum Spiele. Einige echt schöne Ideen dabei.

In Barcelona

T'estimo, Catalunya! Ich liebe Dich, Katalonien. Eine herrliche Gegend -- obwohl ich gar nichts davon gesehen habe. Wir kamen Mittwoch in Barcelona an und fuhren direkt zur Universität, die eine eigene Stadt 20 km außerhalb bildet. Dies beruht auf General Francos Versuch, die Intelligenz aus den Städten Spaniens zu verbannen und so für Ruhe zu sorgen. Heute ging es dann direkt von der Uni wieder zurück zum Flughafen.

Wie dem auch sei, eine herrliche Zeit in Katalonien verbracht. Und warum liebe ich es so?

  1. Das Wetter. Heute morgen 20 Grad, Sonnenschein den ganzen Tag, herrlich.
  2. Das Essen. Lange Zeit mied ich die spanische Küche, weil ich in meinem bescheidenen Irrglauben dachte, sie sei wie die mexikanische, und ich die mexikanische seitdem ich sie in St Louis versuchte nicht mehr mag. Tatsächlich war das Essen ausgezeichnet.
  3. Die Landschaft. Zwischen Meer und Gebirge. Wunderschön.

Ich muss da mal dringend privat hin. Vielleicht sehe ich sogar was von Barcelona...

Bad Santa

Brrrr... eiskalt hier im Lande.
Nettes Detail am Rande: wegen eines Problems mit einem Mailserver kam ich tatsächlich noch vor einer von mir verschickten eMail wieder in Deutschland an. Ts, moderne Technik halt... Ach ja, aus der Reihe Filme in 50 Worten

Billy Bob Thornton spielt einen äußerst heruntergekommenen Weihnachtsmann. Er trinkt, säuft, vögelt, klaut und benutzt vor Kindern schlimme Wörter. Und dann trifft er eine hübsche Frau, die noch an den Weihnachtsmann glaubt, und er ändert sich...

...kein bisschen.
Ein Film, der angenehmerweise bis zum Ende durchgehend böse bleibt und nicht irgendwann in friedliches Weihnachtstrallala übergeht -- anschauen, anschauen, anschauen!

Ladenschlussträume

Ich moechte ein Land, wo ich um 21:15 in eine Buchhandlung laufen kann, um mich umzuschauen. Ich moechte ein Land, in dem ich abends nicht in die Stadt hetzen muss, weil die Laeden zu sind.
Ach, halt. In so einem Land bin ich ja!
Also, umformulieren: ich moechte, dass auch endlich in Deutschland das Ladenschlussgesetz gekippt wird. Mehr Freiheit fuer alle!

Und ehrlich: ich glaube nicht, dass es der Untergang des Abendlandes waere, wenn neben den Taxifahrern, Hoteliers, Krankenschwestern, Kellnern, Reportern, Journalisten, Aerzten, Schaffnern, Busfahrern, Feuerwehrmaennern, Polizisten, Bauern, Schichtarbeitern, Webmastern, Freischaffenden, Schluesseldiensten, Telefonsexanbietern, Selbststaendigen, Pfarrern und vielen anderen auch die Einzelhandelsfachverkaeufer Sonntags und abends arbeiten wuerden.

Verkehr in Turin

Ich habe mir Sorgen darueber gemacht, wie Stuttgart unter dem Verkehr kollabieren wuerde, wenn es die Olympischen Spiele tatsaechlich nach Stuttgart geschafft haetten.

Diese Sorgen sind vollkommen ausgestanden: die Winterolympiade 2006 findet in Turin statt, einer wirklich schoenen Stadt, der ich dazu nur beglueckwuenschen kann -- aber wie auch immer die Stadt das verkehrstechnisch ueberstehen will, ist mir ein vollkommenes Rätsel.
Dafuer aber ist die Gastronomie schon darauf vorbereitet: so schnell bedient wie hier in den Restaurants wurde ich noch nirgendwo auf der Welt, sehr beeindruckend.

Meldung aus Turin

Diesmal nicht von der Essener Spielemesse, auch wenn ich dazu noch ein, zwei Sachen schreiben werde... Nein, ich bin in Turin. Ich besuche die FOIS 2004, eine Konferenz ueber formale Ontologien in der Informatik, der ich mit Spannung entgegensehe. Wer mich kennt, weiss, dass ich sowohl Informatik als auch Philosophie studierte, und umso spannender finde ich die Schnittstelle zwischen den beiden -- wie etwa hier.
Dazu waehrend dem Verlauf mehr.
Turin selbst ist insbesondere regnerisch. Und was mir auffiel: mein Zug vom Flughafen in die Stadt hatte satte 45 Minuten Verspaetung. Und das schien hier vollkommen gewoehnlich zu sein -- niemand aergerte sich auch nur! Die anderen Zuege hatten bis zu anderthalb Stunden Verspaetung, kein einziger schien puenktlich.

Allerdings: die Zuege fahren im Stunden- oder Halbstundentakt. Da stellt man sich die Frage, ob der Zug ueberhauipt noch als verspaetet angesehen wird, oder ob man ihn einfach als den naechsten Zug betrachtet?

Essener Spielemesse, Teil 3

So, mein erster Eintrag, den ich gemütlich im Bett liegend mache -- WLAN, Laptop, und den neuen Nodix Blogger sei Dank! :)
Aber auf, mehr zur Essener Spielemesse, wie es aussieht hört das mit den Messe-Skurilla gar nicht mehr auf:
Wie stets beliebt waren Add-Ons und besondere Editionen von bereits erfolgreichen Spielen.

So gab es etwa von AbacusSpiele eine neue Edition des wirklich genialen Anno Domini: die Anno Domini Spiel des Jahres Edition! Dabei ist Anno Domini nicht etwa Spiel des Jahres geworden, und das wurde mit einer besonderen Edition gefeiert, nein, vielmehr sind es Ereignisse, die mit den Themen der Spiele des Jahres zu tun haben, z.B. die UNESCO erklärt den Palast von Alhambra zum Weltkulturerbe der Menschheit. Sehr nette Idee.

Ein wenig nach Geldmacherei sah es bei den Siedlern von Catan aus: nachdem ja das Buch zu dem Spiel Siedler von Catan erschienen ist, erscheint jetzt das Spiel zu dem Buch zu dem Spiel. Ach, Fans kaufen alles... (dazu bald mehr, wenn ich von der neuen Ausgabe des Liber Cantiones schreibe)

Und weiter mit den Sondereditionen: bekanntlich ist dem Herr der Ringe ein bescheidener Erfolg zuteil geworden (Plätze 2, 8 und 9 der besten Filme aller Zeiten, Plätze 2, 5 und 9 nach Einspielergebnis, insgesamt läppische 2,9 Mrd Dollar). Es ist kein Wunder, dass die Filme auch ein paar Spiele inspirierten -- und soferns die Ideen inspirativ waren, habe ich auch nichts dagegen! (Man denke an die genialen Computerspiele von Electronic Arts, wow) Aber nachdem bereits ein Herr der Ringe Risiko offenbar als (sehr coole) Geldmacherei enttarnt wurde, wie nötig war es da noch ein Herr der Ringe Monopoly oder gar ein Herr der Ringe Labyrinth rauszubringen??
Obwohl, ein Herr der Ringe Monopoly?
Fast schon wieder lustig:

Frodo: "Eine 7. Ach nö, schon wieder der Schicksalsberg..."
Sauron: "Gehört mir. Verflixt, habe ich da immer noch kein Hotel gebaut?"
Sam: "Herr Frodo muss wieder ungemütlich auf dem Boden schlafen. Wieviel nehmt Ihr Schurke denn Herrn Frodo dafür ab?"
Sauron: "Sind dann 300 Goldstücke."
Frodo: "Hab ich nicht mehr, verdammt. Und auf Beutelsend ist auch schon eine Hypothek drauf"
Bilbo: "So gehst Du mit meinen Sachen um..."
Frodo: "Hey, Sauron, nimmst Du auch den Ring als Pfand?"

Oder so ähnlich... Genug für heute, es gibt noch mehr Messeimpressionen, bald.

Essener Spielemesse, Teil 2

Wie versprochen, mehr zur Essener Spielemesse:

Wie arrogant sind die Briten eigentlich? Nun gut, sie reisen extra für die Spielemesse auf den Kontinent - aber dafür schaffen sie es nicht, ihre Preise in Euro auszuzeichnen. "All prices in pounds. Please convert." Hätten sie wenigstens den Umrechnungskurs hingeschrieben...

Ich fragte mich auch, was die ganzen Schmuckhändler da wollten -- nun gut, Umsatz wahrscheinlich. Spieler kaufen Schmuck, das war deutlich zu sehen. Aber warum ein Heiligenmedaillon mit dem Bild von General Suharto?

Mein absoluter Favorit in der Klasse Werbesprüche kam von Logika Spiele:
Junge, Alte, Mädchen, Buben
Spielen gern mit Pentakuben

Es scheint sich ein Trend abzuzeichnen: nachdem früher Brettspiele zu Computerspielen verarbeitet wurden, tritt man heute den umgekehrten Weg an.
So passierte es schon vor längerem mit Sid Meiers Civilization, so geschieht es nun mit Age of Mythology. Besonders herauszuheben ist hierbei das Doom Brettspiel. Wir kennen ja alle die hohe Komplexität der Vorlage, und freuen uns auf eine spannende Partie Doom auf dem Brett!
Es ist halt ein Space Quest Klon, ein wenig aufgepeppt (nette Figuren). Ihr kennt Space Quest nicht? War ein eher mau erfolgreicher SpinOff des zeitlosen Hero Quest.

Es gab noch mehr, nächstes Mal wieder...

Essener Spielemesse, Teil 1

Die Essener Spielemesse ist vorbei, und es hat wie stets die Beine und den Geldbeutel strapaziert, aber es gab auch allerlei Neues und Skurilles zu entdecken.

Die Neuigkeiten mag jeder im Web finden, wo er möchte, ich konzentriere mich lieber auf die Skurilla. Wie etwa einen zweieinhalb Meter langen Kicker, an dem pro Seite acht Reihen Fußballer sind. Wir spielten es in zwei Mannschaften, vier gegen vier.
Auch spannend war der lebensgroße Kicker, in dem man Menschen reinstellt, die sich an der Stange halten mussten.
Japaner, die offensichtlich unterbeschäftigt waren, und deswegen im radebrechenden Deutsch und Englisch auf ihren Stand DIN A4 Zettel klebten, dass sie extra aus Japan hierhergekommen seien, um ihre Spiele vorzustellen. Also wollten sie auch nach ihren Spielen gefragt werden.
An einem Stand lief ich am Sonntag vorbei, und der war weg. Nur ein ausgedruckter Zettel "Wir sind weg. Alles ausverkauft. 700 Stück!" Leider erwähnte der Zettel in keinster weise, was da eigentlich verkauft wurde.
Es gibt Trading Card Games zu so ziemlich allem: Six Feet Under, Buffy, 24, Shrek 2, AVP...

Genug für heute, morgen mehr.

Captain Kirk will ins All

Das ist doch mal eine Schlagzeile, wie sie sein muss.

William Shatner, am Bekanntesten für seine Rolle als James T. Kirk in Star Trek, hat bei SpaceShipOne nachgefragt, wann er denn als Tourist in den Weltraum starten könnte.
An der Stelle von SpaceShipOne würde ich ihn für den Selbstkostenpreis hochfliegen.

Meine Arbeit

An was forsche ich eigentlich?
Zunächst, eingestellt bin ich am AIFB, dem Institut für Angewandte Informatik und formale Beschreibungssprachen, der Universität Karlsruhe.
Hier wird schon seit Jahren am Semantic Web geforscht, und das Institut hat sich bereits einen hervorragenden Namen hierbei gemacht.

Der bekannteste Verfechter des Semantic Webs dürfte unzweifelhaft Sir Tim Berners-Lee sein, der, noch gar nicht so lange her, das World Wide Web ins Rollen brachte.

Was nun aber ist das Semantic Web?
Inzwischen habe ich ein paar grobe Ideen, und in den kommenden Blog-Einträgen werde ich versuchen mich der Antwort zu nähern. Natürlich sollen aber auch weiterhin die sonstigen Themen von Nodix nicht kürzer kommen als bisher. Aber eben mit ein wenig semantischer Würze.

Auswärts bloggen

Es funktionier wirklich. Ich schreibe hier meinen ersten Blog-Eintrag von "nicht daheim".

Übrigens bin ich immer überrascht, wie manche Leute denken. Das erkennt man an so Kleinigkeiten wie meinem Handymenü. Wenn ich es auf lautlos stelle, dann deaktiviere ich nicht etwa das Klingeln, sondern aktiviere vielmehr die Stille.
Eigentlich, ein sehr schöner Gedanke. Manchmal sollten wir alle ein wenig Stille aktivieren.

Vor allem im Kino.

Der neue Pythonblogger

Wow, es funktioniert! Jetzt kann ich auch bloggen, wenn ich nicht zu Hause bin -- was gerade in den letzten Wochen allzu häufig der Fall war. Dadurch sollte diese Seite endlich wieder ein wenig aufleben, ich hoffe sehr, dass es klappt.

Das Problem war ja, dass dies ein sehr einfacher Webspace ist, und darin keine Scripte erlaubt sind. Also besteht die ganze Seite aus statischen HTML-Seiten. Es gibt keine Webschnittstelle, wie es der normale Blogger gewohnt ist, wo er einfach von überall aus seinen Eintrag machen kann.
Jetzt also habe ich einen Weg erstellt, dies zu umgehen. Der Nodix-Blogger ist ein Python-Skript, dass zunächst den derzeitigen Stand von der Website bezieht, dann den neuen Eintrag aufnimmt, die Seiten generiert und dann alles hochlädt.
Python wird es auf jeder Linux- und Mac-Kiste geben, lediglich bei Windows ist das noch nicht so verbreitet. Vielleicht mach ich ja noch ein Windows-Executable daraus, dann kann ich es wirklich von überall nutzen.

Ich weiß nicht, wie verbreitet dieses Problem ist: sollten sich noch drei Leute per Mail bei mir melden, dass sie dieses Werkzeug auch gerne hätten, werde ich es auch veröffentlichen.

Also jetzt: bis bald wieder!

Neuer Blogger

Ich teste eine neue Version des Nodix-Bloggers.

Der ist von überall verwendbar, wo Python läuft. Also auch von der Arbeit, wenn ich mal einen schnellen Eintrag machen möchte.

Lebenssituation

Nur zu euer aller Update: ich habe inzwischen mein Studium der Philosophie und Informatik abgeschlossen und habe jetzt eine Stelle an der Fakultät für Wirtschaftswissenschaften, an der ich - so alles gut geht - den Doktor in Politk machen werde (eine Stelle, über die ich übrigens wirklich irre glücklich bin).

Im Ernst.

Übrigens, ich werde am Wochenende auf dem RatCon sein, wer mich dort treffen will, schreibe mir eine kurze eMail.

Meine Familie

Mitschnitt eines Gespräches zwischen meiner Mutter und Schwesterchen:
"Und sag, wo ist Dein Bruder?"
"Zur AOK und dann zur Polizei."
"Was!? Die AOK hat Samstags offen?"

Da kann man doch sehr leich einen falschen Eindruck erhalten, oder?

Back again!

"Ein wenig Schmerz hat noch keinem weh getan"

Ich bin wieder zurück! Noch bin ich beim eMail-Beantworten, beim Überblick-verschaffen, beim in-neue-Lebenssituation-eingewöhnen und so fort. Schon bald gibt es mehr.
Danke jetzt schon für eure unglaubliche Treue! Ihr seid unglaublich!

N24 überzeugt

12 May 2004

Soeben wurde ich von der Qualität des Nachrichtensenders N24 restlos überzeugt:
"Wenn die Erde zu beben beginnt, tritt die furchtbarste aller Naturgewalten zu Tage: Erdbeben!"

Bitte, liebe(r) Autor(in), der/die Du dafür verantwortlich bist: kündige, und such' Dir einen Job, den Du mit mehr Leidenschaft machen willst.
Für eine bessere Zukunft.

Vergiss Mein Nicht

11 May 2004

Mit den letzten paar Einträgen kam ich mir ja schon richtig nölig vor. Immer nur Kritik an den ganzen, aufwändig gemachten Filmen. Heute aber der Beweis, dass dies an den Filmen und nicht an mir lag (und ganz sicher nicht am besser gewordenen Wetter), ein weiterer, sehr erfreulicher Beitrag aus der Reihe Filme in 50 Worten, ein Film, wo auch die letzten zwei Worte reichen würden

Traumhafter Film über die Liebe, die Erinnerung, vor allem an die Liebe, und wie wir die Realität konstruieren. Ich darf nichts über die Story verraten, weil ich niemanden den Spaß rauben möchte - und sehr, sehr hoffe, dass ihn jeder sieht! Würde ich mich heute erneut in meinen Partner verlieben, wenn ich ihn erst jetzt kennenlernen würde? Gibt es sowas wie den Anderen, der für jemanden bestimmt ist? Wer findet, dass Kirsten Dunst, in Unterwäsche rumhopsend, verdammt süß aussieht?
Dieser Film zeigt mal wieder, was mit dem Medium Film alles machbar ist. Es geschehen Sachen, die in keinem anderen Medium möglich sind - kein Wunder also, dass nicht das Buch zum Film verkauft wird, sondern das Drehbuch zum Film.
Von dem Autor von Being John Malkovich und Adaption kommt jetzt ein Film, der mit Fug und Recht in deren Fußstapfen treten kann. Bisher der beste Film des Jahres (und das sage ich trotz Big Fish).

Kurz: unbedingt anschauen!

Van Helsing

10 May 2004

aus der Reihe Filme in 50 Worten

Der vielleicht berechenbarste Film des Jahres: Kate Beckinsale in einem Vampirfilm (hat aber im Gegensatz zu Underworld die Seiten gewechselt). Sie verliebt sich in Einen, der zum Werwolf wird (genau wie in Underworld), gespielt von Hugh Jackman (ein Wolf, wie in X-Men), der zuvor noch einen riesenhaften Mr Hyde erledigt (dargestellt wie in der Liga der außergewöhnlichen Gentlemen, aus der auch die Idee genommen wurde, alles zusammenzuwerfen, dessen Copyright schon abgelaufen ist). Ihnen hilft Faramir (wie im Herrn der Ringe), ein Q des 19. Jahrhunderts (James Bond), und der Bösewicht ist der Duke (wie in Moulin Rouge). Van Helsing ist hinter Dracula her, Frankensteins Monster ist eigentlich ganz lieb, und Vampire und Werwölfe bekämpfen sich. Gemacht von dem Team von der Mumie, untermalt von einem Soundtrack von Alan Silvestri (Forrest Gump) und etwa ein Dutzend Deus Ex Machinas.
Kurz: ein Film 100%ig nach Rezept. Ordentlich gekocht, aber etwas fad.

Hätte der Film nicht wenigstens eine Überraschung haben können? Wenigstens einen Hauch von Originalität? Er wirkt, als wäre er von vorne bis hinten vom Marketingleuten entworfen, die alles zusammengeklaut haben (die Liste oben ist bei weitem nicht vollständig), was irgendwie erfolgreich war in den letzten paar Jahren. Und wer kam auf die alberne Idee, den Prolog aus dem 19. Jahrhundert schwarz-weiß zu machen - die Handlung aber, die den immensen Zeitraum von einem Jahr später (immer noch 19. Jahrhundert) stattfindet, in knallbunten Farben?

2LDK

noch eine Preview für die Reihe Filme in 50 Worten

Japanische Zicken beim Extremmobbing.
Mit kleinen Sticheleien fängt es an, mit etwas größeren hört es auf.
Merke: stecke nie zwei Frauen, die denselben Job wollen, über Nacht in eine Wohnung.

Einer der krankesten Filme des Jahres. Blutiger als Kill Bill 2. Mir hat er Spaß gemacht - aber ich würde ihn nie, niemals!, weiterempfehlen.

Alt, neu, geliehen und blau

Danke! Eine Achtelmillion Besucher konnten seit Bestehen von Nodix verzeichnet werden. Wow! Als dankeschön wieder eine Preview in der Reihe Filme in 50 Worten

Sie wird übermorgen heiraten. Doch das Schicksal spielt verworrene Spiele: der Freund ihrer Schwester, der diese vor zwei Jahren einfach sitzen ließ, und wegen dem die Schwester seitdem wegen Depression in der Geschlossenen sitzt, taucht überraschend auf, und durch die Wirrungen der Ereignisse schlucken sie am Tag davor etwas Tiefblaues, dass sie auf einen ziemlichen Trip wirft. Die Schwester hingegen sieht ständig ziemlich zusammengeprügelte Sänger, ganz ohne die Pille. Und dann wird es leicht durcheinander.

Ähh, ein dänischer Dogma-Film halt. Ich habe nicht prinzipiell was gegen Dogma, aber ich möchte endlich einen sehen, der seine Story ernst nimmt. Wie meistens verliert sich der Film in der Darstellung zahlreicher, lebensnaher, für den Plot vollkommen irrelevanter Details. Er hat durchaus Charme, und auch mal Witz, aber die Liebesgeschichte kommt völlig unglaubwürdig daher, und durch das vollständige Fehlen von Filmmusik (abgesehen von Mettes Nur-in-ihrem-Kopf-Sängern) gelingt es dem Film nicht, die Emotionen entsprechend zu transportieren.
Ein guter Beweis dafür, dass nicht jeder kleine, unabhängige Film auch gut ist.

Kleeblatt Versand

Heute erhielt Schwesterchen einen Brief mit einer "Letzten Anweisung für Ihren Gewinn-Abruf". Ihre Gewinnnummer hatte nämlich Glück und brachte ihre einen "Anteils-Gewinn aus der großen 1.618,00 Euro Super-Jackpot-Vergabe"! Toll, oder? 1618 Euro! Da freut man sich. Besonders, wenn man nicht mitgespielt hat.
Wie stets ist der Teufel im Detail: schauen wir mal genauer nach, was hier der Kleeblatt Versand aus Graz durch die Bundesrepublik schickt.
Zunächst mal ist das ganze nicht durch einen Notar, sondern durch einen offiziellen Rechtsberater bestätigt. Nein, nicht, dass ich glaube, der Kleeblatt Verlag würde hier lügen und betrügen, und deswegen nicht einmal einen richtigen Notar hinzuziehen - beileibe nicht! Eher, dass ein Notar wohl zu teuer gewesen wäre, und die ganze Aktion ja - wie es in den Regeln ausdrücklich steht (die in unleserlichen Majuskeln in Hellgrau auf der Rückseite des zweiten Blattes stehen) - "zum Zweck der Gewinnung von neuen Kunden und Interessierten durchgeführt", und die Aktion natürlich am Besten so billig wie möglich sein soll.

"Wie aber?", fragt man sich. "Ich habe doch gewonnen! Ich brauche nur diese 0190er Nummer anrufen, und ich kriege meinen Gewinn!" - ja, schon. Dieser Gewinn aber ist ein Anteil der 1618 Euro. Und der Anruf geht an eine 0190er Nummer, die 1,86 Euro die Minute kostet. Lassen wir den Anruf, sagen wir mal, sechs Minuten dauern (ist nicht lang: Buchstabieren des Namens, der Adresse, es wird nach ein paar Daten gefragt, die im Brief verteilt sind, vielleicht am Anfang noch ein wenig - ebenso zahlungspflichtige, musikuntermalte "Bitte Warten"-Schleifen - und prompt sind 10 Euro unterwegs.
"10 Euro? Das ist doch OK, wenn man 1618 Euro bekommt. Ich meine, die habe ich doch sicher!" Falsch! Man hat einen Anteils-Gewinn an den 1618 Euro erhalten. Das heißt, wenn nur 1000 Leute aus ganz Deutschland anrufen - eine lächerlich kleine Zahl bei einer so überzeugend wirkenden Gewinnmitteilung - bekommt jeder 1,61 Euro. Das heißt, jeder - bis auf den Kleeblatt-Versand. Der kassierte durch die Anrufe über 10.000 Euro.

"Aber wenn das jetzt bekannt ist, dann rufen doch weniger an, und mein Anteil wächst" - oder alternativ "Der macht das doch nur, um seinen eigenen Anteil zu erhöhen, und will, dass wir alle nicht anrufen!" Hier kämen wir in ein klassisches Problem der Spieltheorie: machen wenige mit, ist der Gewinn des Einzelnen größer, aber wenn viele mitmachen etc. Dazu gibt es ausreichend theoretische Texte, ich will hier nicht weiter darauf eingehen.
Zumal das hier nicht zutrifft: ich kann mit Nodix gar nicht die Massen erreichen, welche diesen Gewinn-Abruf erhalten, dazu habe ich schlicht nicht die Reichweite. Ein paar hundert Leser am Tag, das ist nicht das Ding.
Zweitens: ich werde nicht teilnehmen (weil ich ja gar keine Einladung erhalten habe) und ich werde auch Schwesterchen davon abraten. Ich schreibe das also nicht aus persönlichem Interesse, zumal, wie eben gesagt, dass auch nicht wirklich helfen würde.

Alles, was ich mit diesem Beitrag wollte, ist auf die schiere Frechheit zu verweisen, mit der hier Gutgläubigkeit ausgeschlachtet wird. Der Kleeblatt Versand missbraucht hier, um an bestätigte Adressen zu gelangen - nichts Anderes ist ihr Ziel - die Naivität der Leute. Dies führt zu mehr Misstrauen in der Welt. Nein, es ist kein Betrug, glaube ich - ich bin kein Rechtsanwalt - weil alle Informationen ja korrekt da stehen. Sie sind aber irreführend, und ich bin mir sicher, dass jeder Teilnehmer mit einem wesentlich größeren Gewinnanteil rechnet, als den, den er erhalten wird.
Natürlich, wie stets gilt: "Der Rechtsweg ist ausgeschlossen. Teilnehmer oder Dritte können nicht auf die Zuteilung eines bestimmten Gewinnes klagen. Jeder Teilnehmer bestätigt mit seiner Teilnahme, dass er die Teilnahmebedingungen gelesen und verstanden hat, anerkennt und akzeptiert."
Wozu also noch beschweren?

Ich würde mich freuen, wenn ihr diesen Beitrag verlinkt, wenn ihr die Möglichkeit habt. Auch dürft ihr ihn kopieren und sinnerhaltend kürzen und widergeben, solange mein Name oder ein Link hierher angegeben ist. Auf so einen Schmarrn - wie man in Graz sagen würde - sollte man deutlich hinweisen!
Danke.


Ursprünglich auf Nodix von Denny veröffentlicht.

Vorhergehender Beitrag:
Nachfolgender Beitrag:


Lautlos

aus der Reihe Filme in 50 Worten

Ein Serienkiller. Ein absoluter Profi, der seit seinem 10. Lebensjahr mordet. Der keine Spuren hinterlässt. Der sich dummerweise in eine im Schlaf summende Frau verliebt, und ihr dann nicht nur nicht das Leben nimmt, sondern auch noch das ihre rettet, damit sie ihm das seine retten kann. Doch die Polizei ist hinter ihnen her, darunter auch ein ausgebuffter, mit geradezu übersinnlich erscheinenden Fähigkeiten ausgestatteter Kommisar, der immer die richtige Theorie tippt. Wird er ein neues Leben beginnen können, oder wird ihn seine Vergangenheit einholen?

Joachim Krol spielt den emotionslosen Killer perfekt. Selbst wenn er sich verliebt, ist er vollkommen überzeugend emotionslos. Die Vorschau auf diesen Film hat so gewirkt, als ob nach Lola rennt, Goodbye Lenin oder Liegen Lernen mal wieder ein überraschend guter deutscher Film ins Kino rollt, der Germanikus ungeschehen macht. Nö. Ich meine, er ist besser als Germanikus, ja, aber das sagt mehr über Germanikus als über Lautlos. Die Liebesgeschichte scheint vollkommen unmotiviert, und außer ein halbes Dutzend guter Dialogzeilen wird in dem Film gar nichts geboten. Lautlos ist Aktionslos, Spannungslos, Herzschmerzlos und Humorlos. Das witzigste an dem Film ist noch das nach Specht klingende Instrument des Soundtracks. Also, nicht dass er grottig schlecht ist. Aber ich freue mich doch sehr, dass es eine kostenlose Preview des Filmes war, und nicht etwa eine Samstagabendfastzehneuroteurevorstellung.
Nee, Liebesgeschichte und Polizeiarbeit vermengt zur Zeit das schon laufende Taking Lives wesentlich besser als Lautlos.


Ursprünglich auf Nodix von Denny veröffentlicht.

Vorhergehender Beitrag:
Nachfolgender Beitrag:


The girl next door

Wieder mal in der Sneak Preview gewesen, und deswegen, gleich für die werten Leser, ein weiterer Beitrag in der überraschend beliebten Reihe Filme in 50 Worten

Ein Junge in der Abschlussklasse, der nicht zu denen dazugehört, die In sind. Nebenan zieht ein wunderschönes Mädchen ein, die auch noch mit ihm ausgeht. Ob mangelnder Alternative verliebt er sich unsterblich in sie - und erfährt dann, dass sie ein Pornostar ist!
Wie in Liebeskomödien üblich, stellt der davor so besonnene Held der Geschichte eine unvorstellbare Anzahl immer größer werdender Dummheiten an, um sie zu gewinnen, und um sich aus dem Schlamassel zu befreien, in das er sich dadurch bringt. Dies gelingt natürlich nicht, bietet dem Zuschauer aber viel Gelegenheit zu lachen.

Überraschend lustiger Film, dem es gelingt, trotz des schlüpfrigen Themas nicht selber ins Schlüpfrige abzugleiten, mit einigen interessanten Ideen und Darstellungen derselben. Dieser Film ist zwar kein Muss (wie der zuletzt rezensierte), aber man er ist mehr als OK. Eine Liebeskomödie, die es schafft, tatsächlich auf das Lustige abzuzielen, statt auf die Tränendrüse. Lasst die Taschentücher zu Hause (Achtung! Ob des Themas war der Spruch zweideutig!)

Marketing

Das nenn ich mal Cross-Marketing par excellence: passend zum aktuellen Film von Mel Gibson gibt es jetzt sogar das dazugehörige Getränk, Fanta Citrell Passion Fruit.
Und wenn wir schon bei dem Film sind: bei so einem Erfolg muss es doch auch eine Fortsetzung geben. Wie wäre es mit Christ 2: Resurrection oder Christ II - Judgement Day?
Nur so Ideen...

Apropos, ich habe die Passion immer noch nicht gesehen. Niemand in Stuttgart, der rein möchte? Melde Dich!

Kill Bill Vol. 2

Huch. Gestern wollte ich eigentlich in Kill Bill Vol. 1 - stattdessen rutschte ich in Kill Bill Vol. 2. Macht nichts, dacht ich mir, habe ich hier halt im Rahmen der Filme in 50 Worten, voller stolz...

Ein Film, wo ich von der Story noch weniger verraten darf als sonst, sonst werde ich womöglich geköpft. Und das ist bei den potenziellen Zuschauern dieses Filmes nicht aus der Luft gegriffen! Wer Teil 1 geliebt hat, der wird auch Teil 2 mögen! Megacoole Actionszenen. Heiße Musik. Wunderschöne Aufnahmen. Geradlinger (für einen Tarantino, meine ich) Handlungsbogen. Schnelle Duelle. Was den Inhalt angeht: nun ja, der Titel verrät eigentlich schon ziemlich alles...

Kurz: wer Teil 1 gesehen hat, weiß eh, ob er hier reingeht oder nicht. Wozu also noch groß rezensieren?
Das einzig überraschende mag sein, dass der Film - ein zweiter Teil, wohlgemerkt - ab 16 Jahren freigegeben ist, während der erste ab 18 war.

Taking Lives

Ach, wie peinlich. Im gestrigen Eintrag waren gleich zwei (nunmehr korrigierte) Fehler. Die Hauptdarstellerin ist nich Judd (eigentlich Jude) Law sondern Ashley Judd, und der bessere Film des Genres heißt nicht Taking Lifes sondern Taking Lives. Danke an Kai und Ralf für die Hinweise! Als kleine Gegenleistung gibt es genau diesen Film als weitere Folge aus der Reihe Filme in 50 Worten

Wollten wir nicht schon immer mal jemand anderes sein? Wo wir uns aber nicht trauen, lebt der Schurke dieses Films diesen Traum. Dummerweise heißt die Konsequenz, dass er den anderen Personen zuvor das Leben nehmen muss, um dasselbe zu Leben. Nicht überraschend folgt daraus eine Mordserie, und gleich nach dem ersten entdeckten Mord ruft der kanadische Polizeichef das FBI um Hilfe, und das schickt Lara Croft, Angelina Jolie, die mit Vorliebe in Gräbern liegt und Leichen zum Frühstück betrachtet.
Die kanadische Polizei bestätigt mit ihren Fähigkeiten im Folgenden mehrfach die Entscheidung des Polizeichefs mit ihrer erschütternden Inkompetenz bei Hausbewachung und Personenschutz. Ein klein wenig wird sich Freud freuen, weil wir herausfinden, dass die Mutter des Täters an allem Schuld trägt und...

Ach, man kann gar nicht viel mehr erzählen, ohne die Pointe zu verraten (die man aber ohnehin die ganze Zeit erahnt). Ethan Hawke und Angelina Jolie biten ein glaubwürdiges Team, Kiefer Sutherland hat hingegen keine exponierte Rolle, egal, wie groß sein Name auf den Plakaten steht. Die Storyautoren haben sich einiges einfallen lassen, aber eigentlich das Beste am Film sind die ersten paar Minuten. Alles sonstige läuft mit einer gewissen Konsequenz ab, die kaum Raum für echte Überraschungen lässt.
Kurz: solider Film für alle, die das Genre mögen. Für alle anderen: warum Taking Lives, wenn doch Big Fish läuft?

Twisted

Und ich dachte, nach dem Studiumende werde ich mehr Zeit haben... ganz im Gegenteil, alles ist viel chaotischer geworden.
Wie auch immer, nur damit diese Seite nicht allzusehr verwaist wirkt, und um zu beweisen, dass ich noch da bin, hier ein weiterer Hammerbeitrag aus der Reihe Filme in 50 Worten

Ashley Judd wird Inspektorin in der Mordkomission, doch da macht sie sich schnell viele Feinde. Ihrer erster Job entpuppt sich gleich als eine Mordserie, wobei ihr im Nachhinein auffällt, dass sie mit allen Opfern geschlafen hat. Auch wenn ihr sonstiger Lebensstil eher dazu veranlassen würde, dass sie sich über diesen Zufälligkeit keine Gedanken zu machen bräuchte, da sie ja mit jedem ins Bett hüpft, macht sie sich kurz Gedanken und ersäuft diese dann erfolgreich mit Alkohol.
Schließlich hält sie sich selber für übergeschnappt, und beim ach so überraschenden Ende kommt wieder heraus, wer der tatsächliche Täter ist. Untermalt wird das ganze von einer alten, drohend guckenden Chinesin.

Ein deutlich unterdurchschnittlicher Kriminalfilm. Man hätte einen schönen Psychothriller aus der Grundidee machen können, doch dann wandelt er sich zum sehr konventionellen Krimi-Thriller. Die Schauspieler (neben Ashley Judd Andy Garcia und Samuel L. Jackson) haben wir alle schon mehr leisten sehen, deswegen ist es bedauerlich, dass ihre doch recht ansehnlichen Namen hierfür werben. Mit Taking Lives gibt es zur Zeit eine deutlich bessere Alternative dieses Genres im Kino.

Philosophie bestanden

Hallo, hier geht es vielleicht chaotisch zu! Da denkt man noch, dass mit den letzten Prüfungen endlich Ruhe einkehrt, aber ich habe noch nichtmal die Zeit gefunden, meine Bücher wieder zu sortieren. Der erste Tag, wo ich ein wenig Zeit gehabt hätte, wäre vorgestern gewesen - aber an einem 1. April Nachrichten zu verkünden ist immer so gewagt...

Also, zunächst, damit es alle wissen (Danke für die vielen Nachfragen, übrigens!) - ja, die Prüfungen sind vorbei, und ich habe sie bestanden! Die mündliche mit einer 1,3, und die beiden Hausarbeiten, die sich auch hier auf der Website einsehen lassen, gar jeweils mit einer 1,0 - sehr zufriedenstellend!
Damit ist mein Studium beendet und ich bin jetzt ganz hochoffiziell Diplom-Informatiker! Das Zeugnis jedoch scheint noch ein wenig Zeit zu brauchen, da meine Diplomarbeit selbst immer noch keine Note hat - ach ja, immer das Selbe. Und jetzt kommt die spannende Suche nach dem richtigen Beruf...
Wünscht mir viel Glück! Und wer von einer interessanten Stelle erfährt, darf es mir gerne flüstern.

Erste Bewerbung

Wie peinlich: die erste Bewerbung ist raus, und da befindet sich diese Website in einem so desolaten, unordentlichen Zustand (und natürlich verweise ich auf Nodix, wenn ich mich bewerbe). Und dennoch, wie bereits geschrieben, am Freitag und Montag ist (sind? - es ist ja nur eine Prüfung, die aber in zwei Teilen) die Prüfung, im Moment habe ich einfach nicht die Zeit, an Nodix weiterzuarbeiten.
Tja, dann kann ich nur hoffen, dass die Inhalte von Nodix überzeugend sind, auch mit der derzeitigen, nicht ganz optimalen Navigationsstruktur...

Apropos Inhalte: heute gibt es auch endlich neue Inhalte! In der Abteilung Philosophie stehen ab sofort meiine beiden letzten Hausarbeiten zum Lesen bereit. Viel Spaß damit! Auf Kommentare und heiße Diskussionen bin ich natürlich sehr gespannt (und bitte nicht wundern, wenn ich erst Dienstag antworte!)

Abschlussprüfung

So, die nächsten Tage muss ich büffeln. Am Freitag ist meine schriftliche Abschlussprüfung (zum Thema Gerechtigkeit), und zwar morgens von 9:30 bis 12:30. Am Montag ist dann der mündliche Teil der Prüfung dran, mit den Themen Zweifel, Trost und Utilitarismuskritik, und zwar um 14:00, eine halbe Stunde lang etwa.

Dies für alle, die mir die Daumen drücken wollen.

Pagerank hoch

So, endlich wieder weiter an Nodix gebastelt. Nachdem ich gestern ganz fleißig war, und an meiner Hausarbeit konzentriert gearbeitet habe, gönnte ich mir zum Abschluss des Tages noch ein wenig an Nodix zu basteln. Hierbei wurde die erste Ebene der neuen Hierarchie endlich auch eingeführt. Heißt das, es gibt irgendwelche Änderungen?
Tatsächlich, nein. Immer noch werdet ihr beim Navigieren durch Nodix furchtbar verwirrt durch die Gegend irren, tut mir leid. Erst wenn alles fertig ist, wird sich Nodix wieder angenehm lesen lassen, angenehmer als je zuvor.

An dieser Stelle vielen Dank an Google, die mir heute das Google-Ranking von 4 auf 5/10 verbesserten! Die ständige Arbeit an Nodix scheint sich bezahlt zu machen.

Gothika

aus der Reihe Filme in 50 Worten

Halle Berry ist Psychiaterin und glaubt nur an rational erklärbare Sachverhalte und ihren Patienten deswegen kein Wort. Wie für so einen Menschen in Hollywoodfilmen üblich, passieren ihr umgehend rational nicht erklärbare Sachverhalte, weswegen sie eingewiesen wird. Ach ja, auch weil sie ihren Mann mit einer Axt in kleine, handliche Portionen aufgeteilt hat. Sie besteht darauf, nicht verrückt sondern nur besessen zu sein. Robert Downey Jr., ehemaliger Arbeitskollege, ist jetzt ihr Psychiater und glaubt ihr kein Wort. Genug zur Story, der Rest derselben ist ebenso klischeehaft wie das oben bereits Erzählte. Es gibt genau eine Schrecksekunde im Film und keine überraschenden Wendungen. Leider reiht sich der Film nicht in die interessante Sparte der Psycho-Thriller ein, wie etwa Fight Club, A beautiful mind oder Identity. Umso bedauerlicher, da der Anfang genau das Potenzial dazu erkennen lässt. Dummerweise gleitet er dann in eine übernatürliche Geistergeschichte ab, und schnell verliert man das Interesse am weiteren Verlauf der Geschichte, weil man sie eigentlich schon ein paar Dutzend Mal gesehen hat. Das Beste an der zweiten Hälfte des Films ist der Abspann mit dem Limp Biskit Cover von Behind Blue Eyes.

Drumline

aus der Reihe Filme in 50 Worten (Frage: gehe ich eigentlich zu häufig ins Kino?)

Eine Geschichte, von der wir uns fragen, warum sie nicht schon lange erzählt wurde: ein junger und überaus talentierter Trommler bekommt darob ein Stipendium für die renommierte A&T, wo er unbedingt in den Spielmannszug aufgenommen werden will. Härteste Trainingsbedingungen, ein Disziplin verlangender Drum Major, seine eigene Arroganz und eine völlig unmotivierte Liebesgeschichte mit dem ersten Mädchen, dass er auf dem Campus anspricht - wer hätte gedacht, dass Trommler für Spielmannszüge so hart rangenommen werden, bis sie endlich ihre Lektion gelernt haben. "Da draußen hört Dich keiner, da draußen hört man nur die Band!"

Sehr coole Trommeleinlagen, für die sich ein THX-Kino oder eine wirklich gute Anlage lohnen. Ansonsten erschien mir der Film eher lustig, weil die Leute dass alles so furchtbar ernst nahmen (ich weiß, wäre ich da drin, würde ich ganz anders empfinden).

Definitiv eine unverbrauchte Geschichte, frische Schauspieler, tolle Trommeln, wenig Überraschungen.

Big Fish

aus der Reihe Filme in 50 Worten

Ein Vater hat die Angewohnheit, bei seinen Geschichten stets zu lügen. Alle lieben ihn dafür, bloß sein Sohn versucht, die Wahrheit über ihn herauszufinden, und begibt sich dabei auf eine Reise durch die Vergangenheit seines Vaters, die ihn erkennen lässt, dass nicht alles, was eine Lüge zu sein scheint, auch wirklich gelogen ist...

Es ist ein wunderschöner Film, der einen viel zu Lachen bringt, auch zum Nachdenken, und auch zum Lächeln (nicht jeder Film, der einen Lachen macht, bringt einen auch zum Lächeln). Der Film ist bunt, bilderfroh, skurill, genau so wie wir es von Tim Burton kennen und lieben.

Anschauen. Nicht verpassen. Unbedingt anschauen.

Jubiläum für den Papst

Mit dem heutigen Tag überschreitet die Dauer des Pontifikats von Johannes Paul II. das von Leo XIII. Hier meine persönlichen Gratulationen nach Vatikanstadt!

Es gibt hierbei die Frage, ob JP2 damit zum zweit- oder zum drittlängst regierenden Papst wird.Pius IX. regierte 31 Jahre und ist damit mit Sicherheit vor JP2, aber was ist mit dem heiligen Petrus? Abgesehen davon, dass weder das Datum seines Todes bekannt ist (zwischen 64 und 67) noch das Jahr, in dem er zum Fels der Kirche wurde (um 30?), weiß man nicht einmal, ob man Petrus überhaupt als Papst zählen kann. Sich selber hätte er mit Sicherheit nicht so bezeichnet.

Dies bloß zum Hintergrund. Es soll also nicht überraschen, wenn es dieser Tage Meldungen gibt, die JP2 sowohl zum zweit- wie auch zum drittlängstregierenden Papst ernennen.Viel amüsanter ist jedoch die Lösung, die Associated Press veröffentlicht hat (und die an zahlreichen Stellen widergegeben wurde):
Now, Pope John Paul II has added another milestone to his career, passing Pope Leo the 13th to become the third longest-serving pontiff in the history of the Roman Catholic Church.
The longest pontificate -- almost 32 years -- was that of the 19th-century Pope Pius the Ninth.

Liebe Associated Press, wer war dann Nummer 2? Vertraue nie den Nachrichten.

Unterwegs nach Cold Mountain

aus der Reihe Filme in 50 Worten

Der amerikanische Bürgerkrieg ist ein ganz schlimmer Krieg, in dem beide Seiten unmenschliche Gräueltaten verursachen (ach). Besonders schlimm ist es für Jude Law und Nicole Kidman, die sich eben erst unsterblich ineinander verliebt haben (besonders hilfreich dabei war, dass sie nie die Gelegenheit hatten, sich kennenzulernen - eine hervorragende Voraussetzung für schmachtende Liebe). Er muss an die Front, sie bleibt daheim, um auf die Farm aufzupassen - nicht, dass sie Ahnung davon hätte... Beide müssen Prüfungen überstehen (er muss Menschen töten, fliehen und Natalie Portman verlassen, nachdem sie zusammen geschlafen haben; sie muss kochen lernen), um dann, im großen Finale -

Halt, ich habe ja versprochen, nicht alles zu erzählen, in diesen Zusammenfassungen. Wer das unglaublich überraschende Ende wissen möchte, soll ins Kino gehen. Ein paar nette Sprüche weist der Film durchaus auf, und er ist gar nicht so kitschig, wie alle vermuten - die Schlachtszene am Anfang ist durchaus ein würdiger Herausforderer für den Soldaten James Ryan , aber insgesamt ist er doch ein paar Spuren zu vorhersagbar, und hält sich zu sehr an Klischees und den klassischen Aufbau einer Geschichte.

Kurz: er hat genausoviele Oscars erhalten, wie er verdient hat. Vielleicht einen mehr oder weniger, aber das war's auch schon.

Timeline

Nachdem ich vor ein paar Tagen über eine Website stolperte, in der es darum ging, Bücher in 50 Worten zusammenzufassen (lest selbst), dachte ich mir, oh, gute Idee, mache ich auch.
Nur halt nicht mit Büchern, sondern mit Filmen, die ich vor kurzem im Kino gesehen habe. Und natürlich nicht so, dass ich alles verrate.

Ein defektes, dafür aber überdimensioniertes Faxgerät schickt eine Gruppe von Archäologen und drei Marines 600 Jahre in die Vergangenheit. Die Marines sterben alle, und dabei geht das defekte Faxgerät kaputt. Gefangen im 100jährigen Krieg (wo zum Glück alle Englisch oder Französisch des 20. Jahrhunderts sprechen) hauen und stechen die Archäologen um ihr Überleben, wohl wissend, dass sie nur 6 Stunden Zeit haben (warum auch immer). In der Gegenwart haben die Leute auch nur 6 Stunden, um das kaputte defekte Faxgerät zu reparieren. Mensch, eine Zeitmaschine wäre jetzt echt praktisch...

Spaßiger und netter Actionfilm, man hätte aber von dem Autor von Jurassic Park mehr erwartet. Die physikalischen und manche historischen Grundlagen werden vollkommen ignoriert.

Macht einen Heidenspaß, wenn man jedesmal, wenn einer der Begriffe "600 Jahre" oder "14. Jahrhundert" oder "1357" fallen, einen Schluck trinken muss (je alkoholreicher, desto spaßiger).

Legolas ist gar nicht so toll

Wer erlegt in dem Film "Der Herr der Ringe Teil 3: Die Rückkehr des Königs" eigenhändig einen Olifanten?

Genau, Éowyn - sie braucht dafür etwa 10 Sekunden, und Éomer, der in etwa derselben Zeit sogar zwei Stück auf einmal erledigt.
Ach ja, da wäre noch Legolas, der ungefähr zwei Minuten lang auf einem Olifanten rumhüpft, während dieser weiter durch die eigenen Kampfesgefährten rast, bis er ihn endlich zu Fall bringt.

Was lernen wir daraus? Es kommt weniger darauf an, was man erreicht hat, als darauf, wie man es verkauft.

Schnitt!

Dass dem Chirurgen notfalls Gesundheit vor einem Tattoo geht, ist ja nur verständlich. Dass durch den Schnitt des Chirurgen ab und an Tattoos beschädigt werden, leider unvermeidbar.

Dumm nur, wenn das Tattoo die Worte I like women bildete, und das wo durch den Schnitt verloren geht... (aus der Zeit)

Bescheidenheit

Don't be humble, you are not that great. - G. Meir

Vernunft

Wenn wir von einem neuen Versuch hören, Vernunft oder Sprache oder das freie Entscheiden naturalistisch zu erklären, so sollten wir darauf reagieren, wie auf die Kunde, jemandem sei die Quadratur des Kreises gelungen oder der Beweis, dass die Wurzel von 2 eine rationale Zahl ist. Es ist nur eine gelinde Neugier am Platze: Wie gut ist der Fehlschluss versteckt worden? - Peter Geach

Nachricht vom Nodix-Team

Ist das nicht herzallerliebst? Heute früh erreichte mich folgende e-Mail:

Dear user, the management of Nodix.de mailing system wants to let you know that,
Your e-mail account has been temporary disabled because of unauthorized access.
For details see the attach.
Best wishes,
The Nodix.de team http://www.nodix.de

Wie dumm, dass ich hinter Nodix.de stecke. Mein e-Mail-Account tut jedenfalls wie bisher, und ich habe ihn auch nicht vorübergehend deaktiviert. Natürlich befand sich im Attachment ein Virus, denselbigen ich nicht geöffnet habe. Die eMail kam ja von staff@nodix.de (eine, in diesem Fall zufällig nicht existierende Adresse), und das sah durchaus vernünftig aus, deswegen eine Warnung an alle Leser:
die Rücksendeadressen von eMails können unglaublich einfach gefälscht werden. Nur weil die eMail von jemandem kommt, dem ihr traut, heißt das noch lange nicht, dass Ihr der eMail bedenkenlos trauen dürft! Bei unabgesprochen zugesandten Attachments einfach nachfragen!

Oscars - die Auswertung

Na, ich habe doch gestern gar nicht so schlecht gelegen. Von 20 Tipps lagen 17 richtig, zudem stimmten meine Voraussagen bei allen wichtigen Oscars mit den Tatsachen überein. Spitze! Also, nächstes Jahr hier wieder vorbeischneien, wenn ich wieder mit prophetischen Gaben die Oscars prophezeie...Ich meine, zwei meiner drei Fehltipps habe ich auch nur dem Gedanken zu verdanken "Der Herr der Ringe kann ja nicht alle Nominierungen verwirklichen!"

Er kann.

Oscars

So, heute Nacht gibt es endlich wieder Oscars. Es war ein tolles Kinojahr! Und hier kommen meine persönlichen Favoriten, damit ich auch morgen noch kraftvoll blamieren kann (in Klammern findet Ihr, sofern es sich von meinem Favoriten unterscheidet, den Kandidaten, den ich gewählt hätte, und den ich am liebsten gewinnen sähe - dummerweise unterscheidet er sich von dem Kandidaten, den ich am wahrscheinlichsten halte):

  • Nicht-adaptiertes Drehbuch: Lost in Translation
  • Adaptiertes Drehbuch: The Lord of the Rings: The Return of the King
  • Visuelle Effekte: The Lord of the Rings: The Return of the King
  • Sound mixing: The Lord of the Rings: The Return of the King
  • Tonschnitt: Pirates of the Caribbean: The Curse of the Black Pearl (Finding Nemo)
  • Animierter Kurzfilm: zwar nix gesehen, dennoch tippe ich auf Boundin' und hoffe auf Destino
  • Bester Song: Annie Lennox für Into the West in The Lord of the Rings: The Return of the King
  • Filmmusik: The Lord of the Rings: The Return of the King
  • Makeup: The Lord of the Rings: The Return of the King
  • Schnitt: City of God
  • Kostüme: The Last Samurai (The Lord of the Rings: The Return of the King)
  • Cinematography: Master and Commander: The Far Side of the World (City of God)
  • Art direction: The Lord of the Rings: The Return of the King
  • Bester Zeichentrickfilm: Finding Nemo
  • Weibliche Nebenrolle: Renee Zellweger in Cold Mountain (Marcia Gay Harden in Mystic River)
  • Weibliche Hauptrolle: Charlize Theron in Monster (Diane Keaton in Something's Gotta Give - hier soll nochmal erwähnt werden, dass Keisha Castle-Hughes' Nominierung für Whale Rider einfach unglaublich ist! Schade, dass sie keine Chancen zu gewinnen hat)
  • Männliche Nebenrolle: Tim Robbins in Mystic River (Benicio Del Toro in 21 Grams)
  • Männliche Hauptrolle: Sean Penn in Mystic River (Johnny Depp in Pirates of the Caribbean: The Curse of the Black Pearl, ganz ganz knapp vor dem brillianten Bill Murray in Lost in Translation)
  • Regie: Peter Jackson für The Lord of the Rings: The Return of the King (wobei in auch Fernando Meirelles für City of God und Sofia Coppola für Lost in Translation mehr als verdient hätten!)
  • Bester Film: The Lord of the Rings: The Return of the King (Schade für Lost in Translation, doch selbst in der Klammer würde The Lord of the Rings: The Return of the King stehen. Warum konnte Miss Coppola nicht einfach ein Jahr warten mit ihrem Film?)

Zu ein paar Kategorien (Best documentary short subject, Best documentary feature, Best live action short film, Best foreign language film of the year) kann ich nichts sagen, weil ich die Filme nicht kenne. Von den Kategorien oben hingegen kenne ich die meisten Kandidaten (bis auf Seabiscuit und Cold Mountain). Also, immerhin ein informierter Tip!
So, wer mir vor heute Nacht noch eine eMail mit seinen Tipps schickt (oder eh hier vorbeikommt, um sich die lange Oscarnacht reinzuziehen ;), mit dem wette ich gerne. Um ein Ü-Ei oder alternativ eine Erdnuss, je nach Wunsch.

Risiko Alter

Die Bundesregierung hat ein kleines Heftchen veröffentlicht, mit dem zauberhaften Titel "agenda 2010 - Deutschland bewegt sich". Ich hatte mir es heute, aus Interesse und auch aus mangelnder Alternative in der Straßenbahn durchgelesen, und fand einiges Interessant. Herrje, wieviel Geld des erarbeiteten Einkommens gar nicht an mich gehen wird! Als unbedarfter Student ahnt man das ja gar nicht, ich werde hoffentlich bald persönlich in den Genuss kommen, die Begriffe Brutto und Netto mit Bedeutung zu füllen.
Aber darum geht es nicht. Auch nicht darum, dass einige sehr nette Ideen drin waren, und einige fehlleitende Informationen und offensichtlich verfälschende Statistiken. Das ist ja üblich, und man sollte das gewohnt sein. Viel erschreckender jedoch führte mir ein kleiner Satz im Glossar die Einstellung der Bundesregierung vor Augen:
"Soziale Sicherungssysteme [...] Damit sind alle Versicherten solidarisch gegen die großen Risiken Alter und Erwerbsminderung [...] abgesichert."
Das Risiko Alter? Auf der Suche nach einer Quellenangabe im Netz stieß ich auf über 300 Treffer, die über das Risiko Alter sprechen.

Meine sehr geehrte Bundesregierung, liebe Herren und Damen Versicherer, lasst Euch hiermit deutlich gesagt sein, dass das Alter meiner bescheidenen Meinung nach kein Risiko ist, sondern vielmehr Ergebnis des offensichtlichen Meisterns zahlreicher vorhergehender Risiken. Mir ist durchaus bewusst, dass für einen Versicherer wie auch für die Bundesregierung, die ja unter anderem für eine Sicherung des Rentensystems zuständig ist, das Alter der einzelnen Personen durchaus ein Risiko ist, doch glauben sie mir, wir, als Alternde, wünschen selten mit so offensichtlichem Zynismus und unüblicher Ehrlichkeit darauf hingewiesen zu werden.
Auf alle, die das Risiko Alter bereit sind einzugehen! (- und das nicht nur ob der zweifelhaften Alternative)

Umbau von Nodix

So, ich habe doch versprochen, dass Nodix noch umgebaut wird. Dies ist ein weiterer Schritt, die Titelseite sieht doch damit schon wesentlich aufgeräumter aus. Von den inneren Seiten ist noch nichts umgestellt, aber hier seht ihr schon mal, wie es etwa aussehen wird in Zukunft.

Nodix selbst wird einerseits eine Art Dachseite für die restlichen Seiten der Nodix-Gruppe darstellen, andererseits wird es insbesondere noch diesen Blog beinhalten, den ihr gerade lest, und schließlich noch eine Reihe von Texten und Essays aus meiner Feder.

Mit den vielen, unübersichtlichen Ebenen von früher ist es damit aus. Alle Inhalte sollen mit bis zu zwei Klicks erreichbar werden. Wie üblich, freue ich mich über Kommentare.

Fun with Google

Ich habe soeben eine lustige Art entdeckt, wie man sich über andere lustig machen kann. Und Google hilft dabei! Gebt einfach ein beliebiges, falsch geschriebenes Wort bei Google ein, und schaut nach, wieviele Deppen heute im Internet publizieren, und nicht einmal korrekt schreiben können. Nicht, dass Nodix fehlerfrei wäre, aber dann mache man sich halt auch über Nodix lustig...
Es ist ja fast bedauerlich, dass es keine Grammatikalische Suchmaschine gibt, damit man nach lustigen Zeiten und ähnlichen suchen kann.

Hier schonmal ein erster Fund: "Las mich bitte nicht alleinIch habe Angst durch die wircklichkeit zu wegetieren". Der Autor droht auch noch an, jedem, der interesse hat an seinen Gedichten, weitere zu schicken...

Wer mag, kann mir ja weitere Fundstücke schicken. Dabei stellt sich die Frage, ob man inzwischen "nähmlich" schreiben darf... über 40,000 Belege für diese Schreibweise sollten eigentlich zumindest zum Nachdenken über eine weitere Rechtschreibreform animieren.

Hausarbeit fertig

So, meine Philosophie-Hausarbeit zum ontologischen Status Neuronaler Netze ist endlich abgeschlossen. Bald wird sie auch auf Nodix zur Verfügung gestellt werden. Noch befindet sich Nodix ja im Umbau, ich hoffe aber, dass noch bevor der 100.000. Besucher diese Website beehrt, ich mit dem Umbau weitgehend fertig bin.

An dieser Stelle eine Entschuldigung an meine anderen Projekte, insbesondere das DSA4 Werkzeug und XML4Ada95. Studium und Jobsuche haben zur Zeit Priorität. Sobald hier alles geregelt ist, bin ich wieder mit dem alten und vielem neuen Elan zurück!

Wenn mir jemand einen Job hat, würde das ganze natürlich arg beschleunigt werden ;) Meldet Euch!

Frohen Valentinstag

Einen frohen Valentinstag!
Ich weiß, ich weiß, letztes Jahr wurde die Seite noch auf Rosa umgestellt, zur Feier des Tages, aber seien wir ehrlich: schöner wurde sie dadurch trotzdem nicht. Genau genommen erfuhr ich damals einige der stärksten Reaktionen auf Nodix, und das von Leuten, von denen ich nie erwartet hätte, dass sie diese Seiten überhaupt lesen...

Was ist das besondere am Valentinstag? Es ist einer der wenigen Feiertage im Jahr, die ich wirklich mag. Einerseits, er ist nicht so kommerziell überfrachtet wie etwa Weihnachten oder Ostern. Ja, natürlich wird er auch kommerziell genutzt, aber eben nicht so wahnsinnig überladen wie manch anderer Feiertag. Und was ist schon dabei, eine Rose zu verschenken? Einer schönen Frau zu verraten, dass man ab und zu an sie denkt? Oder, wenn man gar das Glück hat, tatsächlich verliebt zu sein, sich die Liebe zu gestehen. Natürlich, dazu braucht man keinen im Kalender festgemachten Tag - aber schaden tut er auch nicht... außerdem gewährt er eine gewisse Narrenfreiheit. Es ist das eine, einer Unbekannten an einem schönen Valentinssamstag eine Blume zu schenken, und etwas anderes, dies an einem regnerischen Septemberabend zu machen.

Ein Gedanke, der mich heute umtrieb, war die Vergänglichkeit der Geschenke. Bitte, bitte, fangt nie an, Diamanten am Valentinstags zu schenken! Es müssen Blumen sein, die bald verwelken, Naschwerk, welches noch bälder verschwinden wird, nur noch eine süße Erinnerung... ist das nicht viel passender für den Festtag der Liebe, als irgendetwas ewig haltbares, kristallklares, kaltes? Genießt den Augenblick, das Vergängliche, die Erinnerung...

Wer war eigentlich dieser Sankt Valentin? Wie üblich gibt die Wikipedia hier Auskunft. Interessant, dass die katholische Kirche den Valentinstag 1969 offiziell abschaffte, weil der zugrundeliegende Heilige wohl eher legendären Ursprungs gewesen ist - ein weiterer Grund, warum ausgerechnet dieser Tag für etwas so Vergängliches so schön passend ist: selbst der Ursprung ist vergessen, vergangen, nur noch Legende...

Wem das ganze nicht passt: heute ist auch der Namenstag des Heiligen Kyrill, genau derjenige, der das kyrillische Alphabet erfunden hat. Wer also eher auf hochgeistiges steht, kann ja hier passender gedenken...

100.000 Besucher!

Heute wurde die Marke von 100.000 Besuchern überschritten! Danke, danke Euch allen!
Nodix, eine kleine, private Website, konnte in nur 3 Jahren und 2 Monaten 100.000 Besucher anlocken - wobei die zweite Hälfte dieser 100.000 allein in den letzten fünfeinhalb Monaten zusammenkam!
Das größenwahnsinnige, letztes Jahr gestartete Projekt 100.000 hat sein Ziel damit um nur 2 Monate und 2 Tage verfehlt, und heute konnte es endlich erfüllt werden. Und wem habe ich das zu verdanken?

Euch! Den Lesern! Jedem Einzelnen von Euch! Danke, vielen Dank für Eure Treue, und umso mehr tut es mir leid, mich hier ständig für meine mangelnde Zeit entschuldigen zu müssen. Ich glaube ganz fest daran, dass, sobald das Studium abgeschlossen und ich eine Stelle habe, ja, dann wird mein ganzes Leben wieder einen geregelteren Gang gehen, und dann werde ich auch regelmäßig die Zeit finden, Euch hier mit Updates zu beglücken.
Bloß eines kann ich Euch nicht versprechen: wann das sein wird.

Gestern desweiteren in der Sneak gewesen: Das Urteil, Verfilmung eines Justizthrillers von John Grisham. Klasse Besetzung (Dustin Hoffman, Gene Hackman, John Cusack, Rachel Weisz), grandiose Story, spannender Film. Wer Justizthrillern auch nur ein wenig was abgewinnen kann, wird hierin sicher seine Freude haben. Lohnt sich allemal!

Ein Film sie alle zu finden

11 Oscar-Nominierungen für den Herrn der Ringe Teil 3 - Die Rückkehr des Königs!

Status

Viel zu wenig Zeit... sorry, für die Inaktivität auf Nodix selbst. Ich bin gerade heftig daran, meine Hausarbeit über den ontologischen Status Neuronaler Netze zu schreiben - hochinteressant! Wer mir dazu Fragen beantworten kann, soll sich bitte mal bei mir melden, ich knacke da noch an einigen Nüssen, zu denen ich keine Literatur finde...
Sobald sie fertig ist, wird auch diese Arbeit hier auftauchen. Auch die noch übriggebliebenen Nodix-Rubriken werden demnächst weichen - Nodix wird zu einer reinen Sammlung von Essays, Aufsätzen und sonstigen Texten aus meiner Feder (na ja, eher Tastatur). Die sonstigen Themen werden je mehr als erfolgreich von den restlichen Seiten der Nodix-Gruppe abgedeckt.

Hier nur zu den Webcomics:
nutkidz erscheinen wieder halbwegs regelmäßig (freu)! Ohne zuviel versprechen zu wollen, zielen wir wieder auf eine wöchentliche Erscheinungsweise, wahrscheinlich Ende der Woche jeweils. Und es geht rund: aus den Logs entnehme ich, dass alleine die nutkidz inzwsichen etwa 300-400 regelmäßige Leser anziehen! Hurra! Auch gibt es für die nutkidz endlich ein Gästebuch! Schwesterchen und ich würden uns sehr über Eure Einräge freuen!
Something*Positive auf Deutsch ist hervorragend angelaufen, und dank der Hilfe von Buddy konnten wir auch - trotz meiner mehrtätigen Abwesenheit - weiterhin täglich einen neuen Comic hochfahren! Klasse, und hier nochmal Dank, auch und gerade für die Übersetzungen.

Hier fehlt noch ein Bild.

Und, noch zum Abschluss: hier die Besuche der letzten Woche. Als ich die Statistik das erste Mal sah, überlegte ich mir, ob mir mein Provider nicht vielleicht irgendetwas mitteilen möchte... :)

Happy Birthday, Nodix

Heute wird Nodix 3 Jahre alt!

Am 14. Januar 2001 ging Nodix online - damals mit etwa einem halben Dutzend Seiten. Übrigens exakt ein Tag vor der Wikipedia, die allerdings - das muss man neidlos zugestehen - leicht größer geworden ist... :)

Heute besteht die Nodix-Gruppe aus mehr als 500 Seiten. Im ersten Jahr konnten wir 2.000 Besucher willkommen heißen, im zweiten Jahr waren es weitere 20.000, und im letzten Jahr konnten wir diese Zahl nochmal mehr als verdoppeln und über 44.000 Besucher fanden ihren Weg nach Nodix - Doppelt soviel wie in den beiden Jahren zuvor! Hier das ganze schön grafisch aufbereitet (die Anzahl von Requests auf den Webspace)

Hier fehlt noch ein Bild.

Die Spitze war erreicht kurz bevor die Seiten zum DSA4 Werkzeug umgezogen waren. Seitdem haben wir einen deutlich ruhigeren Verlauf - es ist eigentlich sehr amüsant für mich zu beobachten, da ich nahezu jede Spitze und die meisten Täler erklären kann, woran das lag, was da passierte, usw. Ich dachte mir, vielleicht findet Ihr es ja interessant....

Apropos DSA4 Werkzeug: in diesen drei Jahren Nodix sind bislang vier eigenständige Seiten hier entsprungen - alle im letzten Jahr. Das DSA4 Werkzeug, XML4Ada95, die nutkidz und natürlich nakit-arts. Auf allen diesen Seiten wurden heute Updates gefahren: das DSA4 Werkzeug erhielt eine neue Nachricht über deren eigene Zukunft, und ein kurzen Überblick über die Stärken und Schwächen des Programms; XML4Ada95 erhielt nach langer Zeit endlich eine News, in welcher wir auf den neuesten Stand gebracht werden bezüglich des Projekts; ein neues Bild ist in Schwesterchens Galerie zu finden; ein neuer nutkidz ist auch da...

Und letzterer enthält einen subtilen Hinweis auf die große Überraschung: eine weitere Website wird hiermit in der Nodix-Gruppe willkommen geheißen! Something*positive ist die deutsche Übersetzung von Randy Milhollands Webcomic, und täglich werden wir die Truppe um Davan, Peejee und Aubrey nun durch ihre Abenteuer begleiten können. Nehmt euch in Acht: s*p ist böse und gemein. Und ich mein es! Wer so etwas nicht mag, soll lieber nicht hinklicken davon lassen... alle anderen, herzlich Willkommen in einer neuen, ganz und gar sarkastischen Welt.

Da es auf der s*p-Website noch nicht steht, danke an Buddy für die Hilfe bei der Übersetzung. Wer übrigens nicht genug Gemeinheiten hat, der mag einen Blick auf das neue Editorial von baumgarf werfen. Wer auf Skurilla und Blasphemie steht, wird dieses Editorial lieben...

Und die Umstrukturierung der eigentlichen Nodix-Seite ("Rest-Nodix"? :) ist ja noch nicht einmal beendet, und dabei wurde sie schon heftig verkleinert. Die Bedienung wird stark vereinfacht, und der Inhalt leichter zugänglich werden - aber bleibt einfach weiterhin Nodix treu, und ihr werdet es erleben.Tja, also - willkommen im vierten Jahr Nodix!, und 66.000 Mal Danke! Ich glaube, uns stehen interessante Zeiten bevor.

Mozilla und nakit-arts

Ui, seit einer Woche Ruhe von meiner Seite - aber das Glück wird nicht lange anhalten! Zunächst mal hier einen ganz herzlichen Dank an die wunderschöne Lena, die mich darauf hinwies, dass die nakit-arts-Website unter Mozilla nicht funktionierte. Wie peinlich - da rede ich soviel von Open Source, und dann teste ich die Seite nicht einmal mit dem wichtigsten Open Source Browser...

Die Ruhe ist auch eher eine Ruhe vor dem Sturm: der dritte Geburtstag von Nodix kommt näher. Nur noch 5 Tage, und dann kommt die große Überraschung! Und ich kann sagen, freut euch darauf.

Weniger freuen tue ich mich darüber, dass Sourceforge in letzter Zeit so viele Schwierigkeiten hat - zur Zeit kann ich nichtmal auf die aktuelle Downloadadresse verweisen, und die Benutzer ärgern sich mit einem veralteten File herum... gnn! Vielleicht zieht die DSA4 Werkzeug-Website deswegen auch vollständig um - bloß wohin? Es hat nicht zufällig jemand zuviel Webspace und -traffic über, den man entsprechend konfigurieren könnte, oder? :(

Nun mit Impressum

Und weiter am Nodix rumbasteln. Nun erfährt, endlich dem deutschen Gesetz gemäß, jede Seite auch einen Link zum Impressum. Damit sollte ich zumindest von dieser Art von Abmahnungen geschützt sein... :) Übrigens, willkommen in 2004! Ich hoffe, ihr hattet alle einen guten Rutsch, und kommt langsam wieder zu Kräften. Viel Spaß jedenfalls im neuen Jahr!, mit oder auch, wenn es denn sein muss, ohne Nodix...

Blick in die Kristallkugel

So, von hier aus einen guten Rutsch in das neue 2004!Wie überall üblich zu Jahresenden soll auch hier eine Vorschau stehen, wenn auch nur eine ganz kurze.Hierbei werde ich gar nicht über Nodix reden, da dieses selber ja bald Geburtstag feiert, und dafür ist das viel besser geeignet.

Da ich mindestens so gut wie 95% der Astrologen bin, werde ich also eine Vorschau auf 2004 wagen (eine Rückschau wäre zuviel Recherche - begnadet derjenige, der hierbei den Gag erkannt hat).

Dieses Jahr wird es, entgegen dem Trend, der sich fast bis an den Anfang des Jahrtausends zurückverfolgen lässt, keinen Herr der Ringe-Film geben.Alle Süchtigen müssen sich mit etwa einer Dreiviertelstunde neuer Szenen im Herbst begnügen, danach heißt es, sich auf drastische Entzugserscheinungen einzustellen.Als Metadon der Fantasyfans kann man die Realverflimung zum Letzten Einhorn betrachten, welches uns Ende des Jahres erfreuen sollte.Bedauerlicherweise erscheint der Film aber nicht rechtzeitig :( Dies wird frühestens im März bestätigt werden.
Die guten Nachrichten für alle Fantasyfans sind, dass der Herr der Ringe natürlich wieder Oscars einstreicht, zudem erreichen uns mehr Infos zur Verfilmung des Narniazyklus.

Auf Softwareseite ist Ultima X zu erwarten!Und zur allgemeinen Überraschung - bedenkt man, die Verschiebungsorgien von Ultima IX - erscheint es auch tatsächtlich 2004. Duke Nukem Forever hingegen erscheint nicht.

Die Amerikaner beweisen gegen Ende des Jahres, dass sie das Zählen geübt haben.Selbst Florida schafft es im ersten Anlauf, diese verantwortungsvolle Aufgabe zu meistern.Wiederholt werden die Angebote der Russen und Chinesen, Wahlbeobachter zu schicken, abgelehnt.

Aber solche allgemeinen Informationen interessieren eh Niemanden.Was wir einzig wissen wollen ist doch, was uns nächstes Jahr widerfährt.
Also, blicken wir tief in die Kristallkugel...
Das neue Jahr wird 366 Tage anbieten, die allesamt voller neuer Gelegenheiten sind.Die Tarotkarten verraten mir, dass wir jeden einzelnen Tag neue Fähigkeiten aneignen dürfen, interessanten Personen begegnen werden - wir müssen sie nur erkennen und ansprechen -, mit neuen Ideen jonglieren, und endlich dran gehen dürfen, lang gehegte Pläne in die Wirklichkeit umzusetzen.2004 ist die naheliegendste Möglichkeit, uns weiterzuentwickeln.
Auch 2004 werden wieder einzelne Rückschläge versuchen, uns zu entmutigen, aber wir sind als Einzige dafür verantwortlich, ob wir diese als Erfahrungen abbuchen wollen, oder uns durch diese Entmutigen lassen wollen.Die Sterne stehen günstig für die erste Möglichkeit.2004 beginnt mit der Sonne im Steinbock, einem ehrgeizigen Zeichen, das hoch hinaus will - was also sollte uns aufhalten, außer wir selbst?

Also, es wird ein spannendes Jahr.Ich hoffe Euch allen alles Beste, einen guten Rutsch und ein wunderschönes 2004!

Renovierung die zweite

So, die letzten Tage mal wirklich nichts getan, nur Warcraft gespielt - Schwesterchen hat es mir zu Weihnachten geschenkt! So konnte ich diese Weihnachten behaglich damit verbringen, allerlei virtuelle Leben zu vernichten, Städte niederzubrennen, und einer ziemlich coolen Story (für ein Strategiespiel!) zu folgen.Der neue nutkidz-comic liegt eigentlich schon ein Weilchen bei mir auf der Festplatte, aber aus oben genannten Gründen kam ich erst heute dazu, in hochzuladen. Nächstes Jahr versuchen wir wieder regelmäßiger zu erscheinen - ach, diese guten Vorsätze... :)Schließlich habe ich auch endlich die Galerie nach nakit-arts verschoben, so dass auch dort mal etwas Leben in die Bude kommt. Schwesterchen plant zwar immer noch an nakit-arts, so wie sie es haben möchte, aber bis es soweit ist, finden sich schon mal Ihre Bilder dort.Das ein weiterer Schritt beim Umbau von Nodix. Nodix selbst wird damit einerseits zu einem Portal zu den ganzen Seiten, die hier ihren Anfang hatten - links die Übersicht mit den Logos, das sind nakit-arts, nutkidz, DSA4 Werkzeug und XML4Ada95. Andererseits wird Nodix selber noch weiter umgebaut - meine eigenen Texte finden sich ja hier, die will ich irgendwie zugänglicher einrichten, und auf dieser Seite schließlich noch diese Einträge.Bis Nodix fertig umgebaut ist, dauert es noch ein wenig. Eine große Überraschung ist noch für das Jubiläum von Nodix geplant - bald wird Nodix 3 Jahre alt! - und vielleicht ist bis dahin ja auch der neue Aufbau ersichtlich. Ich suche nur noch nach einer Möglichkeit, wie man sich schnell über die Updates auf den diversen Subsiten informieren kann... Über Kommentare, Anregungen, Kritiken usw. würde ich mich sehr freuen, schließlich mache ich ja die Seite für Euch! Schreibt einfach an die übliche Adresse - denny@nodix.de.

Renovierung

So, ein weiterer Schritt bei der Umgestaltung von Nodix ist getan. Eben diese Texte, die ihr hier lest, die sogenannten News, werden jetzt von einem neuem Stück Software verwaltet, dem Nodix Bloggerskript, in Python geschrieben. Ich werde bei Gelegenheit mal eine Übersicht darüber schreiben, wie Nodix eigentlich technisch funktioniert.

Jedenfalls sorgt er für eine automatische Verwaltung der News, dafür, dass die Titelseite nicht zu lang wird, und dass das Archiv entsprechend erstellt wird. Als kleinen Gag bietet er bei jedem Eintrag einen kleinen schwarzen Kasten an. Wenn man den Mauszeiger über diesen Kasten verharren lässt, erfährt man, wie man genau diese Meldung verlinken kann. Mag vielleicht praktisch sein...

Auch sind die Subseiten von Nodix nun direkt von der Nodix-Titelseite erreichbar, wie ihr links sehen könnt. Dies ist noch nicht die endgültige Fassung - die Umstrukturierung von Nodix geht noch weiter. Es wird insgesamt leichter zu bedienen, schöner, übersichtlicher, auch schneller, und sogar für mich einfacher zu verwalten. Was will ich mehr?

Übrigens wurde ich darauf aufmerksam gemacht, dass die Rückkehr des Königs gar nicht mehr auf Platz 4 der besten Filme laut imdb ist, sondern schon Platz 3. Na, es wird spannend - knackt der Film die ewige Herrschaft des Paten, wie es eines Königs gebührt? Warten wir's ab.

DSL, nutkidz, HdR3

Und kurzfristig fiel mein Anschluss wieder aus. Noch bin ich mit Beantworten von eMails beschäftigt - und außerdem wird an Nodix fleißig weitergebastelt. Die alten nutkidz-Seiten sind jetzt allesamt in Verweise auf die neue nutkidz-Website www.nutkidz.de verwandelt worden, so wie es sich für eine liebe Website gehört.

Außerdem finde ich DSL toll. Ein ganz neues Interneterlebnis. Außerdem: auf der Liste der besten Filme auf imdb ist die Rückkehr des Königs inzwischen auf Platz 4 angelangt, noch vor den beiden Türmen und den Gefährten! Mehr als 100 Million US-Dollar sind auch schon eingespült. Ich kann nur sagen: der Film ist wahnsinn! Anschauen! Der kriegt den Oscar für den Besten Film.

Nutkidz.de

Und wieder zurück. Zwei Hundertschaften eMails warten jetzt auf Beantwortung, es wird also noch ein wenig dauern. Aber die Internetpause wurde sinnvoll genutzt: unter anderem wurden die nutkidz vollständig neu designt und vor allem, die nutkidz haben jetzt ihre eigene Website! www.nutkidz.de selbst wird diese Tage konnektiert werden, aber hier auf Nodix erhaltet ihr bereits eine Vorschau auf das neue Layout! Viel Spaß damit.

Herr der Ringe III

Dies ein Notupdate - zur Zeit habe ich kein Internetzugang (deswegen kann ich auch nicht auf eMails antworten, sorry jetzt schon!), und mache auch das hier von einem Internetcafe aus. Aber die neuen nutkidz waren fest versprochen, und hier sind sie auch! Endlich geht es weiter, yeah!

Außerdem war gestern Nacht die Premiere des Herrn der Ringe III - Die Rückkehr des Königs! Und ich war dabei, und ich kann sagen: Wooooooooooooooowwww!!!! Anschauen. Dies ist keine Übung. Unbedingt anschauen. Jetzt. Sofort.

Viva die Moderatoren

Zwei Sachen gibt es, auf die wir uns nächste Woche freuen können! Alle Details im Bild.

Übrigens, von hier aus ein Dank an Viva! Heute fühlte ich mich wieder richtig schlau, als ich mal vorbeizappte. Aus irgendeinem Grund glauben die Programmmacher von Viva ja, Sendungen zwischen die Musikclips zu bringen, die ohnehin 83% der Zuschauer langweilen. Für die Moderatoren dieser Sendung hofft man wirklich, dass sie besser bezahlt werden als sie es verdienen. So auch die heutige Moderatorin, die mit verblüfften Blick feststellte, dass ihr Interviewpartner mit dem Hinweis auf das Lied "Maria" nicht sofort etwas anfangen konnte. "Du weißt schon, das Lied von Scooter". Nach wie vor blickte der Interviewte etwas unwissend in die Kamera, als sie schier an ihm zu verzweifeln schien.

Doch da sagte der Interviewte (dessen Namen ich nicht abgewartet habe): "Ach, ich dachte Du meinst den Song von Leonard Bernstein für West Side Story" - und, schau an, da war es an der Moderatorin, verdattert zu schauen, und dann einen Blick in die Kamera zu werfen, der so deutlich aussagte "Ach ja, dieser altmodische Spinner, wovon der wieder redet..." - offensichtlich keine Ahnung habend.

Man kann wirklich für einen Musiksender arbeiten, ohne die West Side Story zu kennen? Harald Schmidt, wie kannst Du uns in diesem Lande alleine lassen? Mit John Rawls hätte es das nicht gegeben...

60.000

Hurra, 60.000 Besucher! Das habe ich natürlich nur Euch zu verdanken, die Ihr hier vorbeischneit. Danke, danke, nochmals danke! Nun ja, zugegeben, es wird etwas kanpp, um bis zum Ende des Jahres 100.000 zu erreichen - aber wir setzen einfach die Aktion fort. 2004, ja 2004 werden wir zusammen die 100.000 knacken, da besteht kein Zweifel!

Nächstes Jahr wird Nodix komplett überarbeitet, so dass auch das erweitern leichter fällt. Derzeitig wird das mit dem Nodix Webseiten-Generator V8 erledigt, ein selbstgeschriebenes Tool, dass seine Arbeit ja auch sehr ernst und eifrig vollführt, doch hat es auch einige Schwächen. Und da ich mich immer mehr mit Python anfreunde, überlege ich, vielleicht das Tool mit Python neu zu schreiben, statt das Delphitool weiter zu warten.

Zudem sollen die Teile von Nodix selbstständiger werden. Die nutkidz sollen auf eine eher eigene Website, usf. Und vielleicht gibt es ja auch bald eine weitere Überraschung, das zeigt sich alles. Alles neu macht der Jahresanfang. So weit, so schön, eine frohe Adventszeit noch!

Herr der Ringe 2, die DVD

So, heute geht es in die Sneak Nummer 200! Mal sehen, was das wird - ich rechne ja mit einer echten Überraschung, und nicht etwa mit S.W.A.T. oder Buddy oder irgendsowas... Letzte Woche, der Vollständigkeit halber, kam Shanghai Knights, überraschend lustiger Film mit Jackie Chan - der Kerl hat einfach eine unglaublich einfallreiche und spaßige Kampfchoreographie, wow!

Was mich ein wenig enttäuscht hat, in den letzten Tagen, sind zwei DVDs, einmal Moulin Rouge, und dann der Herr der Ringe II - Die zwei Türme, Extended Edition. Nein, nicht die Filme - beide Filme sind unglaublich gut, 10 von 10 Punkte, überhaupt keine Frage! Es sind die DVDs! Moulin Rouge, zum Beispiel, bietet nicht die Möglichkeit, den Film so zu sehen, wie man es im Kino tat - Deutscher Text, Englische Songs mit deutschen Untertiteln. Entweder man schaltet die Untertitel durchgehend an, oder aber durchgehend aus. Blöd.

Noch viel schlimmer aber treibt es die HdR2-SEE: da zahlt man für die übersetzte Version unglaubliche 95,- Euro (inzwischen, in manchen Läden, auch noch mehr, vor Kurzem sah ich sie für 120,- Euro; wohlgemerkt für die limitierte Sammlerausgabe), statt etwa 60,- Euro für eine andere Sprachversion - und die meisten Specials haben nicht einmal Untertitel! Auch die Audiokommentare sind ohne Untertitel. Und der eigentlich erwartete Trailer für den dritten Teil fehlt auch. Dies zu übersetzen hätte doch drin sein müssen, gerade für diesen Preis, der mit Recht scharf kritisiert wurde.

Nodix bietet da besseren Service: meine Hausarbeiten zu Entwurfsmustern und zu Softwarearchitekturen gibt es jetzt auch als PDF-Version, ich hoffe, dass sie so eine noch weitere Verbreitung findet. Und noch 8 Mal Schlafen gehen, dann kommt auch schon der Herr der Ringe... pfeif auf den Weihnachtsmann dieses Jahr!

Tolle Abenteuer und Sneak

Am Wochenende war ich in Oberhausen, zum DSA spielen. Habe schon ewig kein DSA mehr gespielt, wie mir auffiel :) und es machte wieder höllischen Spaß. Von Freitag bis Sonntag abend spielten wir fast durch, nur zwei Mal Schlaf zu je drei Stunden, und eine kurze Unterbrechung am Samstag Abend... wow, ich bin doch noch nicht so alt wie ich dachte ;)

Wir spielten die "Unsichtbaren Herrscher", ein grandioses Abenteuer für alle Freunde des Horasreiches, unheimlich spannend, gut geschrieben, und mit einigen überraschenden Wendungen - es lohnt sich alle Mal! Ich kann lediglich dazu raten, es aber nicht an einem Wochenende durchspielen zu wollen, sondern sich mehr Zeit zu nehmen. Bei uns wurde an mehreren Stellen heftig gekürzt, einfach weil wir keine Zeit gehabt hätten.

Schließlich: ich gehe regelmäßig in die Sneakpreview in Stuttgart. Sehr empfehlenswert! Aber was ich wissen möchte: seid Ihr daran interessiert, von den Filmen dort zu erfahren? Wollt Ihr wissen, ob ein Film gut ist, bevor er anläuft? Die letzten paar Wochen kamen "Ein unmöglicher Härtefall", "Findet Nemo!" und "Tatsächlich... Liebe" - drei Filme, wo ich bei jedem einzelnen sagen würde: unbedingt anschauen! Wer sie noch nicht gesehen hat, jetzt rein!

Gestern hingegen war's "The Mystic River". Eine technisch brilliante Umsetzung einer schwachen Story. Klasse Schauspieler, gute Regie - doch wem es darum geht, sollte lieber "Eigentlich... Liebe" anschauen. Noch mehr gute Schauspieler :) Aber der Plot von Mystic River ist schlicht zu vorhersehbar... da sind Geschichten wie "Fight Club", "Identity" oder "Memento" um Klassen besser.

Schade eigentlich, dass so viel gute Arbeit für so eine langweilige Story verbraten wurde...

Abschlussvortrag Diplomarbeit

Morgen findet der Abschlussvortrag für meine Diplomarbeit statt (hier finden sich die Folien zum Abschlussvortrag). Wünscht mir alles Gute! Ich kann offensichtlich ein wenig Glück gebrauchen: bei der Vorbereitung der Demo verreckte plötzlich mein CygWin, und ich konnte gar nichts mehr ausprobieren...

Das wird morgen etwas hektischer dadurch :(

Rettet die Kinos!

Soeben wurde ich von Thomas Glatzer auf ein äußerst interessantes Anliegen aufmerksam gemacht:

MeinKinoBinIch.de ist ein Portal deutscher Kinobetreiber und weist auf ein für alle Kinofreunde sicher interessantem Gesetzesvorhaben hin. Da ich ein großer Freund des Kinos bin, würde ich mich freuen, wen die Kampagne mit den dort dargestellten Zielen Erfolg hätte. Wenn jemand mit Thomas Kontakt deswegen aufnehmen möchte, hier seine ICQ: 321969141

(Anmerken möchte ich zudem noch, rein persönlich: ob Internet "Unterschriften"-Sammlungen sinnvoll sind, weiß ich nicht. Ob die Kampagne korrekt dargestellt wird, habe ich nicht recherchiert. Aber mündige Leser werden dazu sicher in der Lage sein, und wenn jemand neutralen Informationen findet, mag er mich gerne ins Bild setzen. Seiten von Kinobetreibern sind keine neutrale Information.)

Keine Spielmesse

Dieses Jahr komme ich doch nicht auf die Essener Spielemesse. Geld, Wetter, Reisemöglichkeit, Zeit, usf., mögen mir keine rechte Gelegenheit geben. Schade, wollte ich doch einige Leute sprechen...

Nun ja, was soll's, nächstes Jahr hoffentlich wieder. Und jetzt noch die Treffen absagen...

DSA4 Werkzeug 1.24

Es geht richtig rund. Die Entwicklung des DSA4 Werkzeugs ist wieder angelaufen. Dies konnte mit der am Sonntag veröffntlichten Version 1.24 allen Unkenrufen zu Trotz bewiesen werden. Jetzt geht es darum, dem Werkzeug die Magie näherzubringen...

Na, mal schauen. Dieses Wochenende ist Essener Spielemesse - vielleicht komme ich am Samstag vorbei. Wenn sich dort jemand mit mir treffen möchte, kann er mir ja kurz noch eine e-Mail schreiben, dann können wir uns noch kurzschließen - falls ich komme.

Dazu morgen mehr.

XML4Ada95 0.9

Version 0.9 von XML4Ada95 ist erschienen! Das bedeutet, die Dokumentation ist ausgebessert, ein paar grobe Bugs sind raus, und ich teile der ganzen Welt mit: hier ist es! Holt es euch...

Ich befürchte nur ernsthaft, dass meine Veröffentlichung des Pakets für meine Diplomarbeit von Nachteil sein kann: dadurch, dass schon jetzt viele Augen das Projekt begutachten werden, wird es auch viel Kritik, auch gerechtfertigte geben. Ich hoffe, dass dies die Note nicht negativ beeinflusst. Dem Projekt kann es ja nur zu Gute kommen, wenn es korrigiert wird. Ächz, ich hoffe, das Richtige getan zu haben, indem ich auf mein Gefühl hörte.

XML4Ada95 wächst weiter

Weiterer Wachstum für XML4Ada95. Beispiele, Erweiterung der Dokumentation (mehr denn 100 Seiten inzwischen, gut, dass dieser Teil nicht in der Ausarbeitung der Diplomarbeit mitgedruckt wurde, puh!)

Es geht voran. Ich mache mir auch schon wieder die ersten Notizen zum DSA4 Werkzeug und werde auch daran bald wieder arbeiten. Zudem steht eine Überarbeitung von Nodix an, aber das eilt weniger - diese Website hat sich doch anders entwickelt als gedacht, und dem sollte Rechnung getragen werden.

Den Juni, Juli und Augst von dieser Seite ins Archiv geschoben, und auch die History rechts mal wieder gekürzt (das macht die Anfangsseite kleiner und somit schneller zu laden).

Schönes Wochenende allen!

XML4Ada95 wächst

XML4Ada95 ist ganz schön gewachsen, drei Dutzend neue Seiten wurden hinzugefügt. Und die Seite ist noch lange nicht fertig!

Auch auf Nodix wurde ein klein wenig aufgeräumt, von den Einträgen auf der Titelseite wurde der Mai ins Archiv geschoben. War aber eher mal wieder zum üben, ich vergesse sonst, wie der Nodix Webseiten Generator bedient wird...

Diplomarbeit abgegeben

Yehhhhhaaa, heute fristgerecht die Ausarbeitung der Diplomarbeit abgegeben. Der Quelltext dazu ist auch schon fertig, und jetzt geht es darum, so schnell wie möglich die dazugehörige Website aus dem Boden zu stampfen.

Und hier ist sie, eine weitere Website, die auf Nodix ihren Anfang nimmt: XML4Ada95!

Dem Aufmerksamen wird auffallen, dass das Layout großteils von der Website zum DSA4 Werkzeug übernommen ist. Dazu kann ich nur sagen: stimmt. Aber da ich lediglich so schnell wie möglich den Inhalt online stellen wollte, habe ich den DSA4WzDocGen einfach leicht erweitert, damit er mir jetzt meine XML4Ada95-Seiten erstellen kann. Vielleicht trenne ich die beiden Layouts ja wieder, sobald ich etwas mehr Zeit habe, aber im Moment geht es eher um Inhalte, Inhalte, Inhalte... (und häßlich ist doch das Layout allemal nicht, oder?)

50.000 Besucher

Hurra, 50.000 Besucher!

Der halbe Weg zum Projekt 100.000 ist getan. Dennoch, immer noch in der Sommerpause. Abgabe der Diplomarbeit ist der 7. Oktober. Bis dahin, noch viel Spaß!

Sommerpause 2003

So, jetzt gibt es zunächst mal Sommerpause mit den nutkidz, Nodix selbst wird sich vorerst auch kaum verändern. In den nächsten Wochen wird meine Diplomarbeit hier vorgestellt, und die Ergebnisse online gestellt, aber für mehr ist zur Zeit schlicht keine Zeit vorhanden. Sorry, hätte Euch früher informieren sollen!

Aber keine Angst, wir kommen wieder!

Diplomarbeit Halbzeit

Der Zwischenvortrag zu meiner Diplomarbeit ist gelaufen, die Folien dazu - in verschiedensten Formaten - finden sich für den interessierten, unter der Adresse www.nodix.de/da. Kommentare sind stets hoch willkommen!

Übersetzungen

Heute mal die nutkidz um ein paar Schreibfehler ärmer gemacht und gleichzeitig die englischen Übersetzungen, die ausstanden, nachgeliefert. Danke bei dieser Gelegenheit noch an Buddy, der die Übersetzungen stets rechtzeitig Woche für Woche schickte - hier, sieh selbst, die Arbeit war nicht umsonst!

Zwischenvortrag

So, hier der neue nutkidz. Und wen es interessiert: am 1. August findet meine Zwischenpräsentation zu meiner Diplomarbeit statt, Thema XML für Ada 95. Wen es interessiert, kann sich melden, ansonsten werden hier danach auch die Folien runterladbar werden.

Prüfungen und Verspätungen

Verzeihung ob der unregelmäßigen nutkidz! In den nächsten Tagen sollten auch die englischsprachigen auf den neuesten Stand gebracht werden, doch hier geht es nach wie vor drunter und drüber. Schwesterchen hat heute ihre Abschlussfeier, und seit dieser Woche darf sie sich staatlich geprüfte Designerin nennen! Und wegen ihrer mündlichen, praktischen und schriftlichen Prüfungen - die sie allesamt glänzend bestanden hat! - hatte sie wenig Zeit! Ich bin stolz auf Dich.

Ich selber tümmle mich mit meiner DA, wo ich ein wenig in Zeitnot verrutsche. Diese Tage bereite ich mich auf die Zwischenpräsentation vor, und mache gerade meine Folien, aber zunächst geht es zu Schwesterchen. So, Danke vielmals für Eure Geduld, und wie immer viel Spaß mit den nutkidz!

Happy Birthday, Emelie!

Auch wenn ihr es leid sein dürftet, die stetigen Entschuldigungen für die Verspätungen bei den nutkidz zu hören: diesmal war es geplant, dass der Comic erst heute erscheint, denn erst heute hat Emelie, der jüngste Spross der Familie, Geburtstag! Den zu Recht stolzen Eltern herzlichen Glückwunsch!

Dogma, 22 Uhr, Pro7

Hui, diese Woche habe ich die pünktlichen nutkidz einfach vergessen. Sorry, soll nicht wieder vorkommen! Dafür bastel ich an der Diplomarbeit mit Erfolg weiter! Und nicht vergessen, heute abend, Dogma um 22:20 auf Pro7!

Das letzte Einhorn

Wow, was habe ich gestern, nur durch Zufall entdeckt? Als ich mal wieder auf IMDb nach etwas suchte, schlug ich kurz nach, in wie vielen Filmen Christopher Lee schon mitspielte, und dann entdeckte ich die Ankündigung seines 223. Films.

Es gibt wahrscheinlich für jeden einen Film, der so etwas wie eine heilige Kindheitserinnerung ist. Für mich war es der erste Film, den ich überhaupt im Kino gesehen habe: Das letzte Einhorn.

Und - angekündigt für Weihnachten 2004 - kommt die Realverfilmung des letzten Einhorns ins Kino!

Christopher Lee bleibt uns auch nächstes Weihnachten mit einem Fantasyspektakel erhalten (nach drei Jahren Herr der Ringe) und zwar als König Haggard, den er auch schon im Zeichentrickfilm vor über zwei Jahrzehnten sprach. Angela Lansbury - bekannt aus "Mord ist ihr Hobby" - spielt Mamma Fortuna, welche sie auch damals gesprochen hat, und Mia Farrow, die damals Amalthea, also das Einhorn, sprach, spielt die gute Molly. Prinz Lir und Amalthea selber sind noch nicht gecastet, für Prinz Lir schwirrt die Gerüchteküche für Heath Ledger, unseren Ritter aus Leidenschaft, für Amalthea wird jede Dame von Natalie Portman über Uma Thurman bis Cate Blanchett genannt - und keine Infos von offizieller Seite.

Die Musik von America klingt uns wohl immer noch in den Ohren, für die Neuverfilmung wird jedoch u.a. Adiemus für Musik sorgen! Und auf der offiziellen Website des Films gibt es noch viel mehr Material, unter anderem auch das komplette Skript des kommenden Films!

Was garantiert uns einen guten Film? Christopher Lee setzt sich sehr für eine gute Umsetzung ein, und der Autor der Buchvorlage selbst - Peter Beagle - hat das Skript beigesteuert. Wow! Nachdem dieses Jahr der Herr der Ringe Teil III - Die Rückkehr des Königs wohl das bis dahin erfolgreichste Fantasyepos im Kino abschließen wird, haben wir Freunde der Zauberei nächstes Jahr wieder einen zauberhaften Film zu erwarten! Yippee!!!

Außerdem wurde die Nodix-Titelseite wieder ein wenig verkürzt, und die Meldungen von Februar bis April ins Archiv geschoben.

GIMP

Wie der getreue nutkidz-Leser wissen wird, steht der Scanner, mit dem die nutkidz vom Papier den Eintritt in die digitale Welt finden, bei Schwesterchen. Da Schwesterchen diese Tage aber nicht bei sich weilt, sondern hier in Stuttgart - kein Scanner! Heute haben wir uns also in Unkosten gestürzt, im Internetcafe das Bild eingescannt und dann hat statt Schwesterchen meine Wenigkeit das Bild bearbeitet - mit dem kostenfreien GIMP, welches ich an dieser Stelle noch mal ausdrücklich loben möchte (damit werden die Sprechblasen schon seit jeher erstellt). Ein hervorragender Beweis dafür, dass OpenSource-Projekte eine unheimliche Qualität erreichen können. Lediglich die Bedienung ist deutlich nicht anfängerfreundlich, aber ansonsten: ausstrucksstark wie kaum ein Programm unter 200 Euro!

Abschlussprüfungen

30 May 2003

Und wieder ein Tag Verspätung. Nun ja, ein kleines Update an alle treuen Leser: Schwesterchen hat diese Woche ihre erste Runde Abschlussprüfungen geschrieben, und hat wenig Zeit. Ich schreibe an meiner Diplomarbeit und bin gerade einem wichtigen und schwierigen Problem auf der Spur, und will es alsbald lösen. Habe auch wenig Zeit.

Wir werden es wohl auch weiterhin schaffen, neue Folgen der nutkidz zu veröffentlichen, aber restliche Updates werden noch einige Zeit lang rar bleiben, sorry. Dies wird sich bessern, aber es können noch ein paar Wochen ins Land ziehen. Danke für Eure Treue!

Und Wow! Ihr seid Klasse! Die 40.000 sind überschritten! Noch ist das Projekt 100.000 nicht aufgegeben, auch wenn es im Moment ein wenig ruhiger wirkt, aber ich habe ein paar Sachen im Kopf, die diese Seite wieder ungemein beleben werden, sobald ich wieder Zeit habe. Vertraut mir!

Bis bald, und liebe Grüße in die ganze weite Welt!

Nutkidz reloaded

23 May 2003

Wie bereits angekündigt kommt diese Woche der nutkidz-Comic mit ein wenig Verspätung: dafür aber passend zu Matrix Reloaded! Die nutrix ist wieder da!

Diplomarbeit am Horizont

15 May 2003

Diese Woche, wenn auch knapp, doch pünktlich: hier die neue Folge der nutkidz!

In nächster Zeit stehen die ersten Infos zu meiner Diplomarbeit auf dieser Seite an. Desweiteren noch eine Ankündigung: nächste Woche gibt es aufgrund des besonderen Tages einen besonderen nutkidz-Comic! Weil der brandaktuell sein wird, kann er unter Umständen ein wenig später kommen. Wer noch nicht dahinter gekommen ist, wovon ich rede, soll sich nächste Woche überraschen lassen!

Erster Ausfall

Zum ersten Mal fiel eine Folge der nutkidz aus, und zwar aus eben dem Grund, der im Comic dargestellt wird. Letzte Woche fuhren Schwesterchen und ich nach Berlin, und verpassten es schlichtweg, davor noch einen Comic online zu stellen. Da wir auch keine mehr vorrätig hatten - Schwesterchen befindet sich zur Zeit in der Abschlussphase ihrer Ausbildung - konnten wir nicht schnell einen nachliefern, und ich kam diese Woche noch schlichtweg nicht dazu, euch zumindest auf dem laufenden zu halten. Auf der Website zum DSA4 Werkzeug sieht es gar noch trüber aus...

Doch hier endlich der neue nutkidz. Hätte schon gestern kommen sollen, doch war ich kurzfristig anderweitig verhindert. Der neue Comic ist wahrhaft ein selffulfilling Gag!

Ab nächstes Mal, ganz sicher, wieder regelmäßig jeden Donnerstag eine neue Folge!

Viva liebt dich

So schönes Wetter, und gleichzeitig kann ich mich freuen, denn auf Viva Plus kam soeben die SMS-Nachricht über den Äther getickert: "Ich liebe Denny." - leider ohne Abesender...

Melde Dich doch mal :)

Keine Updates

Weiterhin keine großen Updates, ich habe für die Uni noch einiges aufzuholen. Aber wie auch schon letzte Woche kommt pünktlich ein neuer nutkidz-Comic!

Neue nutkidz

Na, keine Updates heißt natürlich nicht, dass keine neuen nutkidz erscheinen! Diesmal wieder eine spannende Folge aus der aktuellen, nervenzerfetzenden Storyline! Aber wie wird das bloß ausgehen?! Sehen sie das nächste Woche - in dem großen Finale!

Apropos nächste Woche: da ich nun für eine Woche wegfahre, kann es zu einer Verzögerung des neuen Comics um ein, zwei Tage kommen. Solange, viele liebe Grüße und eine besinnliche Karwoche!

Aprilscherz

Die Meldung vom 1. April war leider nur ein Aprilscherz. Zur Zeit sind leider kaum Updates zu erwarten: ab Freitag bin ich für eine Woche weg, und zuvor will ich soviel wie möglich für meine Diplomarbeit erledigen. Aber ich bastel schon an einem größeren Umbau an Nodix - doch wird er nicht vor Mai oder Juni vollzogen werden, tut mir leid. Danke jedenfalls für euer Verständnis!

Nutkidz bei Carlsen

Wow! Dass der Krieg im Irak solche Auswirkungen haben wird? Einige amerikanische Comiczeichner, schon seit jeher für Ihren Patriotismus bekannt (man denke an Captain America), haben jetzt, wie CNN berichtet, beschlossen, dass ihre Comics in Frankreich und Deutschland nicht mehr erscheinen sollen, bis sich die Sache gelegt hat! Heute erhalte ich eine e-Mail vom Carlsen-Verlag, dass sie jetzt auf allen Fronten nach neuen Comiczeichner suchen - und dabei haben sie einige Webcomicautoren angeschrieben - so auch uns, wegen den nutkidz!!

Wenn alles gut geht, wird es in etwa fünf Monaten einen ersten Softcover nutkidz-Comicband geben!! Unglaublich, aber war. Der von mir gleich aufgebrachte Vorschlag, auch auf der Comicverfilmungswelle mitzuschwimmen und einen entsprechenden Film zu drehen (das Filmplakat seht ihr oben - lustigerweise hat das Schwesterchen schon vor ein paar Tagen angefertigt, es ist der obercoole Comic den ich vorletzte Woche angesprochen habe) hat Carlsen leider als Aprilscherz missverstanden.

So oder so: demnächst, die nutkidz, bei Carlsen! Und nach wie vor jede Woche neu auf www.nodix.de!

Bug im Parkhaus

Von Samstag auf Sonntag war in Stuttgart die Lange Nacht der Museen, ein wie stets hervorragend besuchte Veranstaltung, bei der man Catch the Bus mit Museen statt mit Discos spielt. Die Lange Nacht endete um 2 Uhr Nachts, also genau zu dem Zeitpunkt, an dem auch auf Sommerzeit gestellt wurde. Um die Uhrzeit wollten alle Leute die Stadt verlassen, vornehmlich auch mit dem Auto. Wir parkten wie so häufig in meinem Lieblingsparkhaus in der Stuttgarter City, die Hofdienergarage (ich hoffe jetzt, dass sie so heißt) - vor allem wegen der unschlagbaren Abend-, Nacht- und Wochenendpauschale von nur einem Euro.

Doch was geschah am Samstag? Nachdem man das Parkticket löst hat man eine gemütliche Viertelstunde um zum Auto zu kommen und das Parkhaus zu verlassen. So auch an diesem Abend, bloß: wer schlau war, und ein wenig vor 2 Uhr schon das Parkhaus verlassen wollte - um den Andrang zuvorzukommen - schaute in die Röhre: man kam nicht raus.

Warum? Programmiererfehler.

Was war passiert: wer sein Ticket um, sagen wir, 1:57 löste, und sagen wir, 6 Minuten später an der Schranke war, kam nicht raus. Es war 3:03, und damit, so meinte der Computer, waren die 15 Minuten lange vorbei. Dadurch bildete sich eine so lange Schlange, dass die, die nach 2 lösten, sprich, nach 3 Uhr, auch erst nach mehr als einer Viertelstunde zur Schranke kamen. Wir brauchten nur 20 Minuten, aber hinter uns standen die Autos das ganze Parkhaus hoch - ich will nicht wissen, wie lange das noch so ging.

Ehrlich, so eine blöde Geschichte! Ich bin froh, dass wir zusammenpuzzeln konnten, was geschehen war, weil hätte ich nicht gewusst, warum ich da stehe, ich glaube ich wäre ganz schön genervt gewesen. So fande ich es traurig-komisch...

Wenn noch jemand Geschichten von ähnlichen Computerbugs hat, kann sie mir gerne schicken. Diese werden allen Programmierern da draußen als Mahnung, Belehrung und der Allgemeinheit zur Erheiterung bekannt gemacht!

1/3 des Weges

Hurra!

Besucher Nummer 33333 ist auf Nodix gewesen! Damit haben wir bereits ein Drittel des Projekt 100.000 erreicht - und das noch bevor ein Drittel des Jahres vorbei ist! Ihr seid schlicht fantastisch! Damit liegt Nodix noch voll im Plan ob des Projekts 100.000: den Prognosen sind die Tatsachen tatsächlich noch zwei, drei Wochen vorraus!

Danke, danke, danke, liebe Besucher! Und großes Ehrenwort: auch wenn es zur Zeit ein wenig wie eine Flaute wirkt, in Wirklichkeit stehen wieder Erweiterungen, Erneuerungen und Entstaubungsaktionen an, um Nodix noch beliebter und attraktiver zu machen.

Wenn Ihr irgendwelche Vorschläge, Anregungen, Kritiken habt: immer her damit!

Dramatische nutkidz-Folge

Schwesterchen machte mich darauf aufmerksam, dass ich den supercoolen nutkidz, den ich letzte Woche ankündigte, gar nicht heute zeigen kann, weil die derzeitige Story noch läuft... und die mag ich nicht unterbrechen. Hier also die Fortsetzung!

Und die Folge endet mit dramatischer Musik: wieso hat Till keine Seele? Was wird aus Chrissies Seele? Wird der Teufel alle nutkidz beherrschen? Kann der Geistliche die nutkidz noch irgendwie retten? Wollen die das überhaupt? Bleibt dabei, wenn nächste Woche ein paar Fragen geklärt und viele neue aufgeworfen werden - bei den nutkidz!

Treue

"Ein treuer Gatte ist ein Ehemann, der sich nicht auch noch auswärts blamieren möchte" - Anonym

Ich habe auch schon die nutkidz für nächste Woche gesehen - wow, dies sind genial, so gut, dass ich sie als Hintergrundbild verwende. Freut euch darauf! Es gibt eine Überraschung.

Danke, Harald

"Das, was Bush mit Saddam gemacht hat, ist eigentlich nichts anderes als eine Räumungsklage wegen Eigenbedarfs" - Harald Schmidt

DSA4 Werkzeug 1.22

Das DSA4 Werkzeug hat Version 1.22 erreicht - man kann seine Helden jetzt auch Drucken und sonstwie ausgeben. Schaut mal vorbei, vielleicht interessiert es den werten Leser...