Semantic search

Jump to navigation Jump to search

Zen and the Art of Motorcycle Maintenance

13 May 2021

During my PhD, on the topic of ontology evaluation - figuring out what a good ontology is and what is not - I was running circles up and down trying to define what "good" means for an ontology (Benjamin Good, another researcher on that topic, had it easier, as he could call his metric "Good metric" and be done with it).

So while I was struggling with the definition in one of my academic essays, a kind anonymous reviewer (I think it was Aldo Gangemi) suggested I should read "Zen and the Art of Motorcycle Maintenance".

When I read the title of the suggested book, I first thought the reviewer was being mean or silly and suggesting a made-up book because I was so incoherent. It took me two days to actually check whether that book existed, as I wouldn't believe it.

It existed. And it really helped me, by allowing me to set boundaries of how far I can go in my own work, and that it is OK to have limitations, and that trying to solve EVERYTHING leads to madness.

(Thanks to Brandon Harris for triggering this memory)

Keynote at Web Conference 2021

Today, I have the honor to give a keynote at the WWW Confe... sorry, the Web Conference 2021 in Ljubljana (and in the whole world). It's the 30th Web Conference!

Join Jure Leskovec, Evelyne Viegas, Marko Grobelnik, Stan Matwin and myself!

I am going to talk about how Abstract Wikipedia and Wikifunctions aims to contribute to Knowledge Equity. Register here for free:

Update: the talk can now be watched on VideoLectures:

Building a Multilingual Wikipedia

Communications of the ACM published my paper on "Building a Multilingual Wikipedia", a short description of the Wikifunctions and Abstract Wikipedia project that we are currently working on at the Wikimedia Foundation.


Jochen Witte

Jochen Witte war ein Freund meiner Schulzeit. Ich habe viel von ihm gelernt, er konnte all diese praktischen Sachen zu denen ich nie einen Zugang hatte und von denen ich oft wünschte, ich könnte sie. Von ihm lernte ich, was eine gute Soundanlage braucht und warum Subwoofer groß sein müssen und was Subwoofer überhaupt sind. Zusammen schleppten wir schwere Boxen, um Unterstufendiscos und Abischerze und Vorträge zu ermöglichen. Von ihm lernte ich die Vorzüge des Gaffertapes kennen, und dass es nicht nur silbernes Klebeband ist. Er war der erste, der mir Mangas und Anime ein wenig näherbrachte, insbesondere hatte er eine Leidenschaft für Akira. Er ließ mich das erste Mal die elektronische Musik von Chris Hülsbeck und Jean-Michel Jarre hören. Er las ASM, ich las Power Play. Wir spielten eine zeitlang DSA miteinander. Er war der erste den ich kannte mit einem Pager. Er wirkte stets so als konnte er alles reparieren, und es war gut so jemanden zu kennen.

Gleichzeitig waren einige meiner Freunde und ich ihm gegenüber nicht immer freundlich, oh nein, im Gegenteil, manchmal war ich geradewegs grausam. Ich mache mich über seine Brille lustig oder sein Gewicht, und konnte Punkte damit sammeln, über ihn Witze zu machen. Ich wusste es war falsch. Wir waren ja schon die Außenseiter in der Klasse, und ich versuchte ihn zum Außenseiter der Außenseiter zu machen. Meine einzige Entschuldigung ist, dass wir Kinder waren, und ich noch nicht die Stärke hatte, besser zu sein. Ich lernte viel daraus, und wollte nie wieder so sein. Mit der Zeit verstand ich mich besser. Wo diese Grausamkeit herkam. Und das es nicht an Jochen lag, sondern in mir. Ich schäme mich für vieles was ich tat. Ich weiß nicht, ob ich mich jemals bei ihm entschuldigt habe.

Und dennoch glaube ich waren wir Freunde.

Nach der Schulzeit verloren wir uns aus den Augen. Er studierte Chemie in Esslingen, wir trafen uns hin und wieder im Movie Dick zur Sneak Preview. Er zog nach Staig im Alb-Donau-Kreis und fand sich als Goth wieder. Aber über die Jahre hinweg, gerieten wir hin und wieder in Kontakt.

Eine unserer gemeinsamen Erinnerungen war, wie wir zusammen zu einem Vortrag von Erich von Däniken fuhren. Es war mein Auto. Wir hatten einen Platten, und während er es zum Laufen brachte - wie gesagt, er konnte alles reparieren - fragte er mich, wann ich denn das letzte Mal nach dem Öl geschaut habe. Ich muss so belämmert reingeschaut haben, dass er nur noch lachen konnte. Die Antwort war "Nie", und er sah es in meinem Gesicht. Jedesmal wenn wir uns trafen, sprach er mich auf diesen Abend an.

Jochen half mir beim Umzug nach Karlsruhe. Das Gästebett passte nicht richtig zusammen. Er sagte er könnte es festziehen, aber ich würde es nie wieder auseinander bekommen. Es wird schwierig, damit umzuziehen. Ich sagte, das ist OK, ist ja nur ein billiges IKEA Gästebett Couch Dings. Ich habe nicht vor, damit umzuziehen, versicherte ich ihm.

Ich zog damit von Karlsruhe nach Berlin. Von Berlin nach Alameda. Innerhalb von Alameda. Von Alameda nach Berkeley. Es hat den Umzugshelfern jedesmal Kopfzerbrechen bereitet, genau wie Jochen versprochen hatte. Letzte Woche brach ein Stück ab. Ich sitze jetzt darauf und schreibe das hier. Nach fast einem Jahrzehnt sollte ich es wohl endlich austauschen.

Das letzte mal trafen wir uns ganz zufällig 2017 am Stuttgarter Bahnhof. Ich war überhaupt nur ein Mal im letzen halben Jahrzehnt wieder in Deutschland. Und da, am Bahnhof, traf ich ihn. Es war schön, Jochen wiederzusehen, und wir redeten als ob wir uns immer noch täglich sehen würden, wie zwanzig Jahre zuvor. Als ob das Abitur erst gestern war.

Diese Woche erfuhr ich von Michael, dass Jochen verstorben ist. Er starb nur wenige Monate nach unserem zufälligen Treffen, im April 2018. Er wurde nur vierzig Jahre alt.

Es tut mir leid.

Und noch viel mehr: Danke.

Ruhe in Frieden, Jochen Witte.

Der Name Zdenko

Heute sah ich dass der Artikel Zdenko - mein eigentlicher Name - auf der Englischen Wikipedia verändert wurde. Jemand hatte die Bedeutung des Namens von dem, was ich für richtig hielt (slawische Form von Sidonius) zu etwas was ich nie zuvor gehört habe (Koseform von Zdeslav) verändert, aber nicht die Quelle angepasst. Ich dachte, das wird eine schnelle Korrektur, habe aber dennoch in die Quelle geschaut - und, schau an, die Quelle sagte weder das eine noch das andere, sondern behauptete der Name stammt von dem slawischen Wort zidati, bauen, errichten.

Das führte mich zu einer zweitstündigen Odyssee durch verschiedene Quellen des 19. und 20. Jahrhunderts, wo ich Belege für alle drei Bedeutungen finden konnte - außerdem Quellen, die behaupteten, dass der Name von dem Slawischen Wort zdenac, Brunnen, abgeleitet ist, dass auch der Name Sidney von Sidonius stamme, und eine Hessische Quelle die vehement darüber schimpfte, dass doch Zdenko und Sidonius nichts miteinander zu tun haben (auch die Slowenische Wikipedia sagt, dass die Namen Zdenko und Sidonius zwar einen gemeinsamen Namenstag haben, aber nicht der gleiche Name sind). Dafür aber führt die gleiche Quelle aus, dass der im Osthessischen gebrauchte Name Denje wohl von Zdenka kommt (so nah an Denny!)

Denje gefällt mir als Name.

Kurzgesagt: wenn Du denkst, Etymologie sei kompliziert, sei gewarnt: Anthroponomastik ist deutlich schlimmer!

The name Zdenko

Today I saw that the Wikipedia article on Zdenko - my actual name - was edited, and the meaning of the name was changed from something I considered correct (slavic form of Sidonius) to something that I never heard of before (diminutive of Zdeslav), but the reference stayed intact, so I thought that'll be an easy revert. Just to do due process, I checked the given source - and funnily enough, it didn't say neither one nor the other, but gave an etymology from the slavic word zidati, to build, to create.

That lead me down a two hour rabbit hole through different sources crossing the 19th to 20th century, finding sources that claim the name is derived from the Slavic word zdenac, a well, or that Zdenko is cognate to Sidney, a Hessian source explaining that it is considered the root for the name Denje (so close to Denny!) (and saying it has nothing to do with Sidonius), and much more.

In short, if you think that etymology is messy, I tell you, anthroponymy is far worse!

Time on Mars

This is a fascinating and fun listen about the mars mission. Because a day on Mars takes 40 minutes longer than on Earth, the people working on that mission had to live on Mars time, as the Mars rovers work with solar panels. So they have watches showing Mars time. They invent new words in their language, speaking about sol instead of day, of yestersol, and they start themselves calling Martians. 11 minutes.

Katherine Maher to step down from Wikimedia Foundation

Today Katherine Maher announced that she is stepping down as the CEO of the Wikimedia Foundation in April.

Thank you for everything!

Boole and Voynich and Everest

Did you know?

George Boole - after whom the Boolean data type and Boolean logic was named - was the father of Ethel Lilian Voynich - who wrote The Gadfly.

Her husband was Wilfrid Voynich - after whom the Voynich manuscript was named.

Ethel's mother and George Boole's wife was Mary Everest Boole - a self-thought mathematician who wrote educational books about mathematics. Her life is of interest to feminists as an example of how women made careers in an academic system that did not welcome them.

Mary Everest Boole's uncle was Sir George Everest - after whom Mount Everest is named.

And her daughter Lucy Everest was the first he first woman Fellow of the Royal Institute of Chemistry.

Geoffrey Hinton, great-great-grandson of George and Mary Everest Boole, received the Turing Award for his work on deep learning.

Abraham Taherivand to step down from Wikimedia Deutschland

Today Abraham Taherivand announced that he is stepping down as the CEO of Wikimedia Deutschland at the end of the year.

Thank you for everything!

Twenty years

On this day, twenty years ago, on January 15, 2001, I started my third Website, Nodix, and I kept it up since then (unlike my previous two Websites, which are lost to history as Internet Archive didn't capture them yet, it seems). A few years later I renamed it to Simia.

Here is the first entry: Willkommen auf der Webseite von Denny Vrandecic!

My Website never became particularly popular, although I was meticulously keeping track of how many hits I got and all of this. It was always a fun side project for which I had sometimes more and sometimes less time.

The funniest thing is that it was - and that was completely incidental - exactly the same day that another Website was started, which I, over the years, spent much more time on: Wikipedia.

Wikipedia changed my life, not only once, but many times.

It is how I met Kamara.

It is how I met a lot of other very smart people, too. It became part of my research work and my PhD thesis. It became the motivation for many of the projects I have started, be it Semantic MediaWiki, Wikidata, or Abstract Wikipedia. It is the reason for my career trajectory over the last fifteen years. It is hard to overstate how influential Wikipedia has been on my life.

It is hard to overstate how important Wikipedia has become for modern AI and for the Web of today. For smaller language communities. For many, many people looking for knowledge. And for the many people who realised that they can contribute to it too.

Thanks to the Wikipedia community, thanks to this marvellous project, and happy anniversary and many returns to Wikipedia!

Happy New Year 2021!

2020 was a challenging year, particularly due to the pandemic. Some things were very different, some things were dangerous, and the pandemic exposed the fault lines in many societies in a most tragic way around the world.

Let's hope that 2021 will be better in that respect, that we will have learned from how the events unfolded.

But I'm also amazed by how fast the vaccine was developed and made available to tens of millions.

I think there's some chance that the summer of '21 will become one to sing about for a generation.

Happy New Year 2021!

Keynote at SMWCon Fall 2020


I have the honor of being the invited keynote for the SMWCon Fall 2020. I am going to talk "From Semantic MediaWiki to Abstract Wikipedia", discussing fifteen years of Semantic MediaWiki, how it all started, where we are now - crossing Freebase, DBpedia, Wikidata - and now leading to Wikifunctions and Abstract Wikipedia. But, more importantly, how Semantic MediaWiki, over all these years, still holds up and what its unique value is.

Page about the talk on the official conference site: https://www.semantic-mediawiki.org/wiki/SMWCon_Fall_2020/Keynote:_From_Semantic_Wikipedia_to_Abstract_Wikipedia

Site went down

The site went down, again. First time was in July, when Apache had issues, this time it's due to MySQL acting up and frying the database. I found a snapshot from July 2019, and am trying to recreate the entries from in between (thanks, Wayback Machine!)

Until then, at least the site is back up, even though they might be some losses in the content.

P.S.: it should all be back up. If something is missing, please email me.

Wikidata crossed Q100000000

Wikidata crossed Q100000000 (and, in fact, skipped it and got Q100000001 instead).

Here's a small post by Lydia Pintscher and me: https://diff.wikimedia.org/2020/10/06/wikidata-reaches-q100000000/

Mulan

I was surprised when Disney made the decision to sell Mulan on Disney+. So if you wanted to watch Mulan, you not only have to buy it, so far so good, but you have to join their subscription service first. The price for Mulan is $30 in the US, additionally to the monthly fee of streaming, $7. So the $30 don't buy you Mulan, but allow you to watch it if you keep up your subscription.

Additionally, on December 4 the movie becomes free for everyone with a Disney+ subscription.

I thought, that's a weird pricing model. Who'd pay that much money for streaming the movie a few weeks earlier? I know, it will be very long weeks due to the world being so 2020, but still. Money is tight for many people. Also, the movie had very mixed reviews and a number of controversies attached to it.

According to the linked report, Disney really knows what they're doing. 30% of subscribers bought the early streaming privilege! Disney made hundreds of millions in extra profit within three first few days (money they really will be thankful for right now given their business with the cruise ships and theme parks and movies this year).

The most interesting part is how this will affect the movie industry. Compare to Tenet - which was reviewed much better and which was the hope to revive the moribund US cinema industry, but made less than $30M - which also needs to be shared with the theaters and had much more distribution costs. Disney keeps a much larger share of the $30 for Mulan than Tenet makes for its production company.

The lesson from Mulan and Trolls 2, which also did much better than I would ever have predicted, for the production companies experimenting with novel pricing models, could be disastrous for theaters.

I think we're going to see even more experimentation with pricing models. If the new Bond movie and/or the new Marvel movie should be pulled from cinemas, this might also be the end of cinemas as we know them.

I don't know how the industry will change, but the swing is from AMC to Netflix, with the producers being caught in between. The pandemic massively accelerated this transition, as it did so many others.

https://finance.yahoo.com/amphtml/news/nearly-onethird-of-us-households-purchased-mulan-on-disney-for-30-fee-data-221410961.html

Gödel's naturalization interview

When Gödel went to his naturalization interview, his good friend Einstein accompanied him as a witness. On the way, Gödel told Einstein about a gap in the US constitution that would allow the country to be turned into a dictatorship. Einstein told him to not mention it during the interview.

The judge they came to was the same judge who already naturalized Einstein. The interview went well until the judge asked whether Gödel thinks that the US could face the same fate and slip into a dictatorship, as Germany and Austria did. Einstein became alarmed, but Gödel started discussing the issue. The judge noticed, changed the topic quickly, and the process came to the desired outcome.

I wonder what that was, that Gödel found, but that's lost to history.

Gödel and Leibniz

Gödel in his later age became obsessed with the idea that Leibniz had written a much more detailed version of the Characteristica Universalis, and that this version was intentionally censored and hidden by a conspiracy. Leibniz had discovered what he had hunted for his whole life, a way to calculate truth and end all disagreements.

I'm surprised that it was Gödel in particular to obsess with this idea, because I'd think that someone with Leibniz' smarts would have benefitted tremendously from Gödel's proofs, and it might have been a helpful antidote to his own obsession with making truth a question of mathematics.

And wouldn't it seem likely to Gödel that even if there were such a Characteristics Universalis by Leibniz, that, if no one else before him, he, Gödel himself would have been the one to find the fatal bug in it?

Starting Abstract Wikipedia

I am very happy about the Board of the Wikimedia Foundation having approved the proposal for the multilingual Wikipedia aka Abstract Wikipedia aka Wikilambda aka we'll need to find a name for it.

In order to make that project a reality, I will as of next week join the Foundation. We will be starting with a small, exploratory team, which will allow us to have plenty of time to continue to socialize and discuss and refine the idea. Being able to work on this full time and with a team should allow us to make significant progress. I am very excited about that.

I am sad to leave Google. It was a great time, and I learned a lot about running *large* projects, and I met so many brilliant people, and I ... seriously, it was a great six and a half years, and I will very much miss it.

There is so much more I want to write but right now I am just super happy and super excited. Thanks everyone!

Lexical masks in JSON

We have released lexical masks as ShEx files before, schemata for lexicographic forms that can be used to validate whether the data is complete.

We saw that it was quite challenging to turn these ShEx files into forms for entering the data, such as Lucas Werkmeister’s Lexeme Forms. So we adapted our approach slightly to publish JSON files that keep the structures in an easier to parse and understand format, and to also provide a script that translates these JSON files into ShEx Entity Schemas.

Furthermore, we published more masks for more languages and parts of speech than before.

Full documentation can be found on wiki: https://www.wikidata.org/wiki/Wikidata:Lexical_Masks#Paper

Background can be found in the paper: https://www.aclweb.org/anthology/2020.lrec-1.372/

Thanks Bruno, Saran, and Daniel for your great work!