Part of the vision of the semantic web is that you can take random data from two random sources and load it into a SPARQL store and start writing queries, URI namespaces help, but if people use truly random identifiers it ‘just works’ and there is nothing paranoid about it.
Well isn't Wikipedia one of the biggest SPARQL-queryable DBs out there? They use 'slug' based IDs
I don't oppose randomization but what I say is that Time is entropy. Use it as a prefix. And UUIDs v7 which uses time is still too long and ugly.
WordPress powers half the web and there are no UUIDs in its URLs... so in practice using UUID is just not aesthetically pleasing for URLs
Postscript: And in fact LLMs seem to be better at fulfilling the Tim Berners Lee and Bill Gates dreams of universal interop. They can just say "oh here's the weather in Amsterdam according to this json" without any rigid interop ID or protocol
Back in the 1990s there was that sordid episode when Microsoft Office used those other UUIDs which were based on time and a MAC address so any Office document could be tracked to what computer you used and when.
In my lost half-decade I was pursuing the dream of "Real Semantics", which, in retrospect, was a baby Cyc, not so much a master knowledge base but a system for maintaining knowledge bases for other systems that could hold several parallel knowledge bases at once. I read everything public about Cyc and also tried using DBpedia and Freebase as a prototype "master" database. Lenat strongly believed you'd get in trouble pretty quickly if you tried to make non-opaque identifiers but Wikipedia has done a pretty good job of it for 7M items, with the caveat that Wikipedia doesn't have a consistent level of modelling granularity (e.g. it is so much easier for a video game to be notable than a book, some artists have all their major songs in Wikipedia, others don't, there is no such thing as a "Ford Thunderbird" but there is a "7th generation Ford Thunderbird", etc.)
Part of the vision of the semantic web is that you can take random data from two random sources and load it into a SPARQL store and start writing queries, URI namespaces help, but if people use truly random identifiers it ‘just works’ and there is nothing paranoid about it.
Well isn't Wikipedia one of the biggest SPARQL-queryable DBs out there? They use 'slug' based IDs
I don't oppose randomization but what I say is that Time is entropy. Use it as a prefix. And UUIDs v7 which uses time is still too long and ugly.
WordPress powers half the web and there are no UUIDs in its URLs... so in practice using UUID is just not aesthetically pleasing for URLs
Postscript: And in fact LLMs seem to be better at fulfilling the Tim Berners Lee and Bill Gates dreams of universal interop. They can just say "oh here's the weather in Amsterdam according to this json" without any rigid interop ID or protocol
... And be right 85% of the time.
Back in the 1990s there was that sordid episode when Microsoft Office used those other UUIDs which were based on time and a MAC address so any Office document could be tracked to what computer you used and when.
In my lost half-decade I was pursuing the dream of "Real Semantics", which, in retrospect, was a baby Cyc, not so much a master knowledge base but a system for maintaining knowledge bases for other systems that could hold several parallel knowledge bases at once. I read everything public about Cyc and also tried using DBpedia and Freebase as a prototype "master" database. Lenat strongly believed you'd get in trouble pretty quickly if you tried to make non-opaque identifiers but Wikipedia has done a pretty good job of it for 7M items, with the caveat that Wikipedia doesn't have a consistent level of modelling granularity (e.g. it is so much easier for a video game to be notable than a book, some artists have all their major songs in Wikipedia, others don't, there is no such thing as a "Ford Thunderbird" but there is a "7th generation Ford Thunderbird", etc.)
Use UUIDv7 and be done with it. It solves your database indexing problem, too.