The Semantic Web has a branding problem: It was built to manage data, not semantics. Somewhere along the line, insiders renamed it the “Data Web”. That was a great move for Web researchers, but what will the semantics crowd do with the name? Just as “semantics” was misplaced in the Data Web, “web” is misplaced in our vision of a global semantic network. The Semantic Web won’t act like a web at all.
The reason is that form must follow function and “web” is the wrong form for semantics. Do you remember why you stopped using the Yahoo Directory and switched to Google? Both provide lists of Web pages organized by categories. The difference is that search engines involve you in the creation of those categories through your queries. When search engines became comparable to the directories in assembling relevant lists, there was no going back. The form of a directory, as a largely static structure, is incompatible with the function of search.
Similarly with semantics. The Data Web, a “giant, global graph”, implies a persistent data structure. Most semantic data cannot be represented this way. Semantic data is highly personal. Like a search engine query, it doesn’t exist without context provided by consumers. As a result, the difference in the quality and scale of the data is as dramatic as the difference between a directory and a search engine. One is bounded, one is not.
Data structures can provide an historical snapshot of semantics. But even if we wanted to tombstone semantics like a directory, how would we encode such an immense data structure? Our industry is aware of the scalability challenges of globally linked data, but we’ve yet to come to terms with the scale of semantic data. For a very rough estimate, multiply all the content that’s available on the Internet by every individual consumer by every individual perspective they might have on that content. Semantics is data at a scale that will dwarf the Web of today.
So as not to raise the ire of the Semantic (er, Data) Web community, let me say I’m a fan. As a standard for data exchange, it’s brilliant. We use it here at Primal and will only extend our use of it over time. But semantics requires more fluid and probabilistic models. We need a different organizing basis for it, one that’s inclusive of both the data structures and the dynamic processes that generate them (most notably, affording personal perspective).
Perhaps the Semantic Web isn’t so much a misnomer as much as a composite term of Semantic Engines and Data Web. Semantic Engines will provide a contextual and ever-changing flow of semantic data onto the network. This may be encoded using Data Web standards, but I don’t think consumers will experience it as a web. It’ll be in a state of constant churn, fuzzy and calculated. As with Google, we’ll be very aware of the presence and value of these services, but not as a static network of data to traverse. A truly Semantic Web will emerge around each consumer.