The Semantic Web (“Web 3.0”) is coming in one form or another, and perhaps sooner rather than later. But there are still some major issues that need to be resolved before it’s a practical reality.
The concept of Linked Data (a.k.a. the Semantic Web, or Web 3.0) seems to be picking up momentum. Evidence that the Semantic Web is percolating up to the mainstream is a brief but engaging recent article — published in a heating and air conditioning journal — on the importance of the Semantic Web in marketing. (This article provides a very understandable, easy-to-scan overview of how the concept applies to the real world — in this case, marketing applications.)
In fact, the concept sounds almost like “repository-less MDM,” in which all forms of data are linked together in a meaningful manner, so that by just asking the question or making the query, we get back precisely the information we need, in an immediately consumable format, rather than getting a collection of documents and Web pages that require further gleaning and distilling. MDM consultant Dan Power, in an excellent overview of semantic technologies and product MDM, says it “sounds a little bit like science fiction.”
This concept has huge implications for the master data management world, especially applied MDM such as business intelligence, business process management, and multichannel publishing. However, some of the concerns arising from the Semantic Web appear to be the same ones we face in the current practice of MDM. For example:
Security — We’re already concerned about gaps in Web 2.0 security. There would be even more concerns about security in a Web 3.0 environment where information is linked at the data level rather than the document level. In fact, essays and journals have been discussing this topic for much of the past decade, focusing on security schemas rooted in the Resource Description Framework (RDF) schema. However, “The security, trust, information quality and privacy issues arising from the vision of the Semantic Web as a global information integration infrastructure are mainly unsolved.”
Reliability — Tim Berners-Lee is the inventor of the original World Wide Web and author of “Semantic Web” and “Linked Data,” the urtexts behind those concepts. In a TED talk on Linked Data, he cites “crowd-sourcing” as an example and a benefit of the Semantic Web (“You do your bit, everybody else does their bit”), using as an example his having labeled the lecture hall where he’s speaking on Open Street Map. Assuming this really works as an example of Linked Data: what if he has mischievously or maliciously mis-labeled that location? That bad data would then infect every other application that consumes it. We will need some kind of protocol that assures the reliability of the Linked Data.
Metadata – More IT professionals are growing aware of the vital importance of metadata to sound data management practices. In the Semantic Web world, metadata becomes even more critical as the means to establishing protocols and vocabularies among data sources and the machines that host them. In fact, in Web 3.0, “metadata provides the connections as well as the descriptions of content.” Ensuring that all data are properly attributed and the metadata properly stored, maintained, and integrated becomes essential – and a further challenge to MDM practitioners.
Permissions and accessibility — Berners-Lee also says that we need to “demand raw data now” to help make this scenario possible. But data owners and stewards, many of whom already resist the notion of making their data accessible, will surely resist making their data available in a linked, semantic way — particular the extra burdens it brings with it (e.g., more elaborate crosswalking of metadata). Some of those concerns will be rooted in passive-aggressive corporate politics as they’ve always been. But given the data integration, metadata managment, and security issues involved, the data guardians might have better ground for resisting the idea of making their data available for sharing with other users and other applications.
Completeness — One hundred years ago, the best information was that which used the best data available at the time. Obviously it was harder to collect the data then. But the Semantic Web will only assure the ease of acquiring data in a meaningful manner. It doesn’t necessarily assure us that all the data we need is linked on the Semantic Web. (Hans Rosling’s compelling TED Talk on the developing world becomes a inspiring cry for data owners and policymakers in business and government to meet the demand for “raw data now” and for technologists to develop new tools to present this data in meaningful ways.)
Data quality and governance — If we think it’s hard to enforce data quality standards now, wait until the entire population of Web users has become a team of data-entry clerks, each making up their own standards for nomenclature, taxonomies, and other standards as they go. It implies the need for a global data governance protocol that will enforce standards for data quality and completeness in real-time. It’s hard enough doing that with enterprises and their supply chain partners, much less the entire population of the Web. Greg Satell explains the concept of ontologies and how they will be enforced through the use of RDF. In explaining these concepts, he builds up the exciting potential that the Semantic Web holds if these and other challenges can be overcome:
The possibilities are exciting and applications are already being rolled out. Advertiser’s data about brands can be matched with media data about consumers. Data about poverty and hunger stored in computers around the world can be combined and analyzed. Through combining databases, we will be more likely to identify problems and find solutions.
If industrial journals consider “marketing and the Semantic Web” to be a relevant topic for their readership, then the Semantic Web is certainly coming — in one form or another, and perhaps sooner rather than later. But there still seem to be major issues that need to be resolved before it’s a practical reality. MDM evangelists have been addressing these same issues for some time, and will have much to contribute to the work of bringing the World Wide Web 3.0 online.