Wednesday, July 19, 2006

Good point: disinformation and the Semantic Web

I wish that I had gone to the AAAI conference this year. I am keenly interested in the application of AI techniques to the Semantic Web, and Tim Berners-Lee gave a keynote speech largely on the Semantic Web.

After Berners-Lee's talk, Peter Norvig in the question period posed the problem of people publishing fake data in much the same way they try to cheat to increase the page ranking on their web sites. I had not thought of that problem, and it is a tough problem to deal with: what happens to trust mechanisms when some people actively try to fake the meta data on their web sites? While I was walking to lunch with Norvig a few years ago, I brought up a related problem: assume that for narrow domains of discourse (e.g., political news, financial news, etc.) that you could largely automate the creation of RDF from natural language text on web sites. I personally believe that this is achievable right now, with a lot of effort. The problem that I posed at lunch was (besides the technical challenges of dealing with potentially trillions of RDF triples) the problem of dealing with lots of conflicting information while factoring in different levels of trust.

No comments: