soma

latest entries from
Piotr's R&D blog

Extracting meaning from the web

Saturday, January 10, 2004, 12:20AM - category Semantic Web -

As a kind of follow-up on an earlier blurb, I note (thanks to John Battelle's Searchblog) that IBM is working on WebFountain. This huge system seems to consist of three parts: an information crawler that covers the web, news, IRC, and other sources; automated annotators that mark up the documents retrieved with XML tags specialized to their domains of expertise; and analysis engines that pull interesting results out of the annotated information. (My summary is based on the IEEE Spectrum article.) If this huge project is even mildly successful, then the semantic web as proposed by the W3C is as good as dead. In a decade (or less), this kind of computing power will be on everybody's desktop, and companies will specialized in selling annotator results. And it'll all be done in XML, since the cost of entry for RDF is just too high...


Some previous entries (or browse the archives):