LivingKnowledge goal is to bring a new quality into search and knowledge management technology for more concise, complete and contextualised search results.
Posted February 16, 2010Testbed
The LivingKnowledge testbed will provide all infrastructures necessary to manage retrieve and exploit facts and opinions, traceable along several dimensions: content, time and bias.
The testbed will
- provide us with valuable requirements to guide our research on diversity-enabled retrieval and interpretation of information
- enable us to thoroughly test, integrate and evaluate all technologies and algorithms designed and developed in other WPs
- enable and support research for the research community for testing new approaches, by providing an infrastructure as well as an extendible set of algorithms and test collections, thus fostering research collaboration also beyond the consortium.
To facilitate these applications Yahoo! will set up an infrastructure which will provide the following three major components:
1. Indexing: a novel indexing framework which will model the three dimensions of our opinion/fact database: the first dimension, content, will be modelled using state-of-the-art information retrieval methods capable of indexing the lexical content of events. Note that by lexical content we do not mean just terms: we will index also entire events, defined as semantic frames, i.e., who did what to whom, when and where. Second, we will index events along the temporal dimension: for each event we will store the duration when the event was observed or reported. Lastly, we will attribute a confidence value for each opinion or fact stored, depending on occurrence, source, and link structure, among others.
2. Search: efficient search and ranking mechanisms which can exploit this multi-dimensional opinion/fact space. Queries will combine content and temporal constraints, as for example in the query “gas prices AFTER 2100”. The results will be ranked, not only based on the degree to which they satisfy the query constraints but also based on the confidence attached to the corresponding opinion or fact.
Yahoo! Research and the European Archive Foundation (EA) will provide the document collections for this testbed. We will use the crawls dating back to 2003-2004, stored by the EA, to train and retrospectively test our ideas. We will use collections of current news documents fromYahoo! news (http://news.yahoo.com), to evaluate our approach in a commercial setting with present-day information. Yahoo! will also provide appropriate crawls of user-generated data, including blogs and forums, which we will use, together with the news collections, to support our media research analysis application.