QALD-4 » Home
September 2014 ⋅ Part of: QA Track at CLEF 2014  

Question Answering over Linked Data (QALD-4)

 

QALD-4 is the fourth in a series of evaluation campaigns on multilingual question answering over linked data, this time with a strong emphasis on interlinked datasets and hybrid approaches using information from both structured and unstructured data. QALD-4 is part of the Question Answering lab at CLEF 2014.

 

Motivation

While more and more structured data is published on the web, the question of how typical web users can access this body of knowledge becomes of crucial importance. Over the past years, there is a growing amount of research on interaction paradigms that allow end users to profit from the expressive power of Semantic Web standards while at the same time hiding their complexity behind an intuitive and easy-to-use interface. Especially natural language interfaces have received wide attention, as they allow users to express arbitrarily complex information needs in an intuitive fashion and, at least in principle, in their own language. Multilingualism has, in fact, become an issue of major interest for the Semantic Web community, as both the number of actors creating and publishing data all in languages other than English, as well as the amount of users that access this data and speak native languages other than English is growing substantially.

The key challenge is to translate the users' information needs into a form such that they can be evaluated using standard Semantic Web query processing and inferencing techniques. Over the past years, a range of approaches have been developed to address this challenge, showing significant advances towards answering natural language questions with respect to large, heterogeneous sets of structured data. However, only few systems yet address the fact that the structured data available nowadays is distributed among a large collection of inter- connected datasets, and that answers to questions can often only be provided if information from several sources are combined. In addition, a lot of information is still available only in textual form, both on the web and in the form of labels and abstracts in linked data sources. Therefore approaches are needed that can not only deal with the specifi character of structured data but also with finding information in several sources, processing both structured and unstructured information, and combining such gathered information into one answer.