sms basd retr from faqs

FIRE 2011 Task


Home
Dataset
Important Dates
People
Submission
Attendance
Resources
Contact
Results
Joint Task Coordinators
COER
and
IBM Research

The number of internet users in the world is estimated to be around 2 Billion. India, having nearly 17% of the world population accounts for merely 81 million users (6.9 % of its population). On the other hand, the number of telecom service users, specifically mobile phone users in India is nearly 10 times larger than the number of internet users. The mobile phone is a cheap and easy device for communication and is increasingly being used as a source of information. It is keeping this fact in mind, that FIRE 2011 has included an "SMS based FAQ Retrieval" task. The goal of this task is to find a question Q* from corpora of FAQs (Frequently asked questions) that best answers/matches the SMS query S.

SMS queries which are written in "SMS language" tend to be noisy as users try and compress text by omitting letters, using slang, etc., due to a cap on the length of messages (160 characters constitutes one SMS), lack of screen space (which makes reading large amounts of text difficult), etc. The messages also frequently contain unintended typographical errors due to small size of keypads on mobile phones as also the poor language skills of the users. The presence of such noise makes this task different and more challenging than traditional QA retrieval tasks.

The SMS retrieval task consists of three sub-tasks.

Participants can submit results for single or multiple sub-tasks listed below.

Task details

Task 1: Mono-Lingual FAQ Retrieval

In this sub-task, the SMS query and the FAQ corpus shall be of the same language. The goal in this task is to find the best matching question Q* from a mono-lingual collection of FAQs Q.

A ranked list of a maximum of 5 best matching questions, per SMS query have to be returned alongwith scores.

Task 2: Cross-lingual FAQ Retrieval

In this sub-task, the SMS query and the FAQ corpus shall be of different languages. That is, if the SMS query is in Language L1, then the FAQ Corpus will be from a language other than L1.

Thus, the goal in this task is to find the best matching question Q* from the set of FAQs in language L2 while the SMS query is from a different language L1.

The data has been provided for English SMS queries matching to Hindi FAQs.

A ranked list of a maximum of 5 best matching questions, per SMS query have to be returned alongwith scores.

Task 3: Multi-lingual FAQ Retrieval

In this sub-task the SMS queries can be from multiple languages and these can match to FAQ collections from multiple languages. For example, SMS queries could be written in English, or in Hindi, or in Malayalam, and these queries could match FAQ collections of English OR Hindi OR Malayalam.

A ranked list of a maximum of 5 best matching questions, per SMS query have to be returned alongwith scores. Note: Results from the different FAQ collections should be merged and ranked into one result.

To register and participate please send an email to fire2011smstask@gmail.com




News

Watch this space for
the latest news!