Multimedia Document Retrieval
Query Expansion

By Query expansion one means the idea of adding words to a query to increase its information content. For instance, for the query "Tell me about the northern Ireland peace process" the retriever is going to find documents that contain the words northern Ireland and peace process. But it is not going to find documents about the I.R.A. and decomissioning. In order to retrieve these documents also it might therefore be useful to expand the original query with the words "I.R.A." and "decomissioning". The MDR demo system supports two methods to do query expansion. As was mentioned before, these methods can be activated by choosing one or both of the radio buttons next to the "Expand Query" button at the top of the results page.

Relevance Feedback

It was already pointed out in section
Presentation of Search Results that the user can mark documents that were found by the retriever as relevant or not relevant by clicking on the "Relevant?/Relevant!" toggle button next to the document extract. If the user chooses to expand the query based on the relevant documents a list of words from these documents is created that contains words that are specific to the relevant documents and distinguish them from the non-relevant ones. An example of such a relevant document specific list of words can be found below. Here several of the documents that were returned for the query "Which films were nominated for Oscars this year?" have been marked as relevant. As one can see the list of words in this example almost answers the question because it contains the word "cider" from the title "The Cider House Rules" the words "angela" and "ash" for "Angela's Ashes" and the name "Ripley" for "The Talented Mister Ripley". The latter name appears in the list below in its stemmed version which is "riplei". (For further information about word stemming please refer to the third paragraph in section Presentation of Search Results.) Even if one doesn't know that it makes sense to select these words because one doesn't know the titles of the nominated films it still makes sense to add more general words to this query like "academy", "award", "director", "Hollywood" or "movie" which appears in the list as the word stem "movi". Once a list of words to be added to the original query has been selected the user can press the "Search again" button on this page which will repeat the search after adding all the new words to the original query. This process can be repeated several times until the results are satisfactory.

In the demo snapshot several documents have been pre-selected as relevant for each query. Following the link "Expand query from relevant documents" at the top of the results page brings you to the list of words that have been suggested as additional query words based on the selection of relevant documents. It is not possible in the demo snapshot to select another set of relevant documents. The "Relevant?/Relevant!" toggle buttons have been disabled.

Semantic Correspondence

Another way to expand a query is directly from the query words. In this case some prior information is exploited about which words are topically related. One can assume, for example, that somebody that wants to find documents about the UK also is interested in documents on Scotland, England, Wales and Northern Ireland. Similarly, documents about London, Edinburgh, etc. might be interesting. This example shows that there is a hierarchical structure of concepts where concepts on a higher level group together concepts on a more detailed level. It is this hierarchical structure that is used when a query is expanded directly from the query words. The list of words that would be suggested as possible words to expand the original query with in this example would be "Scotland, England, Wales, Northern Ireland, London,...".

In the demo snapshot you can follow the link "Expand query from query words" at the top of the results pages to see the list of additional words that are suggested for the current query. When there were no words suggested this link has been disabled.

