Date 27-9-2010
Time 10:00
Room/Location DISI-Sala Conferenze - 3 piano
Title Determining the Spatial Reader-scope of News Sources with Local Lexicons
Speaker Dott. Gianluca Quercini
Affiliation Faculty Research Assistant at Institute for Advanced Computer Studies, University of Maryland
Abstract Information sources on the Internet (e.g. web versions of newspapers) usually have an implicit spatial reader-scope, termed the audience location which is the geographical location for which the content has been primarily produced. Knowledge of the spatial reader-scope facilitates the construction of a news search engine that provides readers a set of news sources relevant to the location in which they are interested. In particular, it plays an important role in disambiguating toponyms (e.g. textual specifications of geographical locations) in news articles, as the interpretation that is chosen for the toponym often reduces to selecting an interpretation that seems natural to those familiar with audience location. The key to determining the spatial reader-scope of news sources is the notion of local lexicon, which for a location s is a set of concepts such as, but not limited to, names of people, landmarks, and historical events, that are spatially related to s. Techniques to automatically generate the local lexicon of a location by using the link structure of Wikipedia are described and evaluated. A key contribution is the improvement of existing methods used in the semantic relatedness domain to extract concepts spatially related to a given location from the Wikipedia. Results of experiments are presented that indicate that the knowledge of the audience location significantly improves the disambiguation of textually specified locations in news articles and that the local lexicon is an effective method to determine the spatial reader-scope of a news source.
