Home | Search | Help  
Home Page Università di Genova

Seminar Details

Date 6-5-2011
Time 14:30
Room/Location DISI-Sala Conferenze III piano
Speaker Prof. Hanan Samet
Affiliation Department of Computer Science University of Maryland College Park, MD 20742 e-mail: hjs@cs.umd.edu
Link http://www.cs.umd.edu/
Abstract PLACE INFORMATION SYSTEMS: TEXTUAL LOCATION IDENTIFICATION AND VISUALIZATION: An ACM Distinguished Speaker Program Lecture The popularity of web-based mapping services such as Google Earth/Maps and Microsoft Virtual Earth (Bing), has led to an increasing awareness of the importance of location data and its incorporation into both web-based search applications and the databases that support them, In the past, attention to location data had been primarily limited to geographic information systems (GIS), where locations correspond to spatial objects and are usually specified geometrically. However, in the web-based applications, the location data often corresponds to place names and is usually specified textually. An advantage of such a specification is that the same specification can be used regardless of whether the place name is to be interpreted as a point or a region. Thus the place name acts as a polymorphic data type in the parlance of programming languages. However, its drawback is that it is ambiguous. In particular, a given specification may have several interpretations, not all of which are names of places. For example, ``Jordan'' may refer to both a person as well as a place. Moreover, there is additional ambiguity when the specification has a place name interpretation. For example, ``Jordan'' can refer to a river or a country while there are a number of cities named ``London''. In this talk we examine the extension of GIS concepts to textually specified location data and review search engines that we have developed to retrieve documents where the similarity criterion is not based solely on exact match of elements of the query string but instead also based on spatial proximity. Thus we want to take advantage of spatial synonyms so that, for example, a query seeking a rock concert in Nervi would be satisfied by a result finding a rock concert in Albaro or Sampierdarena. This idea has been applied by us to develop the STEWARD (Spatio-Textual Extraction on the Web Aiding Retrieval of Documents) system for finding documents on website of the Department of Housing and Urban Development. This system relies on the presence of a document tagger that automatically identifies spatial references in text, pdf, word, and other unstructured documents. The thesaurus for the document tagger is a collection of publicly available data sets forming a gazetteer containing the names of places in the world. Search results are ranked according to the extent to which they satisfy the query, which is determined in part by the prevalent spatial entities that are present in the document. The same ideas have also been adapted to collections of news articles as well as Twitter tweets resulting in the NewsStand and TwitterStand systems, respectively, which will be demonstrated along with the STEWARD system in conjunction with a discussion of some of the underlying issues that arose and the techniques used in their implementation. Future work involves applying these ideas to spreadsheet data. Biography Hanan Samet (http://www.cs.umd.edu/~hjs/) is a Professor of Computer Science at the University of Maryland, College Park and is a member of the Institute for Computer Studies. He is also a member of the Computer Vision Laboratory at the Center for Automation Research where he leads a number of research projects on the use of hierarchical data structures for database applications involving spatial data. He has a Ph.D from Stanford University. He is the author of the recent book "Foundations of Multidimensional and Metric Data Structures" published by Morgan-Kaufmann, San Francisco, CA, in 2006 (http://www.mkp.com/multidimensional), an award winner in the 2006 best book in Computer and Information Science competition of the Professional and Scholarly Publishers (PSP) Group of the American Publishers Association (AAP), and of the first two books on spatial data structures titled "Design and Analysis of Spatial Data Structures" and "Applications of Spatial Data Structures: Computer Graphics, Image Processing and GIS" published by Addison-Wesley, Reading, MA, 1990. He is the founding chair of ACM SIGSPATIAL, a recipient of the 2009 UCGIS Research Award and the 2010 CMPS Board of Visitors Award at the University of Maryland, a Fellow of the ACM, IEEE, AAAS, and IAPR (International Association for Pattern Recognition), and an ACM Distinguished Speaker.
Back to Seminars