Washington, June 19 : A new computerized method, called IM2GPS algorithm, devised by Carnegie Mellon University researchers, can tell where a particular photo was taken by matching it to similar GPS-tagged photos in the Flickr online photo collection.
Unlike humans, the IM2GPS algorithm developed by computer science graduate student James Hays and Alexei A. Efros, assistant professor of computer science and robotics, doesn't attempt to scan a photo for location clues, such as types of clothing, the language on street signs, or specific types of vegetation.
However, it analyzes the composition of the photo, notes how textures and colors are distributed and records the number and orientation of lines in the photo. Then it searches Flickr for photos that are similar in appearance.
"We're not asking the computer to tell us what is depicted in the photo but to find other photos that look like it. It was surprising to us how effective this approach proved to be. Who would have guessed that similarity in overall image appearance would correlate to geographic proximity so well?" said Efros.
They found that they it was possible for them to accurately geolocate the images within 200 kilometers for 16 percent of more than 200 photos in their test set, which was up to 30 times better than chance. And in case their algorithm failed to identify the specific location, it actually narrowed the possibilities, such as by identifying the locale as a beach or a desert.
"It seems there's not as much ambiguity in the visual world as you might guess. Estimating geographic information from images is a difficult, but very much a doable, computer vision problem," said Hays.
This computer system for geolocating photos might prove useful in enhancing image search techniques by identifying the locale of a photo, making them less dependent on captions or associated text. This could also prove useful in finding family photos from a specific trip and in some forensic applications.
Determining the location of photos also makes it possible to combine them with geographic data bases related to climate, population density, vegetation, topography and land use. Hays said that the knowledge of locale can also aid in such computer vision tasks as object identification. For example, if a computer recognizes that a photo likely was taken in Japan, the computer will have a better idea of what a taxicab should look like.
Hays said many online photos have some sort of geographic label, but these human descriptions can often be incorrect, or overly broad. By using photos with both geographic keywords and GPS coordinates, enabled the researchers to find more than six million photos that were useful and accurately geolocated.
The IM2GPS algorithm readily located photographs of such landmarks as the Cathedral of Notre Dame in Paris. More surprisingly, it was able to recognize that a narrow street in Barcelona was typical of Mediterranean villages, rather than an American alleyway.
However, there were also some odd matches - the architecturally unique Sydney Opera House seemed to the computer to be similar to a hotel in Mississippi as well as a bridge in London. A shot of the Eiffel Tower at dusk was matched to other Eiffel Tower shots, but also to San Francisco's Coit Tower and New York's Statue of Liberty, both shot at dusk.
Hays said that one reason for this confusion can be that the algorithm is not designed to recognize specific objects so much as it is to recognize geographic areas. For instance, an image of Utah's Monument Valley caused the IM2GPS algorithm to successfully retrieve a number of other images from Monument Valley and the American Southwest, rather than images of a specific rock formation.
The study will be presented at the IEEE Computer Society Conference on Computer Vision and Pattern Recognition in Anchorage, Alaska.