Imagine a version of Google Street View where you could hit the rewind button and see any point in time over the last five years. Cornell researchers are building something like that, at least for a few much-visited places.
Noah Snavely, associate professor of computer science, already has collected millions of images from Internet photo-sharing sites and combined views of popular locations taken from many different angles to create 3-D models. Now Snavely and graduate student Kevin Matzen have added the fourth dimension of time with their Scene Chronology system. In models they have developed of Times Square, the Akhibahara shopping district in Tokyo and 5Pointz, a graffiti display area in Queens, an observer can navigate around inside a virtual 3-D space while using a slider control to move forward and backward in time.
In Times Square, theater marquees keep pace with movie releases. In the Akhibahara model, entire buildings can be seen to change. When applied to graffiti, the system offers a new way to preserve and compare ephemeral art.
“When reasoning about time in image collections, every observation has something to say,” the researchers said. They described their work in September at the 2014 European Conference on Computer Vision in Zurich, Switzerland, where they received a Best Paper award.
Their software works with flat surfaces in the image – signs, walls, storefronts or theater marquees – and treats them as “patches” that are stitched together to create the total scene. Each patch can be thought of as a series of planes stacked up into a three-dimensional solid, like index cards in a file drawer, where the front to back dimension is time. To create a 3-D model of the overall scene at a particular moment in time, the computer joins slices taken from each solid at that time.
So far the display shows only the planes, because the algorithm doesn’t recognize the “boring” flat grey and brown surfaces of walls and pavement, Matzen explained. That will be an enhancement for the future, he said.
A challenge is getting the time-stamping right. Amateur photographers don’t always set the clocks in their cameras correctly. In early experiments, human observers found anomalies in images, like people wearing winter clothes in July or ads for movies that hadn’t come out yet.
So the computer compiles lists of positive observations of a given feature and negative observations – when the feature isn’t there – and computes a time span over which the feature exists. In rendering the entire scene at a particular moment, the computer selects the features whose time span includes that moment. Once a 4-D model of a scene is created, a new photo can be time stamped by comparing it with the model. A given combination of billboards and movie posters in an image of Time Square could pinpoint the exact day a photo was taken.
The system can process tens to hundreds of thousands of photos of a given scene. The Times Square database contains a quarter-million images. The number of photos of many other locations available online is increasing rapidly, the researchers point out. According to Facebook and other social networking services, about 1.8 billion new photos are uploaded every day.
“Decades from now, once we’ve amassed a huge body of photos, we could go back and process locations that now have only a few,” Matzen said. “Imagine a future where 100 years worth of densely sampled imagery is available for any scene.” We may also change the way we think about photography, he added. Someday, “Any photo you might think of taking, someone else has taken.”
The research was supported by the National Science Foundation, the Intel Science and Technology Center for Visual Computing, and Amazon Web Services in Education.