People moving in and out of photographs used to be reserved for the world of Harry Potter, but computer scientists have now brought that magic to real life.
Their algorithm, Photo Wake-Up, can take a person from a 2D photo or a work of art and make them run, walk, or jump out of the frame.
The system also lets users view the animation in three dimensions using augmented reality tools. The researchers will present their results June 19 at the Conference on Computer Vision and Pattern Recognition in Long Beach, California. This research first attracted media attention when it appeared in preprint form in December on ArXiv.
“This is a very hard fundamental problem in computer vision,” says coauthor Ira Kemelmacher-Shlizerman, an associate professor at the University of Washington’s Paul G. Allen School of Computer Science & Engineering. “The big challenge here is that the input is only from a single camera position, so part of the person is invisible. Our work combines technical advancement on an open problem in the field with artistic creative visualization.”
Previously, researchers thought it would be impossible to animate a person running out of a single photo.
“There is some previous work that tries to create a 3D character using multiple viewpoints,” says coauthor Brian Curless, a professor in the Allen School. “But you still couldn’t bring someone to life and have them run out of a scene, and you couldn’t bring AR into it. It was really surprising that we could get some compelling results with using just one photo.”
The researchers envision Photo Wake-Up could lead to a new way for gamers to create avatars that actually look like them, a method for visitors to interact with paintings in an art museum—say sitting down to have tea with Mona Lisa—or something that lets children bring their drawings to life.
Examples in the research paper include animating the Golden State Warriors’ Stephen Curry to run off the court, Paul McCartney to leap off the cover of the “Help!” album, and Matisse’s “Icarus” (1944) to leave his frame.
To make the magic a reality, Photo Wake-Up starts by identifying a person in an image and making a mask of the body’s outline. From there, it matches a 3D template to the subject’s body position. Then the algorithm does something surprising: In order to warp the template so that it actually looks like the person in the photo, it projects the 3D person back into 2D.
“It’s very hard to manipulate in 3D precisely,” says coauthor Chung-Yi Weng, a doctoral student in the Allen School. “Maybe you can do it roughly, but any error will be obvious when you animate the character. So we have to find a way to handle things perfectly, and it’s easier to do this in 2D.”
Photo Wake-Up stores 3D information for each pixel: its distance from the camera or artist and how a person’s joints are connected together. Once the template has been warped to match the person’s shape, the algorithm pastes on the texture—the colors from the image. It also generates the back of the person by using information from the image and the 3D template. Then the tool stitches the two sides together to make a 3D person who will be able to turn around.
Once the 3D character is ready to run, the algorithm needs to set up the background so that the character doesn’t leave a blank space behind. Photo Wake-Up fills in the hole behind the person by borrowing information from other parts of the image.
Right now Photo Wake-Up works best with images of people facing forward, and can animate both artistic creations and photographs of real people. The algorithm can also handle some photos where people’s arms are blocking part of their bodies, but it is not yet capable of animating people who have their legs crossed or who are blocking large parts of themselves.
“Photo Wake-Up is a new way to interact with photos,” Weng says. “It can’t do everything yet, but this is just the beginning.”
Funding came from the National Science Foundation, UW Animation Research, UW Reality Lab, Facebook, Huawei, and Google.
Source: University of Washington