Google has created a program where the viewer can “fly into” a still photo using artificially intelligent (AI) 3D models.
In a new paper entitled InfiniteNature-Zero, the researchers take a landscape photo and then use AI to “fly” into it like a bird, with clever software generating a fake landscape thanks to machine learning.
When facing the daunting task, researchers had to fill in information that a still photo doesn’t provide, such as hidden areas in a photo. For example, a spot that is hidden behind trees needs to be generated. This can be done by “inpainting,” the AI will simulate what it thinks would be there by the process of machine learning with huge datasets.
Similarly, to get the flying effect, the AI has to generate what is outside of the photograph’s borders. This is called “outpainting” and is much like the content-aware tool in Photoshop where the AI will generate a wider image based upon the original photo and aided by its deep learning from massive datasets.
As anyone who has ever zoomed into a photo will know, the image quality falls apart as its breaks down into blurry pixels. To stop this from happening, Google uses superresolution, a process where AI synthesizes a noisy, pixellated image into a crisp one.
The program, which researchers named “Perpetual View Generation of Natural Scenes from Single Images,” combines these three techniques: inpainting, outpainting, and superresolution, to create the flying effect.
In previous attempts by the researchers, the image breaks down almost immediately as the viewer flies in. But in the latest paper, credited to Google Research, Cornell University, and UC Berkeley, the image holds itself much better and for longer. However, it is still far from perfect but offers a vast improvement from previous efforts.
The latest paper also represents a step forward in that previous perpetual view generators were trained by real-life drone footage, whereas these new examples were created from just single photographs of landscapes.
“This AI is so much smarter than the previous one that was published just a year ago,” says Károly Zsolnai-Fehér, from Two Minute Papers.
“And it requires training data that is much easier to produce at the same time. And, I wonder what we will be capable of just two more papers down the line. So cool!”
Google’s AI Research
Google’s AI team has been using the Neural Radiance Fields (NeRF) that previously allowed researchers to build detailed 3D models of real-world locations and powerfully denoise images, effectively enabling user to “see in the dark.”
However, the above programs relied upon a large cache of images of the location it was generating, whereas the new perpetual view generator only needed a single image.
Earlier this year, PetaPixel reported on Samsung Labs developing a way to create high-resolution avatars, or deepfakes, from a single still frame photo called MegaPortraits.