Technology Details for Auto 3D™
Methods of Production of Stereoscopic 3-D Video Content
There are three ways to produce stereoscopic 3-D video content. In the first method, 3-D animation software is used to generate stereoscopic views of animated imagery. Most 3-D movies in theaters today are stereoscopic animated videos that are generated in this way. The other two methods of production are utilized to generate live action (real-life) stereoscopic 3-D videos. The first (and most common) of these methods is to shoot in stereoscopic 3-D using two video cameras and a 3-D camera rig. This technology works pretty well but, due to today's improperly designed camera rigs and optics, it often produces image distortions (which can cause eyestrain) and provides inconsistent amounts of 3-D at different shooting distances and zoom settings. Another limitation of shooting stereoscopic 3-D is that, due to the heavy and bulky rigs used (often mounted on a short dolly track), each 3-D camera system used on a shoot is unable to be moved more than a short distance. This makes close-ups and pans around people impossible. The final method of stereoscopic 3-D video production involves shooting in 2-D with conventional cameras (that can be easily moved and panned), and later (in postproduction) converting the 2-D video to stereoscopic 3-D
Overview of Conventional Conversion Technology
Several companies currently can convert 2-D video to 3-D. They all use a technique called "rotoscoping" (for instance, see: http://en.wikipedia.org/wiki/Rotoscoping"
) or a somewhat faster version of that called "rotobrushing" (for instance, see: http://tv.adobe.com/watch/after-effects-cs5-feature-tour/rotoscope-with-rotobrush/
). With either technique, a graphic artist starts with the first frame of video and traces and removes various objects from the frame. For instance, each person in the frame is removed separately. When they are removed, the hole in the image that is created must be filled in with new pixels that are similar to the surrounding pixels that are left around the hole. This is why a graphic artist is required. The new pixels must look natural. All objects in the scene that are to appear at different depths must be similarly cut out separately. After all desired objects are cut out, each one is placed back in the frame as a double image on a different layer. The distance between each image and its double determines the depth that the object will be seen at when viewing the video in 3-D. The graphic artist must decide what depth each object must appear at, making subjective guesses each time. The software used then tracks each identified object and performs the same operations until the objects move out of frame or until a new scene appears. The graphic artist must repeat these steps whenever a new object appears in the frame. This must be done for all the frames in a video.
Since each object is cut out as a single "cutout", it tends to look like a cardboard cutout in the final 3-D movie, which can look like a series of flat cardboard cutouts at different depths. This can be improved by superimposing many differently shaped triangles of different sizes with different orientations on each cutout. After computer processing, this will result in a more rounded dimensional look for each object. However, this extra work will take more time and cost a lot more. The graphic artist must, again, arbitrarily decide which part of each object is at what depth. In addition, graphic artists can take the alpha channel for each object (basically a solid silhouette of each object) and "paint" it with various shades of white, gray, and black, representing different depth locations on each object (again, as subjectively judged by the graphic artist). Then, additional software can analyze the various white, gray, and black alpha silhouettes to generate double images of each group of pixels within an object, giving them different separations based on the "painted" gray level. This also adds more work and cost, but can add varying depths to different parts of cut out objects. A drawback of this technology is that these techniques don't work well with semi-transparent images, especially if they are highly detailed, such as fire, smoke, fog, mist, etc.
As there are about 175,000 frames in an average movie, it takes a large group of graphic artists working simultaneously for several months to convert a single movie to 3-D. Consequently the cost is about $5 - $15 million just for the 3-D conversion alone. Although this cost and timeframe is usually acceptable for a new movie (since ticket prices are increased for 3-D movies), it isn't acceptable for most TV shows or the remake of old movies, since the generated profits wouldn't be high enough to cover this high 3-D conversion cost.
Auto 3D™ Conversion
The 3-D Vision Auto 3D™ conversion process, however, works on completely different principles, and doesn't rely on graphic artists to make arbitrary decisions. The process is based on decades of research on how the brain works, how it creates the experience of 3-D, and how stereoscopic 3-D images can be recorded. The brain creates the 3-D experience from several different perceived factors within an image, as well as from stored memories of previous 3-D experiences, even if only triggered by observations of 3-D in small areas. By comparing frames of a regular 2-D video, Auto 3D™ automatically provides a stereo pair of every pixel in the frame based on changing occlusions of background objects by foreground objects (an important 3-D cue), while the brain is led to further interpret 3-D by the displayed reduction in color, saturation, brightness, contrast, and size of objects as they get further in depth. Any object in front of another object becomes a "foreground object" relative to the object behind it, providing multiple layers of depth. Use of multiple views from different frames also provides stereoscopic information that automatically provides continuous depth as well as natural and correct rounding and other depth contours of objects within the frames. This technique works equally well with all kinds of imagery including fire, smoke, fog, mist, etc. The computer operator only has to assist the Auto 3D™ system in making difficult choices in certain frames that our algorithms can't currently interpret as well as the human brain can. This task is much less difficult and time consuming than conventional 3-D conversion methods, making Auto 3D™ much faster and less expensive. Sometime in the future we expect our algorithms to be improved further, eliminating human computer operators altogether.
With this revolutionary new process, a full-length movie (or TV show) can be converted to 3-D in less than a month by only two or three computer operators, at a fraction of the cost of conventional conversion. Furthermore, since stereo pairs of all pixels in the frame are created based on actual 3-D information present in 2-D videos, the 3-D provided doesn't look like cardboard cutouts, and continuous rounded depth is perceived instead. The relatively low cost, short conversion time, and realistic 3-D that is produced makes Auto 3D™ the only practical conversion process available for 3-D conversion of TV shows, previously-made movies, and other lower-end video applications. The first nationwide broadcast of a TV show converted to 3-D using Auto 3D™ was the Rachael Ray show on October 29, 2010.