Stability AI yesterday announced the release of Stable Video 3D (SV3D), a generative AI model that creates 3D videos from a single 2D image. Stability AI, is an open source generative AI firm that develops models for a variety of applications.
The model, based on Stable Video Diffusion, aims to advance 3D technology by delivering improved quality and multi-view consistency compared to previous models like Stable Zero123.
SV3D comes in two variants: SV3D_u, which generates orbital videos from single images without camera conditioning, and SV3D_p, which accommodates both single images and orbital views, allowing for 3D video creation along specified camera paths. “Stable Video 3D leverages its multi-view consistency to optimise 3D Neural Radiance Fields (NeRF) and mesh representations to improve the quality of 3D meshes generated directly from novel views,” Stability AI stated in their blog post.
The model is available for both commercial and non-commercial use. Commercial users require a Stability AI membership starting at $20 per month, while non-commercial users can download the model weights from Hugging Face.
Varun Jampani, lead researcher at Stability AI, said, “Stable Video 3D is a valuable tool for generating 3D assets, especially within the gaming sector. Additionally, it enables the production of 360-degree orbital videos, which are useful in e-commerce, providing a more immersive and interactive shopping experience.”
SV3D’s release follows other recent advancements in AI-generated video, such as OpenAI’s Sora, Runway ML , and Google Dream Fields. However, SV3D differentiates itself by focusing on generating 3D videos from single images rather than relying on text inputs.
As AI continues to evolve, models like Stable Video 3D showcase the potential for transforming 2D content into immersive 3D experiences, with applications spanning gaming, e-commerce, and beyond.
Stability AI has been on a roll, releasing several innovative AI models in recent months. Last month, the company released Stable Diffusion 3, its most capable text-to-image model with improved performance in multi-subject prompts, image quality, and spelling abilities. They also launched Stable Cascade, a text-to-image AI model designed for efficiency on consumer hardware.