Google strikes back with an answer to OpenAI’s Sora launch

Google DeepMind

Google ’s DeepMind naval division unveiled its second generationVeo video generation modelon Monday , which can create clips up to two minutes in distance and at resolutions reaching 4 K quality — that ’s six times the distance and four times the resolution of the 20 - second/1080p resolution cartridge clip Sora can generate .

Of of course , those are Veo 2 ’s theoretic upper limit . The manikin is currently only usable on VideoFX , Google ’s data-based TV generation political program , and its clips are capped at eight seconds and 720p resolution . VideoFXis also waitlisted , so not just anyone can log on to try Veo 2 , though the company announced that it will be boom access in the coming weeks . A Google interpreter also noted that Veo 2 will be made usable on the Vertex AI program once the ship’s company can sufficiently scale the model ’s capableness .

Veo 2 on VideoFX

Google DeepMind

“ Over the come months , we ’ll extend to iterate found on feedback from user , ” Eli Collins toldTechCrunch , “ and [ we ’ll ] await to integrate Veo 2 ’s update capabilities into compelling use cases across the Google ecosystem … We carry to apportion more updates next year . ”

Today , we ’re announcing Veo 2 : our state - of - the - art video generation model which produces realistic , high - quality clip from text or image prompts . 🎥

We ’re also releasing an improved interlingual rendition of our text - to - figure of speech model , Imagen 3 – available to utilize in ImageFX through…pic.twitter.com/h6ejHaMUM4

& mdash ; Google DeepMind ( @GoogleDeepMind)December 16 , 2024

Veo 2 reportedly hold a number of vantage over its predecessors , including a better understanding of physics ( think better unstable kinetics and better illumination / overshadow effects ) as well as the electrical capacity to give “ clear ” video clips , in that generated textures and images are sharper and less prostrate to glaze over when move . The Modern model also offers improved tv camera controls , enabling the substance abuser to place the virtual camera lens with greater precision than before .

As TechCrunch notes , Veo 2 has not yet perfected the picture genesis process , though it does come along to hallucinate far less than contender likeSora , Kling , Movie Gen , orGen 3 Alpha . “ Coherence and consistency are area for maturation , ” Collins said . “ Veo can consistently adhere to a prompt for a couple minutes , but [ it ca n’t ] adhere to complex prompting over long horizons . likewise , fictitious character consistency can be a challenge . There ’s also room to improve in generating intricate details , riotous and complex motions , and continuing to push the boundaries of naive realism . ”

Google also annunciate betterment toImagen 3on Monday , enabling the commercial image generation model to produce “ promising , comfortably - composed ” outputs . The mannequin , useable on ImageFX , will also offer extra descriptive hypnotism based on keywords in the user ’s prompt , with each keyword spawning a drop-off - down computer menu of related to terms .