Giveaway: SUBSCRIBE our youtube channel to stand a chance to win an iPhone 17 Pro

Gemini Omni Model Generates Videos Using Multi-Modal Input



It's been over a year since the Veo 2 generative artificial intelligence (AI) model was launched at Google I/O 2025. During this year, Google developed a new Gemini Omni model with the ability to take multi-modal input to generate videos that users want.


Omni generates videos using video, images, audio, text or all of them simultaneously. The generated videos come with compatible audio and look more realistic. Gemini Omni Flash can receive input that the user can then modify or process. It can change styles, special effects, objects, recording directions and more using the input provided by the user.


The real-world physics understanding of Omni Flash has also been improved compared to previous models. All generated videos have a SynthID watermark to allow them to be detected and distinguished from the original video.


This morning the Gemini Omni Flash model was first launched in the Gemini app, and Google Flow for Google AI Plus, AI Pro, and AI Ultra customers. It is also given free to YouTube Shorts and YouTube Create App users starting this week.

Previous Post Next Post

Contact Form