Giveaway: SUBSCRIBE our youtube channel to stand a chance to win an iPhone 17 Pro

Gemini Omni Model Generates Videos Using Multi-Modal Input

bythecekodok -May 20, 2026

It's been over a year since the Veo 2 generative artificial intelligence (AI) model was launched at Google I/O 2025. During this year, Google developed a new Gemini Omni model with the ability to take multi-modal input to generate videos that users want.

Omni generates videos using video, images, audio, text or all of them simultaneously. The generated videos come with compatible audio and look more realistic. Gemini Omni Flash can receive input that the user can then modify or process. It can change styles, special effects, objects, recording directions and more using the input provided by the user.

The real-world physics understanding of Omni Flash has also been improved compared to previous models. All generated videos have a SynthID watermark to allow them to be detected and distinguished from the original video.

This morning the Gemini Omni Flash model was first launched in the Gemini app, and Google Flow for Google AI Plus, AI Pro, and AI Ultra customers. It is also given free to YouTube Shorts and YouTube Create App users starting this week.

Tags APPS & GAMES

Trending

Mufti of Selangor: Haram to Buy Warranty Cards or "Extended Warranty" Separately

Wow! There are Adult Scenes in Sakura School Simulator

10 Most Interesting Science News of 2025

World AI Cooperation Organization Established – Malaysia, China, Russia and Indonesia Among 29 Member Countries

NVIDIA GeForce RTX 50 Super Launch Delayed Due to Supply and Price of New GDDR7 Memory

Gemini Omni Model Generates Videos Using Multi-Modal Input

Contact Form