Google / Gemini Omni Video — Text to Video (gemini-omni-video)
Gemini Omni Video is a multimodal video generation model that supports text, image, video, audio, and character references for rich, controllable output.
Points forts
•Multimodal – Combine text, images, video clips, audio, and characters.
•4K support – Generate up to 4K resolution videos.
•Character consistency – Reuse character IDs across generations.
•Flexible duration – 4, 6, 8, or 10 seconds.
Paramètres
•prompt* – Text description of the video to generate
•image_urls – Reference images (max 5). Video uses 2 image slots out of 7 total.
Gemini Omni Video costs $0.6900 per generation through Renderful's API. No subscription required — pay only for what you use.
How do I use Gemini Omni Video via API?
Sign up for a free Renderful API key, then send a POST request to the /v1/predictions endpoint with model "gemini-omni-video". See the documentation at renderful.ai/docs for code examples in Python, JavaScript, and cURL.
What type of content does Gemini Omni Video generate?
Gemini Omni Video is a text to video model by Google. Key features include: 4-10s videos, 4K support, Image refs.
Is the Gemini Omni Video API fast?
Gemini Omni Video has medium generation speed. Results are delivered via polling or webhook callback for seamless integration.