隆重推出 Gemini Omni - Gemini

去年，Nano Banana 将 Gemini 的智慧导入图像生成与编辑领域。从那时起，这项技术已经帮助了数百万名用户修复老照片、将草图化为设计，并以过去无法想像的方式实现脑中的创意想法。从开发之初，我们就是以原生多模态的基础打造 Gemini。现在，我们将迈出令人期待的下一步。

我们正式推出 Gemini Omni，这是一款结合 Gemini 逻辑推理与创作能力的全新模型，能以任何的输入素材创造出任意形式的内容，并率先支持视频生成。透过 Omni，你可以自由混搭图片、音讯、视频和文字素材等输入内容，结合 Gemini 基于现实世界的知识，进一步生成高品质的视频。你甚至可以透过像聊天一样的简单对话，轻松编辑你的视频。

今天，我们正式推出 Omni 系列中的第一款模型 Gemini Omni Flash，并开始导入 Gemini 应用程式、Google Flow 和 YouTube Shorts。未来，我们会逐步支持图片与音讯等更多元的输出模态类型。

透过对话编辑视频

Gemini Omni 让你能够用最自然的流畅语言，轻松进行视频编辑。每一次的指令都会承接上一步的脉络，不仅能让视频保持角色外观的一致性、拥有合理的物理逻辑，场景更会记住前一幕发生的细节。

改变眼前的世界： 针对画面中的特定物件进行微调，或是彻底抽换改变整个世界背景。随手拍下的视频现在可以成为打造新世界的起点，让你创作出过去无法亲自拍摄的画面。

> Prompt: Make the sculpture out of bubbles.

重新想像动作与情节： 只要上传一段拍好的视频，你可以直接要求 Omni 改变其中的情境，像是调整动作、加入新角色或物品，或是把平凡瞬间变成令人意想不到的惊喜情节。

> Prompt: When the person touches the mirror, make the mirror ripple beautifully like liquid, and the person's arm turns into reflective mirror material

> Prompt: Dim the lights in the room. Put a black and white checkerboard room inside a glass sphere that floats tracking above the hand, inside it contains a recursive representation.

> Prompt: The lights of the apartments start turning on in sync with the music.

反复微调视频细节： 调整环境、视角、风格，甚至是特定的小细节，并保留视频原始场景的连续性。

> Prompt: A video of a violinist playing a song.

> Prompt: Make the violin invisible

> Prompt: Change the camera angle to be over the violinist's shoulder.

结合 Gemini 的真实世界知识

Omni 不只能够建构出逼真的场景，还能推理接下来该发生什么事。透过结合物理常识跟 Gemini 丰富的历史、科学与文化知识，Omni 能进一步拉近拟真画面与深刻叙事之间的距离。

更符合真实物理法则的画面： Omni 大幅提升了对重力、动能与流体力学的理解，让生成的场景更加逼真。

> Prompt: A marble rolling fast on a chain reaction style track, continuous smooth shot.

融合知识与创意： 透过 Gemini 的知识库，Omni 能超越单纯的模式比对 (pattern matching)，在语言、影像与背后的深层意涵间建立更深的理解。

> Prompt: The video shows items of the alphabet. An unusual item starting with each letter is shown sitting on a table (like a Capybara for C, disco globe for D and Lava Lamp for L). All 26 letters must be represented by 26 items with matching lower thirds displaying the letter. Only one item and lower third at a time. Each lower third must look like a black marker written on a slip of paper in the bottom left. Rapid fire, roughly 9 frames per item at 24FPS. Last frame is a slip of paper "THE END". The whole video is accompanied by calm smooth music.

复杂概念视觉化： 透过简短的提示词，Omni 就能制作出生动的解说视频，透过视觉效果帮助你轻松了解生硬复杂的概念。

> Prompt: claymation explainer of protein folding, everything is made out of clay, no hands, stop motion, accurate

混搭不同素材生成视频

参考任何东西： Omni 能将图像、文字、视频或音讯等任何参考来源，揉合转化为单一且风格连贯的成品输出。在音讯输入上，我们初期会先开放语音档作为参考素材，并持续加入其他音讯输入类型。

> Prompt: Dynamic sci-fi film style video based on image_0.png. Elements light up similar to video_0.mp4 synchronized to the beat of the music from audio_0.wav

> Prompt: Referring to the extreme camera movement, perspective, and distortion in video-0, create a front-facing full-body walk cycle of the character from image-0, quickly style-shifting between the different reference images.

> Prompt: Add harp sounds synchronized to when I touch each fern leaf. Change the leaf structure to all resemble semi translucent 3d bioluminescent plant life, with bioluminescent firefly particles floating around.

从手边的素材开始： 你可以透过输入角色图片、场景图或手绘草图等参考资料，创作出符合心中想像的作品。

> Prompt: Imagine the world gradually changing into retro futuristic style (grainy and moody as image-1) as I walk. Use the audio for a retro-futuristic background music. 10s.

> Prompt: turn this into realistic footage, using the drawing only as a guide for movement, do not show the drawing in the final video

> Prompt: Apply the pose and motion from input video to provided character from this image. Apply style from image reference to the new video

套用风格、动态或特效： 你可以使用参考素材来定义视觉风格，或是直接用自然对话的方式进行描述，Omni 会融合所有输入的参考内容，产出风格一致的短片。

> Prompt: edit this keeping everything the same. add animated motion effects coming out of the skateboard

> Prompt: Apply the motion of the whale swimming from the provided video to the provided image of fluid reflective material. Do not show the whale or water; instead, have this reflective material move like the whale.

使用自己的虚拟化身创作视频

我们始终致力以负责任的方式发展 AI 技术，并制定明确的政策来保护用户及规范我们 AI 工具的使用。用户可以透过「虚拟化身（Avatars）」建立一个数位版本的自己，并生成无论在外貌或声音上都极具个人特色的视频。至于进一步编辑视频以修改音讯和语音的功能，我们目前仍持续进行测试与评估，确保能负责任地将这项技术带给大众。

所有由 Omni 创作的视频，都会包含无法以肉眼察觉的 SynthID 数位浮水印。你可以透过 Gemini 应用程式、Chrome 浏览器中的 Gemini 以及 Google 搜寻，验证视频是否由 Gemini Omni 生成。如果想进一步了解我们如何扩展内容透明度与验证工具，并认识网路上的内容创作与编辑历程，欢迎参阅我们的延伸部落格文章。

立即体验 Gemini Omni

从今天起，我们正式推出 Omni 系列的第一款模型 Gemini Omni Flash。所有 Google AI Plus、Pro 与 Ultra 方案的订阅用户，都能透过 Gemini 应用程式和 Google Flow 抢先体验。此外，从本周开始，用户也能在 YouTube Shorts 和 YouTube Create 应用程式中免费体验这项功能。

在接下来的几周内，我们也将透过 API 陆续把这项强大的功能开放给开发人员与企业客户使用。