Gemini Omni
Google's new video model lets you direct a clip by conversation — change the camera, swap an object, light the scene — but it will not let you edit the speech, and it signs everything it makes.
Gemini Omni folds Google's text reasoning into video so editing becomes a conversation: ask for a different camera angle or a new background and each instruction builds on the last, the way Google's image tool already works for stills. The first release makes ten-second clips with sound and edits footage you already have, rather than only conjuring video from a sentence.
You can lend your own voice to a clip, but you cannot rewrite the speech in someone else's footage.
The line worth noticing is the one Google drew around it. The model is good enough to put new words in a real person's mouth — and that is exactly the capability Google held back. You can lend your own voice to a generated clip, but you cannot rewrite the speech in someone else's footage; Google says it is 'still working to test this' before releasing it 'responsibly.' Read plainly, in an election year: the deepfake button exists and was left off the panel.
So the story is restraint, not reach. The frontier of what these models can do now runs ahead of what their makers will ship — the gating happens at release, by choice, not at the limit of the technology. To make that choice legible, every clip Omni produces carries an invisible Google watermark and a tamper-evident origin record by default, verifiable in Chrome and Search and not switchable off. The capability is being rationed and labelled at the same time, which is roughly the only honest answer anyone has to synthetic video.
Watch where a maker draws its own line — what it can build but chooses not to release tells you more than the demo reel.