Podcast Episode

ByteDance Suspends Seedance 2.0 Voice Feature After AI Generates Voice From Photo Alone

February 11, 2026

Audio archived. Episodes older than 60 days are removed to save server storage. Story details remain below.

ByteDance has launched Seedance 2.0, its most advanced AI video generation model, but was forced to suspend a key feature after discovering it could generate a realistic clone of someone's voice from just a photograph. The incident has reignited the deepfake debate even as the model's capabilities sent Chinese tech stocks surging.

ByteDance Launches Seedance 2.0 Then Scrambles to Contain Deepfake Risk

ByteDance, the parent company of TikTok, has released Seedance 2.0, a next-generation AI video generation model that has been called the most advanced of its kind. The model debuted in limited beta on the Jimeng AI platform on February 7, 2026, with broader availability rolling out days later. Swiss-based consultancy CTOL has claimed the model surpasses both OpenAI's Sora 2 and Google's Veo 3.1 in practical testing.

Capabilities That Stunned the Industry

Seedance 2.0 introduces native audio-video generation, producing synchronised sound effects, ambient audio, and phoneme-level lip synchronisation in eight or more languages. It supports 2K resolution output and generates video roughly 30 percent faster than its predecessor. The model accepts up to 12 reference files, including images, video clips, and audio, enabling what ByteDance calls multi-lens storytelling, where a single prompt can produce coherent multi-shot sequences with consistent characters and lighting.

Stock Markets React

The launch triggered a rally across Chinese tech and media stocks. COL Group hit its daily 20 percent price limit, while Shanghai Film and Perfect World each gained approximately 10 percent. The CSI 300 Index rose 1.4 percent, reflecting renewed investor enthusiasm for homegrown AI technologies.

The Voice Cloning Discovery

However, the celebration was short-lived. Pan Tianhong, founder of tech media outlet MediaStorm, discovered that uploading only a facial photograph, with no voice samples whatsoever, caused the system to generate audio closely matching his actual voice. He described the experience as terrifying, using the word six times in his account of the discovery.

Emergency Suspension

ByteDance's risk control team moved quickly, suspending the feature that allowed real photographs to be used as reference subjects. The Jimeng platform now requires users to complete live verification, recording their own image and voice, before creating digital avatars. The incident underscores the growing tension between AI innovation and safety, as analysts project the AI video generation market could reach 3.67 billion dollars in 2026.

Published February 11, 2026 at 2:57am