arxiv:2503.23307

MoCha: Towards Movie-Grade Talking Character Synthesis

Published on Mar 30

· Submitted by

lim142857 on Apr 1

#1 Paper of the day

Upvote

103

Authors:

Cong Wei ,

Haoyu Ma ,

Ji Hou ,

Felix Juefei-Xu ,

Xiaoliang Dai ,

Luxin Zhang ,

Tingbo Hou ,

Animesh Sinha ,

Wenhu Chen

Abstract

Recent advancements in video generation have achieved impressive motion realism, yet they often overlook character-driven storytelling, a crucial task for automated film, animation generation. We introduce Talking Characters, a more realistic task to generate talking character animations directly from speech and text. Unlike talking head, Talking Characters aims at generating the full portrait of one or more characters beyond the facial region. In this paper, we propose MoCha, the first of its kind to generate talking characters. To ensure precise synchronization between video and speech, we propose a speech-video window attention mechanism that effectively aligns speech and video tokens. To address the scarcity of large-scale speech-labeled video datasets, we introduce a joint training strategy that leverages both speech-labeled and text-labeled video data, significantly improving generalization across diverse character actions. We also design structured prompt templates with character tags, enabling, for the first time, multi-character conversation with turn-based dialogue-allowing AI-generated characters to engage in context-aware conversations with cinematic coherence. Extensive qualitative and quantitative evaluations, including human preference studies and benchmark comparisons, demonstrate that MoCha sets a new standard for AI-generated cinematic storytelling, achieving superior realism, expressiveness, controllability and generalization.

View arXiv page View PDF Project page Add to collection

Community

lim142857

Paper author Paper submitter 8 days ago

•

edited 8 days ago

X: https://x.com/CongWei1230/status/1906877381899415945
Website: https://congwei1230.github.io/MoCha/

nityanandmathur

7 days ago

Any plans to release the code base?

lim142857

Paper author 4 days ago

•

edited 4 days ago

Hi, thank you for your interest in MoCha! Releasing the source code and model weights requires approval from Meta. However, I can try to implement MoCha using open-source video generation models.
Meanwhile, MoChaBench will be released soon—please stay tuned.

osma77

7 days ago

Generee une blague

librarian-bot

7 days ago

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on HF中国镜像站 checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

altryne

6 days ago

Will MochaBench be open sourced?

lim142857

Paper author 4 days ago

Hi, thank you for your interest in MoCha! Yes, MoChaBench will be released soon—please stay tuned. In fact, all the demo videos on our website were generated using MoCha models running on MoChaBench.