CINEMA: Coherent Multi-Subject Video Generation via MLLM-Based Guidance Paper • 2503.10391 • Published about 22 hours ago • 5
CINEMA: Coherent Multi-Subject Video Generation via MLLM-Based Guidance Paper • 2503.10391 • Published about 22 hours ago • 5
CINEMA: Coherent Multi-Subject Video Generation via MLLM-Based Guidance Paper • 2503.10391 • Published about 22 hours ago • 5 • 1
WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation Paper • 2503.07265 • Published 4 days ago • 4
WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation Paper • 2503.07265 • Published 4 days ago • 4 • 1
MIGE: A Unified Framework for Multimodal Instruction-Based Image Generation and Editing Paper • 2502.21291 • Published 14 days ago • 4
MIGE: A Unified Framework for Multimodal Instruction-Based Image Generation and Editing Paper • 2502.21291 • Published 14 days ago • 4 • 2
Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think Paper • 2502.20172 • Published 15 days ago • 27
Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think Paper • 2502.20172 • Published 15 days ago • 27 • 3
FantasyID: Face Knowledge Enhanced ID-Preserving Video Generation Paper • 2502.13995 • Published 23 days ago • 8
FantasyID: Face Knowledge Enhanced ID-Preserving Video Generation Paper • 2502.13995 • Published 23 days ago • 8 • 2
Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model Paper • 2502.10248 • Published 28 days ago • 51
VFX Creator: Animated Visual Effect Generation with Controllable Diffusion Transformer Paper • 2502.05979 • Published Feb 9 • 8
VFX Creator: Animated Visual Effect Generation with Controllable Diffusion Transformer Paper • 2502.05979 • Published Feb 9 • 8 • 2