ChatMusician: Understanding and Generating Music Intrinsically with LLM Paper • 2402.16153 • Published Feb 25, 2024 • 60
Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners Paper • 2402.17723 • Published Feb 27, 2024 • 16
ComposerX: Multi-Agent Symbolic Music Composition with LLMs Paper • 2404.18081 • Published Apr 28, 2024 • 2
MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions Paper • 2407.20962 • Published Jul 30, 2024
VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling Paper • 2406.04321 • Published Jun 6, 2024 • 1
YuE: Scaling Open Foundation Models for Long-Form Music Generation Paper • 2503.08638 • Published 16 days ago • 59
AudioX: Diffusion Transformer for Anything-to-Audio Generation Paper • 2503.10522 • Published 14 days ago • 21
VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling Paper • 2406.04321 • Published Jun 6, 2024 • 1
AudioX: Diffusion Transformer for Anything-to-Audio Generation Paper • 2503.10522 • Published 14 days ago • 21