Conversational speech generation
Tuning-free subject-driven generation
Generate text responses to user prompts
Generate videos from text or images
OmniParser, turn your LLM into GUI agent
Generate high-quality audio from text using various controls
A unified multimodal understanding and generation model.