view post Post 1154 Gemma 3 seems to be really good at human preference. Just waiting for ppl to see it. See translation 🔥 5 5 + Reply
UniTok: A Unified Tokenizer for Visual Generation and Understanding Paper • 2502.20321 • Published 15 days ago • 29
TEXGen: a Generative Diffusion Model for Mesh Textures Paper • 2411.14740 • Published Nov 22, 2024 • 17
Image Inpainting via Iteratively Decoupled Probabilistic Modeling Paper • 2212.02963 • Published Dec 6, 2022
Is synthetic data from generative models ready for image recognition? Paper • 2210.07574 • Published Oct 14, 2022
Towards Efficient and Scale-Robust Ultra-High-Definition Image Demoireing Paper • 2207.09935 • Published Jul 20, 2022
GO-NeRF: Generating Virtual Objects in Neural Radiance Fields Paper • 2401.05750 • Published Jan 11, 2024
UniDream: Unifying Diffusion Priors for Relightable Text-to-3D Generation Paper • 2312.08754 • Published Dec 14, 2023 • 11
ObjectMover: Generative Object Movement with Video Prior Paper • 2503.08037 • Published 3 days ago • 3
SCOB: Universal Text Understanding via Character-wise Supervised Contrastive Learning with Online Text Rendering for Bridging Domain Gap Paper • 2309.12382 • Published Sep 21, 2023
What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis Paper • 1904.01906 • Published Apr 3, 2019
Cream: Visually-Situated Natural Language Understanding with Contrastive Reading Model and Frozen Large Language Models Paper • 2305.15080 • Published May 24, 2023
Prometheus-Vision: Vision-Language Model as a Judge for Fine-Grained Evaluation Paper • 2401.06591 • Published Jan 12, 2024 • 4