HF中国镜像站

ZechenBai
/

LOVA3-llava-v1.5-phi1.5-fuyu

Model card Files Files and versions Community

This repository contains the model for LOVA3: Learning to Visual Question Answering, Asking and Assessment. LOVA3 is a framework designed to equip MLLMs with the capabilities to answer, ask, and assess questions in the context of images.

Code: https://github.com/showlab/LOVA3

🎓 Citation

If you find LOVA3 useful, please cite using this BibTeX:

@inproceedings{
    zhao2024lova,
    title={{LOVA}3: Learning to Visual Question Answering, Asking and Assessment},
    author={Hengyuan Zhao and Pan Zhou and Difei Gao and Zechen Bai and Mike Zheng Shou},
    booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems},
    year={2024},
    url={https://openreview.net/forum?id=vIOKLMl6wu}
}

Downloads last month: 10

Safetensors

Model size

1.72B params

Tensor type

BF16

·

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including ZechenBai/LOVA3-llava-v1.5-phi1.5-fuyu

LOVA3

6 items • Updated Dec 23, 2024