Can this model be used for video captioning?
#16
by
HugTibers
- opened
I want to use this model to identify actions, such as falls, which cannot be judged by a single image.
Same question with you , any new ideas now?
Hi, refer to V-BLIP for video captioning: https://huggingface.co/models?other=video-captioning