Building and better understanding vision-language models: insights and future directions
Paper
•
2408.12637
•
Published
•
133
Idefics3-8B-Llama3 trained with the data in the following collection with this script.
@misc{laurençon2024building,
title={Building and better understanding vision-language models: insights and future directions.},
author={Hugo Laurençon and Andrés Marafioti and Victor Sanh and Léo Tronchon},
year={2024},
eprint={2408.12637},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
Base model
HuggingFaceM4/Idefics3-8B-Llama3