UniTAB: Unifying Text and Box Outputs for Grounded Vision-Language Modeling
Paper
•
2111.12085
•
Published
Unicorn accomplishes the great unification of the network architecture and the learning paradigm for four tracking tasks. Unicorn puts forwards new state-of-the-art performance on many challenging tracking benchmarks using the same model parameters. This model has an input size of 800x1280.
This model can be used for:
This model can simultaneously deal with SOT, MOT17, VOS, and MOTS Challenge
MOT17 MOTA (%): 77.2 MOTS sMOTSA (%): 65.3
@inproceedings{unicorn,
title={Towards Grand Unification of Object Tracking},
author={Yan, Bin and Jiang, Yi and Sun, Peize and Wang, Dong and Yuan, Zehuan and Luo, Ping and Lu, Huchuan},
booktitle={ECCV},
year={2022}
}