MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models
We present MMIE, a Massive Multimodal Interleaved understanding Evaluation benchmark, designed for Large Vision-Language Models (LVLMs). MMIE offers a robust framework for evaluating the interleaved comprehension and generation capabilities of LVLMs across diverse fields, supported by reliable automated metrics.
Website | Code | Dataset | Results | Evaluation Model | Paper
Model | Model Type | Situational analysis | Project-based learning | Multi-step reasoning | AVG |
|---|---|---|---|---|---|
Qwen-VL-70b | Openjourney | Interleaved LVLM | 47.63 | 55.12 | 42.17 | 50.92 |
Model | Model Type | Situational analysis | Project-based learning | Multi-step reasoning | AVG | |
|---|---|---|---|---|---|---|
10 | Qwen-VL-70b | Openjourney | Interleaved LVLM | 47.63 | 55.12 | 42.17 | 50.92 |
Model | Model Type | Situational analysis | Project-based learning | Multi-step reasoning | AVG | |
|---|---|---|---|---|---|---|
1 | MiniGPT-5 | Interleaved LVLM | 47.63 | 55.12 | 42.17 | 50.92 |
2 | EMU-2 | Interleaved LVLM | 39.65 | 46.12 | 50.75 | 45.33 |
3 | GILL | Interleaved LVLM | 46.72 | 57.57 | 39.33 | 51.58 |
4 | Anole | Interleaved LVLM | 48.95 | 59.05 | 51.72 | 55.22 |
5 | GPT-4o | Openjourney | Integrated LVLM | 53.05 | 71.4 | 53.67 | 63.65 |
6 | GPT-4o | SD-3 | Integrated LVLM | 53 | 71.2 | 53.67 | 63.52 |
7 | GPT-4o | SD-XL | Integrated LVLM | 56.12 | 73.25 | 53.67 | 65.47 |
8 | GPT-4o | Flux | Integrated LVLM | 54.97 | 68.8 | 53.67 | 62.63 |
9 | Gemini-1.5 | Openjourney | Integrated LVLM | 48.08 | 67.93 | 60.05 | 61.57 |
10 | Gemini-1.5 | SD-3 | Integrated LVLM | 47.48 | 68.7 | 60.05 | 61.87 |
11 | Gemini-1.5 | SD-XL | Integrated LVLM | 49.43 | 71.85 | 60.05 | 64.15 |
12 | Gemini-1.5 | Flux | Integrated LVLM | 47.07 | 68.33 | 60.05 | 61.55 |
13 | LLAVA-34b | Openjourney | Integrated LVLM | 54.12 | 73.47 | 47.28 | 63.93 |
14 | LLAVA-34b | SD-3 | Integrated LVLM | 54.72 | 72.55 | 47.28 | 63.57 |
15 | LLAVA-34b | SD-XL | Integrated LVLM | 55.97 | 74.6 | 47.28 | 65.05 |
16 | LLAVA-34b | Flux | Integrated LVLM | 54.23 | 71.32 | 47.28 | 62.73 |
17 | Qwen-VL-70b | Openjourney | Integrated LVLM | 52.73 | 71.63 | 55.63 | 64.05 |
18 | Qwen-VL-70b | SD-3 | Integrated LVLM | 54.98 | 71.87 | 55.63 | 64.75 |
19 | Qwen-VL-70b | SD-XL | Integrated LVLM | 52.58 | 73.57 | 55.63 | 65.12 |
20 | Qwen-VL-70b | Flux | Integrated LVLM | 54.23 | 69.47 | 55.63 | 63.18 |