metadata
license: apache-2.0
language:
- en
tags:
- scene-graph-generation
- object-detection
- visual-relationship-detection
- pytorch
- yolo
pipeline_tag: object-detection
library_name: sgg-benchmark
model-index:
- name: REACT++ yolo12m
results:
- task:
type: object-detection
name: Scene Graph Detection
dataset:
name: VG150
type: vg150
metrics:
- type: mR@20
value: 10.81
name: mR@20
- type: R@20
value: 18.76
name: R@20
- type: mR@50
value: 14.42
name: mR@50
- type: R@50
value: 24.63
name: R@50
- type: mR@100
value: 16.78
name: mR@100
- type: R@100
value: 28.47
name: R@100
- type: F1@20
value: 13.72
name: F1@20
- type: F1@50
value: 18.19
name: F1@50
- type: F1@100
value: 21.11
name: F1@100
- type: e2e_latency_ms
value: 20.5
name: e2e_latency_ms
- name: REACT++ yolo26m
results:
- task:
type: object-detection
name: Scene Graph Detection
dataset:
name: VG150
type: vg150
metrics:
- type: mR@20
value: 10.81
name: mR@20
- type: R@20
value: 21.12
name: R@20
- type: mR@50
value: 14.6
name: mR@50
- type: R@50
value: 28.34
name: R@50
- type: mR@100
value: 18.36
name: mR@100
- type: R@100
value: 33.7
name: R@100
- type: F1@20
value: 14.3
name: F1@20
- type: F1@50
value: 19.27
name: F1@50
- type: F1@100
value: 23.77
name: F1@100
- type: e2e_latency_ms
value: 19.8
name: e2e_latency_ms
- name: REACT++ yolov8m
results:
- task:
type: object-detection
name: Scene Graph Detection
dataset:
name: VG150
type: vg150
metrics:
- type: mR@20
value: 12.22
name: mR@20
- type: R@20
value: 22.89
name: R@20
- type: mR@50
value: 16.31
name: mR@50
- type: R@50
value: 29.96
name: R@50
- type: mR@100
value: 18.45
name: mR@100
- type: R@100
value: 34.09
name: R@100
- type: F1@20
value: 15.93
name: F1@20
- type: F1@50
value: 21.12
name: F1@50
- type: F1@100
value: 23.94
name: F1@100
- type: e2e_latency_ms
value: 18.7
name: e2e_latency_ms
REACT++ Scene Graph Generation — VG150 (yolo12m, yolo26m, yolov8m)
This repository contains REACT++ model checkpoints for scene graph generation (SGG) on the VG150 benchmark, across 3 backbone sizes.
REACT++ is a parameter-efficient, attention-augmented relation predictor built on top of a YOLO backbone. It uses:
- DAMP (Detection-Anchored Multi-Scale Pooling), a new simple pooling algorithm for one-stage object detectors such as YOLO
- SwiGLU gated MLP for all feed-forward blocks (½ the params of ReLU-MLP at equal capacity)
- Visual x Semantic cross-attention — visual tokens attend to GloVe prototype embeddings
- Geometry RoPE — box-position encoded as a rotary frequency bias on the Q matrix
- Prototype Momentum Buffer — per-class EMA prototype bank
- P5 Scene Context — AIFI-enhanced P5 tokens provide global context via cross-attention
The models were trained with the SGG-Benchmark framework and described in the REACT++ paper (Neau et al., 2026).
Results — SGDet on VG150 test split (CUDA, max_det=100, batch_size=1)
Metrics from end-to-end evaluation (
tools/evaluate.py). Latency = model forward only.
| Backbone | R@20 | R@50 | R@100 | mR@20 | mR@50 | mR@100 | F1@20 | F1@50 | F1@100 | Lat. (ms) |
|---|---|---|---|---|---|---|---|---|---|---|
| yolo12m | 18.76 | 24.63 | 28.47 | 10.81 | 14.42 | 16.78 | 13.72 | 18.19 | 21.11 | 20.5 |
| yolo26m | 21.12 | 28.34 | 33.7 | 10.81 | 14.6 | 18.36 | 14.3 | 19.27 | 23.77 | 19.8 |
| yolov8m | 22.89 | 29.96 | 34.09 | 12.22 | 16.31 | 18.45 | 15.93 | 21.12 | 23.94 | 18.7 |
Checkpoints
| Variant | Sub-folder | Checkpoint files |
|---|---|---|
| yolo12m | yolo12m/ |
yolo12m/model.onnx (ONNX) · yolo12m/best_model_epoch_19.pth (PyTorch) |
| yolo26m | yolo26m/ |
yolo26m/model.onnx (ONNX) · yolo26m/best_model_epoch_18.pth (PyTorch) |
| yolov8m | yolov8m/ |
yolov8m/model.onnx (ONNX) · yolov8m/best_model_epoch_6.pth (PyTorch) |
Usage
ONNX (recommended — no Python dependencies beyond onnxruntime)
from huggingface_hub import hf_hub_download
onnx_path = hf_hub_download(
repo_id="maelic/REACTPlusPlus_VG150",
filename="yolo12m/react_pp_yolo12m.onnx",
repo_type="model",
)
# Run with tools/eval_onnx_psg.py or load directly via onnxruntime
PyTorch
# 1. Clone the repository
# git clone https://github.com/Maelic/SGG-Benchmark
# 2. Install dependencies
# pip install -e .
# 3. Download checkpoint + config
from huggingface_hub import hf_hub_download
ckpt_path = hf_hub_download(
repo_id="maelic/REACTPlusPlus_VG150",
filename="yolo12m/best_model.pth",
repo_type="model",
)
cfg_path = hf_hub_download(
repo_id="maelic/REACTPlusPlus_VG150",
filename="yolo12m/config.yml",
repo_type="model",
)
# 4. Run evaluation
import subprocess
subprocess.run([
"python", "tools/relation_eval_hydra.py",
"--config-path", str(cfg_path),
"--task", "sgdet",
"--eval-only",
"--checkpoint", str(ckpt_path),
])
Citation
@article{neau2026reactpp,
title = {REACT++: Efficient Cross-Attention for Real-Time Scene Graph Generation
},
author = {Neau, Maëlic and Falomir, Zoe},
year = {2026},
url = {https://arxiv.org/abs/2603.06386},
}