Dataset and results for SLD (https://arxiv.org/abs/2507.21184)
Andy Lin
pkuHaowei
AI & ML interests
NLP, continual learning, computational biology
Recent Activity
updated
a dataset
about 11 hours ago
pkuHaowei/llm-srbench
published
a dataset
about 11 hours ago
pkuHaowei/llm-srbench
upvoted
a
paper
10 days ago
Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces