Long-Context Model Laboratory

community

https://github.com/LCM-Lab

Activity Feed Request to join this org

AI & ML interests

Long-context Modeling, Reinforcement-Learning, Multi-modality

Recent Activity

QQTang1223 authored a paper 7 days ago

Flux Attention: Context-Aware Hybrid Attention for Efficient LLMs Inference

QQTang1223 submitted a paper 8 days ago

Flux Attention: Context-Aware Hybrid Attention for Efficient LLMs Inference

QQTang1223 updated a collection 3 months ago

Elastic-Attention

View all activity

Papers

Flux Attention: Context-Aware Hybrid Attention for Efficient LLMs Inference

View all Papers

authored a paper 7 days ago

Flux Attention: Context-Aware Hybrid Attention for Efficient LLMs Inference

Paper • 2604.07394 • Published 10 days ago • 16

submitted a paper to Daily Papers 8 days ago

Flux Attention: Context-Aware Hybrid Attention for Efficient LLMs Inference

Paper • 2604.07394 • Published 10 days ago • 16

updated a collection 3 months ago

Elastic-Attention

Elastic Attention: Test-time Adaptive Sparsity Ratios for Efficient Transformers • 17 items • Updated Jan 28 • 3

updated a model 3 months ago

LCM-Lab/full_streaming_64k_qwen3-4b_MLP3.0_wfrozen

Text Generation • 4B • Updated Jan 28 • 1

updated a collection 3 months ago

Elastic-Attention

Elastic Attention: Test-time Adaptive Sparsity Ratios for Efficient Transformers • 17 items • Updated Jan 28 • 3

updated a model 3 months ago

LCM-Lab/full_streaming_64k_qwen3-4b_MLP8.0_wfrozen

Text Generation • 4B • Updated Jan 28 • 5

updated a collection 3 months ago

Elastic-Attention

Elastic Attention: Test-time Adaptive Sparsity Ratios for Efficient Transformers • 17 items • Updated Jan 28 • 3

updated a model 3 months ago

LCM-Lab/moba_llama

Text Generation • Updated Jan 28 • 2

updated a collection 3 months ago

Elastic-Attention

Elastic Attention: Test-time Adaptive Sparsity Ratios for Efficient Transformers • 17 items • Updated Jan 28 • 3

updated a model 3 months ago

LCM-Lab/infllm_qwen3-8b

Text Generation • 8B • Updated Jan 28 • 1

updated a collection 3 months ago

Elastic-Attention

Elastic Attention: Test-time Adaptive Sparsity Ratios for Efficient Transformers • 17 items • Updated Jan 28 • 3

updated a model 3 months ago

LCM-Lab/moba_qwen3-8b

Text Generation • 8B • Updated Jan 28 • 5

updated a collection 3 months ago

Elastic-Attention

Elastic Attention: Test-time Adaptive Sparsity Ratios for Efficient Transformers • 17 items • Updated Jan 28 • 3

updated a model 3 months ago

LCM-Lab/infllm_llama

Text Generation • 8B • Updated Jan 28 • 1

updated a collection 3 months ago

Elastic-Attention

Elastic Attention: Test-time Adaptive Sparsity Ratios for Efficient Transformers • 17 items • Updated Jan 28 • 3

updated a model 3 months ago

LCM-Lab/infllm_qwen3-4b

Text Generation • 4B • Updated Jan 28 • 5

updated a collection 3 months ago

Elastic-Attention

Elastic Attention: Test-time Adaptive Sparsity Ratios for Efficient Transformers • 17 items • Updated Jan 28 • 3

updated a model 3 months ago

LCM-Lab/moba_qwen3-4b

Text Generation • 4B • Updated Jan 28 • 5

updated a collection 3 months ago

Elastic-Attention

Elastic Attention: Test-time Adaptive Sparsity Ratios for Efficient Transformers • 17 items • Updated Jan 28 • 3

updated a model 3 months ago

LCM-Lab/nsa_qwen3-8b

Text Generation • 9B • Updated Jan 28 • 2