Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2412.13663

Reproducing ModernBERT for Retrieval

Reproducing ModernBERT for information retrieval tasks.

joe32140/ModernBERT-base-msmarco

Sentence Similarity • 0.1B • Updated Jan 26 • 3.19k • • 10
joe32140/ModernBERT-large-msmarco

Sentence Similarity • 0.4B • Updated Jan 26 • 128 • 4
joe32140/ColModernBERT-base-msmarco-en-bge

Sentence Similarity • 0.1B • Updated Dec 21, 2024 • 21 • 1
joe32140/gte-en-mlm-base-msmarco

Sentence Similarity • 0.1B • Updated Dec 25, 2024 • 4

Bringing BERT into modernity via both architecture changes and scaling

answerdotai/ModernBERT-base

Fill-Mask • 0.1B • Updated Jan 15 • 789k • 967
answerdotai/ModernBERT-large

Fill-Mask • 0.4B • Updated Jan 15 • 75.4k • 436
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 158

Papers - Text - Encoders - DeBERTa

BERTs are Generative In-Context Learners

Paper • 2406.04823 • Published Jun 7, 2024 • 1
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 158

Bookmark::Models

madhurjindal/autonlp-Gibberish-Detector-492513457

Text Classification • 67M • Updated May 14 • 183k • • 65
answerdotai/ModernBERT-base

Fill-Mask • 0.1B • Updated Jan 15 • 789k • 967
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 158
answerdotai/ModernBERT-large

Fill-Mask • 0.4B • Updated Jan 15 • 75.4k • 436

RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval

Paper • 2409.10516 • Published Sep 16, 2024 • 43
Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse

Paper • 2409.11242 • Published Sep 17, 2024 • 7
Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models

Paper • 2409.11136 • Published Sep 17, 2024 • 22
On the Diagram of Thought

Paper • 2409.10038 • Published Sep 16, 2024 • 13

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 158
A Survey of Small Language Models

Paper • 2410.20011 • Published Oct 25, 2024 • 46
No More Adam: Learning Rate Scaling at Initialization is All You Need

Paper • 2412.11768 • Published Dec 16, 2024 • 43
Chain of Draft: Thinking Faster by Writing Less

Paper • 2502.18600 • Published Feb 25 • 50

Text Classification

LLM Teacher-Student Framework for Text Classification With No Manually Annotated Data: A Case Study in IPTC News Topic Classification

Paper • 2411.19638 • Published Nov 29, 2024 • 6
Word Sense Linking: Disambiguating Outside the Sandbox

Paper • 2412.09370 • Published Dec 12, 2024 • 10
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 158
Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 376

Rethinking Data Selection at Scale: Random Selection is Almost All You Need

Paper • 2410.09335 • Published Oct 12, 2024 • 16
From Generalist to Specialist: Adapting Vision Language Models via Task-Specific Visual Instruction Tuning

Paper • 2410.06456 • Published Oct 9, 2024 • 37
Emergent properties with repeated examples

Paper • 2410.07041 • Published Oct 9, 2024 • 8
Personalized Visual Instruction Tuning

Paper • 2410.07113 • Published Oct 9, 2024 • 70

Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss

Paper • 2410.17243 • Published Oct 22, 2024 • 93
StyleMaster: Stylize Your Video with Artistic Generation and Translation

Paper • 2412.07744 • Published Dec 10, 2024 • 20
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 158
Autoregressive Universal Video Segmentation Model

Paper • 2508.19242 • Published Aug 26 • 28

LinFusion: 1 GPU, 1 Minute, 16K Image

Paper • 2409.02097 • Published Sep 3, 2024 • 34
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion

Paper • 2409.11406 • Published Sep 17, 2024 • 27
Diffusion Models Are Real-Time Game Engines

Paper • 2408.14837 • Published Aug 27, 2024 • 126
Segment Anything with Multiple Modalities

Paper • 2408.09085 • Published Aug 17, 2024 • 22

Reproducing ModernBERT for Retrieval

Reproducing ModernBERT for information retrieval tasks.

joe32140/ModernBERT-base-msmarco

Sentence Similarity • 0.1B • Updated Jan 26 • 3.19k • • 10
joe32140/ModernBERT-large-msmarco

Sentence Similarity • 0.4B • Updated Jan 26 • 128 • 4
joe32140/ColModernBERT-base-msmarco-en-bge

Sentence Similarity • 0.1B • Updated Dec 21, 2024 • 21 • 1
joe32140/gte-en-mlm-base-msmarco

Sentence Similarity • 0.1B • Updated Dec 25, 2024 • 4

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 158
A Survey of Small Language Models

Paper • 2410.20011 • Published Oct 25, 2024 • 46
No More Adam: Learning Rate Scaling at Initialization is All You Need

Paper • 2412.11768 • Published Dec 16, 2024 • 43
Chain of Draft: Thinking Faster by Writing Less

Paper • 2502.18600 • Published Feb 25 • 50

Bringing BERT into modernity via both architecture changes and scaling

answerdotai/ModernBERT-base

Fill-Mask • 0.1B • Updated Jan 15 • 789k • 967
answerdotai/ModernBERT-large

Fill-Mask • 0.4B • Updated Jan 15 • 75.4k • 436
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 158

Text Classification

LLM Teacher-Student Framework for Text Classification With No Manually Annotated Data: A Case Study in IPTC News Topic Classification

Paper • 2411.19638 • Published Nov 29, 2024 • 6
Word Sense Linking: Disambiguating Outside the Sandbox

Paper • 2412.09370 • Published Dec 12, 2024 • 10
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 158
Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 376

Papers - Text - Encoders - DeBERTa

BERTs are Generative In-Context Learners

Paper • 2406.04823 • Published Jun 7, 2024 • 1
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 158

Rethinking Data Selection at Scale: Random Selection is Almost All You Need

Paper • 2410.09335 • Published Oct 12, 2024 • 16
From Generalist to Specialist: Adapting Vision Language Models via Task-Specific Visual Instruction Tuning

Paper • 2410.06456 • Published Oct 9, 2024 • 37
Emergent properties with repeated examples

Paper • 2410.07041 • Published Oct 9, 2024 • 8
Personalized Visual Instruction Tuning

Paper • 2410.07113 • Published Oct 9, 2024 • 70

Bookmark::Models

madhurjindal/autonlp-Gibberish-Detector-492513457

Text Classification • 67M • Updated May 14 • 183k • • 65
answerdotai/ModernBERT-base

Fill-Mask • 0.1B • Updated Jan 15 • 789k • 967
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 158
answerdotai/ModernBERT-large

Fill-Mask • 0.4B • Updated Jan 15 • 75.4k • 436

Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss

Paper • 2410.17243 • Published Oct 22, 2024 • 93
StyleMaster: Stylize Your Video with Artistic Generation and Translation

Paper • 2412.07744 • Published Dec 10, 2024 • 20
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 158
Autoregressive Universal Video Segmentation Model

Paper • 2508.19242 • Published Aug 26 • 28

RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval

Paper • 2409.10516 • Published Sep 16, 2024 • 43
Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse

Paper • 2409.11242 • Published Sep 17, 2024 • 7
Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models

Paper • 2409.11136 • Published Sep 17, 2024 • 22
On the Diagram of Thought

Paper • 2409.10038 • Published Sep 16, 2024 • 13

LinFusion: 1 GPU, 1 Minute, 16K Image

Paper • 2409.02097 • Published Sep 3, 2024 • 34
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion

Paper • 2409.11406 • Published Sep 17, 2024 • 27
Diffusion Models Are Real-Time Game Engines

Paper • 2408.14837 • Published Aug 27, 2024 • 126
Segment Anything with Multiple Modalities

Paper • 2408.09085 • Published Aug 17, 2024 • 22

Previous
1
2
3
4
5
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs