BAAI BGE-M3
BAAI
Capabilities
Strengths
- Strong multilingual retrieval — 100+ languages
- Dense + sparse + multi-vector in one model
- MIT licence
- Small enough to run on CPU or 2 GB VRAM
Weaknesses
- Self-hosted operational overhead
- 8k input limit
Pricing
Input / 1M tokens
Free (self-host)
Output
—
Hosting
Open weights
Embedding specs
Output dimension
1024
Max input
8,192 tokens
Matryoshka
No
Transparency
Open weights
10.0 / 10
Open training data
5.0 / 10
Open methodology
7.0 / 10
Licence openness
9.0 / 10
Provider disclosure
7.0 / 10
FMTI company score
N/A
Open weights with permissive licence; methodology partially documented.
Sustainability
Inference energy
8.5 / 10
Training footprint
N/A
Provider infrastructure
7.0 / 10
Small enough to run on local hardware — user controls the energy source.
MTEB quality
Grounded in the MTEB (Massive Text Embedding Benchmark) Overall average published by the model authors. Bearing collapses MTEB's retrieval, STS, classification, and clustering categories into a single quality signal because they correlate strongly for embedding models. Methodology.