BAAI BGE-M3

BAAI

open_source

Capabilities

Multilingual

Strengths

Strong multilingual retrieval — 100+ languages
Dense + sparse + multi-vector in one model
MIT licence
Small enough to run on CPU or 2 GB VRAM

Weaknesses

Self-hosted operational overhead
8k input limit

Pricing

Input / 1M tokens

Free (self-host)

Output

—

Hosting

Open weights

Embedding specs

Output dimension

1024

Max input

8,192 tokens

Matryoshka

Transparency

Open weights

10.0 / 10

Open training data

5.0 / 10

Open methodology

7.0 / 10

Licence openness

9.0 / 10

Provider disclosure

7.0 / 10

FMTI company score

N/A

Composite:7.5 / 10

Open weights with permissive licence; methodology partially documented.

Sustainability

Inference energy

8.5 / 10

Training footprint

N/A

Provider infrastructure

7.0 / 10

Composite:7.8 / 10

Small enough to run on local hardware — user controls the energy source.

MTEB quality

Embedding

77%

Grounded in the MTEB (Massive Text Embedding Benchmark) Overall average published by the model authors. Bearing collapses MTEB's retrieval, STS, classification, and clustering categories into a single quality signal because they correlate strongly for embedding models. Methodology.