X Algorithm — For You Feed
Recommendation system powering the For You feed on X. Open-sourced by xAI under Apache 2.0. Written primarily in Rust (62.9%) and Python (37.1%).
Pipeline overview
The feed is assembled in 7 sequential stages on every request:
Feed request
│
▼
1. Query hydration ← fetch user context first
│
├──────────────────────────────────┐
▼ ▼
2a. Thunder 2b. Phoenix retrieval
(in-network) (out-of-network)
│ │
└──────────────┬───────────────────┘
▼
3. Candidate hydration
│
▼
4. Pre-scoring filters (10 boolean filters)
│
▼
5. Phoenix scorer (Grok transformer)
Weighted scorer Score = Σ(w × P(action))
Author diversity scorer
│
▼
6. Selection — top K
│
▼
7. Post-selection filter (spam / violence / deleted)
│
▼
Ranked feed responseComponents
| Component | Language | Role |
|---|---|---|
| Home Mixer | Rust | Orchestration layer |
| Thunder | Rust | In-network retrieval |
| Phoenix | Python/Rust | Out-of-network retrieval + ranking |
| Candidate Pipeline | Rust | Reusable pipeline framework |
Key design decisions
1. No hand-engineered features
The Grok-based transformer learns all relevance signals directly from raw engagement sequences. No manual feature pipelines for things like "user follows author" or "post contains keyword." This simplifies the data infrastructure and moves all complexity into the model.
2. Candidate isolation in ranking
During transformer inference, candidates cannot attend to each other — each post only attends to the user context. Scores are therefore independent of batch composition, making them consistent and cacheable.
3. Multi-action prediction
Rather than predicting a single "relevance" score, the model predicts ~15 distinct action probabilities. This allows negative signals (block, mute, report) to actively suppress unwanted content.
4. Hash-based embeddings
Both retrieval and ranking use multiple hash functions for embedding lookup instead of a fixed vocabulary table. This handles an unbounded feature space without a rigid vocab.
5. Composable pipeline architecture
The candidate-pipeline crate separates business logic from execution and monitoring. Stages run in parallel where possible with graceful error handling.
Scoring formula
$$\text{Score} = \sum_{i} w_i \cdot P(\text{action}_i)$$
Where:
- - Positive actions (like, reply, repost, share, follow) get positive weights
- - Negative actions (block, mute, report, not interested) get negative weights
- - Thunder — in-memory in-network post store
- - Phoenix — two-tower retrieval + Grok transformer ranker
- - X Algorithm — Pre-Scoring Filters — full filter stack
- - X Algorithm — Scoring Pipeline — scorer details
- - X Algorithm — Key Concepts — concept glossary