Phoenix
ML component with two functions: (1) two-tower retrieval for out-of-network post discovery, and (2) Grok-based transformer ranking that predicts engagement probabilities.
Two functions
Phoenix
├── Retrieval → find relevant posts from global corpus
└── Ranking → score each candidate postPart 1: Retrieval — two-tower model
Architecture
User engagement history All posts in corpus
│ │
▼ ▼
User tower Post tower
(hash embeddings) (hash embeddings)
│ │
▼ ▼
User embedding Post embeddings (all)
│ │
└──────────┬────────────────────┘
▼
Dot product similarity
user · post = cos(θ) [after L2 normalisation]
│
▼
Top-K candidates → out-of-network poolHow dot product similarity works
Both vectors are L2-normalised, so the dot product equals the cosine of the angle between them:
$$\text{similarity} = \mathbf{u} \cdot \mathbf{p} = \cos(\theta)$$
- - High score (→ 1.0): vectors point in the same direction → post is relevant
- - Low score (→ 0.0): vectors are orthogonal → no relationship
- - Negative score: vectors point away from each other → likely not relevant
Posts closest to the user vector in embedding space become out-of-network candidates.
Hash-based embeddings
Rather than a fixed vocabulary lookup table, both towers use multiple hash functions per feature. This handles an unbounded feature space (billions of post IDs, author IDs, etc.) without a rigid vocab.
Part 2: Ranking — Grok transformer
Input
- - User context: engagement history sequence (likes, replies, reposts, clicks, etc. in order)
- - Candidate post: text, author, media features — all encoded via hash embeddings
Candidate isolation
Posts cannot attend to each other during inference. Each post only attends to the user context.
This is a deliberate design decision:
- - Scores are batch-independent — the same post gets the same score regardless of what else is in the batch
- - Scores are therefore cacheable
Output — action probabilities
The model outputs ~15 probabilities per post:
| Category | Actions |
|---|---|
| Positive | like, reply, repost, quote, click, share, follow author, video view |
| Neutral | dwell, photo expand, profile click |
| Negative | block author, mute author, report, not interested |
Scoring formula
$$\text{Score} = \sum_{i} w_i \cdot P(\text{action}_i)$$
Positive actions get positive weights; negative actions get negative weights, actively suppressing unwanted content.
Implementation notes
- - Transformer architecture ported from Grok-1 open source release by xAI
- - Adapted for recommendation use cases (not generative text)
- - No hand-engineered features — the transformer learns all relevance from raw engagement sequences
- - X Algorithm — For You Feed — where Phoenix fits in the pipeline
- - Thunder — in-network retrieval counterpart
- - X Algorithm — Scoring Pipeline — weighted scorer and diversity scorer details
- - X Algorithm — Key Concepts — user embedding, dot product similarity