75 PyTorch problems from real ML/AI interviews at Google, Meta, Anthropic, and more.
Follow me on Twitter | Try the Terminal | AI Tutor | Send Feedback
I struggled to grind for ML/AI interviews so I went back to the basics and created a list after careful research. These are real problems from first person reports from real engineer interviews.
Important
Don't use GPT. The whole point is to struggle through these yourself. If you paste these into ChatGPT you're wasting your time. The goal is to deeply understand PyTorch, not to get an answer. I used GPT to help write some of the initial code, but I tested and solved every problem myself. That's where the learning happens.
Turn any AI assistant into your PyTorch interview coach. The TorchLeet MCP server gives your AI access to all 90 problems, progressive hints, company prep plans, and learning paths, while enforcing a no-spoilers teaching style.
# Clone the repo first
git clone https://github.com/Exorust/TorchLeet.git
cd TorchLeet
# Then connect the AI Tutor (pick your client)
# Claude Code
claude mcp add torchleet -- npx -y torchleet-mcp
# Codex
codex mcp add torchleet -- npx -y torchleet-mcpClaude Desktop / Cursor / VS Code
Add this to your MCP config:
{
"mcpServers": {
"torchleet": {
"command": "npx",
"args": ["-y", "torchleet-mcp"]
}
}
}Four learning guides:
| Guide | What it does |
|---|---|
torchleet-tutor |
Guides you through problems with progressive hints |
torchleet-interview-prep |
Timed mock interviews for specific companies |
torchleet-review |
Senior ML engineer reviews your code |
torchleet-explain |
Deep-dives from intuition to math to code |
Set up the AI Tutor | torchleet-mcp on npm
75 problems across three tracks:
| Track | Focus | Questions |
|---|---|---|
| Basics | Core PyTorch, classical ML, fundamentals | 24 |
| LLM Learning Path | Build an LLM from scratch in order | 23 |
| Advanced | Systems, kernels, modern architectures, alignment | 48 |
Questions overlap between tracks. Company-tagged questions tell you exactly what Google, Anthropic, Meta, and others ask.
# Install PyTorch
# https://pytorch.org/get-started/locally/
# Pick a problem, fill in the TODOs, compare with the solution
jupyter notebook torch/basic/lin-regression/lin-regression.ipynbEach problem has a question file and a _SOLN solution file. Fill in the ... and #TODO blocks, then check your work.
Build an LLM from scratch, one question at a time. Recommended order:
| Problem | Links |
|---|---|
| Implement Byte Pair Encoding from Scratch | Q |
| Implement Sinusoidal Embeddings | Q / S |
| Implement ROPE Embeddings | Q / S |
| Implement RMS Norm | |
| Implement Attention from Scratch | Q / S |
| Problem | Links |
|---|---|
| Implement Multi-Head Attention | Q / S |
| Implement Grouped Query Attention | Q / S |
| Implement KV Cache | Q / S |
| Implement Sliding Window Attention | Q / S |
| Problem | Links |
|---|---|
| Implement SmolLM from Scratch | Q / S |
| Problem | Companies | Links |
|---|---|---|
| Implement KL Divergence Loss | ||
| Implement LoRA | Meta, Google, Anthropic, OpenAI | Q / S |
| Apply SFT on SmolLM | ||
| Implement DPO Loss | Anthropic, OpenAI, DeepMind, Meta | Q / S |
| Implement PPO for RLHF | Anthropic, OpenAI, DeepMind, Meta | Q / S |
| Implement GRPO (DeepSeek-R1) | DeepMind, Anthropic, OpenAI | Q / S |
| Problem | Companies | Links |
|---|---|---|
| Temperature Sampling | OpenAI, Anthropic, Cohere | Q / S |
| Top-k Sampling | Anthropic, OpenAI, DeepMind | Q / S |
| Top-p (Nucleus) Sampling | Anthropic, OpenAI, DeepMind | Q / S |
| Speculative Decoding | Google, DeepMind, Anthropic | Q / S |
| Continuous Batching | Perplexity, Together AI, Meta | Q / S |
| Build a Complete LLM Inference Engine | Perplexity, Together AI, Fireworks AI | Q / S |
| Problem | Companies | Links |
|---|---|---|
| Mixture of Experts Layer | Google, DeepMind, Mistral, xAI | Q / S |
Core PyTorch and classical ML fundamentals.
| Problem | Difficulty | Links |
|---|---|---|
| Implement Linear Regression | Basic | Q / S |
| Custom Dataset and DataLoader | Basic | Q / S |
| Custom Activation Function | Basic | Q / S |
| Custom Loss Function (Huber Loss) | Basic | Q / S |
| Implement a Deep Neural Network | Basic | Q / S |
| Visualize Training with TensorBoard | Basic | Q / S |
| Save and Load PyTorch Model | Basic | Q / S |
| Implement a CNN on CIFAR-10 | Easy | Q / S |
| Implement an RNN from Scratch | Easy | Q / S |
| Data Augmentation with torchvision | Easy | Q / S |
| Add Benchmarking to PyTorch Code | Easy | Q / S |
| Train an Autoencoder for Anomaly Detection | Easy | Q / S |
| Quantize Your Language Model | Easy | Q / S |
| Mixed Precision Training | Easy | Q / S |
| Implement Softmax (numerically stable) | Easy | Q / S |
| Implement K-Means Clustering | Easy | Q / S |
| Implement KNN in PyTorch | Easy | Q / S |
| Implement Logistic Regression | Easy | Q / S |
| KL Divergence Loss | Easy | |
| RMS Norm | Easy | |
| Byte Pair Encoding | Easy | Q |
| CNN Parameter Initialization | Medium | Q / S |
| Implement a CNN from Scratch | Medium | Q / S |
| Implement an LSTM from Scratch | Medium | Q / S |
Company-tagged questions from real ML/AI interviews. Sorted by topic.
| Problem | Difficulty | Companies | Links |
|---|---|---|---|
| Contrastive Loss (InfoNCE) + CLIP | Medium | OpenAI, Anthropic, DeepMind, Midjourney | Q / S |
| 2D Positional Embeddings | Medium | Anthropic, DeepMind, Midjourney, Runway | Q / S |
| Sliding Window Attention | Medium | Mistral, Anthropic, Google, DeepMind | Q / S |
| Knowledge Distillation | Medium | Google, Apple, Meta, Qualcomm, Tesla | Q / S |
| Mixture of Experts Layer | Hard | Google, DeepMind, Mistral, Databricks, xAI | Q / S |
| DDPM (Denoising Diffusion) | Hard | Midjourney, Runway, Stability AI, Adobe, Google | Q / S |
| DDIM Sampling + Classifier-Free Guidance | Hard | Midjourney, Runway, Stability AI, Adobe | Q / S |
| Selective State Space Model (Mamba) | Hard | DeepMind, Google, Anthropic | Q / S |
| Vision Transformer + MAE Pretraining | Hard | Meta, Google, Apple, Tesla, Waymo | Q / S |
| Problem | Difficulty | Companies | Links |
|---|---|---|---|
| Implement LoRA | Medium | Meta, Google, Anthropic, OpenAI, Databricks | Q / S |
| Implement DPO Loss | Hard | Anthropic, OpenAI, DeepMind, Meta | Q / S |
| Implement PPO for RLHF | Hard | Anthropic, OpenAI, DeepMind, Meta | Q / S |
| Gradient Checkpointing | Hard | Meta, Google, NVIDIA, Tesla | Q / S |
| Implement GRPO (DeepSeek-R1) | Expert | DeepMind, Anthropic, OpenAI | Q / S |
| Apply SFT on SmolLM | Hard |
| Problem | Difficulty | Companies | Links |
|---|---|---|---|
| Implement KV Cache | Medium | Anthropic, OpenAI, Meta, Perplexity | Q / S |
| Speculative Decoding | Hard | Google, DeepMind, Anthropic, Apple | Q / S |
| Continuous Batching | Hard | Perplexity, Together AI, Anyscale, Meta | Q / S |
| GPTQ Quantization | Hard | ||
| RAG Search of Embeddings | Medium | ||
| Build a Complete LLM Inference Engine | Expert | Perplexity, Together AI, Anyscale, Fireworks AI | Q / S |
| Problem | Difficulty | Companies | Links |
|---|---|---|---|
| Fused Softmax Kernel in Triton | Expert | NVIDIA, Meta, Google, xAI, Tesla | Q / S |
| FlashAttention-2 in Triton | Expert | NVIDIA, Meta, Together AI, xAI | Q / S |
| FSDP (Fully Sharded Data Parallel) | Expert | Meta, Google, NVIDIA, Anthropic, xAI | Q / S |
| Ring Attention for Long Contexts | Expert | Anthropic, Google, Meta, xAI | Q / S |
| Problem | Difficulty | Links |
|---|---|---|
| Custom Autograd Function (SILU) | Hard | Q / S |
| Write a Transformer from Scratch | Hard | Q / S |
| Write a GAN | Hard | Q / S |
| Sequence-to-Sequence with Attention | Hard | Q / S |
| Explainable AI (GradCAM/SHAP) | Hard | Q / S |
"If I'm interviewing at X, which questions should I prioritize?" Numbers reference the v3-tagged questions above.
| Company | Priority Questions |
|---|---|
| Anthropic | 5, 6, 7, 8, 10, 12, 13, 14, 15, 18, 22, 26, 27, 30 |
| OpenAI | 5, 7, 8, 10, 11, 12, 14, 15, 27 |
| DeepMind | 5, 6, 7, 8, 9, 13, 14, 15, 17, 18, 22, 27 |
| Meta | 1, 2, 3, 4, 9, 11, 12, 14, 15, 16, 19, 23, 24, 25, 26, 30 |
| 1, 2, 4, 9, 11, 13, 16, 17, 18, 20, 22, 23, 24, 26, 29, 30 | |
| Apple | 1, 5, 9, 18, 23, 29 |
| NVIDIA | 16, 24, 25, 26 |
| Midjourney / Runway / Stability AI | 5, 6, 20, 21 |
| Perplexity / Together AI / Anyscale | 7, 10, 12, 19, 25, 28 |
| Tesla / Waymo | 16, 23, 24, 29 |
| xAI | 17, 24, 25, 26, 30 |
| Mistral / Cohere | 7, 8, 10, 13, 17 |
Found a bug? Have a question from your own interview? PRs are welcome. Follow the notebook structure (question file + _SOLN file) and tag the authors.
If you found this helpful, follow me on Twitter. I post about ML interviews, PyTorch tips, and what I'm building next. Or just send me feedback, I read everything.