🎬 RecommenderSystem

A movie recommendation API powered by a Deep Learning Recommendation Model (DLRM). You send it your preferences, it runs them through a neural network, pulls real movie data from TMDB (posters, ratings, summaries — the works), and hands you back personalized recommendations. This is the backend that powers CinemaScopeAI, an iOS app for discovering movies you'll actually want to watch.

Live Demo

The API is live right now. Try it:

curl -X POST https://recommendersystem-l993.onrender.com/api/v1/predict \
  -H "Content-Type: application/json" \
  -d '{"continuous_features": [0.76, 0.5, 0.3, 0.4, 0.72, 0.6, 0.8, 0.52], "categorical_features": [1, 2]}'

Note: It's on Render's free tier, so the first request might take ~30 seconds to wake up. After that, responses come back fast.

You'll get something like this:

[
  {
    "title": "The Shawshank Redemption",
    "genre": "Unknown",
    "rating": "N/A",
    "score": 0.87,
    "poster_url": "https://image.tmdb.org/t/p/w500/q6y0Go1tsGEsmtFryDOJo3dEmqu.jpg",
    "director": "N/A",
    "release_year": 1994,
    "summary": "Imprisoned in the 1940s for the double murder of his wife and her lover..."
  },
  {
    "title": "Interstellar",
    "genre": "Unknown",
    "rating": "N/A",
    "score": 0.84,
    "poster_url": "https://image.tmdb.org/t/p/w500/gEU2QniE6E77NI6lCU6MxlNBvIx.jpg",
    "director": "N/A",
    "release_year": 2014,
    "summary": "The adventures of a group of explorers who make use of a newly discovered wormhole..."
  }
]

Five movies, ranked by how well they match your taste profile, each with a poster URL you can render directly in a client app.

How It Works

┌─────────────────────────────────────────────────────────────────────┐
│                         CLIENT REQUEST                              │
│  POST /api/v1/predict                                               │
│  { "continuous_features": [0.76, 0.5, 0.3, 0.4, 0.72, 0.6, 0.8, 0.52],│
│    "categorical_features": [1, 2] }                                 │
└──────────────────────────┬──────────────────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────────────────────┐
│                      INPUT VALIDATION                               │
│  Pydantic checks types + FastAPI validates feature counts           │
└──────────────────────────┬──────────────────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────────────────────┐
│                       DLRM MODEL (PyTorch)                          │
│                                                                     │
│  continuous features ──► Dense Layer ──────────┐                    │
│                                                 ├──► MLP ──► Score  │
│  categorical features ──► Embedding Tables ────┘    [512→64→32→1]  │
│                                                      + Sigmoid      │
│                                                      → [0.0, 1.0]  │
└──────────────────────────┬──────────────────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────────────────────┐
│                     TMDB API (async via httpx)                      │
│  Fetches popular movies: titles, ratings, poster paths, summaries   │
└──────────────────────────┬──────────────────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────────────────────┐
│                       RANKING + RESPONSE                            │
│  Sorts movies by how closely they match the prediction score        │
│  Returns top 5 with full metadata + poster URLs                     │
└─────────────────────────────────────────────────────────────────────┘

Step by step:

You send preferences — 8 continuous features (engineered stats like average rating, activity, item popularity) and 2 categorical features (user ID and item ID).
DLRM processes them — the model takes your 8 continuous features through a dense layer and your category choices through separate embedding tables ([943, 1682], each 128-dim). It also computes an element-wise user × item interaction (the two embedding vectors multiplied together). Everything gets concatenated and pushed through a multi-layer perceptron (512 → 64 → 32 neurons) with Dropout(0.2) that outputs a single score between 0 and 1.
Real movies get fetched — the API calls TMDB asynchronously to grab currently popular movies with their metadata.
Results get ranked — movies are sorted by how well their TMDB rating aligns with the model's prediction score, and the top 5 are returned with titles, posters, release years, and summaries.

Features

The Model

The DLRM (Deep Learning Recommendation Model) is built from scratch in PyTorch. What makes it interesting:

Handles mixed feature types — 8 continuous features (engineered user/item stats) go through a dense layer, while categorical features (user ID, item ID) each get their own embedding table. This is how real recommendation systems at companies like Meta handle the mix of "how much" and "which one" data.
Embedding tables — [943, 1682] — two embedding tables (users, items), each 128-dimensional.
Explicit feature interaction — computes element-wise user × item embedding product before the MLP, capturing direct collaborative filtering signal.
MLP interaction layer — a 3-layer network (512 → 64 → 32) with ReLU activations and Dropout(0.2) that learns how continuous features, embeddings, and their interaction combine.
Binary classification — ratings >= 4 are positive, trained with BCELoss. Sigmoid output produces a score between 0 and 1.
Regularization — Dropout(0.2), weight decay (1e-5), early stopping (patience=5).
Input validation — the forward pass checks for None inputs, mismatched feature counts, and MLP shape mismatches, raising clear errors instead of crashing silently.

Real Movie Data

TMDB integration — fetches currently popular movies from The Movie Database API, so recommendations always reflect what's actually out there.
Async fetching — uses httpx.AsyncClient so the API doesn't block while waiting on TMDB.
Full metadata — each recommendation comes with title, rating, poster URL (ready to render in a UI), release year, and a plot summary.

API Design

Versioned endpoints — /api/v1/predict with a legacy /predict/ that delegates to it for backward compatibility.
Health checks — GET /health reports model status, app version, and whether everything is loaded.
Model introspection — GET /api/v1/models returns architecture details, parameter count, device info, and layer configuration.
Structured errors — every error (400, 404, 422, 500, 502, 503) returns consistent JSON with success, error, and detail fields. No raw stack traces in production.
Request logging middleware — every request gets logged with method, path, status code, and duration in milliseconds.

Developer Experience

Makefile — one command for anything: make run, make test, make lint, make docker-build.
uv for dependency management — fast, reproducible installs via pyproject.toml.
Docker with health checks — production Dockerfile includes a HEALTHCHECK directive that pings /health every 30 seconds.
Comprehensive tests — 71 tests covering model forward passes, API endpoints, input validation, edge cases (empty batches, wrong types, missing fields), structured error responses, preprocessing config, and DLRM interaction verification.
Ruff for linting — fast Python linting and formatting with a 120-char line length.

Tech Stack

Technology	Why
PyTorch	Full control over the DLRM architecture — custom forward pass, manual embedding tables, easy to extend
FastAPI	Async by default, automatic OpenAPI docs, Pydantic validation on every request, and it's fast
httpx	Async HTTP client for non-blocking TMDB API calls inside async FastAPI endpoints
uv	10-100x faster than pip for dependency resolution and installs
Pydantic	Type-safe request/response schemas that auto-generate API documentation
Docker	Consistent environment from dev to production, with multi-stage builds and health checks
Ruff	Linting + formatting in one tool, written in Rust, runs in milliseconds
pytest	Test framework with fixtures, parametrize, and async support for testing FastAPI endpoints
Faiss	Approximate nearest-neighbor search for fast candidate retrieval in the two-stage pipeline
W&B	Experiment tracking — logs training curves, hyperparameters, and evaluation metrics
Render	Simple container deployment with auto-deploy from GitHub pushes

Benchmarks & Results

See EVALUATION.md for full ablation studies, failure analysis, and reproduction instructions.

Model Architecture

Parameter	Value
Dataset	MovieLens 100K (100,000 ratings, 943 users, 1,682 items)
Embeddings	`[943, 1682]` x 128-dim (user, item)
Continuous features	8 dense features → Linear(8, 128)
MLP	`[512 → 64 → 32]` with ReLU + Dropout(0.2) between layers
Output	Linear(32, 1) → Sigmoid
Loss	BCELoss (binary: rating >= 4 is positive)
Optimizer	Adam (lr=0.001, weight_decay=1e-5)
Scheduler	ReduceLROnPlateau (patience=3, factor=0.5)
Regularization	Dropout 0.2, early stopping (patience=5)
Training	20 epochs max, batch_size=256
Parameters	~372K total (~90% embeddings, ~10% MLP + dense layers)

Input Features (8 total)

#	Feature	Source	Description
0	`user_mean_rating`	User	Average rating the user gave, normalised by max rating (5.0)
1	`user_rating_count`	User	Number of ratings by user, normalised by max count in training set
2	`user_rating_var`	User	Variance of user's ratings, normalised by max possible variance (4.0)
3	`user_days_active`	User	Days between first and last rating, normalised
4	`item_mean_rating`	Item	Average rating the item received, normalised
5	`item_rating_count`	Item	Number of ratings for item, normalised
6	`item_popularity_rank`	Item	Rank-ordered popularity, normalised to [0, 1] (1.0 = most popular)
7	`user_item_deviation`	Interaction	`user_mean - item_mean`, shifted from [-1, 1] to [0, 1]

Offline Evaluation Metrics (DLRM)

Metric	Value
NDCG@10	0.6723
Precision@10	0.5774
Recall@10	0.3758
HitRate@10	0.9965
AUC	0.5541

Evaluated on 283 held-out users, 20K test ratings. Train time: ~13s. Inference latency: ~0.05ms/user.

Classical Baselines Comparison

Model	Approach	NDCG@10	Prec@10	Hit@10
LogReg	StandardScaler → Logistic Regression (C=1.0, lbfgs) on 8 features	0.8023	0.6880	1.0000
XGBoost	200 trees, max_depth=6, lr=0.1, subsample=0.8 on 8 features	0.7836	0.6707	1.0000
LightGBM	200 trees, max_depth=6, lr=0.1, subsample=0.8 on 8 features	0.7820	0.6661	0.9965
DLRM	Learned user/item embeddings (128-dim) + interaction + 3-layer MLP	0.6723	0.5774	0.9965

Serving Architecture

Endpoint	Strategy	Description
`/recommend/{user_id}`	Brute-force	Scores all 1,682 items in a single forward pass, returns top-K
`/recommend_v2/{user_id}`	Two-stage (ANN + rerank)	Faiss ANN retrieves top-100 candidates, DLRM reranks to top-K
Cold-start	Popularity fallback	Unknown users receive popularity-ranked recommendations

Two-Stage Retrieval Pipeline

Stage 1: Candidate Generation (~1ms)          Stage 2: Reranking
┌──────────────────────────────┐       ┌─────────────────────────────┐
│  User embedding (128-dim)    │       │  Full DLRM forward pass     │
│         ↓                    │       │  on 100 candidates          │
│  Faiss IndexFlatIP (cosine)  │──────►│  8 features + embeddings    │
│  1,682 items → top 100       │       │  → top-K ranked results     │
└──────────────────────────────┘       └─────────────────────────────┘

Stage 1 uses Faiss inner-product search over normalised item embeddings to retrieve ~100 candidates in ~1ms.
Stage 2 runs the full DLRM (8 continuous features + user/item embeddings → MLP) on the candidate set and returns the final top-K.
Falls back to sklearn NearestNeighbors (brute cosine) if faiss-cpu is not installed.

Getting Started

Prerequisites

Python 3.10+
uv (recommended) or pip
A TMDB API key (get one free here)

1. Clone the repo

git clone https://github.com/AkinCodes/RecommenderSystem.git
cd RecommenderSystem

2. Install dependencies

make install

This runs uv sync --all-extras, which installs everything from pyproject.toml including dev dependencies (pytest, ruff, etc.).

3. Set up environment variables

cp .env.example .env

Open .env and add your TMDB credentials:

TMDB_API_KEY=your_tmdb_api_key_here
TMDB_BEARER_TOKEN=your_tmdb_bearer_token_here

4. Start the dev server

make run

This starts uvicorn with hot reload at http://localhost:8000. You should see:

INFO:     Configuration loaded and validated successfully.
INFO:     DLRM model loaded successfully from 'trained_model.pth'.
INFO:     Uvicorn running on http://0.0.0.0:8000

5. Try it out

curl -X POST http://localhost:8000/api/v1/predict \
  -H "Content-Type: application/json" \
  -d '{"continuous_features": [0.76, 0.5, 0.3, 0.4, 0.72, 0.6, 0.8, 0.52], "categorical_features": [1, 2]}'

Or visit http://localhost:8000/docs for the interactive Swagger UI.

6. Run the tests

make test

You should see all tests pass:

tests/test_model.py::TestDLRMModel::test_forward_output_shape PASSED
tests/test_model.py::TestDLRMModel::test_forward_output_range PASSED
tests/test_model.py::TestPredictEndpoint::test_predict_endpoint PASSED
...

API Reference

`GET /`

Status check. Returns a simple alive message.

{ "status": "healthy", "message": "Recommendation API is running." }

`GET /health`

Health check with model status and version info.

{
  "status": "healthy",
  "model_loaded": true,
  "version": "1.0.0"
}

`GET /api/v1/models`

Returns details about the loaded DLRM model.

{
  "architecture": "DLRM (Deep Learning Recommendation Model)",
  "num_parameters": 372097,
  "device": "cpu",
  "num_continuous_features": 8,
  "num_categorical_features": 2,
  "mlp_layers": [512, 64, 32]
}

`POST /api/v1/predict`

The main endpoint. Send user preferences, get back 5 movie recommendations.

Request:

{
  "continuous_features": [0.76, 0.5, 0.3, 0.4, 0.72, 0.6, 0.8, 0.52],
  "categorical_features": [1, 2]
}

Response (200):

[
  {
    "title": "Movie Title",
    "genre": "Unknown",
    "rating": "N/A",
    "score": 0.85,
    "poster_url": "https://image.tmdb.org/t/p/w500/poster.jpg",
    "director": "N/A",
    "release_year": 2024,
    "summary": "A brief plot overview from TMDB."
  }
]

Error responses:

Status	When
400	Wrong number of categorical features
422	Invalid types (strings instead of numbers, missing fields)
502	TMDB API is down or returned no movies
503	Model isn't loaded

All errors return structured JSON:

{
  "success": false,
  "error": "HTTP 400",
  "detail": "Expected 2 categorical features, got 1."
}

`GET /api/v1/recommend/{user_id}`

Personalized top-K recommendations for a known MovieLens user. Scores every item in one forward pass and returns the highest-scoring movies with metadata.

Query parameters:

Parameter	Default	Description
`top_k`	`10`	Number of recommendations to return

Example:

curl http://localhost:8000/api/v1/recommend/1?top_k=5

Response (200):

{
  "user_id": 1,
  "recommendations": [
    {
      "item_id": 50,
      "score": 0.9312,
      "title": "Star Wars (1977)",
      "genres": ["Action", "Adventure", "Romance", "Sci-Fi", "War"]
    }
  ]
}

Error responses:

Status	When
200	Unknown user — returns popularity-based fallback with `"note": "cold-start"`
404	Unknown user AND no popular items available (rare)
503	Model or serving context not loaded

`POST /predict/`

Legacy endpoint. Delegates to /api/v1/predict — same request format, same response.

Training the Model

The training pipeline uses PyTorch with early stopping, learning rate scheduling, and W&B logging.

uv run python scripts/train.py

This will:

Create a synthetic dataset (1000 samples with 10 continuous + 5 categorical features)
Train the DLRM for 5 epochs with BCE loss
Save checkpoints to lightning_logs/checkpoints/
Run validation after training
Load the best checkpoint and run a test inference

Config Parameters

All hyperparameters live in configs/config.yaml:

Parameter	Default	What it controls
`num_features`	`10`	Number of continuous input features
`embedding_sizes`	`[10, 10, 10, 10, 10]`	Vocabulary size for each categorical embedding table
`mlp_layers`	`[384, 64, 32]`	Hidden layer dimensions for the interaction MLP
`learning_rate`	`0.001`	Adam optimizer learning rate
`epochs`	`5`	Number of training epochs
`batch_size`	`32`	Samples per training batch

How Checkpoints Work

The ModelCheckpoint callback saves the top 3 models by training loss, plus a last.ckpt that always has the most recent weights. Checkpoint files are named like dlrm-epoch=04-train_loss=0.65.ckpt. To use a trained model in the API, export its state dict to trained_model.pth in the project root.

Dual Optimizer Setup

The trainer separates dense parameters (linear layers) and sparse parameters (embeddings) into two different optimizers — Adam for dense, SparseAdam for sparse. This is a standard pattern in recommendation systems because embedding gradients are naturally sparse (only the rows that were looked up get updated).

Makefile Reference

Command	What it does
`make install`	Install all dependencies including dev extras via `uv sync --all-extras`
`make run`	Start the FastAPI dev server with hot reload on port 8000
`make test`	Run the full test suite with verbose output
`make lint`	Check code with Ruff (linting + format check)
`make docker-build`	Build the Docker image as `recommender-system`
`make docker-run`	Run the container on port 8000, injecting `.env` variables
`make clean`	Remove `__pycache__`, `.pyc`, `.pytest_cache`, `.ruff_cache`, and build artifacts

Environment Variables

Variable	Required	Description
`TMDB_API_KEY`	Yes	TMDB v3 API key (used for authentication)
`TMDB_BEARER_TOKEN`	Yes	TMDB v4 read-access token (used in Bearer auth header)

Get both at themoviedb.org/settings/api — you'll need a free account.

The app will still start without them, but model_loaded will be true and predictions will fail at the TMDB fetch step with a 502 error.

Project Structure

RecommenderSystem/
├── api/
│   └── app.py                      # FastAPI app — endpoints, middleware, TMDB client, error handling
├── models/
│   ├── __init__.py
│   ├── dlrm.py                     # DLRM model — embeddings, interaction, MLP, forward pass
│   └── classical.py                # Classical baselines (XGBoost, LightGBM, LogReg)
├── data/
│   ├── preprocessing.py            # Shared preprocessing — PrepConfig, validation, feature engineering
│   └── ml-100k/                    # MovieLens 100K dataset
├── scripts/
│   ├── train_movielens.py          # Train DLRM on MovieLens with W&B logging
│   ├── retrain_and_compare.py      # Train DLRM + 3 classical baselines, generate comparison report
│   ├── run_experiment.py           # A/B experiment runner (Welch's t-test, Cohen's d, power analysis)
│   ├── experiment_framework.py     # A/B testing framework — statistical comparison engine
│   ├── baseline_comparison.py      # Heuristic baselines (random, popular, user-mean) vs DLRM
│   ├── benchmark_inference.py      # PyTorch vs ONNX inference latency benchmarks
│   ├── drift_detection.py          # Data drift detection between train/test distributions
│   ├── fairness_analysis.py        # Fairness audit — popularity bias, user activity gap, diversity
│   ├── export_onnx.py              # Export DLRM to ONNX format
│   └── train.py                    # Demo training script with synthetic data
├── tests/
│   ├── test_api_integration.py     # 31 API endpoint tests (predict, recommend, errors, health)
│   ├── test_model.py               # 20 DLRM unit tests + API mock tests
│   ├── test_upgrades.py            # 10 tests for PrepConfig + interaction feature
│   └── test_validation.py          # 11 tests for data validation and split checks
├── reports/
│   └── model_comparison.json       # Benchmark results (DLRM vs XGBoost vs LightGBM vs LogReg)
├── configs/
│   └── config.yaml                 # Hyperparameters — features, layers, learning rate, epochs
├── deploy/
│   ├── task-definition.json        # AWS ECS Fargate task definition
│   └── trust-policy.json           # AWS IAM trust policy for ECS execution role
├── Dockerfile                      # Production Dockerfile (Python 3.11, uv, health checks)
├── Makefile                        # Dev shortcuts — install, run, test, lint, docker, clean
├── pyproject.toml                  # Project metadata, dependencies, tool config (ruff, pytest)
├── EVALUATION.md                   # Full ablation studies and reproduction instructions
├── .env.example                    # Template for required environment variables
└── README.md                       # You are here

Deployment

Docker (local)

make docker-build
make docker-run

The production Dockerfile uses Python 3.11 with uv for fast installs and includes a health check that pings /health every 30 seconds.

Render

The API is deployed on Render at https://recommendersystem-l993.onrender.com. It auto-deploys on every push to main. Render builds the Docker image, sets the environment variables, and runs the container.

AWS ECS (Fargate)

The repo includes an ECS task definition (task-definition.json) configured for Fargate with 256 CPU units and 512MB memory. The container image is pushed to ECR at YOUR_AWS_ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com/cinemascope-recsys:latest.

To deploy to ECS:

# Build and push to ECR
docker build -t cinemascope-recsys .
docker tag cinemascope-recsys:latest YOUR_AWS_ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com/cinemascope-recsys:latest
aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin YOUR_AWS_ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com
docker push YOUR_AWS_ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com/cinemascope-recsys:latest

# Register task and run
aws ecs register-task-definition --cli-input-json file://task-definition.json
aws ecs update-service --cluster your-cluster --service your-service --force-new-deployment

What I'd Improve Next

Sequential model — add SASRec or BERT4Rec to capture temporal user behavior (what the user watched recently, not just aggregate stats)
Genre features in DLRM — genre data is already loaded but not used in training. Adding content features would improve cold-start for items.
Negative sampling — explicit hard-negative mining during training to improve ranking quality
Feature store — replace the pickle-based serving context with a feature store (Feast/Redis) for fresh user features
Model versioning — track which model version served each prediction for reproducibility
Rate limiting — protect the TMDB integration from getting throttled under heavy load
Caching — Redis or in-memory cache for TMDB responses since popular movies don't change every second

Related Projects

CinemaScopeAI — the iOS app that talks to this API. Built with SwiftUI, displays recommendations with posters, ratings, and summaries.
MoviePosterAI — poster analysis using computer vision.

Author

Akin Olusanya

LinkedIn · GitHub · workwithakin@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 90 Commits
.github/workflows		.github/workflows
api		api
configs		configs
data		data
deploy		deploy
docker		docker
docs		docs
experiments/results		experiments/results
models		models
reports		reports
scripts		scripts
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
EVALUATION.md		EVALUATION.md
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

🎬 RecommenderSystem

Live Demo

How It Works

Features

The Model

Real Movie Data

API Design

Developer Experience

Tech Stack

Benchmarks & Results

Model Architecture

Input Features (8 total)

Offline Evaluation Metrics (DLRM)

Classical Baselines Comparison

Serving Architecture

Two-Stage Retrieval Pipeline

Getting Started

Prerequisites

1. Clone the repo

2. Install dependencies

3. Set up environment variables

4. Start the dev server

5. Try it out

6. Run the tests

API Reference

GET /

GET /health

GET /api/v1/models

POST /api/v1/predict

GET /api/v1/recommend/{user_id}

POST /predict/

Training the Model

Config Parameters

How Checkpoints Work

Dual Optimizer Setup

Makefile Reference

Environment Variables

Project Structure

Deployment

Docker (local)

Render

AWS ECS (Fargate)

What I'd Improve Next

Related Projects

Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`GET /`

`GET /health`

`GET /api/v1/models`

`POST /api/v1/predict`

`GET /api/v1/recommend/{user_id}`

`POST /predict/`

Packages