ProtoMech

ProtoMech Logo

This is the official code repository for the paper "Protein Circuit Tracing via Cross-layer Transcoders", by Darin Tsui, Kunal Talreja, Daniel Saeedi, and Amirali Aghazadeh, accepted into ICML 2026. A link to the paper can be found here.

Additionally, one can explore protein circuits through our web-based visualizer!

Quick Start

The easiest way to get started with ProtoMech is through our interactive Google Colab notebook. No local installation is required.

Workflow

Models: ProtoMech currently supports running on ESM2-8M and ESM2-35M!
Circuit Discovery (optional): Train a probe on your custom dataset (Binary classification or Regression) to identify circuits.
Interactive Visualization: Generate files required for our website and visualize circuits!

If you skip step 2, you can obtain circuit files in two ways:

Use Our Pre-discovered Library: If you want to explore circuits from our paper, we provide a curated list of circuits here you can access through our notebook.
Auto-generate Your Own: Even without a custom dataset, you can still generate a circuit! Just leave the circuit option blank.

Environment Setup

Create the conda environment by running:

conda env create -f clt.yml
conda activate clt

Repository Structure

ProtoMech/
├── training/              # CLT training code
├── training_block/        # Windowed CLT training code
├── training_transcoder/   # PLT training code
├── circuit_utils/         # Core circuit discovery utilities
├── family_circuit/        # Protein family-based circuit discovery
├── function_circuit/      # DMS function-based circuit discovery
├── steering/              # Probe and DMS steering experiments
├── esm_steering/          # CAA (Contrastive Activation Addition) steering
├── visualization/         # Circuit analysis and PyMOL visualization
├── data/                  # Training data generation
└── plots_and_tables/      # Plots and tables

Model Architectures

Cross-Layer Transcoder (CLT)

Location: training/clt_model.py

Replaces ESM2 MLP blocks with a sparse transcoder using information from all preceding layers:

Top-K Activation: Only top-k latents are active (enforces sparsity)
Cross-Layer Decoding: Layer l reconstructs using latents from layers 0 to l
AuxK Loss: Encourages rarely-used latents to activate

Windowed cross-Layer Transcoder (CLT)

Location: training_block/clt_model.py

Variant of CLTs that restricts cross-layer connectictivity to localized windows. A fair tradeoff between capturing cross-layer dependencies and compute time.

Block size: Sets the window size for cross-layer connectivity.

Per-Layer Transcoder (PLT)

Location: training_transcoder/plt_model.py

Baseline where each layer has independent encoder/decoder pairs. Layer l only uses its own latents.

Training

# Train CLT
cd training && sh main.sh

# Train Windowed CLT
cd training_block && sh main.sh

# Train PLT
cd training_transcoder && sh main_plt.sh

If you would like to train your own model, download training_sequences_5m.a2m from https://huggingface.co/datasets/ktalreja/ProtoMechData and put it in the data folder.

Circuit Discovery

Identifies minimal subsets of latents that recover a target property (family classification or DMS fitness).

Core Utilities (`circuit_utils/`)

File	Description
`clt_circuit.py`	CLT circuit discovery
`plt_circuit.py`	PLT circuit discovery
`esm_activation.py`	ESM-2 activation extraction

Family Circuit Discovery (`family_circuit/`)

Discovers circuits distinguishing protein families (InterPro domains).

cd family_circuit
sh main.sh                        # Full run for all families
sh main.sh --target IPR000724     # Specific family

You can download our Swiss-Prot data used for our family circuits, swissprot_seqid30_75k_all_info_with_3di.parquet, from https://huggingface.co/datasets/ktalreja/ProtoMechData and put it in the data folder.

Function Circuit Discovery (`function_circuit/`)

Discovers circuits using DMS fitness data.

cd function_circuit && sh main.sh

Steering

Modifies sequence generation by amplifying or ablating circuit nodes.

Replacement Models (`steering/`)

File	Description
`full_replacement_models.py`	`FullCLTReplacementModel`, `FullPLTReplacementModel`
`local_replacement_models.py`	Local replacement model for CLT
`run_probe_steering.py`	Probe-based steering

CAA Steering (`esm_steering/`)

Contrastive Activation Addition steering using steering vectors from contrastive pairs.

cd esm_steering && sh main_caa_steering.sh

Visualization

Location: visualization/

File	Description
`circuit_analysis.py`	Family-level circuit analysis
`circuit_analysis_function.py`	Function/DMS-level analysis
`generate_pymol_view.py`	PyMOL visualization scripts
`compute_activations.py`	Computes top-10 sequences per act

If you want to use compute_activations.py instead of using the pre-saved top activation results found in top10_activations.pt (which can be found here), download swissprot_full.parquet from https://huggingface.co/datasets/ktalreja/ProtoMechData and put it in the data folder.

Previous Data

You can find the models at https://huggingface.co/ktalreja/ProtoMechModels and the data used in this paper at https://huggingface.co/datasets/ktalreja/ProtoMechData.

Citation

If you use ProtoMech and enjoy it, please consider citing our paper!

@inproceedings{tsui2026protomech,
  title={Protein Circuit Tracing via Cross-layer Transcoders},
  author={Tsui,  Darin and Talreja,  Kunal and Saeedi,  Daniel and Aghazadeh,  Amirali},
  booktitle={Proceedings of the 43rd International Conference on Machine Learning},
  year={2026}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ProtoMech

Quick Start

Workflow

Environment Setup

Repository Structure

Model Architectures

Cross-Layer Transcoder (CLT)

Windowed cross-Layer Transcoder (CLT)

Per-Layer Transcoder (PLT)

Training

Circuit Discovery

Core Utilities (`circuit_utils/`)

Family Circuit Discovery (`family_circuit/`)

Function Circuit Discovery (`function_circuit/`)

Steering

Replacement Models (`steering/`)

CAA Steering (`esm_steering/`)

Visualization

Previous Data

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
circuit_utils		circuit_utils
esm_steering		esm_steering
family_circuit		family_circuit
function_circuit		function_circuit
plots_and_tables		plots_and_tables
steering		steering
training		training
training_block		training_block
training_transcoder		training_transcoder
visualization		visualization
.gitignore		.gitignore
LICENSE		LICENSE
ProtoMech.ipynb		ProtoMech.ipynb
ProtoMech_Logo_Dark.svg		ProtoMech_Logo_Dark.svg
ProtoMech_Logo_Glow.svg		ProtoMech_Logo_Glow.svg
ProtoMech_Logo_Light.svg		ProtoMech_Logo_Light.svg
README.md		README.md
clt.yml		clt.yml
estimate_clt_params.py		estimate_clt_params.py

Folders and files

Latest commit

History

Repository files navigation

ProtoMech

Quick Start

Workflow

Environment Setup

Repository Structure

Model Architectures

Cross-Layer Transcoder (CLT)

Windowed cross-Layer Transcoder (CLT)

Per-Layer Transcoder (PLT)

Training

Circuit Discovery

Core Utilities (circuit_utils/)

Family Circuit Discovery (family_circuit/)

Function Circuit Discovery (function_circuit/)

Steering

Replacement Models (steering/)

CAA Steering (esm_steering/)

Visualization

Previous Data

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Core Utilities (`circuit_utils/`)

Family Circuit Discovery (`family_circuit/`)

Function Circuit Discovery (`function_circuit/`)

Replacement Models (`steering/`)

CAA Steering (`esm_steering/`)

Packages