Accepted by SIGGRAPH 2026
Yue Ma1*✉, Xu Ye1*, Qinghe Wang2, Yucheng Wang1, Hongyu Liu1, Yinhan Zhang1, Xinyu Wang3, Yuanpeng Chen4, Shanhui Mo4, Paul Liang5, Fangneng Zhan5, Qifeng Chen1,
1Hong Kong University of Science and Technology 2Dalian University of Technology 3Tsinghua University 4Independent 5Massachusetts Institute of Technology (MIT)
*Equal Contribution ✉Corresponding Author
Your star means a lot to us in developing this project! ⭐⭐⭐
Place the final demo video here.

📖 Table of Contents
- [EasyVFX:Frequency-DrivenDecouplingforResource-EfficientVFX Generation](#EasyVFX:Frequency-DrivenDecouplingforResource-EfficientVFX Generation)
We introduce EasyVFX, a resource-efficient framework that achieves realistic VFX synthesis under stringent constraints. Our core philosophy lies in frequency-domain decomposition: we observe that the complexity of VFX can be significantly mitigated by decoupling high-frequency components, which represent in tricate spatial appearances, from low-frequency components that encapsulate global motion dynamics.This spectral disentanglement transforms a high-dimensional learning problem into manageable sub-tasks, thereby lowering the optimization barrier and reducing data dependency. Building upon this insight,we proposea two-stage training paradigm. First, we design a Frequency-aware Mixture-of-Experts (Freq-MoE) architecture. By utilizing a soft routing mechanism, our model assigns specialize dexperts to distinct spectral bands, enabling them to cultivate robust priors for appearance and motion dynamics. This specialization allows the model to acquire foundational VFX knowledge with fewer GPU resources. Second, we introduce a Test-Time Training strategy powered by a novel Frequency-constraint Loss. This allows the pre-trained model to swiftly adapt to specific, unseen effects through localized optimizations, requiring only about 100 steps on a single GPU.

Environment Requirement 🔧
Step 1: Clone this repo
git clone https://github.com/TencentARC/FlexiAct.git
Step 2: Install required packages
bash env.sh
conda activate EasyVFX
Data Preparation ⏬
Option 1: Prepare data
You can download the data we used in our paper at here.
cd EasyVFX
git clone https://huggingface.co/datasets/BianYx/VAP-Data
you need to organize your training dataset in the following structure:
|-- benchmark
|-- captions
|-- some VFX
|-- VFX
|-- crop.csv
|-- val_image.csv
|-- reference_videos
|-- some VFX
|-- VFX
|-- 0.mp4
|-- 1.mp4
|-- ...
|-- extract_vid_and_crop.py
|-- target_image
|-- some VFX
|-- 1.jpg
|-- 2.webp
|-- ...
Step1: Prepare your reference video Execute:
python extract_vid_and_crop.py
Checkpoints 📊
You need to download the base model CogVideoX-5B-I2V to {your_cogvideoi2v_path} by:
git lfs install
git clone https://huggingface.co/THUDM/CogVideoX-5b-I2V {your_cogvideoi2v_path}
Training
Training script:
bash scripts/train/RefAdapter_train.sh -v CUDA_VISIBLE_DEVICES
bash scripts/train/FEI_train.sh -v CUDA_VISIBLE_DEVICES -a your vfx nameInference
bash scripts/inference/Inference.sh
pass
This repository borrows heavily from FlexiAct and CogVideoX, thanks to the authors for sharing their code and models.