Skip to content

hello-it-bit/domain

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

EasyVFX:Frequency-DrivenDecouplingforResource-EfficientVFX Generation

Accepted by SIGGRAPH 2026

Yue Ma1*✉, Xu Ye1*, Qinghe Wang2, Yucheng Wang1, Hongyu Liu1, Yinhan Zhang1, Xinyu Wang3, Yuanpeng Chen4, Shanhui Mo4, Paul Liang5, Fangneng Zhan5, Qifeng Chen1,
1Hong Kong University of Science and Technology 2Dalian University of Technology 3Tsinghua University 4Independent 5Massachusetts Institute of Technology (MIT)
*Equal Contribution Corresponding Author

Your star means a lot to us in developing this project! ⭐⭐⭐

Place the final demo video here. Demo

📖 Table of Contents

🛠️ Method Overview

We introduce EasyVFX, a resource-efficient framework that achieves realistic VFX synthesis under stringent constraints. Our core philosophy lies in frequency-domain decomposition: we observe that the complexity of VFX can be significantly mitigated by decoupling high-frequency components, which represent in tricate spatial appearances, from low-frequency components that encapsulate global motion dynamics.This spectral disentanglement transforms a high-dimensional learning problem into manageable sub-tasks, thereby lowering the optimization barrier and reducing data dependency. Building upon this insight,we proposea two-stage training paradigm. First, we design a Frequency-aware Mixture-of-Experts (Freq-MoE) architecture. By utilizing a soft routing mechanism, our model assigns specialize dexperts to distinct spectral bands, enabling them to cultivate robust priors for appearance and motion dynamics. This specialization allows the model to acquire foundational VFX knowledge with fewer GPU resources. Second, we introduce a Test-Time Training strategy powered by a novel Frequency-constraint Loss. This allows the pre-trained model to swiftly adapt to specific, unseen effects through localized optimizations, requiring only about 100 steps on a single GPU. Method

🚀 Getting Started

Environment Requirement 🔧

Step 1: Clone this repo

git clone https://github.com/TencentARC/FlexiAct.git

Step 2: Install required packages

bash env.sh
conda activate EasyVFX
Data Preparation ⏬

Option 1: Prepare data

You can download the data we used in our paper at here.

cd EasyVFX
git clone https://huggingface.co/datasets/BianYx/VAP-Data

you need to organize your training dataset in the following structure:

|-- benchmark
    |-- captions
        |-- some VFX
            |-- VFX
                |-- crop.csv
                |-- val_image.csv
    |-- reference_videos
        |-- some VFX
            |-- VFX
                |-- 0.mp4
                |-- 1.mp4
                |-- ...
        |-- extract_vid_and_crop.py
    |-- target_image
        |-- some VFX
            |-- 1.jpg
            |-- 2.webp
            |-- ...
    

Step1: Prepare your reference video Execute:

python extract_vid_and_crop.py
Checkpoints 📊

You need to download the base model CogVideoX-5B-I2V to {your_cogvideoi2v_path} by:

git lfs install
git clone https://huggingface.co/THUDM/CogVideoX-5b-I2V {your_cogvideoi2v_path}

🏃🏼 Running Scripts

Training

Training script:

bash scripts/train/RefAdapter_train.sh -v CUDA_VISIBLE_DEVICES

bash scripts/train/FEI_train.sh -v CUDA_VISIBLE_DEVICES -a your vfx name
Inference
bash scripts/inference/Inference.sh

🤝🏼 Cite Us

pass

🙏 Acknowledgement

This repository borrows heavily from FlexiAct and CogVideoX, thanks to the authors for sharing their code and models.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors