WorldString: Actionable World Representation

WorldString is a neural-based interactive digital twin for skinning, articulable, and soft objects. It takes keypoint-based state as input and produces 3D point clouds as output, and is capable of modeling the state manifold of real-world objects by learning directly from point clouds or RGB-D video streams.

Links: Paper (arXiv) · Project Page · Data Generation · Code (this repo)

Release Plan

Status	Milestone
✅	Project page
✅	Data generation code (sim and real world)
✅	Checkpoints and visualize demo (XHand, Unitree Go2, Unitree H1)
⬜	Open-source training data
⬜	Training code and model

Local demo preview

Local checkpoint visualization (python demo_worldstring.py): drag joint sliders and compare simulator ground truth (left) vs. neural prediction (right) for XHand, Go2, and H1 in one page.

_{Auto-looping preview of the unified Gradio demo. Full recording (MP4)}

Installation (Local Demo)

The instructions below set up a Conda environment for checkpoint-based interactive visualization. No training data download is required for the demo.

Supported robots

Robot	Simulator	Checkpoint	Demo tab
XHand Right	PyBullet	`ckpts/xhand_right/xhand_best.pth`	XHand Right
Unitree Go2	MuJoCo	`ckpts/go2/go2_best.pth`	Unitree Go2
Unitree H1	MuJoCo	`ckpts/h1/h1_best.pth`	Unitree H1

1. Prerequisites

OS: Linux recommended (Ubuntu 20.04+). macOS may work for CPU-only demo; Windows is not tested.
GPU (optional): NVIDIA GPU with CUDA for faster inference. CPU-only works but first render is slow.
Conda: Miniconda or Anaconda.
Git: to clone this repository.

2. Clone the repository

git clone git@github.com:MaureenZOU/worldstring.git
cd worldstring

3. Create and activate the Conda environment

We recommend Python 3.11:

conda create -n worldstring python=3.11 -y
conda activate worldstring

4. Install PyTorch

With NVIDIA GPU (CUDA 12.x, recommended):

pip install torch --index-url https://download.pytorch.org/whl/cu130

Verify:

python -c "import torch; print(torch.__version__, 'cuda:', torch.cuda.is_available())"

5. Install demo dependencies

pip install \
  gradio \
  numpy \
  scipy \
  pyyaml \
  plotly \
  open3d \
  pybullet \
  mujoco \
  trimesh

Versions tested locally:

Package	Version
Python	3.11
torch	2.12
gradio	6.16
mujoco	3.9
pybullet	3.2
open3d	0.19
plotly	6.8
trimesh	4.12

Note: flash_attn is optional. If it is not installed, the model automatically falls back to PyTorch scaled dot-product attention.

6. Prepare checkpoints and demo assets

Ensure the following layout exists under ckpts/:

ckpts/
├── xhand_right/
│   ├── xhand_best.pth
│   ├── config.yaml
│   └── demo/current_pose/          # created at runtime; pose_current.da written here
├── go2/
│   ├── go2_best.pth
│   ├── config.yaml
│   └── demo/current_pose/
│       ├── go2_init_state.da       # reference init frame for keypoint binding
│       ├── init_world_min_max.json # normalization bounds for keypoints / inference
│       └── pose_joint_state_init.json
└── h1/
    ├── h1_best.pth
    ├── config.yaml
    └── demo/current_pose/
        ├── h1_init_state.da
        ├── init_world_min_max.json
        └── pose_joint_state_init.json

Robot meshes / URDFs are already under assets/:

assets/
├── xhand_right/urdf/xhand_right.urdf
├── go2/go2.xml
└── unitree_h1/h1.xml

If checkpoint files are distributed separately (e.g. Google Drive / Hugging Face), download them into the paths above before launching the demo.

7. Run the unified demo (all three robots, one page)

From the repository root:

conda activate worldstring
python demo_worldstring.py

Open in your browser:

http://127.0.0.1:6040

Use the tabs XHand Right, Unitree Go2, and Unitree H1 to switch robots. Each tab has joint sliders on the left and two point-cloud views on the right:

Left plot: ground truth from the simulator mesh (PyBullet or MuJoCo)
Right plot: neural prediction from the loaded checkpoint

Click Submit after moving sliders, or Reset to Initial Pose to restore the default configuration.

First load: the page runs inference for all three robots once at startup and may take 1–3 minutes depending on GPU/CPU. Subsequent updates per tab are faster.

8. Run individual demos (optional)

Script	Robot	URL
`python demo_xhand.py`	XHand	http://127.0.0.1:6037
`python demo_go2_mujoco.py`	Go2	http://127.0.0.1:6038
`python demo_h1_mujoco.py`	H1	http://127.0.0.1:6039

9. Coordinate frames (visualization)

Robot	GT & neural panels
XHand	Y-up normalized training frame
Go2 / H1	MuJoCo Z-up world coordinates (inference output is denormalized with `init_world_min_max.json`)

Model inputs are always normalized keypoints written to pose_current.da at runtime; only the displayed point clouds are transformed for consistent viewing.

Repository layout (overview)

worldstring/
├── assets/              # Robot URDF / MJCF and meshes
├── ckpts/               # Pretrained weights, configs, demo init data
├── config/              # YAML config loader
├── dataset/             # Inference-time keypoint / voxel dataset
├── modeling/            # AWR model (PolytopeModel)
├── robot_backends/      # PyBullet / MuJoCo robots, keypoint trackers, demo sessions
├── demo_worldstring.py  # Unified Gradio demo (recommended)
├── demo_xhand.py
├── demo_go2_mujoco.py
└── demo_h1_mujoco.py

Training data generation (sim + real world) lives in the separate repo WorldString_data_gen.

Citation

If you use WorldString in your research, please cite:

@misc{xu2026worldstringactionableworldrepresentation,
      title={WorldString: Actionable World Representation},
      author={Kunqi Xu and Jitao Li and Jianglong Ye and Tianshu Tang and Isabella Liu and Sifei Liu and Xueyan Zou},
      year={2026},
      eprint={2605.18743},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2605.18743},
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WorldString: Actionable World Representation

Release Plan

Local demo preview

Installation (Local Demo)

Supported robots

1. Prerequisites

2. Clone the repository

3. Create and activate the Conda environment

4. Install PyTorch

5. Install demo dependencies

6. Prepare checkpoints and demo assets

7. Run the unified demo (all three robots, one page)

8. Run individual demos (optional)

9. Coordinate frames (visualization)

Repository layout (overview)

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
assets		assets
ckpts		ckpts
config		config
dataset		dataset
docs		docs
modeling		modeling
robot_backends		robot_backends
.gitignore		.gitignore
README.md		README.md
demo_go2_mujoco.py		demo_go2_mujoco.py
demo_h1_mujoco.py		demo_h1_mujoco.py
demo_worldstring.py		demo_worldstring.py
demo_xhand.py		demo_xhand.py

Folders and files

Latest commit

History

Repository files navigation

WorldString: Actionable World Representation

Release Plan

Local demo preview

Installation (Local Demo)

Supported robots

1. Prerequisites

2. Clone the repository

3. Create and activate the Conda environment

4. Install PyTorch

5. Install demo dependencies

6. Prepare checkpoints and demo assets

7. Run the unified demo (all three robots, one page)

8. Run individual demos (optional)

9. Coordinate frames (visualization)

Repository layout (overview)

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages