VOID โ Video Object and Interaction Deletion
๐ Project Page | ๐ป GitHub
Upload a video and its quadmask, enter a prompt describing the scene after removal, and VOID will erase the object along with its physical interactions.
Built on CogVideoX-Fun-V1.5-5B fine-tuned for interaction-aware video inpainting.
Quadmask format
The quadmask is a grayscale video where each pixel value encodes what role that region plays:
| Pixel value | Meaning |
|---|---|
| 0 (black) | Primary object to remove |
| 63 (dark grey) | Overlap of primary object / affected zone |
| 127 (mid grey) | Affected region โ shadows, reflections, new and old trajectories |
| 255 (white) | Background โ keep as-is |
Use the VLM-Mask-Reasoner pipeline included in the repo to generate quadmasks automatically.
Sample sequences โ click to load inputs