Try it yourself — interactive app

The real trained model runs right here in your browser — the RSSM dynamics, count/mass heads, and controller actor, ported to JavaScript and verified to match PyTorch. Scope: up to 4 objects — the model counts reliably from a single frame up to ~4; beyond that it needs temporal context.

You act, the model believes. Take actions and watch the model's belief track reality. It runs open-loop (sees only your action, not the new image), so it slowly drifts — especially mass. Hit Peek to let it look and snap back.

last action:

Model belief vs reality

Choose a target count; the controller drives the world to it. The actor — trained only inside the model's imagination — picks add/remove/no-op, reaches your target, and holds it. It controls count, not mass (added size is random).

last action:

set target:

Goal

Choose a target TOTAL MASS; the mass controller gets close. Unlike count, mass is only approximately controllable — the size of each added shape is random — so expect it to land within ~1–2 of the target, not exactly on it.

last action:

set target mass:

Goal (mass)

Choose a target mass; the planner builds it with the FEWEST shapes. Here add chooses a size (S=+1 / M=+2 / L=+5), so mass is fully controllable. The agent plans inside the world model (no training): each step it imagines every add and takes the largest one that doesn't overshoot — the greedy coin-change rule — then holds. Compare the shapes it uses to the provable optimum.

last action:

set target mass:

Try it yourself — interactive app

Model belief vs reality

Goal

Goal (mass)

Goal (min-objects)