Try it yourself — interactive app

The real trained model runs right here in your browser — the RSSM dynamics, count/mass heads, and controller actor, ported to JavaScript and verified to match PyTorch. Scope: up to 4 objects — the model counts reliably from a single frame up to ~4; beyond that it needs temporal context.

You act, the model believes. Take actions and watch the model's belief track reality. It runs open-loop (sees only your action, not the new image), so it slowly drifts — especially mass. Hit Peek to let it look and snap back.

last action:

Model belief vs reality

Choose a target count; the controller drives the world to it. The actor — trained only inside the model's imagination — picks add/remove/no-op, reaches your target, and holds it. It controls count, not mass (added size is random).

last action:
set target:

Goal

Choose a target TOTAL MASS; the mass controller gets close. Unlike count, mass is only approximately controllable — the size of each added shape is random — so expect it to land within ~1–2 of the target, not exactly on it.

last action:
set target mass:

Goal (mass)