Image-to-Video models often keep the picture looking right but ignore parts of the text instructions.
VerseCrafter is a video world model that lets you steer both the camera and multiple moving objects by editing a single 4D world state.