Zoo | Mbs Series
The modern MBS Series Zoo (MBS-3) is the most ambitious yet. It introduces:
Notably, MBS-3 introduced dynamic difficulty scaling. If a model answers correctly, the next question gets harder—mirroring how a zookeeper might introduce enrichment puzzles to a clever animal.
To truly understand the MBS Series Zoo, you need to understand its evolutionary lineage. Each "Series" adds new enclosures (tasks) while retiring outdated ones. mbs series zoo
Exhibit: Group Dynamics & Social Learning
Monkeys mimic, compete, share, and form alliances — a perfect display of social learning theory, reward systems, and informal networks. Notice how one monkey’s discovery of a tool spreads through the troop. That’s organizational culture in action.
Despite multilingual tasks in MBS-2, the majority of tasks focus on English. The forthcoming MBS-4 "Pangolin" series promises to address this with 100+ languages, but as of 2025, the zoo remains tilted toward high-resource languages. The modern MBS Series Zoo (MBS-3) is the
In the real world, captive animals often suffer from zoochosis (repetitive, neurotic behaviors). In the MBS Series, animals are data constructs. They do not miss the wild because they have no consciousness—only perfect simulations of behavior. You can watch a jaguar hunt without the guilt of confining it.
The Open Zoo Initiative allows any researcher to submit a new task (a "species") to the MBS Series, subject to peer review. This democratizes benchmarking but risks bloat. Notably, MBS-3 introduced dynamic difficulty scaling
The developers behind the MBS engine recently released their 2030 roadmap. Here is what is coming: