Priori Labs
during pretraining.
Ideas
AGI will be recognized by generality and sample efficient learning ability. ASI will be recognized by capability.
Do LLMs have an intrinsic sense of novelty or "interestingness"?
We tested the ability of frontier AI models to play a novel variant of the classic Battleship game.
Tools
An open source Sokoban puzzle builder for evaluating LLM spatial reasoning and multi-step planning.
An open source dungeon crawler game for evaluating LLM spatial reasoning and multi-step planning.
An open source maze building toolkit for LLM spatial reasoning evaluations.
Sandbox environments for evaluating data exfiltration risks with frontier coding agents.
A multi-stage LLM pipeline tool to plan, review, and synthesize solutions to hard problems.