We release two agentic benchmarks: NatureBench (AI for AI) and EnterpriseClawBench (real-world enterprise tasks).