AIRS-Bench: a Suite of Tasks for Frontier AI Research Science Agents
Muna Aghamelu, Abhinav Moudgil, Emanuel Tewolde, Roberta Raileanu, Abhishek Charnalia, Alberto Pepe, Alexander Miller, Alexis Audran-Reiss, Alisia Lupidi, Amar Budhiraja, Anton Protopopov, Bassel Al Omari, Bhavul Gauri, Chee Hau Leow, Daniel Izcovich, Derek Dunfield, Despoina Magka, Edan Toledo, Gaurav Chaurasia, Hossam Mossalam, Isabel Urrego, Ishita Mediratta, Jakob Foerster, Jean-Christophe Gagnon-Audet, Jordi Armengol-Estape, Karen Hambardzumyan, Kelvin Niu, Lucia Cipolina-Kun, Martin Josifoski, Michael Shvartsman, Nicolas Baldwin, Parth Pathak, Saba Nazir, Sandra Lefdal, Tatiana Shavrina, Thomas Simon Foster, Yoram Bachrach