Currently, human scientists significantly outperform AI agents, particularly when it comes to managing complex, multistep tasks. According to the Artificial Intelligence Index Report 2026, the best AI agents score roughly half as well as human specialists with PhDs when performing autonomous scientific workflows.
While AI is being rapidly integrated into research, there are several key areas of comparison:
- Handling Complexity: AI agents still struggle to reliably execute multistep workflows that human scientists handle with expertise. Experts note that the scientific community is still far from understanding how to use these agents effectively.
- Productivity and Quality: Although AI use in the natural sciences grew 30-fold between 2010 and 2025, there is little concrete evidence yet that AI is actually improving productivity. Some researchers argue that the rapid adoption has happened too quickly for scientific norms to adjust, potentially causing a “nosedive” in the quality of research.
- Domain-Specific Knowledge: A new trend is the development of science foundation models, which are trained on massive, domain-specific datasets. For example, the astronomy model AION-1 can classify galaxies and estimate celestial properties, tasks that were previously the sole domain of human astronomers.
- Dependency and Adoption: Despite the performance gap, human scientists have become highly dependent on AI; experts suggest there would be a “riot” if the technology were removed because it has become an essential part of the modern workflow.
- Autonomy Milestones: AI agents are reaching new milestones in autonomy, such as the first fully AI-generated paper passing the peer-review process in 2025 and the launch of the first operational AI-powered weather forecasts.
In summary, while AI agents are becoming indispensable tools and reaching specific milestones in automation, human specialists remains vastly superior in executing the complex, integrated reasoning required for high-level scientific research