In a groundbreaking study, researchers have discovered that large language models (LLMs), a type of artificial intelligence trained on extensive text datasets, can predict the results of proposed neuroscience studies with greater accuracy than human experts. This finding underscores the potential of LLMs to revolutionize scientific research by identifying patterns across vast scientific texts and forecasting outcomes with superhuman precision. The study, published in Nature Human Behaviour, demonstrates that LLMs can distill complex information from scientific literature, enabling them to forecast scientific outcomes more accurately than human experts. This highlights their potential as powerful tools for accelerating research, going far beyond just knowledge retrieval.
The researchers developed BrainBench, a tool to evaluate how well LLMs can predict neuroscience results. They tested 15 different general-purpose LLMs and 171 human neuroscience experts to see whether the AI or the person could correctly determine which of two paired abstracts was the real one with the actual study results. All of the LLMs outperformed the neuroscientists, with the LLMs averaging 81% accuracy and the humans averaging 63% accuracy. This suggests that LLMs can effectively synthesize knowledge to predict future outcomes, potentially transforming the way scientific research is conducted.