testing

Inspect AI Evals for the Reversal Curse

August 5, 2025

I've been thinking a lot about what it'll take to dramatically improve LLM performance across the board. My running hypothesis? We need to crack three core concepts from relational frame theory (RFT) - and the "reversal curse" is one of them. The reversal curse is where large language models see plenty of instances of "A is B" but fail to generalize and learn "B is A". It's described by Berglund et al.

large-language-models evaluations testing

Language, AI, & Complex Systems

Posts tagged "testing"

Inspect AI Evals for the Reversal Curse