Research from Harvard and the University of Michigan reveals that AI models can grasp concepts like color and size far earlier than conventional tests indicate. The study found that these capabilities emerge suddenly during training, often hidden until prompted in specific ways. Experiments showed that models mastered certain concepts up to 2,000 training steps earlier than testing could detect. Notably, techniques like 'linear latent intervention' highlight ways to access these hidden abilities, suggesting traditional benchmarks may underestimate AI competencies. The research indicates that AI models internalize concepts before they demonstrate them, akin to understanding a foreign language without being able to speak it fluently. This finding calls for a revision of testing protocols in AI development to better assess hidden capabilities.

Source 🔗