2024-12-24 17:32:10

More evidence from an OpenAI employee that o3 uses the same paradigm as o1: "[...] progress from o1 to o3 was only three months, which shows how fast progress will be in the new paradigm of RL on chain of thought to scale inference compute."