And there is Reinforcement Learning, which is essential to make models act "conv...

And there is Reinforcement Learning, which is essential to make models act "conversational" and coherent, right?

But I wanted to stay abstract and not go into to much detail outside my knowledge and experience.

With the GPT-2 and GPT-3 base models, you were easily able to produce "conversations" by writing fitting preludes (e.g. Interview style), but these went off the rails quickly, in often comedic ways.

Part of that surely is also due to model size.

But RILHF seems more important.

I enjoyed the rambling and even that was impressive at the time.

I guess the "anthropic principle" you are referring to works in a similar direction, although in a different way (selection, not training).

The only context in which I've heard details about selection processes post-training so far was this article about OpenAIs model updates from GPT-4o onwards, discussed earlier here:

https://news.ycombinator.com/item?id=46030799

(there's a gift link in the comments)

The parts about A/B-Testing are pretty interesting.

The focus is ChatGPT as an enticing consumer product and maximizing engagement, not so much the benchmarks and usefulness of models. It briefly addresses the friction between usefulness and sycophancy though.

Anyway, it's pretty clever to use the wording "anthropic principle" here, I only knew the metaphysical usage (why do humans exist).