I don't really agree with the opening paragraph here. We have seen numerous examples of GPT4 "reasoning" by what is essentially dumb metacognition. For example, when querying wolfram-alpha using the new plugin. GPT4 will correct itself if the query gave an inappropriate response or if the generated wolfram language code did not yield the desired output. And if the code doesn't work GPT4 will acknowledge this. Whether or not this "reflection" or "internal monologue" is intelligence is a different question. It's like the old meme about moving the goalpost of AI each time we accomplish something new.