Otherwise the LLM can just write tests against whatever it wrote and not what is expected. This happens often with the top models too.
Someone needs to check the tests work, review they cover edge cases etc.
Otherwise the LLM can just write tests against whatever it wrote and not what is expected. This happens often with the top models too.
Someone needs to check the tests work, review they cover edge cases etc.