How do you know if it meets your bar for reliability if you don’t understand the output? I don’t know that the analogy to a compiler is apples to apples. A compiler isn’t producing an answer based on statistically generating something that should look like the right answer.