This article does the same stat 101 mistakes that the Bloomberg article does with p-values.
All this article can say it is that it cannot reject the null hypothesis (chatgpt does not produce statistical discrepancies).
It certainly cannot state that chatgpt is definitively not racist. The article moves the discussion in the right direction though.
Also, I didn't look too closely, but their table under "Where the Bloomberg study went wrong" has unreasonable expected frequencies. But then I noticed it was because it was measuring "name-based discrimination." This is a terrible proxy to determine racism in the resume review process, but that is what Bloomberg decided on so wtv lol. Not faulting the article for this, but this discussion seems to be focused on the wrong metric.
If you are going to argue people over stats, then don't make the same mistakes...
Author here. We mentioned in the piece that we can't rule out that ChatGPT is racist and that it's possible with a larger sample size. A caveat is that these tests might show evidence of bias if the sample size were increased to, say, 10,000 rather than 1,000. That is, with a larger sample size, the p-value might show that ChatGPT is indeed more biased than random chance. The thing is, we just don’t know from their analysis, and it certainly rules out extreme bias.
All this article can say it is that it cannot reject the null hypothesis (chatgpt does not produce statistical discrepancies).
It certainly cannot state that chatgpt is definitively not racist. The article moves the discussion in the right direction though.
Also, I didn't look too closely, but their table under "Where the Bloomberg study went wrong" has unreasonable expected frequencies. But then I noticed it was because it was measuring "name-based discrimination." This is a terrible proxy to determine racism in the resume review process, but that is what Bloomberg decided on so wtv lol. Not faulting the article for this, but this discussion seems to be focused on the wrong metric.
If you are going to argue people over stats, then don't make the same mistakes...