I’m curious, is adding “do not hallucinate” to prompts effective in preventing h...

watt · 2025-02-19T09:47:53 1739958473

It will work - you can see it well with a Chain of Thought (CoT) model: it will keep asking itself: "am I hallucinating? let's double check" and then will self-reject thoughts if it can't find a proper grounding. In fact, this is the best part of CoT model, that you can see where it goes off rails and can add a message to fix it in the prompt.

For example, there is this common challenge, "count how many r letters in strawberry", and you can see the issue is not counting, but that model does not know if "rr" should be treated as single "r" because it is not sure if you are counting r "letters" or r "sounds" and when you sound out the word, there is a single "r" sound where it is spelled with double "r". so if you tell the model, double "r" stands for 2 letters, it will get it right.

simonw · 2025-02-18T22:06:32 1739916392

Apple were using that in their Apple Intelligence system prompts last year, I don't know if they still have that in there. https://simonwillison.net/2024/Aug/6/apple-intelligence-prom...

I have no idea if it works or not!

harper · 2025-02-19T15:03:02 1739977382

I added it because of the apple prompts! I figured it is worth a try. The results are good, but i did not test it extensively

becquerel · 2025-02-18T21:29:35 1739914175

I don't know about this specific technique, but I have found it useful to add a line like 'it's OK if you don't know or this isn't possible' at the end of queries. Otherwise LLMs have a tendency to tilt at whatever windmill you give them. Managing tone and expectations with them is a subtle but important art.

krainboltgreene · 2025-02-18T21:23:12 1739913792

It seems absurd, but I suppose it’s the same as misspelling with similar enough trigrams as to get the best autocorrect results.