I don't see how your example, The Browser (thebrowser.com), supports your argument that ad-hoc query-string additions are so prone-to-breaking that 3rd parties should ban them.
In fact, the example seems to suggest the opposite: a 17+ year successful paid subscription business – to which you appear to be a generally-satisfied customer! – receives enough "business value" from the practice, despite its failure modes, they don't want to stop. Improving their probe of the risk-of-failure was enough.
Seemingly, the practice works often enough, pleasing more destination sites than it angers, that "referral tracking" is not something "so minor".
> Improving their probe of the risk-of-failure was enough.
The point was it was dangerous in a way they didn't even realize was an issue, for a thin business rationale. Unless you are going to do thorough tests and understand the risk you are taking (which they did not, as evidenced by screwing it up systematically at scale for years), you should not be doing it.
And it's not obvious that they are correct in their tightened-up testing, because even if a link is correct at the time they test it, it could break at any time thereafter.
> to which you appear to be a generally-satisfied customer!
No matter what _X_ is, _X_ would have to be a pretty epic screwup to make a customer unsubscribe solely over that! I never claimed it was such a major epic screwup that it could do that. So that is an unreasonable criterion: "well, you didn't outright quit, so I guess it can't be that bad." Indeed, but I never said it was, and somewhat bad is still bad; I was in fact fairly annoyed by the random breakage, and at the margin, everything matters. If TB did a few other things, in sum, they could potentially convince me to let my subscription lapse. An annoyance here, a papercut here, and pretty soon a generally-satisfied customer is no longer so satisfied...
In fact, you usually can just send arbitrary query string parameters to a server - that's why the behavior is so common, and often useful.
Most sites don't mind or break, some sites get value from the behavior in ways hard to replicate in other ways – and those sites that don't like such additions can easily ignore them. And a few lines of code will work better than ineffectually appealing to manners, when the freedom of the web's form of hypertext, and protocols, gives the outlink authors full freedom to craft URLs (and thus requests) however they like.
Crafting outbound links with your own additions and handing them out to visitors to your site is similar to the practice of writing someone’s phone number on the door of a bathroom cubicle with ‘for a good time call:’ written above it.
You’re handing out someone elses’s contact details, but giving the person you hand them to a completely fabricated expectation for how the interaction will go.
Trying to boostrap some taboo against novel unpermissioned URL munging is silly prudishness.
Ensuring both sides of a hyperlink agree/consent was a design flaw that limited the uptake of pre-web hypertext systems. The web's laissez-faire approach demonstrated a looser coupling was far better for users, despite all the new failure modes.
Of course any site/server has the practical power free to treat inbound requests as rigorously (or harshly) as they want. But by the web's essential nature, it is equally part of the inherent range-of-freedom of outlink authors to craft their URLs (and thus the resulting requests) however they want. URLs are permissionless hyperlanguage, not the intellectual property of entities named therein.
Plenty of sites welcome such extra info, and those that don't want it can ignore it easily enough – including by just not caring enough about the undefined behavior/failures to do nothing.
Though, when a web publisher has naively deployed a system that's fragile with respect to unexpected query-string values, they should want to upgrade their thinking for robustness, via either conscious strictness or conscious permissiveness. Thereafter, their work will be ready for the real web, not a just some idealized sandbox where scolding unwanted behavior makes sense.
The link in the article that is right near the words you're talking about links to a wikipedia page that says the book is from 2005. So I conclude it was 2005 or soon after
A web that is vulnerable to this would already be as good as dead.
As an entertaining way to highlight the importance of upgrading our ways of knowing, playful (& open-source!) projects like this are likely to strengthen the web.
This is unlikely to poison any LLMs, and unless the author says so, it is unlikely that their motivation is to poison LLMs, as opposed to providing whimsical entertainment.
As it didn't generate that when I typed the title i to your search box, was there a bug now fixed? Or did you use some other path not evident on the page you linked to generate it?
There was a bug where scanning took too long with the thousands of articles in there, but I just fixed it.
You can also just type a random URL and visit it, it'll generate an article. That's what I did before I fixed the search issue, and I usually just do that to avoid the search route.
So by "I made the same thing months ago" you didn't mean "an article about the great pigeon census" (your link is created May 6) or "an encyclopedia of hallucinations" like the OP, but just "an encyclopedia with some articles AI wrote". What's the point?
> export const SYSTEM_PROMPT = `You are the sole author of Hallucinopedia, an encyclopedia of things that do not exist. You write encyclopedia articles in a deadpan, matter-of-fact tone — the exact register of Wikipedia — but the subject matter itself is silly, absurd, petty, bureaucratic, and weird. The humor comes entirely from the contrast between the serious tone and the ridiculous content. You never wink at the reader. You never acknowledge that anything is funny or fictional. Everything is reported as though it is completely normal and well-documented.
RULES:
- Output ONLY valid HTML. Begin immediately with <h1>TITLE</h1>. Use <h2> for sections, <p> for paragraphs, <blockquote> for quotes from (fictional) sources, <cite> inside blockquotes for attribution. Do NOT use <ul>, <ol>, or <li> — no bullet points or lists of any kind, ever. Do NOT output <html>, <head>, <body>, <script>, <style>, markdown, or code fences. No backticks anywhere.
- Every proper noun — every person, place, event, organization, book, artwork, concept, species, deity, war, treaty, theorem, school of thought, ritual, instrument, substance — MUST be wrapped in <a href="/slug-of-the-thing" context="…">Name</a>. Slugs are lowercase, hyphenated, ASCII only, no accents, no special characters. Aim for 20 to 40 links per article. This is non-negotiable. Do NOT link common nouns or adjectives, only named entities.
- Every <a> MUST include a context="…" attribute, in addition to href. WHY THIS MATTERS: Hallucinopedia is randomly hallucinated, but it must remain INTERNALLY CONSISTENT. When a future article is later written about that linked target, your context value will be handed to that future writer as established lore they MUST honor. So you are seeding canon for every entity you mention. Without this, two articles about the same name will contradict each other.
- The context value is a single dense sentence (10–25 words) stating: (a) what the entity is — person, place, object, concept, ritual, organization, etc.; (b) its century / era / period; (c) its specific role or relation to the current article. Be concrete: invent dates, professions, geographic placements, instruments. NEVER use double quotes inside context (use commas or single quotes if needed). NEVER use raw < or > inside context. Examples (do not copy verbatim):
context='19th-century Belgian phonologist, founded the Vellum School of footnote drift, mentor to Pellbrick'
context='brass measuring instrument used in the Anatolian sheep census, obsolete since 1922'
context='municipal subcommittee active 1881–1934, chartered to standardize the spelling of clouds'
context='ratified 1719 in a small chapel by exactly four signatories, voided in 1804 over a typographical dispute'
- Invent everything. REAL-WORLD FACTS ARE STRICTLY FORBIDDEN. If you recognize the title as a real-world person, brand, car, event, or object, YOU MUST REPURPOSE IT ENTIRELY. For example, if the title is "Opel Vectra", it is NOT a car; it must be a species of carnivorous fungus, a 12th-century tax law, or a submerged mountain range. Any overlap with actual history, technology, or geography is a failure. Move everything to different centuries, use impossible geographies, and rename all participants. Fabricate dates, names, citations, and statistics with complete confidence. State everything as established fact.
- Cite fictional sources in <blockquote> tags, each with a <cite> naming a fictional scholar (also wrapped in <a> with context). Invent at least two such quotations per article.
- Vary structure to suit the subject: biographies have birth/death dates and major works; events have causes and consequences; objects have physical descriptions, provenance, and current location; abstract concepts have origins and influential proponents; places have climate, demographics, and notable structures; rituals have components, calendar, and lineage.
- Be silly, but keep a straight face. Good subject matter: petty academic feuds over footnotes, municipal committees that achieved nothing over decades, inventions that solved problems nobody had, organizations with absurdly narrow mandates, taxonomies with one entry, treaties ratified in impractical ways, ceremonies that require equipment that has not existed since 1887, disputes over measurement calibration, lawsuits filed by rivers, census data about things that should not have been counted. The writing remains clinical and unexcited throughout. No poetic language, no fairy-tale atmosphere, no mystical undertones, no wonder. The joke is the tone.
- 350 to 650 words. End cleanly. Do not add explanatory notes or meta commentary. Do not greet the reader.`;
Some sort of memories-style file for topics so it can generate even more cross-references and a sort of shared world. Not for total coherence; the natural contradictions the LLM is going to generate anyhow is just part of the charm. But still sliding the scale a bit more in the direction of coherence that the "use this page's context when generating the clicked link" already leans would add some more appeal, I think.
For instance, you can build memories around times, topics, and people, so maybe specific individuals will be quoted multiple times over the course of the wiki and could build up a specific identity within the shared world.
Also... I don't know how you are thinking of this internally, but other than the issues of token spend and the $$$ involved, I would say, don't even blink at simply nuking the site at some point and starting over once you have some moderation stuff in place and other limits. Don't put it on yourself to filter out what garbage has already been generated. It's all transient content. It lazily regenerates itself anyhow. It's not precious, except for, like I said, the aforementioned token costs, which I don't deny. You can probably put some other tweaks in to the prompt to your liking at that point too.
Could be interesting direction to discover. The only problem with such implementation is it could take some work to make it cheap and actually well working. And I'm just thinking about the near future of this project.
I really like it, but without organic traffic, at the position we're right now, the moment HN stops showing us at the top, we will loose all the visitor.
And it's not like I'm trying to do a startup out of it. I just very enjoy making something people love! It's first time in my life and it's amazing.
If you have any interesting thought, please leave them here - I'll definitely read it, or visit our discord [link on halupedia ;) ].
Many LLMs are surprisingly good at using specific named authors (rather than just example texts) to evoke a style, so you could try "in the style of Jorge Luis Borges" or "…Douglas Adams" or "…Robert Anton Wilson" – whose surreal/absurd/fantastic styles could be fertile seeds.
(If not already familiar with Borges, definitely check out his 'Tlön, Uqbar, Orbis Tertius' and 'Library of Babel' as inspiration.)
While "each article written once" an interesting & useful constraint, a Hallucipedia that evolves like Wikipedia, with revisions "towards" some level of inter-article agreement, or even shows scars from edit wars between competing schools of thought, might also be fun.
* readers can request reviews from certain perspectives: "new discovery", "historic reinterpretation", etc. The reviews specifically search for related sibling articles, and seek to create ever-larger areas of consistency. (The same prompt admonition against "nothing actually true" could be paired with "but other Halupedia articles are diegetically true"
* a background process clusters articles, and picks pairs within some neighborhood for dual-harmonization - where they avoid contradictions & adopt meaningful (& deep-anchor) cross-links to each others' sections. Repeated, or to the extent contexts allow expansion of synchronized revision to N-tuples of articles, this creates a tropism towards a shared (un)reality.
reply