Hacker Newsnew | past | comments | ask | show | jobs | submit | 4sak3n's commentslogin

I couldn't agree more.

The obvious quirks which are easy to pick up and identify, like em dashes or overuse of a specific rhetorical device, are also easy to adapt and change. The things which consistently helps me clock LLMs are the deeper lack of cohesion, thoughtfullness, ideosyncratic quirks and just plain interesting writing and these lacks are, in my opinion, fundamental to the technology and no amount of post-training or RLHF will be able to replicate that.

It's the long tail of genuine human communication, the 0,0001% of training data which is so niche or idiosyncratic, which gets discarded in order to compress and encode the rest, which is what people subconsciously pick up on. Most of the hallmarks which people are so quick to point out, the em-dashes et al, are adressable with tweaks to the system but that lack cannot be because it is a fundamental weakness of the technology.


Not if you see the IPO as your only remaining exit strategy for a juggling act that is threatening to rain down on your head when it collapses


Adjectives can be used as nouns in informal speech


Yes, but I find it really, really irritating.


Is he really criminally underrated if My Stars the Destination features in just about every "best scifi books" list I've ever read? Maybe not as high up as it deserves on those lists but that's hardly criminal underappreciation


Sure, but it's pretty rare to see people talk much about Bester compared to other SF novelists. Also astonishing to me that none of his work has ever been made into a film that I'm aware of. Same with Jack Vance.


The Stars My Destination ... how did I mess that up?


> Vietnamese is one example. Having tones attached to syllables means that words and sentences are shorter.

This does not entail that more can be expressed than other languages. Please see my other reply which goes into (admittedly only slightly) more detail.


Yes, sure, "expressability" is something that is hard to quantify.

Otoh, there should br some connection between grammar complexity and written culture. It is my hypothesis but, say, culture with a rich novel writing tradition leads to a complication of the language grammar. A 3 page long sentence, anyone? One can see how Middle Egyptian literature made the underlying language more complex.


> Tonal languages allows individuals to express way more than Latin based languages.

Not true. There was a study that showed that information density is pretty consistent across all languages, regardless of the average number of phonemes used in that language or the strategies that are employed to encode structural information like syntax. I can only assume you are refering to the density with your statement based on the subject matter in that article as well as the fact that, given enough time and words, any language can represent anything any other language can represent.

I apologise if my terms are not exact, it's been years since I've studied linguistics. In addition, since I'm not going to dig up a reference to that paper, my claim here is just heresay. However the finding lines up with a pretty much all linguistic theory I learned regarding language acquisition and production as well as theory on how language production and cognition are linked so I was confident in the paper's findings even though a lot of the theory went over my head.


Language density is one thing but what about legibility?

How achievable is literacy?


I'm pretty sure that you are correct. Or at the very least it is a reference to that specific aphorism. The title is far too idiomatically Latin (if you overlook the awkwardness with the syntactic subject) to be a coincidence.


"PyPI is growing fast. If this dangerous expansion not stopped, our advanced machine learning models predict that in only 8 years the number of packages will outnumber human beings."

This is one of the funniest things I've read all week.


The WITNESS THIS INEVITABLE FUTURE button that slowly, then steadily started increasing the date on the graphs is executed very well. Truly hysterical, I laughed out loud.


Thank you, I’m glad you appreciate it. I spent way more time on it than I would like to admit. The original version actually did run a small model trained on the data in the browser to generate the predictions.


Maybe its forward thinking of the ML folk to create big data to train the models that will replace them


I don't know whether this level of accuracy is imortant for you or not (if it isn't please excuse the correction) but the plural for "corpus" is "corpora"


Oh my gosh, thank you so much. <3 :)


> A developer working on a function may suddenly discover the need to, say, left-pad a string with blanks. Rather than go though the pain of implementing this challenging functionality ...

The irony is palpable.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: