I think the comparison to giving change is a good one, especially given how frequently the LLM hype crowd uses the fictitious "calculator in your pocket" story. I've been in the exact situation you've described, long before LLMs came out and cashiers have had calculators in front of them for longer than we've had smartphones.
I'll add another analogy. I tell people when I tip I "round off to the nearest dollar, move the decimal place (10%), and multiply by 2" (generating a tip that will be in the ballpark of 18%), and am always told "that's too complicated". It's a 3 step process where the hardest thing is multiplying a number by 2 (and usually a 2 digit number...). It's always struck me as odd that the response is that this is too complicated rather than a nice tip (pun intended) for figuring out how much to tip quickly and with essentially zero thinking. If any of those three steps appear difficult to you then your math skills are below that of elementary school.
I also see a problem with how we look at math and coding. I hear so often "abstraction is bad" yet, that is all coding (and math) is. It is fundamentally abstraction. The ability to abstract is what makes humans human. All creatures abstract, it is a necessary component of intelligence, but humans certainly have a unique capacity for it. Abstraction is no doubt hard, but when in life was anything worth doing easy? I think we unfortunately are willing to put significantly more effort into justifying our laziness than we will to be not lazy. My fear is that we will abdicate doing worthwhile things because they are hard. It's a thing people do every day. So many people love to outsource their thinking. Be it to a calculator, Google, "the algorithm", their favorite political pundit, religion, or anything else. Anything to abdicate responsibility. Anything to abdicate effort.
So I think AI is going to be no different from calculators, as you suggest. They can be great tools to help people do so much. But it will be far more commonly used to outsource thinking, even by many people considered intelligent. Skills atrophy. It's as simple as that.
I briefly taught a beginner CS course over a decade ago, and at the time it was already surprising and disappointing how many of my students would reach for a calculator to do single-digit arithmetic; something that was a requirement to be committed to memory when I was still in school. Not surprisingly, teaching them binary and hex was extremely frustrating.
I tell people when I tip I "round off to the nearest dollar, move the decimal place (10%), and multiply by 2" (generating a tip that will be in the ballpark of 18%), and am always told "that's too complicated".
I would tell others to "shift right once, then divide by 2 and add" for 15%, and get the same response.
However, I'm not so sure what you mean by a problem with thinking that abstraction is bad. Yes, abstraction is bad --- because it is a way to hide and obscure the actual details, and one could argue that such dependence on opaque things, just like a calculator or AI, is the actual problem.
I'm sorry, I think you are teaching people the wrong thing if you are blanket statement saying "abstraction is bad". You are throwing the baby out with the bath water. You can "over abstract" and that certainly is not good but that's not easy to define as it is extremely problem dependent. But with these absurd blanket statements you just push code quality and performance down.
Over abstraction is bad because it can be too difficult to read or it can be bad because it de-optimizes programs. "Too difficult to read or maintain" is ultimately a skill issue. We don't let the juniors decide that but neither should we have abstraction where only wizards can maintain things. Both are errors.
But abstraction can also greatly increase readability and help maintain code. It's the reason we use functions. It's the reason we use OOP. It helps optimize code, it can help reduce writing, it can and does do many beneficial things.
Lumping everything together is just harmful.
Saying abstraction is bad is no different than saying "python is bad", or any duck typing language (including C++'s auto), because you're using an abstract data type. The "higher level" the language, the more abstract it is.
Saying abstraction is bad is no different than saying templates are bad.
Saying abstraction is bad is no different than saying object oriented programming is bad.
Saying abstraction is bad is saying coding is bad.
I'm sorry, literally everything we do is abstraction. Conflating "over abstraction" with "abstraction" is just as grave an error as the misrepresentation of Knuth's "premature optimization is the root of all evil." Dude said "grab a fucking profiler" and everyone heard "don't waste time making things work better".
If you want to minimize abstraction then you can go write machine code. Anything short of that has abstracted away many actions and operations. I'll admire your skill but this is a path I will never follow nor recommend. Abstraction is necessary and our ability to abstract is foundational into making code even work.
*I will die on this hill*
> because it is a way to hide and obscure the actual details
That's not abstraction, that obfuscation. Do not conflate these things.
> one could argue that such dependence on opaque things, just like a calculator or AI, is the actual problem.
> Powered by Gemini, a multimodal large language model developed by Google, EMMA employs a unified, end-to-end trained model to generate future trajectories for autonomous vehicles directly from sensor data. Trained and fine-tuned specifically for autonomous driving, EMMA leverages Gemini’s extensive world knowledge to better understand complex scenarios on the road.
This strikes me as a skunworks project to investigate a technology that could be used for autonomous vehicles someday, as well as score some points with Sundar and the Alphabet board who've decreed the company is all-in on Gemini.
Production Waymos use a mix of machine-learning and computer vision (particularly on the perception side) and conventional algorithmic planning. They're not E2E machine-learning at all, they use it as a tool when appropriate. I know because I have a number of friends that have gone to work for Waymo, and some that did compiler/build infrastructure for the cars, and I've browsed through their internal Alphabet job postings as well.
You were confidently wrong for judging them to be confidently wrong
> While EMMA shows great promise, we recognize several of its challenges. EMMA's current limitations in processing long-term video sequences restricts its ability to reason about real-time driving scenarios — long-term memory would be crucial in enabling EMMA to anticipate and respond in complex evolving situations...
They're still in the process of researching it, noting in that post implies VLM are actively being used by those companies for anything in production.
I should have taken more care to link a article, but I was trying you link something more clear.
But mind you, everything Waymo does is under research.
So let's look at something newer to see if it's been incorporated
> We will unpack our holistic AI approach, centered around the Waymo Foundation Model, which powers a unified demonstrably safe AI ecosystem that, in turn, drives accelerated, continuous learning and improvement.
> Driving VLM for complex semantic reasoning. This component of our foundation model uses rich camera data and is fine-tuned on Waymo’s driving data and tasks. Trained using Gemini, it leverages Gemini’s extensive world knowledge to better understand rare, novel, and complex semantic scenarios on the road.
> Both encoders feed into Waymo’s World Decoder, which uses these inputs to predict other road users behaviors, produce high-definition maps, generate trajectories for the vehicle, and signals for trajectory validation.
They also go on to explain model distillation. Read the whole thing, it's not long
But you could also read the actual research paper... or any of their papers. All of them in the last year are focused on multimodality and a generalist model for a reason which I think is not hard do figure since they spell it out
To the best of my knowledge every major autonomous vehicle and robotics company is integrating these LVLMs into their systems in some form or another, and an LVLM is probably what you're interacting with these days rather than an LLM. If it can generate images or read images, it is an LVLM.
The problem is no different from LLMs though, there is no generalized understanding and thus they can not differentiate the more abstract notion of context. As an easy to understand example: if you see a stop sign with a sticker that says "for no one" below you might laugh to yourself and understand that in context that this does not override the actual sign. It's just a sticker. But the L(V)LMs cannot compartmentalize and "sandbox" information like that. All information is equally processed. The best you can do is add lots of adversarial examples and hope the machine learns the general pattern but there is no inherent mechanism in them to compartmentalize these types of information or no mechanism to differentiate this nuance of context.
I think the funny thing is that the more we adopt these systems the more accurate the depiction of hacking in the show Upload[0] looks.
Because I linked elsewhere and people seem to doubt this, here is Waymo a few years back talking about incorporating Gemini[1].
Also, here is the DriveLM dataset, mentioned in the article[2]. Tesla has mentioned that they use a "LLM inspired" system and that they approach the task like an image captioning task[3]. And here's 1X talking about their "world model" using a VLM[4].
I mean come on guys, that's what this stuff is about. I'm not singling these companies out, rather I'm using as examples. This is how the field does things, not just them. People are really trying to embody the AI and the whole point of going towards AGI is to be able to accomplish any task. That Genie project on the front page yesterday? It is far far more about robots than it is about videogames.
Many large companies have research departments that do experimental work that'll never get to the product. This raises prestige, increases visibility and helps hire smart people.
Things like Waymo's EMMA is an example of this. Will the production cars use LVLM's somewhere? Sure, probably a great idea for things like sign recognition. Will they use a single end-to-end model for all driving, like EMMA? Hell no.
Driving vehicles with people on board requires an extremely reliable software, and LLMs are nowhere close to this. Instead, it'd be usual layered software - LLM, traditional AI models, and tons of hardcoded logic.
(This all only applies to places where failure is critical. All that logic is expensive to write, so if there is no loss of life involved, people will do all sorts of crazy things, including end-to-end models)
But people like that they aren't shying away from negative results and that builds some trust. Though let's not ignore that they're still suggesting AI + manual coding.
But honestly, this sample size is so small that we need larger studies. The results around what is effective and ineffective AI usage is a complete wash with n<8.
Also anyone else feel the paper is a bit sloppy?
I mean there's a bunch of minor things but Figure 17 (first fig in the appendix) is just kinda wild. I mean there's trivial ways to solve the glaring error. The more carefully you look at even just the figures in the paper the more you say "who the fuck wrote this?" I mean like how the fuck do you even generate Figure 12? The numbers align with the grids but boxes are shifted. And Figure 16 has experience levels shuffled for some reason. And then there are a hell of a lot more confusing stuff you'll see if you do more than a glance...
> It shouldn't matter, because whoever is producing the work product is responsible for it, no matter whether genAI was involved or not.
I hate to ask, but did you RTFA? Scrolling down ever so slightly (emphasis not my own)
| *Who authorized this class of action, for which agent identity, under what constraints, for how long; and how did that authority flow?*
| A common failure mode in agent incidents is not “we don’t know what happened,” but:
| > We can’t produce a crisp artifact showing that a specific human explicitly authorized the scope that made this action possible.
They explicitly state that the problem is you don't know which human to point at.
> They explicitly state that the problem is you don't know which human to point at.
The point is "explicitly authorized", as the article emphasizes. It's easy to find who ran the agent(article assumes they have OAuth log). This article is about 'Everyone knows who did it, but did they do it on purpose? Our system can figure it out'
Also what's weird is that this project seems to be primarily written in javascript. I can't imagine that's a pleasant user experience for generating tool paths...
it's a combination of JS, WASM, and WebGPU. the JIT engines are so much faster than you would imagine, especially if you tune your code right. workers allow for parallel processing on all of your CPU cores. WebGPU, at least in Chrome, is kind of amazing.
People seem to be misunderstanding what's going on here. Running the particle accelerator generates a lot of heat and thus needs a pretty large scale cooling system. What this is saying is instead of dumping the heat into the atmosphere pump it towards homes.
Is this a good source of heating? I mean yeah, the heat is being generated anyways. Should you build a particle accelerator to heat homes? Fuck no. But if you already have one, why not?
Considering the top comment is a joke about Bitcoin mining, another a joke about the Sun, April Fools, conspiracy, and a question about what homes (obviously local), it seems like quite a few.
Or maybe I'm misreading and HN really is becoming Reddit because the thread is full of low quality comments off topic. I wasn't surprised to see most accounts are at most a few years old
I think what confuses me is that Apple is taking so much profit that it reduces their profits.
It's a classic direct-indirect management problem. Think about Android for a second. It costs nothing to put an app on their app store. People can make apps for themselves and then just publish because either "why not" or it's an easy way to distribute to friends and family. So basically it is making app creation easy. Meanwhile Apple charges you $100/yr to even put something up on the store, makes it hard to sideload, and consequently people charge for apps, which Apple rejoices as they get a 30% cut (already double dipping: profiting from devs, profiting from the devs' customers).
BUT WE'RE TALKING ABOUT SMARTPHONES
A smartphone is useless without apps! People frustrated they can't find the apps they want on iPhone? They switch to Android. People on Android want to get away from Google but they can't do half the shit they want to on iPhones (and the other half costs $0.99/mo)? They bite their tongue or rage quit to Graphene.
The only reason this "fuck over the user" strategy works is because there's an effective monopoly.
All of this is incredibly idiotic as the point of a smartphone is that it is a computer that also makes phone calls. We have made a grave mistake in thinking they are anything but general purpose computers. All our conversations around them seem really silly or down right idiotic when you recognize they are general purpose computers. And surprise surprise, the result is that seeing how profitable and abusive the smartphone market can be leads to a pretty obvious result: turn your laptops and desktops into smartphone like devices. Where everything must be done through the app stores, where they lock you out of basic functionalities, where they turn the general purpose computer into a locked down for-their-purposes computer.
The thing that made the smartphone and the computer so great was the ability to write programs. The ability to do with it as you want. It's because you can't build a product for everyone. But the computer? It's an environment. You can make an environment that anyone can turn into the thing they want and they need. THAT is the magic of computers. So why are we trying to kill that magic?
It doesn't matter that 90% of people don't use it that way, and all those arguments are idiotic. Like with everything else, it is a small minority of people that move things forward. A small percentage of players account for the majority of microtransactions in videogames. A small percentage of fans buy the majority of merchandise from their favorite musicians. And in just the same way, it is a small number of computer users (i.e. "powerusers") that drive most of the innovation, find most of the bugs, and do most of the things. I mean come on, how long did it take Apple and Google to put a fucking flashlight into the OS? It was the most popular apps on both their stores for a long time before it got built in. Do you really think they're going to be able to do all the things?
I'll add another analogy. I tell people when I tip I "round off to the nearest dollar, move the decimal place (10%), and multiply by 2" (generating a tip that will be in the ballpark of 18%), and am always told "that's too complicated". It's a 3 step process where the hardest thing is multiplying a number by 2 (and usually a 2 digit number...). It's always struck me as odd that the response is that this is too complicated rather than a nice tip (pun intended) for figuring out how much to tip quickly and with essentially zero thinking. If any of those three steps appear difficult to you then your math skills are below that of elementary school.
I also see a problem with how we look at math and coding. I hear so often "abstraction is bad" yet, that is all coding (and math) is. It is fundamentally abstraction. The ability to abstract is what makes humans human. All creatures abstract, it is a necessary component of intelligence, but humans certainly have a unique capacity for it. Abstraction is no doubt hard, but when in life was anything worth doing easy? I think we unfortunately are willing to put significantly more effort into justifying our laziness than we will to be not lazy. My fear is that we will abdicate doing worthwhile things because they are hard. It's a thing people do every day. So many people love to outsource their thinking. Be it to a calculator, Google, "the algorithm", their favorite political pundit, religion, or anything else. Anything to abdicate responsibility. Anything to abdicate effort.
So I think AI is going to be no different from calculators, as you suggest. They can be great tools to help people do so much. But it will be far more commonly used to outsource thinking, even by many people considered intelligent. Skills atrophy. It's as simple as that.
reply