Hacker Newsnew | past | comments | ask | show | jobs | submit | bredren's commentslogin

The action is hot, no doubt. This reminds me of Spacewar! -> Galaxy Game / Computer Space.

I co-founded Gliph, which was one of the first commercial, cross platform messaging apps to provide end to end encrypt.

One area of exposure was push notifications. I wonder if the access described wasn’t to the messages themselves but content rich notifications.

If so, both parties could be ~correct. Except the contractors would have been seeing what is technically metadata.


I'm unfamiliar with Gliph. What were the protocols/constructions you used?

How about Apple? How is Apple training its next foundation models?

Apple is sitting this whole thing out. Bizarre.

The options for a company in their position are:

1. Sit out and buy the tech you need from competitors.

2. Spend to the tune of ~$100B+ in infra and talent, with no guarantee that the effort will be successful.

Meta picked option 2, but Apple has always had great success with 1 (search partnership with Google, hardware partnerships with Samsung etc.) so they are applying the same philosophy to AI as well. Their core competency is building consumer devices, and they are happy to outsource everything else.


Yep. They only stop to build something when it will benefit them, and impacts the bottom line.

This whole thread is about whether the most valuable startup of all time will be able to raise enough money to see the next calendar year.

It's definitely rational to decide to pay wholesale for LLMs given:

- consumer adoption is unclear. The "killer app" for OS integration has yet to ship by any vendor.

- owning SOTA foundation models can put you into a situation where you need to spend $100B with no clear return. This money gets spent up front regardless of how much value consumers derive from the product, or if they even use it at all. This is a lot of money!

- as apple has "missed" the last couple of years of the AI craze, there has been no meaningful ill effects to their business. Beyond the tech press, nobody cares yet.


I mean, they tried. They just tried and failed. It may work out for them, though — two years ago it looked like lift-off was likely, or at least possible, so having a frontier model was existential. Today it looks like you might be able to save many billions by being a fast follower. I wouldn’t be surprised if the lift-off narrative comes back around though; we still have maybe a decade until we really understand the best business model for LLMs and their siblings.

> I mean, they tried. They just tried and failed.

They tried to do something that probably would have looked like Copilot integration into Windows, and they chose not to do that, because they discovered that it sucked.

So, they failed in an internal sense, which is better than the externalized kind of failure that Microsoft experienced.

I think that the nut that hasn't been cracked is: how do you get LLMs to replace the OS shell and core set of apps that folks use. I think Microsoft is trying by shipping stuff that sucks and pissing off customers, while Apple tried internally declined to ship it. OpenClaw might be the most interesting stab in that direction, but even that doesn't feel like the last word on the subject.


I think you are right. Their generative AI was clearly underwhelming. They have been losing many staff from their AI team.

I’m not sure it matters though. They just had a stonking quarter. iPhone sales are surging ahead. Their customers clearly don’t care about AI or Siri’s lacklustre performance.


> Their customers clearly don’t care about AI or Siri’s lacklustre performance.

I would rather say their products didn’t just loose in value for not getting an improvement there. Everyone agrees that Siri sucks, but I’m pretty sure they tried to replace it with a natural language version built from the ground up, and realised it just didn’t work out yet: yes, they have a bad, but at least kinda-working voice assistant with lots of integrations into other apps. But replacing that with something that promises to do stuff and then does nothing, takes long to respond, and has less integrations due to the lack of keywords would have been a bad idea if the technology wasn’t there yet.


Honestly, what it seems like is financial discipline.

We do know that they made a number of promises on AI[1] and then had to roll them back because the results were so poor[2]. They then went on to fire the person responsible for this division[3].

That doesn't sound like a financial decision to me.

[1] https://www.apple.com/uk/newsroom/2024/06/wwdc24-highlights/

[2] https://www.bloomberg.com/news/features/2025-05-18/how-apple...

[3] https://nypost.com/2025/12/02/business/apple-shakes-up-ai-te...


Well they tried and they failed. In that case maybe the smartest move is not to play. Looks like the technology is largely turning into a commodity in the long run anyways. So sitting this out and letting others make the mistakes first might not be the worst of all ideas.

From a technology standpoint I don’t feel Apple’s core competency is in AI model foundations

They might know something?

More like they don't know the things others do. Siri is a laughing stock.

Sure, Siri is, but do people really buy their phone based off of a voice assistant? We're nowhere near having an AI-first UX a la "Her" and it's unclear we'll even go in that direction in the next 10 years.

To use the parlance of this thread: "next" foundation models is doing a lot of heavy lifting here. Am I doing this right?

My point is, does Apple have any useful foundation models? Last I checked they made a deal with OpenAI, no wait, now with Google.


Apple does have their own small foundation models but it's not clear they require a lot of GPUs to train.

Do you mean like OCR in photos? In that case, yes, I didn't think about that. Are there other use cases aside from speach to text in Siri?

I think they are also used for translation, summarization, etc. They're also available to other apps: https://developer.apple.com/documentation/FoundationModels

Thanks, I am a dumb dumb about Apple, and mobile in general. I should have known this. I really appreciate the reply so that I know it now.

I think Apple is waiting for the bubble to deflate, then do something different. And they have the ready to use user base to provide what they can make money from.

If they were taking that approach, they would have absolutely first-class integration between AI tools and user data, complete with proper isolation for security and privacy and convenient ways for users to give agents access to the right things. And they would bide their time for the right models to show up at the right price with the right privacy guarantees.

I see no evidence of this happening.


As an outsider, the only thing the two of you disagree on is timing. I probably side with the ‘time is running out’ team at the current juncture.

They apparently are working on and are going to release 2(!) different versions of siri. Idk, that just screams "leadership doesn't know what to do and can't make a tough decision" to me. but who knows? maybe two versions of siri is what people will want.

Arena mode! Which reply do you prefer? /s

But seriously, would one be for newer phone/tablet models, and one for older?


It sounds like the first one, based on Gemini, will be more a more limited version of the second ("competitive with Gemini 3"). IDK if the second is also based on Gemini, but I'd be surprised if that weren't the case.

Seems like it's more a ramp-up than two completely separate Siri replacements.


Apple can make more money from shorting the stock market, including their own stock, if they believe the bubble will deflate.

They are in housing their AI to sell it as a secure way to AI, which 100% puts them in the lead for the foreseeable future.

For CC, I suspect it also need to be testing and labeling separate runs against subscription, public API and Bedrock-served models?

It’s a terrific idea to provide this. ~Isitdownorisitjustme for LLMs would be the parakeet in the coalmine that could at least inform the multitude of discussion threads about suspected dips in performance (beyond HN).

What we could also use is similar stuff for Codex, and eventually Gemini.

Really, the providers themselves should be running these tests and publishing the data.

The availability status information is no longer sufficient to gauge the service delivery because it is by nature non-deterministic.


You may have explained this elsewhere, but if not—-what kind of post processing did you do to upscale or refine the realsense video?

Can you add any interesting details on the benchmarking done against the RED camera rig?


This is a great question, would love some some feedback on this.

I assume they stuck with realsense for proper depth maps. However, those are both limited to a 6 meters range, and their depth imaging isn't able to resolve features smaller than their native resolution allows (gets worse after 3m too, as there is less and less parallax among other issues). I wonder how they approached that as well.


I had never noticed this before. Can you point at any examples?

I have long noticed high profile people going to court with some kind of cast on, though.


I heard that altman does it. I don't care about him enough to check though. More silly gimmicks like holmes talking in a mans voice or jobs wearing the same turtle neck


I had thought a main problem for professional video editors w FC had to do with video editor UX philosophy. Something difficult to pivot away from.

I’m hand waving there because I’m not a pro but my neighbor is and I don’t recall the details.

But I’m curious how you see FC also lost in semi pro to Davinci specifically.


Davinci Resolve is free. At least, for the non studio version. (There’s a few studio only features, but almost everything is available in the free version of resolve). And a lot of people want to learn resolve anyway for color grading. Why not just edit in resolve too? Resolve studio is also quite cheap, given you buy it once and own it forever. Including updates.

I spent last week helping out at a short filmmaking course. The DP running it has used Final Cut for his entire career. But not a single student chose to edit their film using Final Cut. The class was split between resolve and premier pro. (Premier was chosen by a lot of people because it’s what they use at school, and they have a free licence to premier from their school while they’re studying.)


This, plus:

- The studio version of DaVinci is still affordable should you need it.

- DaVinci has many good tutorials


+ purchasing any BMD camera and you usually get a "free" license of DaVinci :) That's how I got my license many moons ago.

Now BMD have "prosumer" cameras available too that doesn't cost half a liver, which the second-hand market seems flush with too, so you can grab really good hardware for "cheap", and get excellent software with it too as the license is movable across hosts :)


I'm surprised to hear the software moves with the hardware! This and the other comments help explain the spread.


The 'cut' page in DaVinci specifically exists to replicate the FC editing UX.

It's an optional way of editing separate from the 'edit' tab.


Oh that’s where Cut comes from. I could never get used to edit in Cut screen.


Agreed, I hate it.


Yes. I am not aware of a model shipping with Windows nor announced plans to do so. Microsoft’s been focused on cloud based LLM services.


This thread is full of hallucinations ;)


I've been exploring the internals of Claude Code and Codex via the transcripts they generate locally (these serve as the only record of your interactions with the products)[1].

Given the stance of the article, just the transcript formats reveals what might be a surprisingly complex system once you dig in.

For Claude Code, beyond the basic user/assistant loop, there's uuid/parentUuid threading for conversation chains, queue-operation records for handling messages sent during tool execution, file-history-snapshots at every file modification, and subagent sidechains (agent-*.jsonl files) when the Task tool spawns parallel workers.

So "200 lines" captures the concept but not the production reality of what is involved. It is particularly notable that Codex has yet to ship queuing, as that product is getting plenty of attention and still highly capable.

I have been building Contextify (https://contextify.sh), a macOS app that monitors Claude Code and Codex CLI transcripts in real-time and provides a CLI and skill called Total Recall to query your entire conversational history across both providers.

I'm about to release a Linux version and would love any feedback.

[1] With the exception of Claude Code Web, which does expose "sessions" or shared transcripts between local and hosted execution environments.


IMO these articles are akin to "Twitter in 200 lines of code!" and "Why does Uber need 1000 engineers?" type articles.

They're cool demos/POCs of real-world things, (and indeed are informative to people who haven't built AI tools). The very first version of Claude Code probably even looked a lot like this 200 line loop, but things have evolved significantly from there


> IMO these articles are akin to "Twitter in 200 lines of code!"

I don't think it serves the same purpose. Many people understand the difference between a 200 lines twitter prototype and the real deal.

But many of those may not understand what the LLM client tool does and how it relates to the LLM server. It is generally consumed as one magic black box.

This post isn't to tell us how everyone can build a production grade claude-code; it tells us what part is done by the CLI and what part is done by the LLM's which I think is a rather important ingredient in understanding the tools we are using, and how to use them.


Nice, I have something similar [1], a super-fast Rust/Tantivy-based full-text search across Claude Code + Codex-CLI session JSONL logs, with a TUI (for humans) and a CLI/JSONL mode for agents.

For example there’s a session-search skill and corresponding agent that can do:

    aichat search —json  [search params] 
So you can ask Claude Code to use the searcher agent to recover arbitrary context of prior work from any of your sessions, and build on that work in a new session. This has enabled me to completely avoid compaction.

[1] https://github.com/pchalasani/claude-code-tools?tab=readme-o...


That is a cool tool. Also one can set "cleanupPeriodDays": in ~/.claude/settings.json to extend cleanup. There is so much information these tools keep around we could use.

I came across this one the other day: https://github.com/kulesh/catsyphon


This is very interesting, especially if you could then use an llm across that search to figure out what has and maybe has not been completed, and then reinject those findings into a new Claude code session


I haven't written the entry yet but it is pretty incredible what you can get when letting a frontier model RAG your complete CLI convo history.

You can find out not just what you did and did not do but why. It is possible to identify unexpectedly incomplete work streams, build a histogram of the times of day you get most irritated with the AI, etc.

I think it is very cool and I have a major release coming. I'd be very appreciative of any feedback.


For that you'd be better off having the LLM write TODO stubs in the codebase and search for that. In fact, most of the recent models just do this, even without prompting.


> So "200 lines" captures the concept but not the production reality of what is involved.

How many lines would you estimate it takes to capture that production reality of something like CC? I ask because I got downvoted for asking that question on a different story[1].

I asked because in that thread someone quoted the CC dev(s) as saying:

>> In the last thirty days, I landed 259 PRs -- 497 commits, 40k lines added, 38k lines removed.

My feeling is that a tool like this, while it won't be 200 lines, can't really be 40k lines either.

[1] If anyone is interested, https://news.ycombinator.com/item?id=46533132


My guess is <5k for a coherent and intentional expert human design. Certainly <10k.

It’s telling that they can’t fix the screen flickering issue, claiming “the problem goes deep.”


I think it is interesting. Is there any other company in a position today that could put together endorsement quotes from such high ranking people across tech?

Also: Tim Cook / Apple is noticeably absent.


That's because of financial links. They are so intertwined propping up the same bubble they are absolutely going to share quotes instantly. FWIW just skimmed through and the TL;DR sounds to me like "Look at the cool kid, we play together, we are cool too!" without obviously any information, anything meaningful or insightful, just boring marketing BS>


> They are so intertwined propping up the same bubble they are absolutely going to share quotes instantly.

Reading this line, I had a funny image form of some NVidia PR newbie reflexively reaching out to Lisa Su for a supporting quote and Lisa actually considering it for a few seconds. The AI bubble really has reached a level of "We must all hang together or we'll surely hang separately".


Why is that interesting?


It could be an indicator that Apple is not as leveraged up on NVIDIA as to provide a quote. Cook did make a special one of a kind product for the current POTUS, so he is nothing if not pragmatic.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: