Hacker Newsnew | past | comments | ask | show | jobs | submit | ramoz's commentslogin

> DeepSeek, Qwen, Kimi, GLM — running on the LangChain, vLLM, llama.cpp, and Ollama stack

"running on the LangChain" ??

EDIT: look, I think the general discussion is important, so I don't want to denounce the article. I, for one, am excited for better control, ownership, and accessibility of models. The ride labs take us on can be quite frustrating. Maybe there's even signal that the model progression is stalling (ie Opus 4.7). If that's true, then some of the notions made in the article are important to discuss. Ref https://x.com/ClementDelangue/status/2046622235104891138?s=2...

EDIT: this is not a complaint about the grammar. Look at my reply in the comments.


There was a bigger opportunity here to mention OpenCode, Pi, etc - open Harnesses that provide accessibility to the oss models, a platform others can build around, and something enterprises can adopt in reliable ways; for the most dominant use case of AI today.

more than that, its pretty clear that there is an insane underinvestment in the harness layer. ive been iterating on my own ideas in that area through the lens of increasing reliability. and holy crap is there so much low hanging fruit. i literally can’t figure out a sustainable way to do the work without commercializing at that layer

Running on the stack consisting of Langchain etc

Yeah not sure about both ollama and llama.cpp though lol


"the" is connected to "stack", not "LangChain". "LangChain" is a adjective that modifies stack.

Cutting it off at "the LangChain" is like if I took the first sentence of your edit and said "look, I think the general" ?? You think the general?


I wasn't complaining about grammar. I didn't even notice that.

Indeed, one day people told me they're running on the LAMP stack, and I said what's that and they said Linux Apache MySQL and PHP and I said "running on the Linux"? Everyone laughed really loud and the people who told me that were run out of the room. I then went to their desks and pissed on it because I'm quite clever that way. They let you do that when you're smart.

Rigghht. If you're trying to make a point, go ahead and be straighter about it.

Straighter about it??

Can you contribute to our community when writing this stuff? Put it in a blog post, think about it, etc. You’ve been around long enough to know this is trash shit.

What provider do you use for Kimi

The provider is a massive issue. People moving off Claude tend to assume this is solved.

Claude's uptime is terrible. The uptime of most other providers is even worse...and you get all the quantization, don't know what model you are actually getting, etc.


Kimi 2.5 was like using Sonnet 4 on a flaky ADSL line. I haven't tried K2.6 yet, but the physical unreliability of the connection was too off-putting.

OpenRouter and I'm toying around with Hermes. Seems good so far, but haven't really gotten into anything heavy yet. Though the "freedom" of not sweating the token pause and the costs not being too high is real.

Straight from them, but I know other providers like io.net can be faster but I like to directly support the project.

Thx. I'll try with my personal projects (because dues to the data collection and ToS most providers are forbidden in my company), if I can opt out of training on my input.

I'm just getting a but tired of using Opus 2.6 which eats my whole allowance and then some £££ going through the 4kB prompt to review ~13 kB text file twice - and that's on top of the sometimes utter bonkers, bad, lazy answers I'm not getting even from the local Gemma 4 E4B.


Opus 4.7 is very rough to work with. Specifically for long-horizon (we were told it was trained specifically for this and less handholding).

I don't have trust in it right now. More regressions, more oversights, it's pedantic and weird ways. Ironically, requires more handholding.

Not saying it's a bad model; it's just not simple to work with.

for now: `/model claude-opus-4-6[1m]` (youll get different behavior around compaction without [1m])


From what I've seen none of this is that complex, one could simply 'draw a circle around your house' and get all the "anonymized" device pings and just trace those.

Plannotator, open source runs locally, has code review: https://github.com/backnotprop/plannotator

and a code tour feature about to ship: https://x.com/backnotprop/status/2043759492744270027/video/1

- integrated comment feedback for agents

- inline chat

- integrated AI review (uses codex and claude code defaults)

Stage (op product) navigation tour is nice UX, about a day worth of work in addition to the incoming code tour.


this is for AI agent work though. That's cool, but not every team that wants better UX for complex work uses agents. Even if it "just works" for real scenarios, the marketing could be better.

Fair. There are users who simply just use the diff and integrated GitHub view/comment/approval-sync experience for local reviews of PRs. But it's _marketed_ as an integrated agent experience.

> So, cyber security of tomorrow will not be like proof of work in the sense of "more GPU wins"; instead, better models, and faster access to such models, will win

tomato, tomato


So kind of like how you would get nowhere by buying more gpus if there's already ASICs in play.

Unfortunately, verifiable privacy is not physically possible on MacBooks of today. Don't let a nice presentation fool you.

Apple Silicon has a Secure Enclave, but not a public SGX/TDX/SEV-style enclave for arbitrary code, so these claims are about OS hardening, not verifiable confidential execution.

It would be nice if it were possible. There's a lot of cool innovations possible beyond privacy.


I wrote a whole SDK for using SGX, it's cool tech. But in theory on Apple platforms you can get a long way without it. iOS already offers this capability and it works OK.

macOS has a strong enough security architecture that something like Darkbloom would have at least some credibility if there was a way to remotely attest a Mac's boot sequence and TCC configuration combined with key-to-DR binding. The OS sandbox can keep apps properly separated if the kernel is correct and unhacked. And Apple's systems are full of mitigations and roadblocks to simple exploitation. Would it be as good as a consumer SGX enclave? Not architecturally, but the usability is higher.


As if you get privacy with the inference providers available today? I have more trust in a randomly selected machine on a decentralized network not being compromised than in a centralized provider like OpenAI pinky promising not to read your chats.

Inference providers don't claim private inference. However, they must uphold certain security and legal compliances.

You have no guarantees over any random connected laptop connected across the world.


I would say the chances of OpenAI itself getting hacked and your secrets in logs getting leaked are about the same or less as the chances of a randomly selected machine on a decentralized network being reverse-engineered by a determined hacker. There's no risk-free option, every provider comes with risks. If you care about infosec you have to do frequent secret rotation anyway.

Every hardware key will be broken if there is enough incentive to do so. Their claims read like pure hubris.

Who cares about AI privacy? Most people don’t. If you do, run locally.

Macs do not have an accessible hardware TEE.

Macs have secure enclaves.


Good point!

But they argue that:

> PT_DENY_ATTACH (ptrace constant 31): Invoked at process startup before any sensitive data is loaded. Instructs the macOS kernel to permanently deny all ptracerequests against this process, including from root. This blocks lldb, dtrace, and Instruments.

> Hardened Runtime: The binary is code-signed with hardened runtime options and explicitly without the com.apple.security.get-task-allow entitlement. The kernel denies task_for_pid() and mach_vm_read()from any external process.

> System Integrity Protection (SIP): Enforces both of the above at the kernel level. With SIP enabled, root cannot circumvent Hardened Runtime protections, load unsigned kernel extensions, or modify protected sys- tem binaries. Section 5.1 proves that SIP, once verified, is immutable for the process lifetime.

gives them memory protection.

To me that is surprising.


Looking at their paper at [1], there's a gaping hole: there's no actual way to verify the contents of the running binaries. The binary hash they include in their signatures is self-reported, and can be modified. That's simply game over.

[1] https://github.com/Layr-Labs/d-inference/blob/master/papers/...


A note, as others have posted on this thread: I mention this as a concrete and trivial flaw in their whole strategy, but the issue is fundamental: there's no hardware enclave for third-party code available to do the type of attestation that would be necessary. Any software approach they develop will ultimately fall to that hole.

Couldn't someone just uhh... patch their macOS/kernel, mock these things out, then behold, you can now access all the data?

If it's not running fully end to end in some secure enclave, then it's always just a best effort thing. Good marketing though.


Right.

Apple is perfectly capable of doing remote attestation properly. iOS has DCAppAttest which does everything needed. Unfortunately, it's never been brought to macOS, as far as I know. Maybe this MDM hack is a back door to get RA capabilities, if so it'd certainly be intriguing, but if not as far as I know there's no way to get a Mac to cough up a cryptographic assertion that it's running a genuine macOS kernel/boot firmware/disk image/kernel args, etc.

It's a pity because there's a lot of unique and interesting apps that'd become possible if Apple did this. Darkbloom is just one example of what's possible. It'd be a huge boon to decentralization efforts if Apple activated this, and all the pipework is laid already so it's really a pity they don't go the extra mile here.


> iOS has DCAppAttest which does everything needed. Unfortunately, it's never been brought to macOS, as far as I know.

Apple's docs claim it's been available on macOS since macOS 11. Am I missing something here?

https://developer.apple.com/documentation/devicecheck/dcappa...


All lies. They mean the symbols exist and can be linked against, but

https://developer.apple.com/documentation/devicecheck/dcappa...

> If you read isSupported from an app running on a Mac device, the value is false. This includes Mac Catalyst apps, and iOS or iPadOS apps running on Apple silicon.


That really sucks! TIL. So app attestation is iOS 14.0+, iPadOS 14.0+, tvOS 15.0+ and watchOS 9 only.

Yes. Running attested workloads on macOS if you are not Apple is nontrivial.

You can probably just tap the HTTP(S) connection to spy on the data coming through. I think it's a mistake to assume any kind of privacy for this service.

The biggest argument for remote attestation I can think of is to make sure nobody is returning random bullshit and cashing in prompt money on a massive scale.


> PT_DENY_ATTACH

All you have to do is attach to the process before it does that, and then prevent this call from going through.


They quite frankly have no idea what they are talking about.

I'm not arguing anything. This is how it works. There is no but.

Protection here is conditional, best-effort. There are no true guarantees, nor actual verifiability.


Aside from the sentiment and arguments made–

You don't need to train new models. Every single frontier model is susceptible to the same jailbreaks they were 3 years ago.

Only now, an agent reading the CEOs email is much more dangerous because it is more capable than it was 3 years ago.


Are they? I'm sure they're vulnerable to certain jailbreaks, but many common ones were demonstrably fixed.


I retract that.

I think what I meant to say was, they're as simple to jailbreak as they were three years ago.

Different methods, still simple. Working with researchers that are able to get very explicit things out of them. Again, it feels much worse than before, given the capability of these models.

There's basically guardrails encoded into the fine-tuned layers that you can essentially weave through (prompting). These 'guardrails' are where they work hard for benevolent alignment, yet where it falls short (but enables exceptional capability alignment). Again, nothing really different than it was three years ago.


Still doing https://plannotator.ai

I use it daily and so do others, for - better UX, feedback, and review surfaces for ai coding agents.

  1. Plan review & iterative feedback. 

  2. Now code review with iterative feedback.
Free and open source https://github.com/backnotprop/plannotator


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: