Hacker Newsnew | past | comments | ask | show | jobs | submit | coffeeaddict1's commentslogin

But how can you be a responsible builder if you don't have trust in the LLMs doing the "right thing"? Suppose you're the head of a software team where you've picked up the best candidates for a given project, in that scenario I can see how one is able to trust the team members to orchestrate the implementation of your ideas and intentions, with you not being intimately familiar with the details. Can we place the same trust in LLM agents? I'm not sure. Even if one could somehow prove that LLM are very reliable, the fact an AI agents aren't accountable beings renders the whole situation vastly different than the human equivalent.

Trust but verify:

I test all of the code I produce via LLMs, usually doing fairly tight cycles. I also review the unit test coverage manually, so that I have a decent sense that it really is testing things - the goal is less perfect unit tests and more just quickly catching regressions. If I have a lot of complex workflows that need testing, I'll have it write unit tests and spell out the specific edge cases I'm worried about, or setup cheat codes I can invoke to test those workflows out in the UI/CLI.

Trust comes from using them often - you get a feeling for what a model is good and bad at, and what LLMs in general are good and bad at. Most of them are a bit of a mess when it comes to UI design, for instance, but they can throw together a perfectly serviceable "About This" HTML page. Any long-form text they write (such as that About page) is probably trash, but that's super-easy to edit manually. You can often just edit down what they write: they're actually decent writers, just very verbose and unfocused.

I find it similar to management: you have to learn how each employee works. Unless you're in the Top 1%, you can't rely on every employee giving 110% and always producing perfect PRs. Bugs happen, and even NASA-strictness doesn't bring that down to zero.

And just like management, some models are going to be the wrong employee for you because they think your style guide is stupid and keep writing code how they think it should be written.


You don't simply put a body in a seat and get software. There are entire systems enabling this trust: college, resume, samples, referral, interviews, tests and CI, monitoring, mentoring, and performance feedback.

And accountability can still exist? Is the engineer that created or reviewed a Pull Request using Claude Code less accountable then one that used PICO?


> And accountability can still exist? Is the engineer that created or reviewed a Pull Request using Claude Code less accountable then one that used PICO?

The point is that in the human scenario, you can hold the human agents accountable. You cannot do that with AI. Of course, you as the orchestrator of agents will be accountable to someone, but you won't have the benefit of holding your "subordinates" accountable, which is what you do in a human team. IMO, this renders the whole situation vastly different (whether good or bad I'm not sure).


You can switch to another LLM provider or stop using them altogether. It's even easier than firing a developer.

It is as easy as getting rid of Microsoft Teams at your org.

Of course he is - because he invested so much less.

What I really want from Codex is checkpoints ala Copilot. There are a couple of issues [0][1] opened about on GitHub, but it doesn't seem a priority for the team.

[0] https://github.com/openai/codex/issues/2788

[1] https://github.com/openai/codex/issues/3585


They routinely mention in GitHub that they heavily prioritize based on "upvotes" (emoji reacts) in GitHub issues, and they close issues that don't receive many. So if you want this, please "upvote" those issues.

Gemini CLI has this

I’ve never understood checkpoints / forks. When do you use them?

Usually, I tell the agent to try out an idea and if I don't like the implementation or approach I want to undo the code changes. Then I start again, feeding it more information so it can execute a different idea or the same one with a better plan. This also helps the context window small.

Can’t you use git for that? I do that often and just revert changes. It does require me to commit often but that’s probably good anyways.

It's about not polluting the context. AI doesn't need information about things that didn't work in the new requests' context.

That's interesting. I use those moments to show it what not to do. Does it not just repeat the mistakes?

> That's probably because we have yet to discover any universal moral standards.

This is true. Moral standards don't seem to be universal throughout history. I don't think anyone can debate this. However, this is different that claiming there is an objective morality.

In other words, humans may exhibit varying moral standards, but that doesn't mean that those are in correspondence with moral truths. Killing someone may or may not have been considered wrong in different cultures, but that doesn't tell us much about whether killing is indeed wrong or right.


It seems worth thinking about it in the context of the evolution. To kill other members of our species limits the survival of our species, so we can encode it as “bad” in our literature and learning. If you think of evil as “species limiting, in the long run” then maybe you have the closest thing to a moral absolute. Maybe over the millennia we’ve had close calls and learned valuable lessons about what kills us off and what keeps us alive, and the survivors have encoded them in their subconscious as a result. Prohibitions on incest come to mind.

The remaining moral arguments seem to be about all the new and exciting ways that we might destroy ourselves as a species.


Using some formula or fixed law to compute what's good is a dead end.

> To kill other members of our species limits the survival of our species

Unless it's helps allocate more resources to those more fit to help better survival, right?;)

> species limiting, in the long run

This allows unlimited abuse of other animals who are not our species but can feel and evidently have sentience. By your logic there's no reason to feel morally bad about it.


> Using some formula or fixed law to compute what's good is a dead end.

Who said anything about a formula? It all seems conceptual and continually evolving to me. Morality evolves just like a species, and not by any formula other than "this still seems to work to keep us in the game"

> Unless it's helps allocate more resources to those more fit to help better survival, right?;)

Go read a book about the way people behave after a shipwreck and ask if anyone was "morally wrong" there.

> By your logic there's no reason to feel morally bad about it.

And yet we mostly do feel bad about it, and we seem to be the only species who does. So perhaps we have already discovered that lack of empathy for other species is species self-limiting, and built it into our own psyches.


> Who said anything about a formula?

In this thread some people say this "constitution" is too vague and should be have specific norms. So yeahh... those people. Are you one of them?)

> It all seems conceptual and continually evolving to me. Morality evolves just like a species

True

> keep us in the game"

That's a formula right there my friend

> Go read a book about the way people behave after a shipwreck and ask if anyone was "morally wrong" there.

?

> And yet we mostly do feel bad about it, and we seem to be the only species who does. So perhaps we have already discovered that lack of empathy for other species is species self-limiting, and built it into our own psyches.

or perhaps the concept of "self-limiting" is meaningless.


>In this thread some people say this "constitution" is too vague and should be have specific norms. So yeahh... those people. Are you one of them?)

I have no idea what you're talking about, so I guess I'm not "one of them".

> That's a formula right there my friend

No, it's an analogy, or a colloquial metaphor.


> I have no idea what you're talking about

Read the top level comment and "objective anchors". It's always great to know the context before replying.

https://news.ycombinator.com/item?id=46712541

There's no objective anchors. Because we don't have objective truth. Every time we think we do and then 100 years later we're like wtf were we thinking.

> No, it's an analogy, or a colloquial metaphor

Formula IS a metaphor... I wrote "formula or fixed law" ... what do you think we're talking about, actual math algebra?


> There's no objective anchors. Because we don't have objective truth. Every time we think we do and then 100 years later we're like wtf were we thinking.

I believe I'm saying the same thing, and summing it up in the word "evolutionary". I have no idea what you're talking about when you suggest that I'm perhaps "one of those people". I understand the context of the thread, just not your unnecessary insinuation.

> Formula IS a metaphor... I wrote "formula or fixed law" ... what do you think we're talking about, actual math algebra?

There is no "is" here. There "is" no formula or fixed law. Formula is metaphor only in the sense that all language is metaphor. I can use the word literally this context when I say that I literally did not say anything about a formula or fixed law, because I am literally saying there is no formula or fixed law when it comes to the context of morality. Even evolution is just a mental model.


> you suggest

no, I asked. because it was unclear.


I hate it too, but to my surprise, all of my colleagues (with an iPhone) said they love because it looks great.


I’ll add a third perspective that’s probably often gone unsaid: I love it on Apple TV, and kinda like it on iPhone and Mac. It definitely needs to be improved though. There are definitely a whole bunch of usability issues, but they shouldn’t be too hard to fix. And Apple has shown willingness to iterate until they get it right. Unlike Microsoft which just moves onto the next thing (the system settings UI design in Windows 11 is fine.. but can they pleeeease just integrate all settings into that UI now.. how many generations of settings / control panes are there in Windows now?)

The huge corner radius is one thing I do wish they reverted in Mac OS.


I love it both on an iPhone and a Mac. It runs great and it looks great. It’s a mistake to look at what people say on the internet. They are usually hopelessly contrarian.


The navigation top bar on iOS especially is a huge improvement overall. Especially for views that stack a safe area bar in the second line.


I had friends use those exact words but replaced with "Windows Phone" and "Windows 8 Computer".


> It’s a mistake to look at what people say on the internet.

Touché.


In terms of performance, it's quite far from something like Blend2D or Vello though.


Blend2D is a CPU-only rendering engine, so I don't think it's a fair comparison to ThorVG. If we're talking about CPU rendering, ThorVG is faster than Skia. (no idea about Blend2d) But at high resolutions, CPU rendering has serious limitations anyway. Blend2D is still more of an experimental project that JIT kills the compatiblity and Vello is not yet production-ready and webgpu only. No point of arguing fast today if it's not usable in real-world scenarios.


How JIT kills compatibility if it's only enabled on x86 and aaarch64? You can compile Blend2D without it and it would just work.

So no, it doesn't kill any compatibility - it only shows a different approach.

BTW GPU-only renderers suck, and many renderers that have GPU and CPU engines suck when GPU is not available or have bugs. Strong CPU rendering performance is just necessary for any kind of library if you want true compatibility across various platforms.

I have seen many many times broken rendering on GPU without any ability to switch to CPU. And the biggest problem is that more exotic HW you run it on, less chance that somebody would be able to fix it (talking about GPUs).


As an aside, anyone here uses drawing tablets for work? I got a cheap Wacom tablets and found it super useful, for sketching ideas or understanding something before starting to implement new code.


For the last few years, I have been using small Wacom Intuos S tablets as a replacement for mice, trackballs or touchpads.

I configure the tablets in the "Relative" mode, in which they behave exactly like a mouse, unlike in their default "Absolute" mode. I configure left click to be done by touching the tablet with the stylus and the 2 buttons that are on the stylus to generate right click and double left click.

The advantage over a mouse or trackball is the much more comfortable position of the hand and also the much higher speed and accuracy of positioning. Moving the pointer to any location on the screen is instantaneous and without any effort, due the lightness of the stylus and to the lack of contact with the tablet.

Because the stylus is extremely light, I can touch type on the keyboard while still keeping the stylus between my fingers. This allows faster transitions between keyboard and graphic pointer than with a standard mouse (because the time needed to grip the mouse is eliminated). Only when I type longer texts, I drop the stylus on the tablet.

The tablet is no bigger than a traditional mouse pad, so it does not need a bigger space on the desk.

After switching to use exclusively a graphic tablet, I would never want to use again a mouse, trackball, trackpoint or touchpad. I only regret that I have never thought earlier to try this.

Besides being a better mouse than a mouse, a tablet obviously allows to do things for which a mouse is inappropriate, e.g. drawing or handwriting (e.g. for signing a document).

I should mention that I have always used the Wacom tablets with Linux. I have never tried them on Windows, so I do not know if there they work as well.


This is pretty cool! Never considered this


Possibly not the use-case you're thinking of, but I've been using a Wacom Intuos tablet as a mouse replacement for a few years now on MacOS and on Linux. I use it in pen mode (where the area of the tablet maps to the screen) - you can also configure it in mouse mode (like a touchpad, where the movement is relative to where the cursor is on the screen) which should work better with multi-display setups, though it's not to my preference. I have my pen/stylus setup so that tapping it onto the tablet acts as a left/primary click, the larger button on the pen is right click, and holding the smaller button and dragging on the tablet is scroll/pan).

MacOS is well-supported once the drivers are installed, though sometimes the driver doesn't seem to pick up tablet (either after the laptop or tablet goes to sleep). Restarting the driver fixes this, though this bug seems to have been fixed in the latest driver release. Linux works out of the box (at least on KDE/Arch), though sadly customization support on Wayland isn't quite there yet compared with what you could do on X11 (with the xsetwacom utility). For drawing support though it should work perfectly but as far as I know you can't the the button functionality, which is a bummer when using it as a pointing device.

The main benefit for me is that it feels much more ergonomic compared with a regular mouse or even a vertical mouse or trackball and I don't get anywhere near as much wrist or shoulder pain - especially in the cold temps in the middle of winter where I am. There is a bit of an adjustment period and I find for interacting with small UI elements such as buttons it can be a bit tricky, but for me the benefits outweigh the downsides. The only other downside I can think of is that when using the tablet over bluetooth (wired is also an option and tracks a little more smoothly) the battery only lasts 1½ days compared with the weeks/months a wireless mouse would go for.


I'm an artist and haven't used a mouse since somewhere in the 00's when I developed some RSI in my index finger while working in the Flash animation mines.

Annoyances: games that require you to push the cursor against the edge of the screen to move the view, app/website developers who force tiny scrollbars that constantly hide themselves despite me setting the OS to never hide scrollbars, having to restart the tablet drivers most of the time when I move between having the laptop docked with the big screen and big tablet on the desk, and taking it out to a cafe or the park and using the smaller tablet that lives in my laptop bag.


Yes.

I've dreamed of using a stylus and tablet since reading _The Mote in God's Eye_ when I was young, and have preferred to use them since using a "Koalapad" attached to a Commodore 64 in the school computer lab when I was young.

The NCR-3125 I had was donated to The Smithsonian by the guy I sold it to, along w/ a lot of other materials on pen computing --- PenPoint was my favourite OS alongside NeXTstep, and the high-watermark of my computing experience was using the NCR running PenPoint as a portable, then cabling it up to my NeXT Cube to transfer data --- had a Wacom ArtZ attached to the Cube, so still had a stylus, just it wasn't a screen.

Futurewave Smartsketch is still my favourite drawing program, and I was very glad that its drawing system made its way through Flash and into Freehand/MX (which I still use by preference and despair of replacing). If you have a graphics tablet, be sure to try out:

https://www.wickeditor.com/#/

Hopefully the folks making Graphite will figure out that it's a core functionality for a drawing program to work w/ a graphics tablet --- haven't been able to do anything when I've tried.

I sketch (either on a Samsung Galaxy Note 10+ or Kindle Scribe or Wacom One or Samsung Galaxy Book 3 Pro 360), take notes (mostly on the Scribe), do block-programming (Wacom One or Book 3), or draw (on the Book 3).


Yes, but Wacom recently discontinued macOS driver support for older versions of the Intuos, and I had to downgrade to an older driver to make it work.

When it doesn’t anymore I’ll need to get something else, probably an iPad so I can also use it as a 2nd screen.


In case you need alternative to the official driver, https://opentabletdriver.net/ is a well maintained driver.


I am doing exactly the same, also writing down some raw ideas I have in it to not forget them.


> And please keep in mind that Blend2D is not really in development anymore - it has no funding so the project is basically done.

That's such a shame. Thanks a lot for Blend2D! I wish companies were less greedy and would fund amazing projects like yours. Unfortunately, I do think that everyone is a bit obsessed with GPUs nowadays. For 2D rendering the CPU is great, especially if you want predictable results and avoid having to deal with the countless driver bugs that plague every GPU vendor.


> There's also the issue of just how many billions of line segments you really need to draw every 1/120th of a second at 8K resolution

IMO, one of biggest benefit of a high performance renderer would be power savings (very important for laptops and phones). If I can run the same work but use half the power, then by all means I'd be happy to deal with the complications that the GPU brings. AFAIK though, no one really cares about that and even efforts like Vello are just targeting fps gains, which do correlate with reduced power consumption but only indirectly.


Adding a power draw into the mix is pretty interesting. Just because a GPU can render something 2x faster in a particular test doesn't mean you have consumed 50% less power, especially when we talk about dedicated GPUs that can have power draw in hundreds of watts.

Historically 2D rendering on CPU was pretty much single-threaded. Skia is single-threaded, Cairo too, Qt mostly (they offload gradient rendering to threads, but it's painfully slow for small gradients, worse than single-threaded), AGG is single-threaded, etc...

In the end only Blend2D, Blaze, and now Vello can use multiple threads on CPU, so finally CPU vs GPU comparisons can be made more fairy - and power draw is definitely a nice property of a benchmark. BTW Blend2D was probably the first library to offer multi-threaded rendering on CPU (just an option to pass to the rendering context, same API).

As far as I know - nobody did a good benchmarking between CPU and GPU 2D renderers - it's very hard to do completely unbiased comparison, and you would be surprised how good the CPU is in this mix. Modern CPU cores consume maybe few watts and you can render to a 4K framebuffer with that single CPU core. Put rendering text to the mix and the numbers would start to be very interesting. Also GPU memory allocation should be included, because rendering fonts on GPU means to pre-process them as well, etc...

2D is just very hard, on both CPU and GPU you would be solving a little bit different problems, but doing it right is insane amount of work, research, and experimentation.


It's not a formal benchmark, but my Browser Engine / Webview (https://github.com/DioxusLabs/blitz/) has pluggable rendering backends (via https://github.com/DioxusLabs/anyrender) with Vello (GPU), Vello CPU, Skia (various backends incl. Vulkan, Metal, OpenGL, and CPU) currently implemented

On my Apple M1 Pro, the Vello CPU renderer is competitive with the GPU renderers on simple scenes, but falls behind on more complex ones. And especially seems to struggle with large raster images. This is also without a glyph cache (so re-rasterizing every glyph every time, although there is a hinting cache) which isn't implemented yet. This is dependent on multi-threading being enabled and can consume largish portions of all-core CPU while it runs. Skia raster (CPU) gets similarish numbers, which is quite impressive if that is single-threaded.


I think Vello CPU would always struggle with raster images, because it does a bounds check for every pixel fetched from a source image. They have at least described this behavior somewhere in Vello PRs.

The obsession for memory safety just doesn't pay off in some cases - if you can batch 64 pixels at once with SIMD it just cannot be compared to a per-pixel processor that has a branch in a path.


It's an argument you can make in any performance effort. But I think the "let's save power using GPUs" ship sailed even before Microsoft started buying nuclear reactors to power them.


Vello [0] might suit you although it's not production grade yet.

[0] https://github.com/linebender/vello


And yet none of those "outsiders" have figured out a way to economically renumerate developers for their work. Flathub had a initiative a few years ago to add payments to help developers fund their projects, but I haven't seen anything come out of it.


Most linux distros require that software they distribute is open source, and link to the home pages of applications, so effectively donations are the only way to pay for those. There are paid distros (which are almost always about support, there was a paid GNUStep distro though many years ago).

On the other hand, Steam et al are app stores where developers can get paid.


Open source doesn't mean free.

> On the other hand, Steam et al are app stores where developers can get paid.

Yes, this is exactly my point. App stores have a reason to exist. They provide discoverability and a streamlined way to monetise your app, something that is sorely lacking in open source projects. A case in point for example is Krita, which is published as a paid app on the Microsoft Store. The revenue generated by the sales goes to fund the development of the project. Linux needs an equivalent.


I'm not sure what you're talking about. I can download Blender from almost any package manager, and the devs of Blender are paid.


This works for Blender because they're a big fish and receive money from big corporations. The vast majority of good open source projects are underfunded or unpaid. Linux distros need a way to streamline payments to open source apps. As I mentioned above, Flathub [0] had an initiative in this direction, but not sure what happened to that.

[0] https://itsfoss.com/news/flathub-paid-apps


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: