Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
LLM-generated code must not be committed without prior written approval by core (netbsd.org)
122 points by beardyw on May 18, 2024 | hide | past | favorite | 111 comments


Every time I hear about AI-generated code, I'm reminded of this comment from Linus Torvalds (taken out of context, of course):

"You copied that function without understanding why it does what it does, and as a result your code is GARBAGE. AGAIN."


I don't know if it is just me but since this AI "summer" started I have ever pulled a single snippet from any of the available services. I'm absolutely curious what kind of code people are getting from it that is so much more valuable than writting things from scratch and use corresponding documentation, examples or even just source code of the framework/libraries that are needed to accomplish a task.


'My customer has given me the documentation to arcane antiquated format X (insert pdf, but it includes 24-bit integers, hex encoded data, semantically signficant whitespace). Here is a sample of the message format, and this struct should represent the contents. Please write me a Python parser which takes an input file in the format and provides the output. In particular, given input X, the output should be equivalent to Y. There should be a set of unit tests for important functionality, which should explain to a reader what is complicated about the format'. is something that ChatGPT4 will just drop out a solution to that works in a few seconds.

If your job is to make an accurate parser for it, probably you want to hand code it. If your job is to make sense of the data the customer has provided you with, this is merely an impediment to your actual job, and ChatGPT has you covered. Yes, there'll be mistakes. But ChatGPT can do in a few seconds what'd take you hours.


Would you apply the same criteria to code instead of file formats? As in "my customer has given me this code for architecture X and I need it working on architecture Y"?

For those who answer "yes" to the above I'd encourage them to read the story of the Therac-25 [1], a machine where hardware mechanisms in model A where replaced by software mechanisms in model B leading to a race condition that would dose patients with massive doses of radiation.

> Yes, there'll be mistakes.

"Over the following weeks the patient experienced paralysis of the left arm, nausea, vomiting, and ended up being hospitalized for radiation-induced myelitis of the spinal cord. His legs, mid-diaphragm and vocal cords ended up paralyzed. He also had recurrent herpes simplex skin infections. He died five months after the overdose." [1]

"But your honor, I specifically wrote 'add unit tests for important functionality'!"

[1] https://en.m.wikipedia.org/wiki/Therac-25


The story of the Therac25 is an important one. I first read it in college and re-read it every 5-10 years.

That said, I don’t think it’s a strong rebuttal to the arguments here. Context matters. Earlier poster mentioned they’d use it for a React front end but not a backend deletion.

Knowing the tool and the appropriateness of the tool in context is one of the lessons to be learned from the Therac 25 tragedy.


'Please translate this AVX-256 algorithm to Arm NEON' is absolutely a prompt I'd give ChatGPT, and similar prompts have revealed new and useful intrinsics. Of course, results to be checked.

Assurance is a complex topic, and any safety critical device should have a carefully thought through architecture and rigorous testing program which minimises the risk of incidents. It therefore seems scarcely relevant here, beyond the fact that a well defined delivery system should be able to handle multiple human errors during implementation without leading to crucial failure modes occuring.


This is a great example of LLM usefulness. It also illustrates how it's half of the solution. Certainly the sense-making part works reasonably well. What it lacks is the formal rigor part. With them both in place, the computer could generate a perfectly correct solution, not just approximately correct.

Computers are very good at formal rigor, and we have quite some rigorous methods of program synthesis.

Whoever manages to connect these dots will own much of the industry.


I tried to get it to show me how to implement a feature I wanted in GCC once. It seemed like the code would get worse with every iteration. It did provide a very useful high level overview of the codebase and abstractions involved though. I don't think I'd ever have even gotten started without that help.


If you don't understand the format yourself how can you ever trust that GPT is actually giving you an accurate result?

The problem with using an ML model to parse stuff you don't understand is that you then have no way to verify the accuracy

If there are no stakes to the results then that's fine to trust blindly but if this is something you need for your job, that's risky and franky stupid to trust to ML


Yes, if you take a problem you don't understand, ask a GPT to write a solution, do nothing to check the solution, trust it blindly, and then use the solution for some safety critical problem, you're playing with fire. But there's a spectrum, and please don't assume I'm at that end of it.

Lots of the time you understand the problem, but the problem is repetitive. Parsing a weird file format might well be that. Beyond that, you have solutions that are easily checked. For example, if I ask ChatGPT to optimise an algorithm for a certain CPU cache, I can easily read whether it did that. And then, there are parts of a software job that are crucial and subtle, and parts that are not.

As a practitioner, traditionally that leads to a shift in the focus and speed with which you approach a task - some pieces of code are 100 lines that took you 2 weeks to get to and were hard fought, some are 2000 lines which you wrote in a day.

Lastly, so much of solid software is being able to understand a probably unfamiliar domain, and ChatGPT can be a great buddy in terms of gaining problem context, finding the limits of your own understanding.

I don't use co-pilot like things, but I've found ChatGPT to be a massive enabler in terms of being able to be productive in unfamiliar problem-spaces.


Sometimes you're just sketching in a notebook, or working on a shell script, a Makefile, something you don't do often enough to have the syntax and idiosyncrasies at the tip of your fingers. Sometimes you're doing something very mechanistic but not easy to automate, like converting some filtering logic from one DSL to another. Sometimes there's just a lot of typing required to do something conceptually very simple. I find Copilot very useful in these side quests, but when I'm working on the hard stuff, the core functionality, it often just outputs nonsense that takes more time to review/fix than it would take to just write it myself. So it takes some learning of when to use it and when to ignore it but I've come around to appreciate it at times.


Copilot for me basically seems like semantic copy/paste. I’m doing some refactoring and make the same style change in a couple of files, then copilot kicks in and understands enough to suggest a legit completion for me.

I only use it sparingly though because it can too easily become a crutch where you start to feel a bit lost without it.


I now write most of my personal scripts in English and have ChatGPT compile it to whatever programming language. I don't use it for anything too complex but it makes it easier to write things like grabbing data from an API or website, manipulating it and putting it in a database, etc. Honestly so long as I provide enough detail, the code is more likely to work on the first run than something I wrote by hand.


Having a working version of some code to refine is often helpful. I most often just use the output as better documentation that I can interact with. I'll definitely grant you that I never end up using AI generated code directly in production. I find the technology extremely valuable though, dare I say transformative.


Software development has gone too diverse. Recently I had a case where I was programming in Matlab I had to use some PowerShell commands. I have no idea how PowerShell even works, but I know enough programming that when phind.com spit out a 4 line script I was able to use it. All this happened in like 10mins. I've done something like this before, almost always trying to work around Matlab limitations, but those cases took a whole day and so much frustration. I like the AI experience very valuable just for the mental peace.

Edit: before someone accuses me of doing what trovalds is saying, I understand that code now. I was able to probe the model to explain it to me to the same degree as my understanding of my own code. It was the same feeling as when my team innovated over my previous work and I just asked them how they did it.


It is helpful as a starting point when I’m switching to a different language I don’t use much. Saves me a lot of googling of how do I do x and y in python. I like a “editing” workflow when I’m not super comfortable with the stack yet - I get that blank page syndrome where I don’t know where to statt. But so far they are not up to snuff for understanding my day to day work or the style of our codebase so I don’t use it for that


It is a good starting point for switching to a language you do not use much when that language is python or js, or maybe a few other languages with a lot of examples in their training set. For many languages (as well as less common libraries etc) it hallucinates too much. My go-to task when trying a new LLM is asking it to import a csv file in julia. You would be surprised (or maybe not) how many fail.


I've recently started writing eslint rules for my project at work. I'll ask ChatGPT how to do it, and it will get the specific implementation wrong every time, but it will give me a decent foundation to start on. It will usually correctly identify the AST node types I need to work with, saving me from having to sift through the possibilities.


I use it for things I want but don't care enough to learn. I recently wanted to present a bunch of labeled pngs as an HTML overview in a certain order, after grabbing them from a webservice with REST. Took me maybe twenty minutes with Opus, including prettying it up with css. I could have done it myself, sure, but not nearly that quickly.


It’s just faster, less cognitive load and frees you up to think more strategically rather than just focus on implementation


I'm a hobby programmer. I mostly ask ChatGPT for stuff that I could do, but want save some time, or I have absolutely no clue how to, for example, get the type annotations right in Rust (still a super noob!).

In both cases, even after back and forth with the AI for half an hour, the result is absolute garbage. And never works. I don't understand at all how people get even barely usable stuff out of AI...


Well, ChatGPT surprised me when I asked it to write me a gltf parser. At first it printed out some python code, but after asking it to do the same it same in c++, it printed out a perfectly working parser, listing buffer size and buffer view sizes of a glb file using the nlohmann json library.


Back and forth degrades quality. Try resetting more frequently. And make sure you're using a frontier model, ie. GPT-4, Opus, Gemini etc.


Today I used it to convert a non-trivial function from Julia to R because I wanted to use a certain R package for a project.


My main code-adjacent use cases broadly fall into the following categories, and I'll give a few more specific examples afterward:

(1) Using LLMs as a learning aid (or docs augmentation)

(2) Writing one-off _whatevers_

(3) Finding the appropriate docs/info in the first place

(4) As a meta-tool to sniff out whether other people have gone down this path before

(5) Cloning any sort of boilerplate "pattern" which requires more than simple text substitution (nvim macros, custom CLI tools applied to a visual selection, ... can get you a long ways, but they're not good for everything)

For a few select examples:

(1a) You can copy in a snippet of code and ask what _this_ character does _here_, finding appropriate search terms to actually learn about the topic.

(1b) Generated examples are fine, even if they're wrong, if you use them as a mental scaffolding to learn more about a library. Analyze them and prove them right or find why they're wrong, and that process of analysis teaches you in a different way than just reading the docs by themselves.

(1c) Sometimes the docs are just unwieldy. Ask the LLM for the snippet immediately after the "-B" flag, and then you can search for that more-unique snippet instead of, e.g., 1000 occurrences of other fields referencing "-B" and not defining it.

(2a) Say you want to know how your webserver responds when the client sends each byte in a separate syscall, waits a second between each one, and then immediately starts a new pipelined request doing the same shenanigans without waiting for a response to the first one. The LLM can generate exactly what you need, or when it fails it was still faster than typing it yourself. It's a task that's much more succinct to describe in English than code, yet simple enough that the LLM has a high success rate, and a brief scan of the code can confirm it's actually hitting localhost instead of downloading not-a-virus.exe.txt. The stakes are low if it's wrong because you can visually tell if it's doing the right thing.

(3a) I recalled reading once about the design of the voting/scoring algorithm used internally in Google's TGIF question ranker and knew there was a brief public blog post going over the more important details in a consumable format. Current search engines mostly can't point me to an old, unmonetized, niche blog post, especially when I don't remember any of the important keywords. LLMs can often point me to the exact blog, and barring that can usually give me the keywords I need to find it myself (similarly with finding things like the AWS SDK C API, which exists but isn't advertised and is hard to search for because the other keywords dominate). In this case, I wanted to read about Wilson's Scoring by Evan Miller [0].

(4a) Ask the same question a few different ways and a lot of times. If you only get garbage hallucinated blog posts, research articles, and keywords not related to your current task, it's suggestive of the specific task you're doing not being popular. If at a high level you know your end goal has been implemented many times, that's suggestive of you going down an incorrect path.

(5a) I was writing a program heavily leveraging lazy type instantiation to generate a zero-cost (de-virtualized, inlined where appropriate) iterator library. Anyone can write map and reduce, but it's a little more annoying to have a meta-layer doing that sort of thing in the type system, especially with some other constraints I had in the project. However, showing the LLM a few examples of this type-level way of defining the things I'm doing and then asking for implementations of `peek` or `skip` or whatever worked swimmingly. They're well suited to that task because the boilerplate isn't just simple textual substitution, because the tasks are simple enough that the LLM has a high chance of getting them right, and because the nature of the library is that a couple very simple hand-written tests will catch all the subtle mistakes, and a human reading the code will catch all the non-subtle errors.

And so on. I don't use LLMs for everything. If I want to do anything involving infrequently used Python dunders I'll probably go straight to the data model docs [1] and search within that page. I write most code by hand (give or take many custom tools I've integrated into my nvim workflow). Sometimes an LLM makes my life easier though.

[0] https://www.evanmiller.org/how-not-to-sort-by-average-rating...

[1] https://docs.python.org/3/reference/datamodel.html


https://hachyderm.io/@inthehands/112006855076082650

> You might be surprised to learn that I actually think LLMs have the potential to be not only fun but genuinely useful. “Show me some bullshit that would be typical in this context” can be a genuinely helpful question to have answered, in code and in natural language — for brainstorming, for seeing common conventions in an unfamiliar context, for having something crappy to react to.

> Alas, that does not remotely resemble how people are pitching this technology.



I use GPT for AutoHotkey scripts for personal use. It fixes things Windows can't do natively. I used it briefly for a shell script on Linux but I don't use Linux much these days. When I used Linux daily I had a folder full of scripts I either adapted or copy pasted from the internet. I can't program but I can copy and paste and make minor changes to make a simple script suitable to me. GPT makes a big difference on my day-to-day, Windows 10 is under my control and I didn't need to learn AHK scripting. I'm not in IT, I'm just a regular person with a Hacker News account.


It does help in a specific case: where I need to write a very obvious but repeated/time consuming self-contained component.

I do know how to write it, but I don't want to write boilerplate stuff again and again, and I want to manage time efficiently. That's when I use, say, ChatGPT to create a React component that is circular and has a click handler (just an example).

I know how to do it, I know how it works, I just want to do it faster.

And I can verify that it works by simply using the component and seeing it works.

Though this is React and frontend. I would probably not blindly do the same for a database connected entity deletion endpoint at a backend.


For me the advantage is that for simple cases like the one you mention, it will hardly ever make dumb mistakes like misspelling a CSS property name or missing the return keyword. So it will not only type faster than me, but it often bypasses the "OK, it's not working, let see what I missed" phase.

This high-level approach usually breaks down for more complex scenarios, but for those, you can often explain with some detail how you want the code to look like, and it will give you the code and auto-correct any glaring mistakes you made.


Linus had a lot of positive things to say about AI and Linux at the end of this recent interview: https://youtu.be/cPvRIWXNgaM?si=dVHbqyeBY8z-MYIH


Classic.

``` [Linux HDD drivers] // sata_promise.c - Promise SATA // Copyright 2003-2004 Red Hat, Inc.

[...]

// Except for the hotplug stuff, this is voodoo from the Promise driver. Label this entire section "TODO: figure out why we do this" ```

https://github.com/torvalds/linux/blob/0450d2083be6bdcd18c95...


Although I do agree anyway (there are more insidious reasons, mainly in how it destroys thinking) but this is a pretty useless working model of AI.


Isn't "mainly in how it destroys thinking" precisely what the parent comment was referring to with the Torvalds quote? I am not sure what your disagreement even is.


I don't think the analogy is that AI is just copy/pasting code, but rather that getting an AI to generate code for you and directly using the output is like copy/pasting code.


But to get AI to write you a working solution, more often than not you are supposed to understand the problem at hand along with possible solutions, right?


At that point, you're re-writing the entire solution, albeit without actually typing it out.

That makes the AI completely pointless.


Or, more appropriately, for people that aren't novices using AI

"I copied the code because it looks just like what I'd do and oh it knew the library calls I couldn't remember off the top of my head. Let me just fix some of the hallucination here and... all good. That saved me like 30 minutes :)"


Everyday life at my company.


Copying functions and not understanding or getting told that it’s garbage?


While I understand the concepts of derivatives and tainted code, this AI/human-dichotomy is not as good as that reasoning requires it to be. Every statement of code I commit is in fact a derivative of work that potentially had an incompatible license.


I agree. For the people who think that every tiny piece of code has copyright, have you never written the same code for two different projects?

There are some utility functions that I have written dozens of times, for different projects. Am I supposed to be varying my style each time so that each has its own unique copyright? I do not believe any court would interpret copyright to work in this way.


> have you never written the same code for two different projects?

But note this prohibition is of "code that was not written by yourself" rather that written by yourself twice.


I am pretty sure that I have implemented a lot of functions before that would be almost (or completely) identical (not including variable names, indentation, etc.) to that of someone else's work. The smaller the function, the more likely it seems to be true, and some programming languages encourage you to write concise functions.


Yes my point is that the prohibition is based on a theory that tiny pieces of code have copyright and so all AI generated code is tainted.

If you write code for one company/client then the IP belongs to that company as a work for hire.

If you write the same function again for someone else then you would be infringing the first company's copyright - if you truly believe in this theory that little pieces of code have copyright.

If you believe in this theory then you need to have a database of all the code you've ever written and be constantly checking every function you write for potential infringement of a previous employer's rights.


You can’t have it both ways though, if an AI can’t hold copyright and is merely a tool being used by a human engineer who bears legal responsibility and ownership, then the concept of “code that was not written by yourself” is completely incoherent.

It is impossible for code to not be written by yourself under that view because it is just a tool. The fact that it’s mad-libs autocomplete instead of normal intellij auto-complete is totally irrelevant.


It is coherent e.g. if the "AI" is used as a tool to copy another's code.

> The fact that it’s mad-libs autocomplete instead of normal intellij auto-complete is totally irrelevant

Completely relevent if the former had and gave no permission and the latter did.


Okay, but if you’ve become very familiar with a particular code base, the later produce code that’s an exact match for a significant piece of code to the first code base, that would be copyright infringement.

There’s a reason companies will use clean-room strategies to avoid poisoning a code base.

So, I’d be okay with a compromise wherein the makers of Chat GPT aren’t liable for copyright for the act of training the model, but are liable for copyright for statements produced by the model. So, when the AI spits out exact copies of copyrighted works in response to a prompt, then copyright has been violated and the AI creator should be liable.


An exact match is of course a derivative, let's call it the identity derivative. So in:

let new_code = f previous_work

where f could be identity. But it could be other transformations such as: rename_all_the_things, move_items_around, object_oriented_to_functional, and what not. Probably a combination. But here's the kicker: all permutations of these functions are all derivatives.

And we don't have to be naive here, thinking that I'm talking about using refactoring tools on an actual code base because I want to hide the fact that I want to steel some piece of code.

I'm talking about the fact that I've seen so much code and this has in fact taught me basically everything I know. I have trained myself on a stream of works. Just like an AI. The difference is in scale, not in nature.


You can pretend the identity function is a derivative that is no different from other kinds of derivatives, and has no logical difference.

A judge and jury are going to laugh in your face.

An exact copy is completely different than “produced new code based on everything it’s seen”. You can try your “it’s just an identity function derivative” if you want, but… judges and juries aren’t idiots. They know that a copy is a copy. You’re not going to pull the wool over their eyes


I was trying to explain so that you might understand, that there is a continuum between copying (the identity function) to more complex combinations of other transformations via your memory/ intuition.

I am definitely not trying to conclude that copyright or licensing is bad.

I have read the scheduler implementations in a number of different operating systems. If I were to implement a scheduler in a operating system, would you say that there is a probability that my scheduler will be a derivative to some extent of those that I have read? How is this materially different from training a GPT on those same pieces of code, and then asking it to construct a scheduler?

I would argue that while there are differences, there are also similarities. That means that the dichotomy is not true.


> How is this materially different from training a GPT on those same pieces of code, and then asking it to construct a scheduler?

I am saying sometimes GPT doesn’t do the thing you’re describing. Sometimes it produces exact copies of something that it was trained on.

When it does that (not a “superficially similar” one, but an exact copy) can OpenAI be held liable for copyright infringement?

Because it seems like they want to avoid liability on both ends. They don’t want to be liable for copyright infringement based on ingesting works, and they don’t want to be liable when their tools produces a *copy* of existing works.

I don’t think that’s an acceptable framework.


Thing is: derivatives are verboten, sometimes. Otherwise we could all just steel and rename all the things.


Okay, I think we may be arguing basically the same thing.

I’m arguing that copying is not allowed. And you’re arguing derivative works are not always allowed.

I originally thought you were arguing that derivative works are allowed, and that since copying was a derivative work, that copying was therefore also allowed.

Sorry for my misunderstanding of what you were originally saying.


> An exact match is of course a derivative

Not if neither is derived from the other.


So this comes down to probabilities then. What is the probably that the fact that my TotallyNotLinux-kernel matches line-by-line with the actual Linux-kernel? I didn't copy it. Promise! But my function called println which is very very similar to some other, well it might not be a derivative.


> So this comes down to probabilities then.

Agreed. No "of course" about it.


There's a key dichotomy here:

While it is technically possible for you to read some GPL'd code, memorize it, and then reproduce it later by accident. That's not how programmers work. What humans remember is not the copyrightable code, but the patentable algorithm. (And there are few algorithms that are simple enough to memorize on a cursory reading, novel enough to be patentable, but not so novel you'll remember it's source)

AI does not work in algorithms. It's a language model. It deals purely in the copyrightable code. (Both figuratively; LLMs are structurally incapable of the high level abstract reasoning required, and literally by way of the training data)


The are two salient differences here:

1) You are a thinking adult - or a thoughtful teenager :) - and therefore you understand copyright law well enough to take responsibility for copyright law, and in particular you are capable of having standing in legal matters. AI is not, but it is quite capable of creating infringing code (eg GNU stuff) that human reviewers wouldn't even know was infringing. So it is much better for any honest and competent organization to ban commercial LLMs entirely. (I am fine with in-house solutions with 100% validated training data...but those aren't very good yet, are they?)

2) I am a broken record on this, but the biggest problem with the "stochastic parrots" analogy is that transformer ANNs are dramatically dumber than parrots, or any other jawed vertebrate (I am not sure about lampreys). As applied to code generation: when I first tested ChatGPT-3.5, I was shocked to discover it was plagiarizing hundreds of lines of F#, verbatim, including from my own GitHub. Obviously that's outrageous in terms of OpenAI's ethics. But it is also amazing how dumb the AI is! Imagine a human programmer who is highly proficient in Python, and pretty good at Haskell, yet despite reading every public F# project in GitHub it can't solve intermediate F# problems without shameless copying.

It is a completely misleading comparison to say that humans reading source code is anything like transformers learning patterns in text. The most depressing thing about the current AI bubble is watching tech folks devalue human intelligence - especially since the primary motivation is excusing the failures of a computer which is less intelligent than a single honeybee.


You can ingest millions of line of code in minutes/hours? If AI can ignore copyright, why can't a human? Are we already inferior to machines just because of our copyright system?


That is my point right there.


They seem to have stopped just short of a total ban.

> Code generated by a large language model or similar technology, such as GitHub/Microsoft's Copilot, OpenAI's ChatGPT, or Facebook/Meta's Code Llama, is presumed to be tainted code, and must not be committed without prior written approval by core.


Does this include single-line suggestions from Copilot, I wonder? Those are pretty nice because it just finishes what I already intended to type.


Should we grant them the benefit of doubt and believe it’s just poor wording?

Surely if an AI writes “foo” it’s because it’s learnt that everyone else writes that. It’s different from asking “suggest me an implementation of FFT”, and then pasting that in wholesale.

It’s similar to how research literature works. Eventually concepts like “computer” and “DNA” become so common that we don’t have to reference 1800s literature unless very pertinent to the question.


> Should we grant them the benefit of doubt and believe it’s just poor wording?

As a NetBSD developer, I don't think it is poor wording.

There is GPL code in the NetBSD tree already, such as GCC, but how it gets used is carefully controlled. If someone proposed adding something AI generated in a way that it was kept separate from everything else then it could potentially get approved.


I'm sceptical of the implication that there's no downside there, even if it's a more limited use of the technology.

If it introduced a subtle mistake (e.g. wrong variable) there's a chance the programmer could miss it, but would not have made the same mistake themselves.

There's also a chance the machine-generated code might lack a bug that the programmer would have introduced if writing it themselves. Overall though, I get the impression these tools are generally harmful to code quality.

edit As ekianjo points out, they specifically mention Copilot as unacceptable.


"write code" isn't a uniform task.

Sometimes writing code involves having built a good mental model of the domain problem & parts of the codebase.

Sometimes writing code involves fairly straightforward manipulation of tools you might not be so familiar with.

For some tasks, (e.g. "take the sample code from this blog post / answer, adjust it to use the variables available in this context"), I'd expect an LLM to save time over not.


Do you feel the same way about normal autocomplete?


It's funny how most people just assume engineers aren't checking the output of Copilot's suggestions.

Reading code for correctness takes only a moment, you either know how to verify its correctness or you don't; writing the code yourself doesn't change that.

Furthermore, code which is meant to be depended upon should come with tests, and if those tests pass with bad code, the tests are wrong.


> It's funny how most people just assume engineers aren't checking the output of Copilot's suggestions.

I didn't say that.

I'm assuming the engineer is checking Copilot's output, but that they will sometimes fail to catch bugs in Copilot's suggested code.

> Reading code for correctness takes only a moment, you either know how to verify its correctness or you don't; writing the code yourself doesn't change that.

I don't buy this framing. I think it's entirely possible for a programmer to miss an error when reviewing code, that they would not have made if writing the code themselves.

I've encountered this many times when copying+pasting+updating my own code to use in a slightly different context. It's easy to miss small mistakes, like failing to find+replace a variable name, despite that I wouldn't have made that mistake if writing the code afresh.

> Furthermore, code which is meant to be depended upon should come with tests, and if those tests pass with bad code, the tests are wrong.

Test suites should improve code-quality regardless of your policy on use of AI tools. Using AI tools may still degrade code-quality.


> Reading code for correctness takes only a moment

It absolutely does not, especially when you take a step outside the simplest code.

Even if I know how to verify it, that can take longer than writing it myself because there might be many different ways to solve the problem, with different logic, trade-offs and failure states.

> Furthermore, code which is meant to be depended upon should come with tests, and if those tests pass with bad code, the tests are wrong.

You can't "fix" bad code with tests. Even if you could get a perfect test coverage for bugs (you can't), there's no test that capturers soft attributes like readibility and maintainability.


Regular autocomplete (from jetbrains IDEs for example) can produce small mistakes as well


it specifically says Copilot


How is this enforceable?


In the same way a lot of the other rules on that list are. How are they going to know you are familiar with the code you are committing? How are they going to know it is tested?

Sometimes rules are established to set up a common community base line. You are trusted to follow them in good faith. If over time, through code review, questioning or testing, it shows you aren't following the rules then you can't claim you didn't know it wasn't okay.


This stance is CYA. There's no way to enforce it but by stating explicitly like this, they save themselves possible legal pain.


The headline of this HN post is misleading, it isn't really a "ban" as such and doesn't have to be "enforceable." These are guidelines whose target audience is good-faith developers who want to make positive contributions to NetBSD.

The comments saying "heh, those stoopid NetBSD maintainers would never know if I slipped a bit of AI into their codebase" are missing the point. The point of the guideline is to say that people who use Copilot/etc for NetBSD development are dumb assholes, so don't be a dumb asshole.


Seems impossible to me, although it seems more of some sort of protection in case of legal problems should someone add AI-generated code that turns out to contain copyrighted code by someone else. They must show they're doing something against that, even if that something turns out to be probably useless.


its not about enforceability its simply about covering their ass


This is what people do not seem to understand. It is all about legalities; copyright, licensing issues, whatever.


yeah doesn't make any sense lol

if you remove comments and shift it to your own style then it's not going to be distinguishable except for sufficiently complex code where your own codebase's unique idioms may show through and the AI doesn't use them


Come now, you can have guidelines that aren't enforceable but which you expect people to adhere to. After all, you asked them to adhere to them, and they're (hopefully) not arseholes.


I doubt the people that think it’s fine to not respect NetBSD ethos are even capable of contributing anything to it so I’d say the project is fine.


Sufficiently complex code is the one most susceptible to coming from a proprietary solution


But are coding large language models in the wild trained on proprietary code bases?


>But are coding large language models in the wild trained on proprietary code bases?

There are source availleble projects on GitHub but BSD license is not compatible with GPL and for sure there is ton of GPL code in the LLMs.

The only solution I can think of is too train LLMs on BSD compatible code to be used on BSD projects, GPL compatible code to be used on GPL projects etc.

See below link were copilot was caught stealing code and prove that it is memorizing and reproducing rather then learning concepts and then generating original code.

https://news.ycombinator.com/item?id=27710287


If you don't know, it's a yes


I don’t believe in unicorns, but maybe they exist.


That ship has sailed, sunk, and sat on the ocean floor for a century. There is no way to enforce this and there never will be.


Good call, AI-Gen code is the same as most copied code, it might work, but you won't know why, which means that further development will be more difficult.



Do even slightly competent programmers ever actually commit LLM-generated code as is rather than using it as a clue for ideas on how to possibly do something? I thought only people with zero coding skills would just copy paste, everyone else would want to change a lot.


Of course not. I'm not sure why this scenario is such a popular straw man. GenAI in the programming domain is augmentative. There's a whole bunch of cognitive work that isn't necessarily _reasoning_ that nonetheless needs to be done for a given task. GenAI is fantastic at yak shaving. Used skillfully, GenAI is good for shortening the distance to understanding. Is this worth the energy costs and downstream environmental and social impact of the technology? I ... I'm not sure. Probably not. But it's here, it's useful, and it's not going back into the bottle.


I think AI would be a better to use for explaining code. I’d use that instead, of creating code.


“Automation has always helped people write code, I mean, this is nothing new at all [..] I see that as tools that can help us be beer better at what we do.” — Linus Torvalds’ on LLM code generation/review (https://m.youtube.com/watch?v=VHHT6W-N0ak).

NetBSD still has an edge with its memory hardening, NPF, kernel-level blacklist, and “legacy support”. But I fear that this out-of-touch policy might eventually tip it into irrelevance.


It’s not out of touch. It’s a licensing issue.

The BSDs were burned by this in the 4.4BSD days.[1] It makes sense that they don’t want to be burned again.

[1] https://en.wikipedia.org/wiki/UNIX_System_Laboratories,_Inc.....


Current generation of AI code generators are loaded with ethics and copyright problems. Plus, there's the issue of "copying something without understanding" angle.

The most advanced tools were template based generators and real-time static checkers plus language servers. AI makes things way more complicated than it is.

It's not only bleed of GPL into MIT. It's also bleeding of source-available licensed corpus to AI models. These things leak their training data like crazy. Ask the right question and get functions from training sets verbatim.

When everything is combined, this is a huge problem. It's not that these problems are individually OK. They're huge already. The resulting problem is a sum of huge problems.


Everyone in the industry has been copying code from Stack Overflow (and also generally shitting on the concept of IP altogether) for years and nobody cares, but suddenly LLMs come out everybody is a copyright stickler. Give me a break.


> Everyone in the industry

I doubt this genericism applies to codebases the like of the *BSD's, or the linux kernel.


The code on Stack Overflow is already licensed with Creative Commons, plus people put their code there with the intent of being shared and used.

There are GPL projects which provide people their livelihoods, because they get grants to develop that code with a GPL license. Ingesting the same code to a model sans it’s license not only infringes on the license, but allows this code to seep to places where it shouldn’t (by design), and puts this man’s livelihood in jeopardy.

Companies frowned upon GPL for years because of its liability, and now they can feast over this code with these models.

Same is for source available repositories. These companies put their code out for eyes only, not for reproduction and introduction to other code bases. These systems also infringe on these licenses, and attacking to the business models of these companies.

I’m not a copyright stickler. I just respect people and their choices they made with their code.

P.S.: I can share the tweets of that researcher if you want.


> The code on Stack Overflow is already licensed with Creative Commons

It is CC-BY-SA so it requires attribution (+ share alike).[1] That is the hard part with code written by LLMs.

[1] https://stackoverflow.com/help/licensing


Considering all the code I write is GPL licensed and I always write a comment on top of SO inspired code blocks with their respective URLs, I don't think I'm doing something wrong.

Update: SA 4.0 accepts GPLv3 as compatible, and I use GPLv3 exclusively. I'm on clear.

LLMs neither know provenance nor licenses about the things they generate, you're right. I think that part of the problem is ignored not just because it's hard, but it's convenient to ignore, too.


If my subordinate was copying code from StackOverflow without attribution I would be annoyed enough to send a grouchy email. Behavior like that is bad hacker citizenship, and bad for long-term maintenance. You should at least include a hyperlink to the SO question.

I also think SO is different about mindless copy-pasting. Outside of rote beginner stuff it’s infrequent that someone has the exact same question as you, and that the best answer works by simple copy-pasting. Often the modification is simple enough that even GPT can do it :) But making sure the SO question is relevant, and modifying the answer accordingly, is a check on understanding that LLMs don’t really have. In particular, a SO answer might be “wildly wrong” syntactically but essentially correct semantically. LLMs can give you the exact opposite problem.


The last time I copied something from SO it was this:

> DENSE_RANK() OVER (ORDER BY TotalMark DESC) AS StudRank

And then I filled in my column names and alias. This is 90% of what is happening with LLMs / SO copying. Copy / paste of syntax like this absolutely does not need a link or attribution and is in no way copyrightable in the first place.


This is not 90% of what's happening with LLMs. Everyone I saw using LLMs were requesting whole algorithms or even program boilerplate which doesn't contain much boilerplate but tons of logic.

Case in point: https://x.com/docsparse/status/1581461734665367554

This is not akin to copying a 2-line trick from SO.

On the other hand, the most significant part of code I copied from SO was using two iostream iterators to automatically tokenize an incoming string. 5-6 lines at most.

This block has a 10+ line comment on top of it not only explaining how it works, but it has a link to the original answer at SO.


> "Automation has always helped people write code

... doesn't say automatic copying.


They're still using CVS?


One of the NetBSD developers is a Mercurial developer and I think the general plan for a while has been to switch to Mercurial once a few technical issues have been fixed. I'm not sure if that has fully happened now and it is just waiting on the right people having enough time or what the status is. I think NetBSD and OpenBSD are the last two actively maintained open source projects that use CVS, at least as far as I am aware.

However, at least for NetBSD (don't know about OpenBSD) there is an official mirror on Github and you can submit pull requests there if you want (I'm fairly sure this will stay the case even after a switch to Mercurial). Some of the main developers mostly use git until a commit is almost ready. I think the main practical difference from the project using git as the main repository for anyone who doesn't have commit access is that while commit messages will mention the contributor I don't think it is linked on github to the contributor in the same way it would usually be with an accepted pull request. At most there might be a tiny number of extremely minor issues with the repository conversion left, and I think those might have been fixed by now as well.


If it works for them why not, and compared to rcs it's really not too bad ;)

In some aspects the old centralized versioning systems are superior to distributed systems like git (for starters, they give you a single linear history with incremental revision numbers).


Let’s go back to writing in assembly because if you’re writing in python you don’t understand what you’re doing (this is sarcasm of course).

AI has decoupled code design from fabrication. We’re just going to need to improve our design language and fabs to get better yield.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: