This hinges on a thorough understanding of safety. It also does not even mention trading things off. Both of which are topics that will require further clarifications. And as it were: and so on.
"Because it turns out that people aren't using the language too much yet, and it makes some things that are easy to do in C, hard to do in order to keep the language safe. It's all tradeoffs."
At some point the person asking is satisfied and that tangent ends. This is how conversation works.
You aren't trying to teach a layman how to write production-ready Rust code, and they aren't interested in learning how.
What does it mean for a thing to be hard to do in the context of different programming languages?
Hard and easy here are extremely deep and complex subject matters. Is goto hard or easy, for example?
The person is going to give up (or you are,) because you are not going to be able to just have this conversation for any meaningful length of time. This is a (bad) fantasy, and one that books sometimes attempt (that I hate) and some people fantasise this way.
They are not going to Ask The Next Question because if they could do that, it would come from a position of already understanding in a sort of anachronistic way. (In that they already have your answer, yet still want it.)
They lack intuition for all the new concepts, they have a bunch of false friends (to borrow from linguistics, in that they think they understand some of the concepts - but they are different concepts with the same or simliar names).
If you are a good teacher, then I am sure you will be very successful in explaining, but it's just not happening quickly nor with just anyone.
Guided to want it. Sure. Everyone else, all those other folks with other lives, opinions and preferences, they are brain washed by my enemies. Come on, man :)
I just wanted Passwords to be its own app because the Settings applet(?) is obnoxious to interact with in some scenarios. My passwords are already all in there.
Now, I use a Windows laptop too and would love for Apple to make the Passwords thing work there too. It probably won't :)
Happily interested in this, I discover that they have broken the scrolling on the homepage about this for reasons. This tells me that since they have never tried their own homepage, they don't really care about usability so their product is probably not going to be very good.
As someone who had input on the website, I can say we're very concerned about that and aiming to fix it as soon as possible. Honestly, we just didn't get it right before the launch.
Why are you even doing anything with the scrolling? Something someone can do in JavaScript in a WebPage is better than the implementation in my touch pad on my Mac?
You're not wrong. But there's a tradeoff between creating a cinematic experience (which really does resonate with people in a strong way) and adhering to standard methods. We debated both approaches and went with this one, in-part because that's part of our ethos: interface design is in a local maximum, and we've had success in our software prototypes creating new interface primitives that i.e. improve the reading experience dramatically.
In order for this current design to work smoothly across platforms, devices, and input methods, however, it'll need to be re-implemented, and it was probably a mistake to try something so ambitious in the time we had available.
Yes. Mobile friendly sites tend to suck because they take us back to WAP and the 90:ies. Even desktop sites suffer from a weird movement recently where all text is unbearably HUGE.
While I understand the concepts of derivatives and tainted code, this AI/human-dichotomy is not as good as that reasoning requires it to be. Every statement of code I commit is in fact a derivative of work that potentially had an incompatible license.
I agree. For the people who think that every tiny piece of code has copyright, have you never written the same code for two different projects?
There are some utility functions that I have written dozens of times, for different projects. Am I supposed to be varying my style each time so that each has its own unique copyright? I do not believe any court would interpret copyright to work in this way.
I am pretty sure that I have implemented a lot of functions before that would be almost (or completely) identical (not including variable names, indentation, etc.) to that of someone else's work. The smaller the function, the more likely it seems to be true, and some programming languages encourage you to write concise functions.
Yes my point is that the prohibition is based on a theory that tiny pieces of code have copyright and so all AI generated code is tainted.
If you write code for one company/client then the IP belongs to that company as a work for hire.
If you write the same function again for someone else then you would be infringing the first company's copyright - if you truly believe in this theory that little pieces of code have copyright.
If you believe in this theory then you need to have a database of all the code you've ever written and be constantly checking every function you write for potential infringement of a previous employer's rights.
You can’t have it both ways though, if an AI can’t hold copyright and is merely a tool being used by a human engineer who bears legal responsibility and ownership, then the concept of “code that was not written by yourself” is completely incoherent.
It is impossible for code to not be written by yourself under that view because it is just a tool. The fact that it’s mad-libs autocomplete instead of normal intellij auto-complete is totally irrelevant.
Okay, but if you’ve become very familiar with a particular code base, the later produce code that’s an exact match for a significant piece of code to the first code base, that would be copyright infringement.
There’s a reason companies will use clean-room strategies to avoid poisoning a code base.
So, I’d be okay with a compromise wherein the makers of Chat GPT aren’t liable for copyright for the act of training the model, but are liable for copyright for statements produced by the model. So, when the AI spits out exact copies of copyrighted works in response to a prompt, then copyright has been violated and the AI creator should be liable.
An exact match is of course a derivative, let's call it the identity derivative. So in:
let new_code = f previous_work
where f could be identity. But it could be other transformations such as: rename_all_the_things, move_items_around, object_oriented_to_functional, and what not. Probably a combination. But here's the kicker: all permutations of these functions are all derivatives.
And we don't have to be naive here, thinking that I'm talking about using refactoring tools on an actual code base because I want to hide the fact that I want to steel some piece of code.
I'm talking about the fact that I've seen so much code and this has in fact taught me basically everything I know. I have trained myself on a stream of works. Just like an AI. The difference is in scale, not in nature.
You can pretend the identity function is a derivative that is no different from other kinds of derivatives, and has no logical difference.
A judge and jury are going to laugh in your face.
An exact copy is completely different than “produced new code based on everything it’s seen”. You can try your “it’s just an identity function derivative” if you want, but… judges and juries aren’t idiots. They know that a copy is a copy. You’re not going to pull the wool over their eyes
I was trying to explain so that you might understand, that there is a continuum between copying (the identity function) to more complex combinations of other transformations via your memory/ intuition.
I am definitely not trying to conclude that copyright or licensing is bad.
I have read the scheduler implementations in a number of different operating systems. If I were to implement a scheduler in a operating system, would you say that there is a probability that my scheduler will be a derivative to some extent of those that I have read? How is this materially different from training a GPT on those same pieces of code, and then asking it to construct a scheduler?
I would argue that while there are differences, there are also similarities. That means that the dichotomy is not true.
> How is this materially different from training a GPT on those same pieces of code, and then asking it to construct a scheduler?
I am saying sometimes GPT doesn’t do the thing you’re describing. Sometimes it produces exact copies of something that it was trained on.
When it does that (not a “superficially similar” one, but an exact copy) can OpenAI be held liable for copyright infringement?
Because it seems like they want to avoid liability on both ends. They don’t want to be liable for copyright infringement based on ingesting works, and they don’t want to be liable when their tools produces a *copy* of existing works.
Okay, I think we may be arguing basically the same thing.
I’m arguing that copying is not allowed. And you’re arguing derivative works are not always allowed.
I originally thought you were arguing that derivative works are allowed, and that since copying was a derivative work, that copying was therefore also allowed.
Sorry for my misunderstanding of what you were originally saying.
So this comes down to probabilities then. What is the probably that the fact that my TotallyNotLinux-kernel matches line-by-line with the actual Linux-kernel? I didn't copy it. Promise! But my function called println which is very very similar to some other, well it might not be a derivative.
While it is technically possible for you to read some GPL'd code, memorize it, and then reproduce it later by accident. That's not how programmers work. What humans remember is not the copyrightable code, but the patentable algorithm. (And there are few algorithms that are simple enough to memorize on a cursory reading, novel enough to be patentable, but not so novel you'll remember it's source)
AI does not work in algorithms. It's a language model. It deals purely in the copyrightable code. (Both figuratively; LLMs are structurally incapable of the high level abstract reasoning required, and literally by way of the training data)
1) You are a thinking adult - or a thoughtful teenager :) - and therefore you understand copyright law well enough to take responsibility for copyright law, and in particular you are capable of having standing in legal matters. AI is not, but it is quite capable of creating infringing code (eg GNU stuff) that human reviewers wouldn't even know was infringing. So it is much better for any honest and competent organization to ban commercial LLMs entirely. (I am fine with in-house solutions with 100% validated training data...but those aren't very good yet, are they?)
2) I am a broken record on this, but the biggest problem with the "stochastic parrots" analogy is that transformer ANNs are dramatically dumber than parrots, or any other jawed vertebrate (I am not sure about lampreys). As applied to code generation: when I first tested ChatGPT-3.5, I was shocked to discover it was plagiarizing hundreds of lines of F#, verbatim, including from my own GitHub. Obviously that's outrageous in terms of OpenAI's ethics. But it is also amazing how dumb the AI is! Imagine a human programmer who is highly proficient in Python, and pretty good at Haskell, yet despite reading every public F# project in GitHub it can't solve intermediate F# problems without shameless copying.
It is a completely misleading comparison to say that humans reading source code is anything like transformers learning patterns in text. The most depressing thing about the current AI bubble is watching tech folks devalue human intelligence - especially since the primary motivation is excusing the failures of a computer which is less intelligent than a single honeybee.
You can ingest millions of line of code in minutes/hours? If AI can ignore copyright, why can't a human? Are we already inferior to machines just because of our copyright system?
> [...], I can confidently say that async and its interactions with memory are the hard part. (Edit: Making async code properly clean up after itself is also the hard part, unless you use a better async runtime than tokio.)
I would love to read more about this, are there any specific areas or things I might search for?