On page 18, they show Copilot produces the following code:
>function isEven(n) {
> return n % 2 === 0;
>}
They then say, "Copilot’s Output, like Codex’s, is derived from existing code. Namely, sample code that appears in the online book Mastering JS, written by Valeri Karpov."
Surely everyone reading this has written that code verbatim at some point in their lives. How can they assert that this code is derived specifically from Mastering JS, or that Karpov has any copyright to that code?
There is no way in hell that isEven is covered by copyright.
"In computer programs, concerns for efficiency may limit the possible ways to achieve a particular function, making a particular expression necessary to achieving the idea. In this case, the expression is not protected by copyright."
> There is no way in hell that isEven is covered by copyright.
Hey, I said the same thing about APIs, but here we are.
Edit: Actually, the Supreme Court declined ruling whether APIs are copyrightable, but they did say that if they are, reusing them like google reused the java apis in android would fall under fair use. Given that lower courts did think that APIs should be copyrightable, we don't know if they are anymore.
What's interesting in that case is I would argue the interface code is MORE important than the implementing code. You can hire any SWE to re-implement the entire API and it being pretty straightforward. The interface code was what actually mattered with developers and took creative expression, design, sequence, organization to put together and feedback and iteration over the years. Google knew this too, taking the interface code was WAY more important for them than if they had done the reverse.
True, but trademark and copyright are pretty different on a fundamental level both in the purpose behind the law and how its implemented. I suspect if it weren't for the term "intellectual property" tying trademark, copyright and patents together, we wouldn't really think of them in such a unified way since they are all really different from each other.
It's possible the complaint is using a trivial example to illustrate the type of argument plaintiffs want to make during any trial. A 200-line example is too unwieldy for non-programmers to digest, especially given the formatting constraints of a legal brief.
Look at paragraphs 90 and 91 on page 27 of the complaint[1]:
"90. GitHub concedes that in ordinary use, Copilot will reproduce passages of code verbatim: “Our latest internal research shows that about 1% of the time, a suggestion [Output] may contain some code snippets longer than ~150 characters that matches” code from the training data. This standard is more limited than is necessary for copyright infringement. But even using GitHub’s own metric and the most conservative possible criteria, Copilot has violated the DMCA at least tens of thousands of times."
Does distributing licensed code without attribution on a mass scale count as fair use?
If Copilot is inadvertently providing a programmer with copyrighted code, is that programmer and/or their employer responsible for copyright infringement?
There's a lot of interesting legal complications I think the courts will want to adjudicate.
They changed it, I'm 100 % sure. The profile picture was Saul from Breaking Bad. I assume they read the comments here and changed it in a matter of one or two minutes.
They determined the other `isEven()` function was cribbed from Eloquent Javascript because of matching comments. I wonder if the complaint just left off telltale comments from that Mastering JS one?
Yeah, the other one I found much more persuasive. The extra comments were unequivocally reproduced from the claimed source. (although that output was from Codex, rather than Copilot).
Yep, same. Not in JS, but in Haskell, for the Even Fib project Euler problem. Something like a million people have submitted right answers for that problem and assuming half wrote their own filter rather than importing a isEven library then that's half a million people there.
That seems like a really bad choice of an example for this, but as I haven't read the document I don't have any other context beyond what you've posted here, I have to take your word for it that that's the purpose of this snippet.
However, if you are looking to understand the reasoning behind this lawsuit, there are lots of better examples online where Copilot blatantly ripped off open source code.
>function isEven(n) {
> return n % 2 === 0;
>}
They then say, "Copilot’s Output, like Codex’s, is derived from existing code. Namely, sample code that appears in the online book Mastering JS, written by Valeri Karpov."
Surely everyone reading this has written that code verbatim at some point in their lives. How can they assert that this code is derived specifically from Mastering JS, or that Karpov has any copyright to that code?