[cctalk] Re: Open source a panacea?

3 Feb 2025

...
  On Feb 3, 2025, at 3:40 PM, Alexander Schreiber via
cctalk &lt;cctalk(a)classiccmp.org&gt; wrote:
 ...
 On top of that: A lot of those LLMs are build on theft at an epically large
 scale. They hovered up everything in sight (and then some) without even
 pretending to care about intellectual property rights - e.g. the NY Times
 has taken OpenAI to court because they managed to make the OpenAI LLMs
 spit out long verbatim fragments of NY Times content. The hilarious part
 is that DeepSeek essentially stole from OpenAI that which OpenAI previously
 stole from everyone else and OpenAI is very angry about the lack of honor
 among thieves or something ;-) 
Excellent point.  I tend to refer to LLMs as "derived work generators" to point
out the copyright problems that are fundamental to what they do.
I also tend to wonder about web hoovering as a training scheme, given that a lot of web
content is fiction.  And I don't mean "misinformation", I just mean novels
and the like.  What happens to an LLM that inhales "The Martian" or
"Ringworld" ?
        paul

2026

2025

2024

2023

2022

[cctalk] Re: Open source a panacea?