[cctalk] Re: Open source a panacea?

3 Feb 2025

Alex, your posts come over with the “flag” set (i get a red “flag” on my iphone). Did you
mean to flag all your responses for some reason ? 

Sent from my iPhone

...
  On Feb 3, 2025, at 13:15, Alexander Schreiber via
cctalk &lt;cctalk(a)classiccmp.org&gt; wrote:

 On Mon, Feb 03, 2025 at 03:54:31PM -0500, Paul Koning via cctalk wrote:

  On Feb 3, 2025, at 3:40 PM, Alexander Schreiber
via cctalk
 &lt;cctalk(a)classiccmp.org&gt; wrote:

 ...  On top of that: A lot of those LLMs are build on theft at an epically
 large scale. They hovered up everything in sight (and then some) without
 even pretending to care about intellectual property rights - e.g. the NY
 Times has taken OpenAI to court because they managed to make the OpenAI
 LLMs spit out long verbatim fragments of NY Times content. The hilarious
 part is that DeepSeek essentially stole from OpenAI that which OpenAI
 previously stole from everyone else and OpenAI is very angry about the lack
 of honor among thieves or something ;-)

 Excellent point.  I tend to refer to LLMs as "derived work generators" to
 point out the copyright problems that are fundamental to what they do.

 I just call them "bullshit generators", based on Harry Frankfurt's
"On
 Bullshit".

  I also tend to wonder about web hoovering as a
training scheme, given that a
 lot of web content is fiction.  And I don't mean "misinformation", I just
 mean novels and the like.  What happens to an LLM that inhales "The Martian"
 or "Ringworld" ?

 That's probably a lot less harmless than what already happened: More than
 one model had to be pulled back and deleted (as well as the corpus it was
 trained from) because its makers had unknowingly hovered up CSAM content,
 trained the model with it and it was cheerfully spitting that filth out again.
 If you blindly hover up the entire Internet, you're going find stuff that
 you probably don't want to have on your systems.

 Kind regards,
           Alex.
 --
 "Opportunity is missed by most people because it is dressed in overalls and
 looks like work."                                      -- Thomas A. Edison

2025

2024

2023

2022

[cctalk] Re: Open source a panacea?