On Mon, Feb 3, 2025 at 12:45 PM Alexander Schreiber via cctalk <
cctalk(a)classiccmp.org> wrote:
On Mon, Feb 03, 2025 at 07:08:32PM -0000, Donald
Whittemore via cctalk
wrote:
On top of that: A lot of those LLMs are build on theft at an epically large
scale. They hovered up everything in sight (and then some) without even
pretending to care about intellectual property rights - e.g. the NY Times
has taken OpenAI to court because they managed to make the OpenAI LLMs
spit out long verbatim fragments of NY Times content. The hilarious part
is that DeepSeek essentially stole from OpenAI that which OpenAI previously
stole from everyone else and OpenAI is very angry about the lack of honor
among thieves or something ;-)
My understanding was that OpenAI accused DeepSeek of "distilling" their
model. Via presumably making API queries to OpenAIs service. However
normally 'distillation" is the process of generating a smaller
("student")
model from a larger ("teacher") model except in this case DeepSeek
apparantly created something more of a peer to the teacher. Maybe there
was some "veneer" final training but the basic assertion of "they stole
our
work" is probably more of OpenAI trying to control the narrative. Now
whether DeepSeek stole N different entities IP, that's a different
question. As you said there is no way to reproduce the model, so
what's on github isn't "open source" in most peoples understanding.
Still it's better than Microsoft/OpenAI where the model is "closed" behind
an API.