Nvidia loses $500 bn in value as Chinese AI firm jolts tech shares

xiao@sh.itjust.works · 3 days ago

Nvidia loses $500 bn in value as Chinese AI firm jolts tech shares

UnderpantsWeevil@lemmy.world · 3 days ago

The number of people repeating “I bet it won’t tell you about Tianamen Square” jokes around this news has - imho - neatly explained why the US tech sector is absolutely fucked going into the next generation.

Womble@lemmy.world · 3 days ago

It’s not a joke, it wont:

Not_mikey@slrpnk.net · 3 days ago

It’s even worse / funnier in the app, it will generate the response, then once it realizes its about Taiwan it will delete the whole response and say sorry I can’t do that.

If you ask it “what is the republic of china” it will generate a couple paragraphs of the history of China, then it’ll get a couple sentences in about the retreat to Taiwan and then stop and delete the response.

Womble@lemmy.world · edit-2 3 days ago

In fairness that is also exactly what chatgpt, claude and the rest do for their online versions too when you hit their limits (usually around sex). IIRC they work by having a second LLM monitor the output and send a cancel signal if they think its gone over the line.

JasSmith@sh.itjust.works · 2 days ago

Okay but one is about puritanical Western cultural standards about sex, and one is about government censorship to maintain totalitarian power. One of these things is not like the other.

Womble@lemmy.world · 2 days ago

Yes I’m aware, I was saying that the method is the same.

小莱卡@lemmygrad.ml · 2 days ago

Sorry that a chinese made model doesn’t parrot US state dept narratives 😞

Bronzebeard@lemm.ee · 3 days ago

You missed the entire point of their comment

Womble@lemmy.world · 3 days ago

Maybe they should have been clearer than saying people were joking about it doing something that it actually does if they wanted to make a point.

Bronzebeard@lemm.ee · 2 days ago

People caring more about “China bad” instead of looking at what the tech they made can actually do is the issue.

You needing this explicitly spelled out for you does not help the case.

ikt@aussie.zone · 2 days ago

ngl I’m still confused

what the tech they made can actually do

It’s AI, it does AI things, is it because China can now do the things we do (coding/development/search queries etc) that are just as good as America that it’s a problem?

Bronzebeard@lemm.ee · 1 day ago

It has nothing to do with it being China. They just figured out how to do it more efficiently and with lower powered chips, meaning nvidia’s market dominance in high end chips that they could charge whatever they wanted for just for is legs cut out from under them. If you don’t need as many to run AI, Nvidia won’t sell as many.

ikt@aussie.zone · 1 day ago

So the idea with this comment:

The number of people repeating “I bet it won’t tell you about Tianamen Square” jokes around this news has - imho - neatly explained why the US tech sector is absolutely fucked going into the next generation.

is that people have misplaced their concern, not at the fact that it’s censored but that the US has lost the technology high ground and won’t get it back for at least a generation?

ikt@aussie.zone · 3 days ago

I’m slow, what’s the point? how does people joking about the fact China is censoring output explain

why the US tech sector is absolutely fucked going into the next generation

小莱卡@lemmygrad.ml · 2 days ago

Because they care more about the model not parroting US state dept narratives than the engineering behind it.

Smokeydope@lemmy.world · edit-2 3 days ago

Try an abliterated version of the qwen 14b or 32b R1 distills. I just tried it out they will give you a real overview.

Still even when abliterated its just not very knowledgeable about “harmful information”. If you want a truly uncensored model hit up mistral small 22b and its even more uncensored fine tune Beepo 22b

Womble@lemmy.world · 3 days ago

Oh I hadnt realised uncensored version had started coming out yet, I definitely wil look into it once quantised versions drop.

Smokeydope@lemmy.world · edit-2 2 days ago

Mradermacher has you covered with quants: https://huggingface.co/mradermacher/DeepSeek-R1-Distill-Qwen-14B-abliterated-v2-GGUF

Scolding7300@lemmy.world · 3 days ago

That’s just dumb. It at least doesn’t suppress that when provided with search results/refuses to search (at least when integrated in Kagi)

UnderpantsWeevil@lemmy.world · 3 days ago

What training data did you use?

Womble@lemmy.world · 3 days ago

??? you dont use training data when running models, that’s what is used in training them.

UnderpantsWeevil@lemmy.world · 3 days ago

DeepSeek open-sourced their model. Go ahead and train it on different data and try again.

Womble@lemmy.world · 3 days ago

Wow ok, you really dont know what you’re talking about huh?

No I dont have thousands of almost top of the line graphics cards to retain an LLM from scratch, nor the millions of dollars to pay for electricity.

I’m sure someone will and I’m glad this has been open sourced, its a great boon. But that’s still no excuse to sweep under the rug blatant censorship of topics the CCP dont want to be talked about.

UnderpantsWeevil@lemmy.world · 3 days ago

No I dont have thousands of almost top of the line graphics cards to retain an LLM from scratch

Fortunately, you don’t need thousands of top of the line cards to train the DeepSeek model. That’s the innovation people are excited about. The model improves on the original LLM design to reduce time to train and time to retrieve information.

Contrary to common belief, an LLM isn’t just a fancy Wikipedia. Its a schema for building out a graph of individual pieces of data, attached to a translation tool that turns human-language inputs into graph-search parameters. If you put facts about Tianamen Square in 1989 into the model, you’ll get them back as results through the front-end.

You don’t need to be scared of technology just because the team that introduced the original training data didn’t configure this piece of open-source software the way you like it.

that’s still no excuse to sweep under the rug blatant censorship of topics the CCP dont want to be talked about.

Wow ok, you really dont know what you’re talking about huh?

Womble@lemmy.world · edit-2 3 days ago

https://www.analyticsvidhya.com/blog/2024/12/deepseek-v3/

Huh I guess 6 million USD is not millions eh? The innovation is it’s comparatively cheap to train, compared to the billions OpenAI et al are spending (and that is with acquiring thousands of H800s not included in the cost).

Edit: just realised that was for the wrong model! but r1 was trained in the same budget https://x.com/GavinSBaker/status/1883891311473782995?mx=2

UnderpantsWeevil@lemmy.world · edit-2 3 days ago

The innovation is it’s comparatively cheap to train, compared to the billions

Smaller builds with less comprehensive datasets take less time and money. Again, this doesn’t have to be encyclopedic. You can train your model entirely on a small sample of material detailing historical events in and around Beijing in 1989 if you are exclusively fixated on getting results back about Tienanmen Square.

MrTolkinghoen@lemmy.zip · 2 days ago

Idk why you’re getting downvoted. This right here.

JasSmith@sh.itjust.works · 2 days ago

Because the parent comment by Womble is about using the Chinese hosted DeepSeek app, not hosting the model themselves. The user above who responded either didn’t read the original comment carefully enough, or provided a very snarky response. Neither is particularly endearing.

UnderpantsWeevil@lemmy.world · 2 days ago

Just another normal day in the Lemmyverse

小莱卡@lemmygrad.ml · 2 days ago

deepseek bad because it doesn’t parrot my US State Dept narrative 😞