THE NVIDIA AI GPU BLACK MARKET | Investigating Smuggling, Corruption, & Governments

recursive_recursion@piefed.ca · edit-2 2 months ago

THE NVIDIA AI GPU BLACK MARKET | Investigating Smuggling, Corruption, & Governments

brucethemoose@lemmy.world · edit-2 2 months ago

Funny thing is ‘Local ML’ tinkerers largely can’t afford GPUs in the US either.

The 5090 is ludicrously expensive for its VRAM pool. So is the 4090, which is all but OOS. Nvidia will only sell you a decent-sized pool for $10K. Hence non-techbros here have either been building used RTX 3090 boxes (the last affordable compute GPU Nvidia ever sold), EPYC homelabs for CPU offloading, or have been trying to buy those modded 48GB 4090s back.

The insane supply chain is something like this:

Taiwan GPUs -> China
China GPU boards -> US
US GPU Boards -> Smuggled back into China
Deneutered GPU Boards -> Sold back to US

All because Nvidia is playing VRAM cartel and AMD, inexplicably, is uninterested in competing with it when they could sell 48GB 7900s basically for free.

PalmTreeIsBestTree@lemmy.world · edit-2 2 months ago

You could also buy the Apple Studio with its large amount of unified ram for a similar price of a 5090. Of course it’s not as fast but it could run a model that needs more ram.

brucethemoose@lemmy.world · edit-2 2 months ago

The pricing for memory is still pretty bad. $4K for 96GB, $5.6K for 256GB, $10K for 512GB. One can get 128GB on the M4 Max for $3.5K, at the cost of a narrower bus so it’s even slower, but generally, EPYC + a 3090 or 4090 makes a lot more sense.

SOTA quantization for these are mostly DIY. There aren’t many MLX DWQs or trellis-quantized GGUFs floating around.

But if you want to finetune or tinker instead of just run, you’re at an enormous disadvantage there. AMD’s Strix Halo boards are way more compatible, but not standalone yet and kinda rare at this point.

humanspiral@lemmy.ca · edit-2 2 months ago

barebones 780m amd mini pc. 128gb ddr5600. About $700. Under $800. Vulkan is good, non AMD supported version of ROCm works for some, because AMD says fuck you to its users. $2000 higher end mini pc with 8600s gpu is other option for high ram with under double the performance, but closer for cheaper than apple solution.

jaschen306@sh.itjust.works · 2 months ago

I live in Taiwan and the GPU is more expensive than America.

NuXCOM_90Percent@lemmy.zip · 2 months ago

The issue is that nVidia are increasingly marketing their consumer grade GPUs to “prosumer” users. Whether that is small research groups working with “AI” or people farming the latest memecoin or… the other things you would need REALLY REALLY high bandwidth linear algebra from and let’s move on.

Whereas AMD are actually still targeting that consumer market. I think it was the nvidia 40x generation where their consumer cards had like no memory at all and AMD were pumping out 16 GB on their cheap(-ish) models? My brain can only remember card generations while I am actively shopping and… yeah.

And yeah. I would LOVE an AMD card with 32 or even 64 GB of even slower memory. But games are still going to target nvidia because people keep buying it and that means that you just won’t have much use beyond the 8 (or apparently now 16) GB that nVidia are going to let you buy. At which point… why waste money?

As for the prosumer and enterprise space? nVidia… have a long history of being assholes and previous GN videos have talked about the behind the scenes pressure they allegedly apply to system integrators and the like. And I will leave that there for Reasons.

But yes, many mid-tier and even high-tier companies could benefit from just buying AMD cards and there is very much a market for “high end” AMD cards… it is just that they have so few customers to make it worthwhile.

brucethemoose@lemmy.world · 2 months ago

I dunno what your talking, but all AMD has to is this:

Pick up the phone.
Tell their OEMs VRAM restrictions are lifted.
Put it down.

…That’s it.

They’d make seperate SKUs with double the VRAM. AMD doesn’t have to waste a cent.

NuXCOM_90Percent@lemmy.zip · 2 months ago

That… really isn’t how things work at all.

But also? That extra VRAM costs money (especially if you want it to be high performance). And you more or less need to produce things in bulk for it to be viable. So if AMD makes a bunch of “AI Accelerators” and nobody buys them because they would rather nVidia (which the video talked about)? it is just a massive flop AND it means that AMD is no longer “the best bang for your buck” option and is directly competing with nVidia in the mindspace of consumers.

That said? I could actually see them cannibalize what little market share Intel got. The Intel GPUs are… moving on. But they have support for codecs that video editors and transcoders REALLY benefit from and a not insignificant part of the Influencer and Editor space actually have those in their editing or capture PCs. Tweaking the silicon to better support those use cases and selling higher memory versions of the Radeons would potentially be a “productivity” space taht can justify the added cost and have knock ons from people who just want to have even more chrome tabs open while they play fortnite. And… it might lead to the more CS side of the ML world actually realizing it isn’t that hard to run pytorch with an AMD card.

brucethemoose@lemmy.world · edit-2 2 months ago

Yes, it is:

https://www.amd.com/en/products/graphics/workstations/radeon-pro/w7900.html

https://dramexchange.com/

16gb GDDR6 ICs are averaging $10 each. The clamshell PCB is already made. So the cost of doubling up the VRAM in a clamshell configuration 7900 XTX (like the W7900) is like $100 at most, on top of this being a seperate memory supply from HMB the datacenter accelerators use. But AMD literally tells its OEMs they are not allowed to sell such clamshell configs of their cards, like they have in the past.

The ostensible business reason is to justify the actual ‘workstation’ cards, which are laughing stocks in that space at those prices.

Hence, AMD is left scratching their heads wondering why no one is developing for the MI325X when devs have literally zero incentive to buy AMD cards to test on.

So if AMD makes a bunch of “AI Accelerators” and nobody buys them because they would rather nVidia (which the video talked about)?

Well, seeing how backordered the Strix Halo Framework Desktop is (even with its relatively mediocre performance), I think this isn’t a big concern.

There is a huge market dying to get out from under Nvidia here. AMD is barely starting to address this with a 32GB model of the 9000 series, but it’s too little too late. That’s not really worth the trouble over a 4090 or 5090, but that calcus changes if the config could be 48GB on a single card like the 7900.

NuXCOM_90Percent@lemmy.zip · 2 months ago

Yes… for an individual, those are the prices (if only there was some 3 hour youtube video about adding more memory to cards…).

The issue is that even a downstream isn’t buying 100 dollars of VRAM. They need to buy that in bulk. And then they need to retool their factories to support that configuration. And if they can’t sell enough of those units to justify the retooling and the purchases?

I mean… look at EVGA

And then you have the marketing/brand implications which I already spoke to.

brucethemoose@lemmy.world · edit-2 2 months ago

That’s what I’m saying, there is no retooling. Some of AMD’s existing OEMs are already making W7900s.

Here’s the bulk of the process on the OEM side, other than maybe leaving an ECC chip off:

Take finished W7900.
Change ID in firmware (so the CAD drivers don’t recognize it)
Apply a different sticker, put it in a different box
Do the paperwork of making a new SKU, like they make for overclocked cards

That’s not that expensive. If it doesn’t sell a lot, well, not much skin off thier back. And it would make AMD boatloads by seeding development for their server cards (which the workstation cards to not do because they are utterly pointless at those prices).

This is all kind of a moot point though, as the 7900 series is basically sunsetted, and AMD doesn’t have a 384 bit consumer card anymore (nor a GDDR7 one to use the new, huge GDDR7 ICs).

interdimensionalmeme@lemmy.ml · 2 months ago

It’s not explicable why AMD is not breaking rank on VRAM and vGPU, same reason as failing Intel.
But the reason is not mentionnable in polite company and it relates to why AMD exists at all.
By all accounts, AMD should have gone under decades ago, instead they’re one of the only x86 platform licensee and they got that basically because of a fluke in history.

But here’s the real deal, because of the regulatory environment, monopolies are technically illegal. Of course since 1980s enforcement of that has been a total joke as proven with the failure of the Microsoft anti-trust case. Anti-trust is currently neutered and even back then it wasn’t really “anti-trust”, merely anti-monopoly. That is the playbook for Intel and Nvidia that allows AMD to continue existing. They exist so that the other two aren’t monopolies. They are kept alive as long, some market segments will have some competition and other will simply not be touched by Intel and AMD, dynamically decided, kind of like splitting territory, so that AMD can always survive.

So that Intel and Nvidia don’t became actually illegal monopolies. This is a very conservative playbook as since the 90s they could have very well become monopolies and the neolibs wouldn’t have squeaked. They would have cheered !

That’s why you’re not getting that actually threatening to Nvidia amounts of VRAM and vGPU from AMD and why they’re dropping support for their 2018 datacenter GPU if they get a little too much of a good deal on the used market.

The solution is simple, destroy nvidia, not a calculated and gentle trust bust, no, break the company so it stops existing as a coherent entity.

Nvidia is a bunch of software and PNGs on top of TSMC. Break Nvidia, break Intel, break microsoft, break cisco, broadcom, break everything in silicon valley and put it all in a blender. And if anything grows too big again, break it the duck up again.

humanspiral@lemmy.ca · 2 months ago

the playbook for Intel and Nvidia that allows AMD to continue existing.

This isn’t a good explanation for why AMD would act stupid. But there still needs to be one.

interdimensionalmeme@lemmy.ml · 2 months ago

It’s not stupid of AMD not to start a turf war where both they and they patron saint are hurt.

They get to eat the scraps and have the occasional win so that nvidia and intel don’t face impotent mealymouthed anti-trust procedures.

You can either come up with a better alternative explanation, I can lawyer up a perfectly cromulent explanation that doesn’t require exposing the corrupt underbelly of the silicon industry, or you can think that everyone at AMD is an idiots snatching defeat from the jaws of victory.

humanspiral@lemmy.ca · 2 months ago

It’s not stupid of AMD not to start a turf war where both they and they patron saint are hurt.

AMD is far ahead of intel in iGPUs. Key to laptop and mini pc segments. Desktop motherboards are bad at USB 4, and so 4 monitor support, and performance per watt. Desktop PC vendors, with external GPUs, don’t promise exact number (3+) of monitors supported.

Intel is the one that needed AMD to survive to avoid monopoly designation many years ago. That hasn’t stopped AMD from kicking Intel’s ass in iGPUs. 2+ generations ahead where latest high end intel barely outperforms 680m, with AMD having 780m cheaper, and 8600s and 890m. AMD is not “being thankful” to Intel by refusing to compete with it.

Nvidia was never under monopoly scrutiny. AMD making stupid decisions on drivers and memory configurations has no explanation. Industry underbelly NVIDIA bribes to CEO would be an explanation.