You know what would be really nice (but nobody is ever going to build)?

azonenberg@ioc.exchange

Oscilloscope that replaces the ungodly slow USB3/1000baseT PC interface port with NVLink.

Forget PCIe and Thunderbolt... 900 GB/s of bandwidth straight from the ADC to my GPU? Sign me up.

azonenberg@ioc.exchange

For comparison... my 16 GHz LeCroy oscilloscope puts out 40 Gsps * 4 channels * 8 bits of raw ADC samples, not counting the flatness corrections done in gateware/firmware.

That's 160 GB/s or 1.28 Tbps of raw samples.

That would even fit in NVLink 2.0 much less the current gen4/5 stuff.

Imagine four channels of 16 GHz bandwidth waveform data straight into a (very large) GPU nonstop... We'd have to do a hell of a lot of optimization to ngscopeclient to keep up and probably add multi-GPU support but it would be so much fun lol.

funkylab@mastodon.social

@azonenberg I'm always kind of weary of silicon manufacturer's proprietary high-speed buses, because they teeeend to be slightly use-case-specific and don't deal well with edge cases outside that. Anyways, when I saw NVLink my first reaction was "wait is this HyperTransport, but with expensive modern transceivers?"; it isn't, but my guess is that from a system's perspective, you'd be better off going for AMD's InfinityFabric, which seems to make stronger coherence statements (not sure). But, you

funkylab@mastodon.social

@azonenberg are mostly only optimizing for nvidia GPUs anyways, so that might be a moot point.

azonenberg@ioc.exchange

@funkylab Well I mean I would *like* a ludicrously high bandwidth portable interface, but the vendors aren't building it.

Realistically, I think the best you can do portably today is 100GbE with RoCE.

1div0@mastodon.social

@azonenberg https://tenstorrent.com/hardware/cards#compare ?

funkylab@mastodon.social

@azonenberg oh you *can* buy 800 Gb/s interfaces, don't know what their host sides look like, if any for non network-vendor stuff (this is mostly aggregated traffic equipment, i.e. linking racks or DC 1 to DC 2)

azonenberg@ioc.exchange

@funkylab yeah exactly. 100G with a normal pcie interface is available today, i have a 100G pipe to my desk and have saturated it with iperf in benchmarks.

And the nic has RoCE offload capabilities although I'm not using it yet

azonenberg@ioc.exchange

@funkylab You can go all the way up to 800G if you have a host system with PCIe gen6 and sufficiently deep pockets (I do not)

(www.fs.com)

scribblesonnapkins@mastodon.social

@azonenberg so my tek 11801C with it's ability to connect to an external sampling head array might present a problem? Oh no wait it's sampling to slow.

azonenberg@ioc.exchange

@scribblesonnapkins well equivalent time sampling is easy to handle with today's tech because the number of actual samples acquired per second is low.

equally, a scope that acquires high speed data and buffers it in memory before processing at a much slower rate is something we can handle today.

But the vision is to be able to do real time or at least lower-dead-time processing at much higher data rates. ThunderScope almost maxes out 10GbE, my vision is to able to keep up with 25/40/100G eventually

funkylab@mastodon.social

@azonenberg I doubt your pockets will be deep enough for NVlink things involving anything but GPUs

azonenberg@ioc.exchange

@funkylab oh i know, nvlink doesnt even let you get the PHY chiplets (the protocol itself is undocumented) unless you have NDAs and a partnership with nvidia etc.

but I can dream...

funkylab@mastodon.social

@azonenberg I was assuming that you'd probably (assuming infinite money) could buy an Nvidia server platform that has network->VRAM piping (I assume this because I presume that's what nvidia bought mellanox for)

azonenberg@ioc.exchange

@funkylab That's where RoCE comes in.

But Ethernet today tops out at 800 Gbps while the latest NVLink can do 14.4 Tbps

azonenberg@ioc.exchange

@funkylab NVLink is the fantasy, the actually achievable real world implementation is to make the scope speak RoCE, put a mellanox NIC in the client, and RDMA the incoming Ethernet frames straight into VRAM.

But it still has to cross over PCIe and get bottlenecked on that bandwidth

scribblesonnapkins@mastodon.social

@azonenberg I was trying to make a joke with the 1st part "Oh no wait it's sampling to slow."

But the second part about the thunderscope is cool.

penguin42@mastodon.org.uk

@azonenberg How about CXL4? That claims 242GB/s and is at least designed for external connectivity.

azonenberg@ioc.exchange

@penguin42 If somebody makes a GPU with CXL I'll be all over it.

Until then I'm stuck with what I can get my hands on. Realistically, that's PCIe and RoCE

cliffsesport@mastodon.social

@azonenberg @funkylab I am curious what fields use Oscilloscopes at level you build and test for? I am guessing radio & perhaps medical? I've only ever used them for basic electronics back in the 90s so the performance of your stuff is just stunning.

CIRCLE WITH A DOT

You know what would be really nice (but nobody is ever going to build)?