A few of the things I've learned in the run up to taping out our first chip that working with FPGAs had not prepared me for (fortunately, the folks driving the tape out had done this before and were not surprised):

david_chisnall@infosec.exchange

There's a lot of analogue stuff on a chip. Voltage regulators, PLLs, and so on all need to be custom designed for each process. They are expensive to license because they're difficult to design and there are only a handful of companies buying them. The really big companies will design their own in house, but everyone else needs to buy them. The problem is that 'everyone else' is not actually many people.
Design verification (DV) is a massive part of the total cost. This needs people who think about the corner cases in designs. The industry rule of thumb is that you need 2-3 DV engineers per RTL engineer to make sure that the thing that you tape out is probably correct. In an FPGA, you can just fix a bug and roll a new bitfile, but with custom chip you have a long turnaround to fix a bug and a lot of costs. This applies at the block level and at the system level. Things like ISA test suites are a tiny part of this because they're not adversarial. To verify a core, you need to understand the microarchitecture-specific corner cases where things might go wrong and then make sure testing covers them. We aren't using CVA6, but I was talking to someone working on it recently and they had a fun case that DV had missed: If a jump target spanned a page boundary, and one of those pages was not mapped, rather than raising a page fault the core would just fill in 16 random bits and execute a random instruction. ISA tests typically won't cover this, a good DV team would know that anything spanning pages in all possible configurations of permission and presence (and at all points in speculative execution) is essential for functional coverage.
Most of the tools for the backend are proprietary (and expensive, and with per-seat, per-year licenses). This includes tools for formal verification. There are open-source tools for the formal verification, the proprietary ones are mostly better in their error reporting (if the checks pass, they're fine. If they don't, debugging them is much harder).
A lot of the vendors with bits of IP that you need are really paranoid about it leaking. If you're lucky, you'll end up with things that you can access only from a tightly locked-down chamber system. If not, you'll get a simulator and a basic floorplan and the integration happens later.
The back-end layout takes a long time. For FPGAs, you write RTL and you're done. The thing you send to the fab is basically a 3D drawing of what to etch on the chip. The flow from the RTL to the 3D picture is complex and time consuming.
On newer processes, you end up with a load of places where you need to make tradeoffs. SRAM isn't just SRAM, there are a bunch of different options with different performance, different leakage current, and so on. These aren't small differences. On 22fdx, the ultra-low-leakage SRAM has 10% of the idle power of the normal one, but is bigger and slower. And this is entirely process dependent and will change if you move to a new one.
A load of things (especially various kinds of non-volatile memory) use additional layers. For small volumes, you put your chip on a wafer with other people's chips. This is nice, but it means that not every kind of layer happens on every run, which restricts your availability.
I already knew this from previous projects, but it's worth repeating: The core is the easy bit. There are loads of other places where you can gain or lose 10% performance depending on design decisions (and these add up really quickly), or where you can accidentally undermine security. The jump from 'we have RTL for a core' to 'we have a working SoC taped out' is smaller than going to that point from a standing start, but it's not much smaller. But don't think 'yay, we have open-source RTL for a RISC-V core!' means 'we can make RISC-V chips easily!'.
I really, really, really disapprove of physics. It's just not a good building block for stuff. Digital logic is so much nicer.

dysfun@social.treehouse.systems

@david_chisnall physics is a royal pain in the arse, actually.

seanwbruno@infosec.exchange

@david_chisnall I really appreciated the two classes in the #UNM Engineering program: VLSI Synthesis (HDL import into Cadence Tools) and VLSI Design (Tanner/Siemans L-Edit + SPICE + Fab Stuff -> tapeout). It was my lowest grades in graduate school by a lot and the courses I learned the most.

Physics is hard.

aliengasmask@mas.to

@david_chisnall when i learned about metastability at uni i immediately turned my back on hardware.
Horrifying to even think about.

Im very grateful to hardware designers for allowing me to ignore physics.

forestfoxx@infosec.exchange

@david_chisnall Regarding verification, what's your take on hardware fuzzing? It's still a pretty niche topic but there're a number of groups doing active research in this field and some of the fuzzers are pretty good. They surely won't replace formal verification (although they're also some interesting hybrid approaches) but at least they can find lots of bugs before tapeout in an automated fashion, and without the exorbitant prices of licensing needed for commercial verification software.

trouble@masto.ai

@forestfoxx @david_chisnall I'm unfamiliar with hardware fuzzing, but it is important to try and physically send random invalid commands at a processor to see where it might randomly lock up or literally catch fire. From my time working alongside MIPS CPU bringup over 20 years ago.

trouble@masto.ai

@david_chisnall having recently watched the hw team do a very large chip bringup...
* Even when you buy IP, spec sheets can be wrong! In one example, the spec sheet said do a, b, c, then 0 usec later, you're done with initializing our block. Turns out it was more like 10 usec, and that's why we couldn't bring that interface online for almost a month!
* Debug: What happens when part of your chip doesn't work? You need 2-3 ways to access every sub-block! This let us prove the above.

gtsadmin@wiseowl.club

@david_chisnall Congrats on your first tape-out of a chip! That's cool.

david_chisnall@infosec.exchange

@forestfoxx

Fuzzing is great, but it needs to be usefully tied to coverage and that's tricky. A simple case of fuzzing a CPU, you can fairly trivially generate every 32-bit instruction and feed them through an RTL simulator. But that will mostly test the same things. You really want to test things like different pipelines with different timing with dependent instructions.

Defining a coverage model that you can feed into a fuzzing tool and get useful output is tricky.

That said, being able to just throw compute at the problem is a great way of increasing confidence.

CIRCLE WITH A DOT

A few of the things I've learned in the run up to taping out our first chip that working with FPGAs had not prepared me for (fortunately, the folks driving the tape out had done this before and were not surprised):