Significant raise of reports (on the Linux Kernel Mailing List) https://lwn.net/Articles/1065620/

cwebber@social.coop

Here's something I think we all will have to contend with, whether you're an AIgen enthusiast or not: attacking is easier than defending, and these things don't get tired and they *are* very good at finding exploits. None of us will be able to ignore that, and we will probably have to listen to real genuine reports from them, even if we reject AIgen input.

However, I don't think that's actually the right solution, and I don't think it's sustainable. 🧵

cwebber@social.coop

The fact of the matter is, most vulnerabilities fall under extremely common patterns, with known solutions:

- Confused deputies: capability security can fix/contain this in many cases, more on that later
- Injection attacks: primarily caused by string templating, using structured templating also fixes this (quasiquote, functional combinators, etc)
- Memory vulnerabilities: solved by memory-safe languages, and yes that includes Rust, but it also includes Python, Scheme/Lisp, etc etc etc

There are other serious vulnerabilities, such as incorrectly written or used cryptography, and others from there, but my primary point is: most damage can be either avoided in the first place or contained (especially in terms of capability security for containment)

And... patching AIgen patches is going to get tough and tiring... (cotd...)

cwebber@social.coop

I don't think human reviewers are going to be able to keep up with the number of vulnerabilities we're seeing appear. I really don't. Humans won't be able to review at scale, and I also think that there's serious risks for blindly accepting AIgen patches, which for critical infrastructure could also be a path to *inserting new* vulnerabilities.

We need to attack this systemically.

I have more to say. More later. But that's the gist for now.

thomasfuchs@hachyderm.io

@cwebber is that the AI that is trained on millions of lines of vulnerabilities

cwebber@social.coop

@thomasfuchs Yep!

As said, attacking is easier than defending

jmax@mastodon.social

@cwebber - I'd be curious about whether LLMs are better than straight up fuzzing. (My suspicion is not, but people are throwing more resources at the LLM efforts.)

cwebber@social.coop

@jmax Probably LLMs PLUS fuzzing would be extremely powerful.

demofox@mastodon.gamedev.place

@cwebber also: inserting LLMs into everything makes social engineering an attack vector in every place they are inserted!

faoluin@chitter.xyz

@cwebber My take is that these large models are very good at pattern matching, so let them match patterns. Let them review. *Maybe* let them write fixes as a *suggestion*, with the understanding that these are *tools* which are sometimes wrong.

They need to be trained ethcially, built with purpose and with guard rails, but I think it could absolutely be done and done well, enough to outstrip current rules-based review tools, if the grift market wasn't so desperate right now.

agentultra@types.pl

@cwebber the neat ones fall under langsec; funny machines and the like. More rare and difficult to find/exploit but I have been wary to see if LLMs can pick up on the patterns that lead to them.

jmax@mastodon.social

@cwebber - Maybe.

dalias@hachyderm.io

@cwebber Memory vulnerabilities are also drastically cut off (not entirely precluded, but far less likely) if, when using C, you reject any temptation to have complex object lifetimes and work as much as possible with long-lived, reserved-in-advance storage. The kernel is a horrible offender on getting this wrong.

zenkat@sfba.social

@cwebber If attacking is easier than defending, then the solution is to attack yourself first. Hire an army of bots to attack every surface they can find on your systems, and report them to you before someone else exploits them.

jpetazzo@hachyderm.io

@dalias do you know of any book/article/... that would explain or describe how to design a general purpose kernel with that in mind? (I wonder what things like file descriptors, device handlers, etc would look like there!)

linear@nya.social

@cwebber@social.coop we need microkernel based operating systems with capability-based security enforcement, isolation of components from each other as a baseline assumption, and formal verification of the whole thing at both the code and spec level, and we need all of this quite urgently

linear@nya.social

@cwebber@social.coop things like genode/sculpt are looking more enticing every day that passes by

ska@social.treehouse.systems

@dalias For complex objects, the main problem I've identified is the coupling of storage and structure. When storing data in the structure, it's easy to free when you shouldn't and vice-versa.

Decoupling storage from structure is the best practice I've learned over the past 15 years, and it's applicable even when you can't reserve your storage in advance.

Storage provisioning is useful, but it's mostly useful with another safety aspect: failing as early as possible and avoiding resource allocation in critical moments.

navi@social.vlhl.dev

@ska @dalias

"decouple storage from structure" is one of the best things but when i first started trying to think about it more, it was hard to wrap my head around how to design things to work like that

i feel like a page of example apis or a book or smth would be very helpful for new folks not familiar with it

bob@epicyon.libreserver.org

@cwebber It's a target rich environment.

I am also seeing cases where maintainers seem to be slamming "accept" on slop PRs and hoping for the best. Could be time pressure or burnout.

dlakelan@mastodon.sdf.org

@linear @cwebber

I'd set aside the formal verification requirement to get the rest of it. I really do think microkernels were the right way to go, it's just that in 1992 or whatever the consumer hardware wasn't up to the task. I think probably around 2005 or so the hardware started to be able to afford to do that. But that's approximately the time that VMs and containers took off. Now we have this giant mess.

CIRCLE WITH A DOT

Significant raise of reports (on the Linux Kernel Mailing List) https://lwn.net/Articles/1065620/