i know some people oppose the widespread use of CI on ideological grounds, so i think it's worth it thinking about why we value it

dalias@hachyderm.io

@whitequark @wwahammy Well it's going to have to run on the CI side anyway when the PR is updated.

noisytoot@berkeley.edu.pl

@whitequark @dalias @wwahammy if a malicious Nix builder can poison the cache, wouldn't that mean that a multiuser Nix system is insecure as well, since unprivileged users are allowed to build and install packages?

whitequark@social.treehouse.systems

@dalias @wwahammy I know this is a real problem (PyPI and Rubygems have both considered measures against excessive bandwidth use, mostly by CI services) but I don't think this is the solution; if someone says I should use a CI system where git clone and pip install don't work I would simply consider it defective and pick a different one. and as stated, this seems like it would entirely prevent anything that uses HTTPS to talk to the network (so, basically everything) from working unless every individual tool is going to be upgraded with this system in mind which seems unlikely

whitequark@social.treehouse.systems

@iris_meredith @dalias @wwahammy my entire motivation for building OSS (in the particular way that I do it) comes down to "the industry / the incumbents are making this miserable as fuck so I'll fix it"

see: Vivado, Verilog, etc

whitequark@social.treehouse.systems

@dalias @wwahammy it sounds like you actually want the GitHub product "devcontainers", wherein they instantiate a machine remotely where you can work with the project in a predictable environment and get fast feedback (... but with more git commit -m xxx && git push to it)

dalias@hachyderm.io

@whitequark @wwahammy Why would you ever do a git clone of third-party repos as part of CI? You just need the version you're building with, in which case you can request the archive of that, which can then be content-addressed by its hash. You don't need the entire history which is probably a few orders of magnitude larger.

valorzard@mastodon.gamedev.place

@dalias @whitequark @wwahammy ok so your not against CI/CD, your just against GitHub Actions specifically.

What would you recommend instead?

whitequark@social.treehouse.systems

@noisytoot @dalias @wwahammy I was thinking about "substituters". as far as I'm aware nothing stops you from editing the stuff in the Nix store if you have the right privileges (directly or via a service) and it's pretty hard to detect if it's ever done, therefore I wouldn't rely just on Nix to prevent cache poisoning (especially in light of regularly dropping Linux LPEs)

whitequark@social.treehouse.systems

@dalias @wwahammy most of the time? because it's a submodule. sometimes a recursive submodule.

github's default actions/checkout does a shallow clone (which is just as efficient), but some packages do actually look at their own history in order to give accurate git-describe results or turn git distance numbers into version numbers. your workflow isn't my workflow

dalias@hachyderm.io

@valorzard @whitequark @wwahammy Well I'm against a number of standard CI/CD practices that are harmful to parties not even involved in the project using the CI/CD.

I don't have a specific recommendation for something I haven't wanted to use. I don't think the whole purpose of CI/CD is that important because I don't think we should be expecting non-developers to be using a continuous rolling main branch rather than discrete releases the maintainers have confidence in. If other people want to do that, fine, but finding the right tooling to do it without externalities impacting others is on them not me.

whitequark@social.treehouse.systems

@dalias @wwahammy also I'm pretty sure that at least with Forgejo, it takes less resources to do a git shallow clone than it takes to download an archive of a commit (because the archive needs to be generated and then stored, and all of them are fully denormalized, while git does some sort of optimization with pack files I think?)

noisytoot@berkeley.edu.pl

@whitequark @dalias @wwahammy LPEs are certainly an issue (although they're also an issue for any CI that doesn't use proper VMs), but Nix doesn't just allow any random unprivileged user to configure a substituter, right?

dalias@hachyderm.io

@whitequark @wwahammy OK, but that's the fault of the CI system doing a shallow clone rather than a fully recursive checkout from already-cloned-and-cached repositories. It's the fault of poor abstraction layers that behave as "just do whatever you want to script in this throwaway container" rather than something more structured.

dalias@hachyderm.io

@whitequark @wwahammy I don't see why the archive would need to be stored. Tarballs are fully streamable and the git-archive command emits them as a stream not with temporary storage.

whitequark@social.treehouse.systems

@dalias @valorzard @wwahammy I think if you have significantly varying amounts of confidence in your main branch there's something wrong with your approach to development, even if non-developers only ever use releases. releases are useful to indicate evolution of the support contract, sure; but if your main branch is sometimes especially wonky because you landed a poorly tested change you should probably test your changes better

whitequark@social.treehouse.systems

@noisytoot @dalias @wwahammy nope. but if you're actively trying to cache intermediate products, you'd have to either allow persistent writes to /nix or allow writes to substituters, both of which seem like they'd allow for cache poisoning (or at least, they don't seem robust enough that I can guarantee absence of it)

whitequark@social.treehouse.systems

@dalias @wwahammy that is how Forgejo works today; the specific externalities that downloading an archive would have over cloning the repo

dalias@hachyderm.io

@whitequark @wwahammy Gotta love how much of a regression all the fancy forges are versus plain cgi-bin cgit...

bms48@mastodon.social

@dalias @whitequark @wwahammy Do you include SourceHut in that analysis? In some ways it's even more minimalist than cgit.

whitequark@social.treehouse.systems

@dalias @wwahammy so I've been responsible for the operation of something more structured for a few years—in my case, a complex Buildbot CI workflow that was updating and building an LLVM/Clang/ARTIQ on a 10 Mbps link (not a typo). I actually did set up the caching system you're talking about here, which used nginx in a forward proxy mode to intercept and store Conda package requests, and it was one of my most nightmarish technical assignments. if I never have to do that again in my life it will be too soon. the correct amount of state in a CI system is zero, because this actually makes it knowable, instead of a bundle of surprises you never know will work from one build to the next because of changes you couldn't predict or track

this doesn't mean that redownloading the same static files over and over is necessary, but the basic principle of "preserve nothing from run to run" is the only way to stay sane

CIRCLE WITH A DOT

i know some people oppose the widespread use of CI on ideological grounds, so i think it's worth it thinking about why we value it