i know some people oppose the widespread use of CI on ideological grounds, so i think it's worth it thinking about why we value it

noisytoot@berkeley.edu.pl

@whitequark @dalias @wwahammy LPEs are certainly an issue (although they're also an issue for any CI that doesn't use proper VMs), but Nix doesn't just allow any random unprivileged user to configure a substituter, right?

dalias@hachyderm.io

@whitequark @wwahammy OK, but that's the fault of the CI system doing a shallow clone rather than a fully recursive checkout from already-cloned-and-cached repositories. It's the fault of poor abstraction layers that behave as "just do whatever you want to script in this throwaway container" rather than something more structured.

dalias@hachyderm.io

@whitequark @wwahammy I don't see why the archive would need to be stored. Tarballs are fully streamable and the git-archive command emits them as a stream not with temporary storage.

whitequark@social.treehouse.systems

@dalias @valorzard @wwahammy I think if you have significantly varying amounts of confidence in your main branch there's something wrong with your approach to development, even if non-developers only ever use releases. releases are useful to indicate evolution of the support contract, sure; but if your main branch is sometimes especially wonky because you landed a poorly tested change you should probably test your changes better

whitequark@social.treehouse.systems

@noisytoot @dalias @wwahammy nope. but if you're actively trying to cache intermediate products, you'd have to either allow persistent writes to /nix or allow writes to substituters, both of which seem like they'd allow for cache poisoning (or at least, they don't seem robust enough that I can guarantee absence of it)

whitequark@social.treehouse.systems

@dalias @wwahammy that is how Forgejo works today; the specific externalities that downloading an archive would have over cloning the repo

dalias@hachyderm.io

@whitequark @wwahammy Gotta love how much of a regression all the fancy forges are versus plain cgi-bin cgit...

bms48@mastodon.social

@dalias @whitequark @wwahammy Do you include SourceHut in that analysis? In some ways it's even more minimalist than cgit.

whitequark@social.treehouse.systems

@dalias @wwahammy so I've been responsible for the operation of something more structured for a few years—in my case, a complex Buildbot CI workflow that was updating and building an LLVM/Clang/ARTIQ on a 10 Mbps link (not a typo). I actually did set up the caching system you're talking about here, which used nginx in a forward proxy mode to intercept and store Conda package requests, and it was one of my most nightmarish technical assignments. if I never have to do that again in my life it will be too soon. the correct amount of state in a CI system is zero, because this actually makes it knowable, instead of a bundle of surprises you never know will work from one build to the next because of changes you couldn't predict or track

this doesn't mean that redownloading the same static files over and over is necessary, but the basic principle of "preserve nothing from run to run" is the only way to stay sane

whitequark@social.treehouse.systems

@dalias see, I don't really like talking to you because of your tendency to arrogantly jump to conclusions without ever doing a bare minimum of research

whitequark@social.treehouse.systems

@dalias not "Huh, I wonder why is it that Forgejo does that?" (I don't know but I suspect it has something to do with IO load from repeatedly requested archives), directly to "It's a regression compared to [favorite project]!". it's insufferable

dalias@hachyderm.io

@whitequark @wwahammy TBH if you can't trust your incremental builds to be incremental, that's something I'd want a good CI to test too. 🤪

Like, both preserving artifacts from parent commit, *and* running a new build from scratch, and asserting that the results are byte-for-byte identical.

No, that doesn't sound fun to implement.

mrdos@hachyderm.io

@whitequark @dalias @wwahammy “10 Mbps link” That's a nice fast UART you've got there!

whitequark@social.treehouse.systems

@MrDOS @dalias @wwahammy this was fiber, believe it or not. the technology caught up with 2010s, the billing... did not

dalias@hachyderm.io

@whitequark If this is a conversation you'd rather I not continue I'm fine with dropping it.

whitequark@social.treehouse.systems

@dalias @wwahammy practically speaking, since most of the traffic is coming from npm/pip/cargo/etc I think you should be able to reduce load on external services without intercepting every network request, but by providing local on-demand caches of popular (thus, expensive to run) repositories. this is unlikely to make much of a difference because the supermajority of the load will continue to come from GitHub, but in a hypothetical world where GitHub implemented this, it would improve things a lot

of course GitHub doesn't care too much because npm traffic should be free for them and I guess they just don't think too much about the rest? gross behavior

whitequark@social.treehouse.systems

@dalias no, I would rather like to see you question your assumptions (that other people just don't know how to build software) more often. which I know is a lot more work, but still

dalias@hachyderm.io

@whitequark I mean I feel like it's less of an "assumption" and more of a long history of unpleasant experiences.

whitequark@social.treehouse.systems

@dalias @wwahammy the unfortunate part about being a comparative drop in the bucket is that you could reduce your traffic by 99.9% and nobody on the other end would even notice. in general it doesn't look like a problem that will be solved unless e.g. PyPI starts responding with 429 to requests from Azure's ASN, and which will probably be solved quickly afterwards

from memory, the latest plan on this was to start charging the biggest bandwidth users, but I'm not sure where that's at. maybe @glyph knows?

whitequark@social.treehouse.systems

@MrDOS @dalias @wwahammy I don't think I have words to adequately describe waiting for Conda to download a build of LLVM you just uploaded there minutes ago... for 90 minutes... then deciding to discard everything it's done and download it again, for some inscrutable dependency solver reasons I could never nail down

I think it may have improved since but it's why I still have a visceral reaction to Conda. it's basically like this

CIRCLE WITH A DOT

i know some people oppose the widespread use of CI on ideological grounds, so i think it's worth it thinking about why we value it