i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original
-
@ireneista @GeoffWozniak based on a discussion with someone who has worked on this problem before we want to try building a diffusion model that captures the whitespace between code tokens and is then able to inject it into a given parsetree, which appears to be a fairly efficient and unproblematic way to do this
@ireneista @GeoffWozniak and everything that is best done on a parsetree (import ordering for example) will be done in the parsetree because it ain't broken
-
@ireneista @GeoffWozniak and everything that is best done on a parsetree (import ordering for example) will be done in the parsetree because it ain't broken
@whitequark @GeoffWozniak yeah this is a recurring research topic for us, we've talked with several of our friends about it over the years. just making a parser/generator that properly round-trip whitespace and comments is already a ton of work, alas...
-
@ireneista @GeoffWozniak and everything that is best done on a parsetree (import ordering for example) will be done in the parsetree because it ain't broken
@whitequark @ireneista This sounds a lot like XSLT (or XSLT-adjacent).
-
@whitequark @GeoffWozniak yeah this is a recurring research topic for us, we've talked with several of our friends about it over the years. just making a parser/generator that properly round-trip whitespace and comments is already a ton of work, alas...
@ireneista @GeoffWozniak there's tree-sitter nowadays which I believe should do that (and I think it should be failure-tolerant considering its fairly wide use in editors: nvim, zed, etc)
-
@ireneista @GeoffWozniak there's tree-sitter nowadays which I believe should do that (and I think it should be failure-tolerant considering its fairly wide use in editors: nvim, zed, etc)
@ireneista @GeoffWozniak my literal first Python project was making a Python parser that fully captures source spans (which wasn't upstream at the time--in 2014 or so), so i'm quite familiar with the topic by now

-
@GeoffWozniak @ireneista I view code as art so I find strongly canonicalizing formatters like
blackto be actively destructive. right now I use Ruff with a 300-line configuration for some of the Python code and I think there's gotta be a better way to approach this that isn't destructive@whitequark @ireneista I very much respect that.
I view code like writing and I will tweak structure and form for far too long sometimes. Layout ends up getting less of my attention.
-
@whitequark @ireneista I very much respect that.
I view code like writing and I will tweak structure and form for far too long sometimes. Layout ends up getting less of my attention.
@GeoffWozniak @ireneista I see layout as part of the form, I guess? I write source code files in much the same way as one would write chapters in a book: somewhat self-contained, and intended to make sense when read top-to-bottom linearly and with roughly one full-displayful of contex. so if rustfmt decides to blow up a function call into 20 lines out of nowhere it very much messes with that, for example
-
@GeoffWozniak @ireneista I see layout as part of the form, I guess? I write source code files in much the same way as one would write chapters in a book: somewhat self-contained, and intended to make sense when read top-to-bottom linearly and with roughly one full-displayful of contex. so if rustfmt decides to blow up a function call into 20 lines out of nowhere it very much messes with that, for example
@whitequark @ireneista Well, I do have limits.
In my case I spend my time in Binutils and GCC. Do I love the GNU style? No. But does consistency help? Yes. So I demur. But I will restructure things so the single line curly braces don't take over.
-
@whitequark @ireneista Well, I do have limits.
In my case I spend my time in Binutils and GCC. Do I love the GNU style? No. But does consistency help? Yes. So I demur. But I will restructure things so the single line curly braces don't take over.
@GeoffWozniak @ireneista the awful code style is probably #2 in the list of top 5 reasons I contribute to LLVM instead of GNU tools. I should use it as a testcase for the tool I'm working on, actually
-
@GeoffWozniak @ireneista the awful code style is probably #2 in the list of top 5 reasons I contribute to LLVM instead of GNU tools. I should use it as a testcase for the tool I'm working on, actually
@GeoffWozniak @ireneista awful memories of chasing down a bug in or1k binutils where
.gotsection got somehow slightly unaligned from_GLOBAL_OFFSET_TABLE_. I never figured it out; I have since quit the company and I will mercifully never have to think about or1k again -
@GeoffWozniak @ireneista the awful code style is probably #2 in the list of top 5 reasons I contribute to LLVM instead of GNU tools. I should use it as a testcase for the tool I'm working on, actually
@whitequark @ireneista I've grown used to it. That may say something bad about me, but it keeps me employed.
However, I never use it as a style in anything else, though.
-
@GeoffWozniak @ireneista awful memories of chasing down a bug in or1k binutils where
.gotsection got somehow slightly unaligned from_GLOBAL_OFFSET_TABLE_. I never figured it out; I have since quit the company and I will mercifully never have to think about or1k again@whitequark @ireneista I was in this wonderousness today, used in one of those functions that is a few hundred lines long with nested case statements and no attempt at functional abstraction.
So perhaps I have lost any hope of making art.
-
@whitequark @ireneista I've grown used to it. That may say something bad about me, but it keeps me employed.
However, I never use it as a style in anything else, though.
@GeoffWozniak @ireneista yeah I mean I've submitted binutils patches while I was employed there, and for all the dislike I have for that code style it was so far down the list of bad things about that job that it didn't even register
-
@whitequark @ireneista I was in this wonderousness today, used in one of those functions that is a few hundred lines long with nested case statements and no attempt at functional abstraction.
So perhaps I have lost any hope of making art.
@GeoffWozniak @ireneista yeah I have regretfully seen libbfd
-
i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original
the "ideal" (their choice of words) case is 64.2%
@whitequark do the thing. Science the shit out of it.
-
i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original
the "ideal" (their choice of words) case is 64.2%
@whitequark 64.2% of the time, it works every time!
-
@GeoffWozniak @ireneista yeah I have regretfully seen libbfd
@whitequark @ireneista Sorry, I probably should have put a CW on that.
-
@static one of my motivations for this is that there are linters popular in the Python ecosystem and i really don't like how they work, haha
-
i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original
the "ideal" (their choice of words) case is 64.2%
@whitequark Seeing somebody trying to implement the service proposed at malus.sh/ and it working just half of the time makes me keep some hope. -
@whitequark Seeing somebody trying to implement the service proposed at malus.sh/ and it working just half of the time makes me keep some hope.
@csolisr i did a double take