Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

Scheduled Pinned Locked Moved Uncategorized
140 Posts 61 Posters 0 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

    @ireneista @GeoffWozniak based on a discussion with someone who has worked on this problem before we want to try building a diffusion model that captures the whitespace between code tokens and is then able to inject it into a given parsetree, which appears to be a fairly efficient and unproblematic way to do this

    whitequark@social.treehouse.systemsW This user is from outside of this forum
    whitequark@social.treehouse.systemsW This user is from outside of this forum
    whitequark@social.treehouse.systems
    wrote last edited by
    #83

    @ireneista @GeoffWozniak and everything that is best done on a parsetree (import ordering for example) will be done in the parsetree because it ain't broken

    ireneista@adhd.irenes.spaceI geoffwozniak@masto.hackers.townG 2 Replies Last reply
    0
    • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

      @ireneista @GeoffWozniak and everything that is best done on a parsetree (import ordering for example) will be done in the parsetree because it ain't broken

      ireneista@adhd.irenes.spaceI This user is from outside of this forum
      ireneista@adhd.irenes.spaceI This user is from outside of this forum
      ireneista@adhd.irenes.space
      wrote last edited by
      #84

      @whitequark @GeoffWozniak yeah this is a recurring research topic for us, we've talked with several of our friends about it over the years. just making a parser/generator that properly round-trip whitespace and comments is already a ton of work, alas...

      whitequark@social.treehouse.systemsW 1 Reply Last reply
      0
      • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

        @ireneista @GeoffWozniak and everything that is best done on a parsetree (import ordering for example) will be done in the parsetree because it ain't broken

        geoffwozniak@masto.hackers.townG This user is from outside of this forum
        geoffwozniak@masto.hackers.townG This user is from outside of this forum
        geoffwozniak@masto.hackers.town
        wrote last edited by
        #85

        @whitequark @ireneista This sounds a lot like XSLT (or XSLT-adjacent).

        1 Reply Last reply
        0
        • ireneista@adhd.irenes.spaceI ireneista@adhd.irenes.space

          @whitequark @GeoffWozniak yeah this is a recurring research topic for us, we've talked with several of our friends about it over the years. just making a parser/generator that properly round-trip whitespace and comments is already a ton of work, alas...

          whitequark@social.treehouse.systemsW This user is from outside of this forum
          whitequark@social.treehouse.systemsW This user is from outside of this forum
          whitequark@social.treehouse.systems
          wrote last edited by
          #86

          @ireneista @GeoffWozniak there's tree-sitter nowadays which I believe should do that (and I think it should be failure-tolerant considering its fairly wide use in editors: nvim, zed, etc)

          whitequark@social.treehouse.systemsW 1 Reply Last reply
          0
          • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

            @ireneista @GeoffWozniak there's tree-sitter nowadays which I believe should do that (and I think it should be failure-tolerant considering its fairly wide use in editors: nvim, zed, etc)

            whitequark@social.treehouse.systemsW This user is from outside of this forum
            whitequark@social.treehouse.systemsW This user is from outside of this forum
            whitequark@social.treehouse.systems
            wrote last edited by
            #87

            @ireneista @GeoffWozniak my literal first Python project was making a Python parser that fully captures source spans (which wasn't upstream at the time--in 2014 or so), so i'm quite familiar with the topic by now 😛

            1 Reply Last reply
            0
            • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

              @GeoffWozniak @ireneista I view code as art so I find strongly canonicalizing formatters like black to be actively destructive. right now I use Ruff with a 300-line configuration for some of the Python code and I think there's gotta be a better way to approach this that isn't destructive

              geoffwozniak@masto.hackers.townG This user is from outside of this forum
              geoffwozniak@masto.hackers.townG This user is from outside of this forum
              geoffwozniak@masto.hackers.town
              wrote last edited by
              #88

              @whitequark @ireneista I very much respect that.

              I view code like writing and I will tweak structure and form for far too long sometimes. Layout ends up getting less of my attention.

              whitequark@social.treehouse.systemsW 1 Reply Last reply
              0
              • geoffwozniak@masto.hackers.townG geoffwozniak@masto.hackers.town

                @whitequark @ireneista I very much respect that.

                I view code like writing and I will tweak structure and form for far too long sometimes. Layout ends up getting less of my attention.

                whitequark@social.treehouse.systemsW This user is from outside of this forum
                whitequark@social.treehouse.systemsW This user is from outside of this forum
                whitequark@social.treehouse.systems
                wrote last edited by
                #89

                @GeoffWozniak @ireneista I see layout as part of the form, I guess? I write source code files in much the same way as one would write chapters in a book: somewhat self-contained, and intended to make sense when read top-to-bottom linearly and with roughly one full-displayful of contex. so if rustfmt decides to blow up a function call into 20 lines out of nowhere it very much messes with that, for example

                geoffwozniak@masto.hackers.townG 1 Reply Last reply
                0
                • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                  @GeoffWozniak @ireneista I see layout as part of the form, I guess? I write source code files in much the same way as one would write chapters in a book: somewhat self-contained, and intended to make sense when read top-to-bottom linearly and with roughly one full-displayful of contex. so if rustfmt decides to blow up a function call into 20 lines out of nowhere it very much messes with that, for example

                  geoffwozniak@masto.hackers.townG This user is from outside of this forum
                  geoffwozniak@masto.hackers.townG This user is from outside of this forum
                  geoffwozniak@masto.hackers.town
                  wrote last edited by
                  #90

                  @whitequark @ireneista Well, I do have limits.

                  In my case I spend my time in Binutils and GCC. Do I love the GNU style? No. But does consistency help? Yes. So I demur. But I will restructure things so the single line curly braces don't take over.

                  whitequark@social.treehouse.systemsW 1 Reply Last reply
                  0
                  • geoffwozniak@masto.hackers.townG geoffwozniak@masto.hackers.town

                    @whitequark @ireneista Well, I do have limits.

                    In my case I spend my time in Binutils and GCC. Do I love the GNU style? No. But does consistency help? Yes. So I demur. But I will restructure things so the single line curly braces don't take over.

                    whitequark@social.treehouse.systemsW This user is from outside of this forum
                    whitequark@social.treehouse.systemsW This user is from outside of this forum
                    whitequark@social.treehouse.systems
                    wrote last edited by
                    #91

                    @GeoffWozniak @ireneista the awful code style is probably #2 in the list of top 5 reasons I contribute to LLVM instead of GNU tools. I should use it as a testcase for the tool I'm working on, actually

                    whitequark@social.treehouse.systemsW geoffwozniak@masto.hackers.townG 2 Replies Last reply
                    0
                    • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                      @GeoffWozniak @ireneista the awful code style is probably #2 in the list of top 5 reasons I contribute to LLVM instead of GNU tools. I should use it as a testcase for the tool I'm working on, actually

                      whitequark@social.treehouse.systemsW This user is from outside of this forum
                      whitequark@social.treehouse.systemsW This user is from outside of this forum
                      whitequark@social.treehouse.systems
                      wrote last edited by
                      #92

                      @GeoffWozniak @ireneista awful memories of chasing down a bug in or1k binutils where .got section got somehow slightly unaligned from _GLOBAL_OFFSET_TABLE_. I never figured it out; I have since quit the company and I will mercifully never have to think about or1k again

                      geoffwozniak@masto.hackers.townG 1 Reply Last reply
                      0
                      • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                        @GeoffWozniak @ireneista the awful code style is probably #2 in the list of top 5 reasons I contribute to LLVM instead of GNU tools. I should use it as a testcase for the tool I'm working on, actually

                        geoffwozniak@masto.hackers.townG This user is from outside of this forum
                        geoffwozniak@masto.hackers.townG This user is from outside of this forum
                        geoffwozniak@masto.hackers.town
                        wrote last edited by
                        #93

                        @whitequark @ireneista I've grown used to it. That may say something bad about me, but it keeps me employed.

                        However, I never use it as a style in anything else, though.

                        whitequark@social.treehouse.systemsW 1 Reply Last reply
                        0
                        • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                          @GeoffWozniak @ireneista awful memories of chasing down a bug in or1k binutils where .got section got somehow slightly unaligned from _GLOBAL_OFFSET_TABLE_. I never figured it out; I have since quit the company and I will mercifully never have to think about or1k again

                          geoffwozniak@masto.hackers.townG This user is from outside of this forum
                          geoffwozniak@masto.hackers.townG This user is from outside of this forum
                          geoffwozniak@masto.hackers.town
                          wrote last edited by
                          #94

                          @whitequark @ireneista I was in this wonderousness today, used in one of those functions that is a few hundred lines long with nested case statements and no attempt at functional abstraction.

                          So perhaps I have lost any hope of making art.

                          Link Preview Image
                          sourceware.org Git - binutils-gdb.git/blob - bfd/elf-bfd.h

                          favicon

                          (sourceware.org)

                          whitequark@social.treehouse.systemsW 1 Reply Last reply
                          0
                          • geoffwozniak@masto.hackers.townG geoffwozniak@masto.hackers.town

                            @whitequark @ireneista I've grown used to it. That may say something bad about me, but it keeps me employed.

                            However, I never use it as a style in anything else, though.

                            whitequark@social.treehouse.systemsW This user is from outside of this forum
                            whitequark@social.treehouse.systemsW This user is from outside of this forum
                            whitequark@social.treehouse.systems
                            wrote last edited by
                            #95

                            @GeoffWozniak @ireneista yeah I mean I've submitted binutils patches while I was employed there, and for all the dislike I have for that code style it was so far down the list of bad things about that job that it didn't even register

                            1 Reply Last reply
                            0
                            • geoffwozniak@masto.hackers.townG geoffwozniak@masto.hackers.town

                              @whitequark @ireneista I was in this wonderousness today, used in one of those functions that is a few hundred lines long with nested case statements and no attempt at functional abstraction.

                              So perhaps I have lost any hope of making art.

                              Link Preview Image
                              sourceware.org Git - binutils-gdb.git/blob - bfd/elf-bfd.h

                              favicon

                              (sourceware.org)

                              whitequark@social.treehouse.systemsW This user is from outside of this forum
                              whitequark@social.treehouse.systemsW This user is from outside of this forum
                              whitequark@social.treehouse.systems
                              wrote last edited by
                              #96

                              @GeoffWozniak @ireneista yeah I have regretfully seen libbfd

                              geoffwozniak@masto.hackers.townG 1 Reply Last reply
                              0
                              • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                                i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

                                the "ideal" (their choice of words) case is 64.2%

                                netraven@hear-me.socialN This user is from outside of this forum
                                netraven@hear-me.socialN This user is from outside of this forum
                                netraven@hear-me.social
                                wrote last edited by
                                #97

                                @whitequark do the thing. Science the shit out of it.

                                1 Reply Last reply
                                0
                                • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                                  i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

                                  the "ideal" (their choice of words) case is 64.2%

                                  lizardbill@hachyderm.ioL This user is from outside of this forum
                                  lizardbill@hachyderm.ioL This user is from outside of this forum
                                  lizardbill@hachyderm.io
                                  wrote last edited by
                                  #98

                                  @whitequark 64.2% of the time, it works every time!

                                  1 Reply Last reply
                                  0
                                  • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                                    @GeoffWozniak @ireneista yeah I have regretfully seen libbfd

                                    geoffwozniak@masto.hackers.townG This user is from outside of this forum
                                    geoffwozniak@masto.hackers.townG This user is from outside of this forum
                                    geoffwozniak@masto.hackers.town
                                    wrote last edited by
                                    #99

                                    @whitequark @ireneista Sorry, I probably should have put a CW on that.

                                    1 Reply Last reply
                                    0
                                    • whitequark@social.treehouse.systemsW This user is from outside of this forum
                                      whitequark@social.treehouse.systemsW This user is from outside of this forum
                                      whitequark@social.treehouse.systems
                                      wrote last edited by
                                      #100

                                      @static one of my motivations for this is that there are linters popular in the Python ecosystem and i really don't like how they work, haha

                                      1 Reply Last reply
                                      0
                                      • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                                        i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

                                        the "ideal" (their choice of words) case is 64.2%

                                        csolisr@hub.azkware.netC This user is from outside of this forum
                                        csolisr@hub.azkware.netC This user is from outside of this forum
                                        csolisr@hub.azkware.net
                                        wrote last edited by
                                        #101
                                        @whitequark Seeing somebody trying to implement the service proposed at malus.sh/ and it working just half of the time makes me keep some hope.
                                        whitequark@social.treehouse.systemsW 1 Reply Last reply
                                        0
                                        • csolisr@hub.azkware.netC csolisr@hub.azkware.net
                                          @whitequark Seeing somebody trying to implement the service proposed at malus.sh/ and it working just half of the time makes me keep some hope.
                                          whitequark@social.treehouse.systemsW This user is from outside of this forum
                                          whitequark@social.treehouse.systemsW This user is from outside of this forum
                                          whitequark@social.treehouse.systems
                                          wrote last edited by
                                          #102

                                          @csolisr i did a double take

                                          1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups