Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

Scheduled Pinned Locked Moved Uncategorized
140 Posts 61 Posters 0 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

    @ireneista @GeoffWozniak and everything that is best done on a parsetree (import ordering for example) will be done in the parsetree because it ain't broken

    geoffwozniak@masto.hackers.townG This user is from outside of this forum
    geoffwozniak@masto.hackers.townG This user is from outside of this forum
    geoffwozniak@masto.hackers.town
    wrote last edited by
    #85

    @whitequark @ireneista This sounds a lot like XSLT (or XSLT-adjacent).

    1 Reply Last reply
    0
    • ireneista@adhd.irenes.spaceI ireneista@adhd.irenes.space

      @whitequark @GeoffWozniak yeah this is a recurring research topic for us, we've talked with several of our friends about it over the years. just making a parser/generator that properly round-trip whitespace and comments is already a ton of work, alas...

      whitequark@social.treehouse.systemsW This user is from outside of this forum
      whitequark@social.treehouse.systemsW This user is from outside of this forum
      whitequark@social.treehouse.systems
      wrote last edited by
      #86

      @ireneista @GeoffWozniak there's tree-sitter nowadays which I believe should do that (and I think it should be failure-tolerant considering its fairly wide use in editors: nvim, zed, etc)

      whitequark@social.treehouse.systemsW 1 Reply Last reply
      0
      • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

        @ireneista @GeoffWozniak there's tree-sitter nowadays which I believe should do that (and I think it should be failure-tolerant considering its fairly wide use in editors: nvim, zed, etc)

        whitequark@social.treehouse.systemsW This user is from outside of this forum
        whitequark@social.treehouse.systemsW This user is from outside of this forum
        whitequark@social.treehouse.systems
        wrote last edited by
        #87

        @ireneista @GeoffWozniak my literal first Python project was making a Python parser that fully captures source spans (which wasn't upstream at the time--in 2014 or so), so i'm quite familiar with the topic by now 😛

        1 Reply Last reply
        0
        • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

          @GeoffWozniak @ireneista I view code as art so I find strongly canonicalizing formatters like black to be actively destructive. right now I use Ruff with a 300-line configuration for some of the Python code and I think there's gotta be a better way to approach this that isn't destructive

          geoffwozniak@masto.hackers.townG This user is from outside of this forum
          geoffwozniak@masto.hackers.townG This user is from outside of this forum
          geoffwozniak@masto.hackers.town
          wrote last edited by
          #88

          @whitequark @ireneista I very much respect that.

          I view code like writing and I will tweak structure and form for far too long sometimes. Layout ends up getting less of my attention.

          whitequark@social.treehouse.systemsW 1 Reply Last reply
          0
          • geoffwozniak@masto.hackers.townG geoffwozniak@masto.hackers.town

            @whitequark @ireneista I very much respect that.

            I view code like writing and I will tweak structure and form for far too long sometimes. Layout ends up getting less of my attention.

            whitequark@social.treehouse.systemsW This user is from outside of this forum
            whitequark@social.treehouse.systemsW This user is from outside of this forum
            whitequark@social.treehouse.systems
            wrote last edited by
            #89

            @GeoffWozniak @ireneista I see layout as part of the form, I guess? I write source code files in much the same way as one would write chapters in a book: somewhat self-contained, and intended to make sense when read top-to-bottom linearly and with roughly one full-displayful of contex. so if rustfmt decides to blow up a function call into 20 lines out of nowhere it very much messes with that, for example

            geoffwozniak@masto.hackers.townG 1 Reply Last reply
            0
            • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

              @GeoffWozniak @ireneista I see layout as part of the form, I guess? I write source code files in much the same way as one would write chapters in a book: somewhat self-contained, and intended to make sense when read top-to-bottom linearly and with roughly one full-displayful of contex. so if rustfmt decides to blow up a function call into 20 lines out of nowhere it very much messes with that, for example

              geoffwozniak@masto.hackers.townG This user is from outside of this forum
              geoffwozniak@masto.hackers.townG This user is from outside of this forum
              geoffwozniak@masto.hackers.town
              wrote last edited by
              #90

              @whitequark @ireneista Well, I do have limits.

              In my case I spend my time in Binutils and GCC. Do I love the GNU style? No. But does consistency help? Yes. So I demur. But I will restructure things so the single line curly braces don't take over.

              whitequark@social.treehouse.systemsW 1 Reply Last reply
              0
              • geoffwozniak@masto.hackers.townG geoffwozniak@masto.hackers.town

                @whitequark @ireneista Well, I do have limits.

                In my case I spend my time in Binutils and GCC. Do I love the GNU style? No. But does consistency help? Yes. So I demur. But I will restructure things so the single line curly braces don't take over.

                whitequark@social.treehouse.systemsW This user is from outside of this forum
                whitequark@social.treehouse.systemsW This user is from outside of this forum
                whitequark@social.treehouse.systems
                wrote last edited by
                #91

                @GeoffWozniak @ireneista the awful code style is probably #2 in the list of top 5 reasons I contribute to LLVM instead of GNU tools. I should use it as a testcase for the tool I'm working on, actually

                whitequark@social.treehouse.systemsW geoffwozniak@masto.hackers.townG 2 Replies Last reply
                0
                • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                  @GeoffWozniak @ireneista the awful code style is probably #2 in the list of top 5 reasons I contribute to LLVM instead of GNU tools. I should use it as a testcase for the tool I'm working on, actually

                  whitequark@social.treehouse.systemsW This user is from outside of this forum
                  whitequark@social.treehouse.systemsW This user is from outside of this forum
                  whitequark@social.treehouse.systems
                  wrote last edited by
                  #92

                  @GeoffWozniak @ireneista awful memories of chasing down a bug in or1k binutils where .got section got somehow slightly unaligned from _GLOBAL_OFFSET_TABLE_. I never figured it out; I have since quit the company and I will mercifully never have to think about or1k again

                  geoffwozniak@masto.hackers.townG 1 Reply Last reply
                  0
                  • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                    @GeoffWozniak @ireneista the awful code style is probably #2 in the list of top 5 reasons I contribute to LLVM instead of GNU tools. I should use it as a testcase for the tool I'm working on, actually

                    geoffwozniak@masto.hackers.townG This user is from outside of this forum
                    geoffwozniak@masto.hackers.townG This user is from outside of this forum
                    geoffwozniak@masto.hackers.town
                    wrote last edited by
                    #93

                    @whitequark @ireneista I've grown used to it. That may say something bad about me, but it keeps me employed.

                    However, I never use it as a style in anything else, though.

                    whitequark@social.treehouse.systemsW 1 Reply Last reply
                    0
                    • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                      @GeoffWozniak @ireneista awful memories of chasing down a bug in or1k binutils where .got section got somehow slightly unaligned from _GLOBAL_OFFSET_TABLE_. I never figured it out; I have since quit the company and I will mercifully never have to think about or1k again

                      geoffwozniak@masto.hackers.townG This user is from outside of this forum
                      geoffwozniak@masto.hackers.townG This user is from outside of this forum
                      geoffwozniak@masto.hackers.town
                      wrote last edited by
                      #94

                      @whitequark @ireneista I was in this wonderousness today, used in one of those functions that is a few hundred lines long with nested case statements and no attempt at functional abstraction.

                      So perhaps I have lost any hope of making art.

                      Link Preview Image
                      sourceware.org Git - binutils-gdb.git/blob - bfd/elf-bfd.h

                      favicon

                      (sourceware.org)

                      whitequark@social.treehouse.systemsW 1 Reply Last reply
                      0
                      • geoffwozniak@masto.hackers.townG geoffwozniak@masto.hackers.town

                        @whitequark @ireneista I've grown used to it. That may say something bad about me, but it keeps me employed.

                        However, I never use it as a style in anything else, though.

                        whitequark@social.treehouse.systemsW This user is from outside of this forum
                        whitequark@social.treehouse.systemsW This user is from outside of this forum
                        whitequark@social.treehouse.systems
                        wrote last edited by
                        #95

                        @GeoffWozniak @ireneista yeah I mean I've submitted binutils patches while I was employed there, and for all the dislike I have for that code style it was so far down the list of bad things about that job that it didn't even register

                        1 Reply Last reply
                        0
                        • geoffwozniak@masto.hackers.townG geoffwozniak@masto.hackers.town

                          @whitequark @ireneista I was in this wonderousness today, used in one of those functions that is a few hundred lines long with nested case statements and no attempt at functional abstraction.

                          So perhaps I have lost any hope of making art.

                          Link Preview Image
                          sourceware.org Git - binutils-gdb.git/blob - bfd/elf-bfd.h

                          favicon

                          (sourceware.org)

                          whitequark@social.treehouse.systemsW This user is from outside of this forum
                          whitequark@social.treehouse.systemsW This user is from outside of this forum
                          whitequark@social.treehouse.systems
                          wrote last edited by
                          #96

                          @GeoffWozniak @ireneista yeah I have regretfully seen libbfd

                          geoffwozniak@masto.hackers.townG 1 Reply Last reply
                          0
                          • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                            i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

                            the "ideal" (their choice of words) case is 64.2%

                            netraven@hear-me.socialN This user is from outside of this forum
                            netraven@hear-me.socialN This user is from outside of this forum
                            netraven@hear-me.social
                            wrote last edited by
                            #97

                            @whitequark do the thing. Science the shit out of it.

                            1 Reply Last reply
                            0
                            • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                              i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

                              the "ideal" (their choice of words) case is 64.2%

                              lizardbill@hachyderm.ioL This user is from outside of this forum
                              lizardbill@hachyderm.ioL This user is from outside of this forum
                              lizardbill@hachyderm.io
                              wrote last edited by
                              #98

                              @whitequark 64.2% of the time, it works every time!

                              1 Reply Last reply
                              0
                              • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                                @GeoffWozniak @ireneista yeah I have regretfully seen libbfd

                                geoffwozniak@masto.hackers.townG This user is from outside of this forum
                                geoffwozniak@masto.hackers.townG This user is from outside of this forum
                                geoffwozniak@masto.hackers.town
                                wrote last edited by
                                #99

                                @whitequark @ireneista Sorry, I probably should have put a CW on that.

                                1 Reply Last reply
                                0
                                • whitequark@social.treehouse.systemsW This user is from outside of this forum
                                  whitequark@social.treehouse.systemsW This user is from outside of this forum
                                  whitequark@social.treehouse.systems
                                  wrote last edited by
                                  #100

                                  @static one of my motivations for this is that there are linters popular in the Python ecosystem and i really don't like how they work, haha

                                  1 Reply Last reply
                                  0
                                  • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                                    i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

                                    the "ideal" (their choice of words) case is 64.2%

                                    csolisr@hub.azkware.netC This user is from outside of this forum
                                    csolisr@hub.azkware.netC This user is from outside of this forum
                                    csolisr@hub.azkware.net
                                    wrote last edited by
                                    #101
                                    @whitequark Seeing somebody trying to implement the service proposed at malus.sh/ and it working just half of the time makes me keep some hope.
                                    whitequark@social.treehouse.systemsW 1 Reply Last reply
                                    0
                                    • csolisr@hub.azkware.netC csolisr@hub.azkware.net
                                      @whitequark Seeing somebody trying to implement the service proposed at malus.sh/ and it working just half of the time makes me keep some hope.
                                      whitequark@social.treehouse.systemsW This user is from outside of this forum
                                      whitequark@social.treehouse.systemsW This user is from outside of this forum
                                      whitequark@social.treehouse.systems
                                      wrote last edited by
                                      #102

                                      @csolisr i did a double take

                                      1 Reply Last reply
                                      0
                                      • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                                        i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

                                        the "ideal" (their choice of words) case is 64.2%

                                        aburka@hachyderm.ioA This user is from outside of this forum
                                        aburka@hachyderm.ioA This user is from outside of this forum
                                        aburka@hachyderm.io
                                        wrote last edited by
                                        #103

                                        @whitequark I guess if your code is extruded as a homogenous paste and probably didn't work to begin with, one doesn't care as much...?

                                        1 Reply Last reply
                                        0
                                        • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                                          @ireneista @GeoffWozniak based on a discussion with someone who has worked on this problem before we want to try building a diffusion model that captures the whitespace between code tokens and is then able to inject it into a given parsetree, which appears to be a fairly efficient and unproblematic way to do this

                                          kouhai@social.treehouse.systemsK This user is from outside of this forum
                                          kouhai@social.treehouse.systemsK This user is from outside of this forum
                                          kouhai@social.treehouse.systems
                                          wrote last edited by
                                          #104

                                          @whitequark @ireneista @GeoffWozniak ~~ah, so python indentation~~

                                          1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups