Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

Scheduled Pinned Locked Moved Uncategorized
140 Posts 61 Posters 0 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • ireneista@adhd.irenes.spaceI ireneista@adhd.irenes.space

    @whitequark @GeoffWozniak yeah this is a recurring research topic for us, we've talked with several of our friends about it over the years. just making a parser/generator that properly round-trip whitespace and comments is already a ton of work, alas...

    whitequark@social.treehouse.systemsW This user is from outside of this forum
    whitequark@social.treehouse.systemsW This user is from outside of this forum
    whitequark@social.treehouse.systems
    wrote last edited by
    #86

    @ireneista @GeoffWozniak there's tree-sitter nowadays which I believe should do that (and I think it should be failure-tolerant considering its fairly wide use in editors: nvim, zed, etc)

    whitequark@social.treehouse.systemsW 1 Reply Last reply
    0
    • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

      @ireneista @GeoffWozniak there's tree-sitter nowadays which I believe should do that (and I think it should be failure-tolerant considering its fairly wide use in editors: nvim, zed, etc)

      whitequark@social.treehouse.systemsW This user is from outside of this forum
      whitequark@social.treehouse.systemsW This user is from outside of this forum
      whitequark@social.treehouse.systems
      wrote last edited by
      #87

      @ireneista @GeoffWozniak my literal first Python project was making a Python parser that fully captures source spans (which wasn't upstream at the time--in 2014 or so), so i'm quite familiar with the topic by now 😛

      1 Reply Last reply
      0
      • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

        @GeoffWozniak @ireneista I view code as art so I find strongly canonicalizing formatters like black to be actively destructive. right now I use Ruff with a 300-line configuration for some of the Python code and I think there's gotta be a better way to approach this that isn't destructive

        geoffwozniak@masto.hackers.townG This user is from outside of this forum
        geoffwozniak@masto.hackers.townG This user is from outside of this forum
        geoffwozniak@masto.hackers.town
        wrote last edited by
        #88

        @whitequark @ireneista I very much respect that.

        I view code like writing and I will tweak structure and form for far too long sometimes. Layout ends up getting less of my attention.

        whitequark@social.treehouse.systemsW 1 Reply Last reply
        0
        • geoffwozniak@masto.hackers.townG geoffwozniak@masto.hackers.town

          @whitequark @ireneista I very much respect that.

          I view code like writing and I will tweak structure and form for far too long sometimes. Layout ends up getting less of my attention.

          whitequark@social.treehouse.systemsW This user is from outside of this forum
          whitequark@social.treehouse.systemsW This user is from outside of this forum
          whitequark@social.treehouse.systems
          wrote last edited by
          #89

          @GeoffWozniak @ireneista I see layout as part of the form, I guess? I write source code files in much the same way as one would write chapters in a book: somewhat self-contained, and intended to make sense when read top-to-bottom linearly and with roughly one full-displayful of contex. so if rustfmt decides to blow up a function call into 20 lines out of nowhere it very much messes with that, for example

          geoffwozniak@masto.hackers.townG 1 Reply Last reply
          0
          • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

            @GeoffWozniak @ireneista I see layout as part of the form, I guess? I write source code files in much the same way as one would write chapters in a book: somewhat self-contained, and intended to make sense when read top-to-bottom linearly and with roughly one full-displayful of contex. so if rustfmt decides to blow up a function call into 20 lines out of nowhere it very much messes with that, for example

            geoffwozniak@masto.hackers.townG This user is from outside of this forum
            geoffwozniak@masto.hackers.townG This user is from outside of this forum
            geoffwozniak@masto.hackers.town
            wrote last edited by
            #90

            @whitequark @ireneista Well, I do have limits.

            In my case I spend my time in Binutils and GCC. Do I love the GNU style? No. But does consistency help? Yes. So I demur. But I will restructure things so the single line curly braces don't take over.

            whitequark@social.treehouse.systemsW 1 Reply Last reply
            0
            • geoffwozniak@masto.hackers.townG geoffwozniak@masto.hackers.town

              @whitequark @ireneista Well, I do have limits.

              In my case I spend my time in Binutils and GCC. Do I love the GNU style? No. But does consistency help? Yes. So I demur. But I will restructure things so the single line curly braces don't take over.

              whitequark@social.treehouse.systemsW This user is from outside of this forum
              whitequark@social.treehouse.systemsW This user is from outside of this forum
              whitequark@social.treehouse.systems
              wrote last edited by
              #91

              @GeoffWozniak @ireneista the awful code style is probably #2 in the list of top 5 reasons I contribute to LLVM instead of GNU tools. I should use it as a testcase for the tool I'm working on, actually

              whitequark@social.treehouse.systemsW geoffwozniak@masto.hackers.townG 2 Replies Last reply
              0
              • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                @GeoffWozniak @ireneista the awful code style is probably #2 in the list of top 5 reasons I contribute to LLVM instead of GNU tools. I should use it as a testcase for the tool I'm working on, actually

                whitequark@social.treehouse.systemsW This user is from outside of this forum
                whitequark@social.treehouse.systemsW This user is from outside of this forum
                whitequark@social.treehouse.systems
                wrote last edited by
                #92

                @GeoffWozniak @ireneista awful memories of chasing down a bug in or1k binutils where .got section got somehow slightly unaligned from _GLOBAL_OFFSET_TABLE_. I never figured it out; I have since quit the company and I will mercifully never have to think about or1k again

                geoffwozniak@masto.hackers.townG 1 Reply Last reply
                0
                • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                  @GeoffWozniak @ireneista the awful code style is probably #2 in the list of top 5 reasons I contribute to LLVM instead of GNU tools. I should use it as a testcase for the tool I'm working on, actually

                  geoffwozniak@masto.hackers.townG This user is from outside of this forum
                  geoffwozniak@masto.hackers.townG This user is from outside of this forum
                  geoffwozniak@masto.hackers.town
                  wrote last edited by
                  #93

                  @whitequark @ireneista I've grown used to it. That may say something bad about me, but it keeps me employed.

                  However, I never use it as a style in anything else, though.

                  whitequark@social.treehouse.systemsW 1 Reply Last reply
                  0
                  • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                    @GeoffWozniak @ireneista awful memories of chasing down a bug in or1k binutils where .got section got somehow slightly unaligned from _GLOBAL_OFFSET_TABLE_. I never figured it out; I have since quit the company and I will mercifully never have to think about or1k again

                    geoffwozniak@masto.hackers.townG This user is from outside of this forum
                    geoffwozniak@masto.hackers.townG This user is from outside of this forum
                    geoffwozniak@masto.hackers.town
                    wrote last edited by
                    #94

                    @whitequark @ireneista I was in this wonderousness today, used in one of those functions that is a few hundred lines long with nested case statements and no attempt at functional abstraction.

                    So perhaps I have lost any hope of making art.

                    Link Preview Image
                    sourceware.org Git - binutils-gdb.git/blob - bfd/elf-bfd.h

                    favicon

                    (sourceware.org)

                    whitequark@social.treehouse.systemsW 1 Reply Last reply
                    0
                    • geoffwozniak@masto.hackers.townG geoffwozniak@masto.hackers.town

                      @whitequark @ireneista I've grown used to it. That may say something bad about me, but it keeps me employed.

                      However, I never use it as a style in anything else, though.

                      whitequark@social.treehouse.systemsW This user is from outside of this forum
                      whitequark@social.treehouse.systemsW This user is from outside of this forum
                      whitequark@social.treehouse.systems
                      wrote last edited by
                      #95

                      @GeoffWozniak @ireneista yeah I mean I've submitted binutils patches while I was employed there, and for all the dislike I have for that code style it was so far down the list of bad things about that job that it didn't even register

                      1 Reply Last reply
                      0
                      • geoffwozniak@masto.hackers.townG geoffwozniak@masto.hackers.town

                        @whitequark @ireneista I was in this wonderousness today, used in one of those functions that is a few hundred lines long with nested case statements and no attempt at functional abstraction.

                        So perhaps I have lost any hope of making art.

                        Link Preview Image
                        sourceware.org Git - binutils-gdb.git/blob - bfd/elf-bfd.h

                        favicon

                        (sourceware.org)

                        whitequark@social.treehouse.systemsW This user is from outside of this forum
                        whitequark@social.treehouse.systemsW This user is from outside of this forum
                        whitequark@social.treehouse.systems
                        wrote last edited by
                        #96

                        @GeoffWozniak @ireneista yeah I have regretfully seen libbfd

                        geoffwozniak@masto.hackers.townG 1 Reply Last reply
                        0
                        • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                          i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

                          the "ideal" (their choice of words) case is 64.2%

                          netraven@hear-me.socialN This user is from outside of this forum
                          netraven@hear-me.socialN This user is from outside of this forum
                          netraven@hear-me.social
                          wrote last edited by
                          #97

                          @whitequark do the thing. Science the shit out of it.

                          1 Reply Last reply
                          0
                          • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                            i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

                            the "ideal" (their choice of words) case is 64.2%

                            lizardbill@hachyderm.ioL This user is from outside of this forum
                            lizardbill@hachyderm.ioL This user is from outside of this forum
                            lizardbill@hachyderm.io
                            wrote last edited by
                            #98

                            @whitequark 64.2% of the time, it works every time!

                            1 Reply Last reply
                            0
                            • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                              @GeoffWozniak @ireneista yeah I have regretfully seen libbfd

                              geoffwozniak@masto.hackers.townG This user is from outside of this forum
                              geoffwozniak@masto.hackers.townG This user is from outside of this forum
                              geoffwozniak@masto.hackers.town
                              wrote last edited by
                              #99

                              @whitequark @ireneista Sorry, I probably should have put a CW on that.

                              1 Reply Last reply
                              0
                              • whitequark@social.treehouse.systemsW This user is from outside of this forum
                                whitequark@social.treehouse.systemsW This user is from outside of this forum
                                whitequark@social.treehouse.systems
                                wrote last edited by
                                #100

                                @static one of my motivations for this is that there are linters popular in the Python ecosystem and i really don't like how they work, haha

                                1 Reply Last reply
                                0
                                • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                                  i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

                                  the "ideal" (their choice of words) case is 64.2%

                                  csolisr@hub.azkware.netC This user is from outside of this forum
                                  csolisr@hub.azkware.netC This user is from outside of this forum
                                  csolisr@hub.azkware.net
                                  wrote last edited by
                                  #101
                                  @whitequark Seeing somebody trying to implement the service proposed at malus.sh/ and it working just half of the time makes me keep some hope.
                                  whitequark@social.treehouse.systemsW 1 Reply Last reply
                                  0
                                  • csolisr@hub.azkware.netC csolisr@hub.azkware.net
                                    @whitequark Seeing somebody trying to implement the service proposed at malus.sh/ and it working just half of the time makes me keep some hope.
                                    whitequark@social.treehouse.systemsW This user is from outside of this forum
                                    whitequark@social.treehouse.systemsW This user is from outside of this forum
                                    whitequark@social.treehouse.systems
                                    wrote last edited by
                                    #102

                                    @csolisr i did a double take

                                    1 Reply Last reply
                                    0
                                    • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                                      i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

                                      the "ideal" (their choice of words) case is 64.2%

                                      aburka@hachyderm.ioA This user is from outside of this forum
                                      aburka@hachyderm.ioA This user is from outside of this forum
                                      aburka@hachyderm.io
                                      wrote last edited by
                                      #103

                                      @whitequark I guess if your code is extruded as a homogenous paste and probably didn't work to begin with, one doesn't care as much...?

                                      1 Reply Last reply
                                      0
                                      • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                                        @ireneista @GeoffWozniak based on a discussion with someone who has worked on this problem before we want to try building a diffusion model that captures the whitespace between code tokens and is then able to inject it into a given parsetree, which appears to be a fairly efficient and unproblematic way to do this

                                        kouhai@social.treehouse.systemsK This user is from outside of this forum
                                        kouhai@social.treehouse.systemsK This user is from outside of this forum
                                        kouhai@social.treehouse.systems
                                        wrote last edited by
                                        #104

                                        @whitequark @ireneista @GeoffWozniak ~~ah, so python indentation~~

                                        1 Reply Last reply
                                        0
                                        • nxskok@cupoftea.socialN nxskok@cupoftea.social

                                          @whitequark @deborahh @danlyke ie, the sort of thing a linter does?

                                          hennichodernich@radiosocial.deH This user is from outside of this forum
                                          hennichodernich@radiosocial.deH This user is from outside of this forum
                                          hennichodernich@radiosocial.de
                                          wrote last edited by
                                          #105

                                          @nxskok @whitequark @deborahh @danlyke to be fair, according to the paper, replacing for with while loops and vice versa and the like was also the goal

                                          illybytes@shrimp.imsofucking.gayI 1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups