Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

Scheduled Pinned Locked Moved Uncategorized
140 Posts 61 Posters 0 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

    @ireneista @GeoffWozniak there's tree-sitter nowadays which I believe should do that (and I think it should be failure-tolerant considering its fairly wide use in editors: nvim, zed, etc)

    whitequark@social.treehouse.systemsW This user is from outside of this forum
    whitequark@social.treehouse.systemsW This user is from outside of this forum
    whitequark@social.treehouse.systems
    wrote last edited by
    #87

    @ireneista @GeoffWozniak my literal first Python project was making a Python parser that fully captures source spans (which wasn't upstream at the time--in 2014 or so), so i'm quite familiar with the topic by now 😛

    1 Reply Last reply
    0
    • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

      @GeoffWozniak @ireneista I view code as art so I find strongly canonicalizing formatters like black to be actively destructive. right now I use Ruff with a 300-line configuration for some of the Python code and I think there's gotta be a better way to approach this that isn't destructive

      geoffwozniak@masto.hackers.townG This user is from outside of this forum
      geoffwozniak@masto.hackers.townG This user is from outside of this forum
      geoffwozniak@masto.hackers.town
      wrote last edited by
      #88

      @whitequark @ireneista I very much respect that.

      I view code like writing and I will tweak structure and form for far too long sometimes. Layout ends up getting less of my attention.

      whitequark@social.treehouse.systemsW 1 Reply Last reply
      0
      • geoffwozniak@masto.hackers.townG geoffwozniak@masto.hackers.town

        @whitequark @ireneista I very much respect that.

        I view code like writing and I will tweak structure and form for far too long sometimes. Layout ends up getting less of my attention.

        whitequark@social.treehouse.systemsW This user is from outside of this forum
        whitequark@social.treehouse.systemsW This user is from outside of this forum
        whitequark@social.treehouse.systems
        wrote last edited by
        #89

        @GeoffWozniak @ireneista I see layout as part of the form, I guess? I write source code files in much the same way as one would write chapters in a book: somewhat self-contained, and intended to make sense when read top-to-bottom linearly and with roughly one full-displayful of contex. so if rustfmt decides to blow up a function call into 20 lines out of nowhere it very much messes with that, for example

        geoffwozniak@masto.hackers.townG 1 Reply Last reply
        0
        • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

          @GeoffWozniak @ireneista I see layout as part of the form, I guess? I write source code files in much the same way as one would write chapters in a book: somewhat self-contained, and intended to make sense when read top-to-bottom linearly and with roughly one full-displayful of contex. so if rustfmt decides to blow up a function call into 20 lines out of nowhere it very much messes with that, for example

          geoffwozniak@masto.hackers.townG This user is from outside of this forum
          geoffwozniak@masto.hackers.townG This user is from outside of this forum
          geoffwozniak@masto.hackers.town
          wrote last edited by
          #90

          @whitequark @ireneista Well, I do have limits.

          In my case I spend my time in Binutils and GCC. Do I love the GNU style? No. But does consistency help? Yes. So I demur. But I will restructure things so the single line curly braces don't take over.

          whitequark@social.treehouse.systemsW 1 Reply Last reply
          0
          • geoffwozniak@masto.hackers.townG geoffwozniak@masto.hackers.town

            @whitequark @ireneista Well, I do have limits.

            In my case I spend my time in Binutils and GCC. Do I love the GNU style? No. But does consistency help? Yes. So I demur. But I will restructure things so the single line curly braces don't take over.

            whitequark@social.treehouse.systemsW This user is from outside of this forum
            whitequark@social.treehouse.systemsW This user is from outside of this forum
            whitequark@social.treehouse.systems
            wrote last edited by
            #91

            @GeoffWozniak @ireneista the awful code style is probably #2 in the list of top 5 reasons I contribute to LLVM instead of GNU tools. I should use it as a testcase for the tool I'm working on, actually

            whitequark@social.treehouse.systemsW geoffwozniak@masto.hackers.townG 2 Replies Last reply
            0
            • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

              @GeoffWozniak @ireneista the awful code style is probably #2 in the list of top 5 reasons I contribute to LLVM instead of GNU tools. I should use it as a testcase for the tool I'm working on, actually

              whitequark@social.treehouse.systemsW This user is from outside of this forum
              whitequark@social.treehouse.systemsW This user is from outside of this forum
              whitequark@social.treehouse.systems
              wrote last edited by
              #92

              @GeoffWozniak @ireneista awful memories of chasing down a bug in or1k binutils where .got section got somehow slightly unaligned from _GLOBAL_OFFSET_TABLE_. I never figured it out; I have since quit the company and I will mercifully never have to think about or1k again

              geoffwozniak@masto.hackers.townG 1 Reply Last reply
              0
              • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                @GeoffWozniak @ireneista the awful code style is probably #2 in the list of top 5 reasons I contribute to LLVM instead of GNU tools. I should use it as a testcase for the tool I'm working on, actually

                geoffwozniak@masto.hackers.townG This user is from outside of this forum
                geoffwozniak@masto.hackers.townG This user is from outside of this forum
                geoffwozniak@masto.hackers.town
                wrote last edited by
                #93

                @whitequark @ireneista I've grown used to it. That may say something bad about me, but it keeps me employed.

                However, I never use it as a style in anything else, though.

                whitequark@social.treehouse.systemsW 1 Reply Last reply
                0
                • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                  @GeoffWozniak @ireneista awful memories of chasing down a bug in or1k binutils where .got section got somehow slightly unaligned from _GLOBAL_OFFSET_TABLE_. I never figured it out; I have since quit the company and I will mercifully never have to think about or1k again

                  geoffwozniak@masto.hackers.townG This user is from outside of this forum
                  geoffwozniak@masto.hackers.townG This user is from outside of this forum
                  geoffwozniak@masto.hackers.town
                  wrote last edited by
                  #94

                  @whitequark @ireneista I was in this wonderousness today, used in one of those functions that is a few hundred lines long with nested case statements and no attempt at functional abstraction.

                  So perhaps I have lost any hope of making art.

                  Link Preview Image
                  sourceware.org Git - binutils-gdb.git/blob - bfd/elf-bfd.h

                  favicon

                  (sourceware.org)

                  whitequark@social.treehouse.systemsW 1 Reply Last reply
                  0
                  • geoffwozniak@masto.hackers.townG geoffwozniak@masto.hackers.town

                    @whitequark @ireneista I've grown used to it. That may say something bad about me, but it keeps me employed.

                    However, I never use it as a style in anything else, though.

                    whitequark@social.treehouse.systemsW This user is from outside of this forum
                    whitequark@social.treehouse.systemsW This user is from outside of this forum
                    whitequark@social.treehouse.systems
                    wrote last edited by
                    #95

                    @GeoffWozniak @ireneista yeah I mean I've submitted binutils patches while I was employed there, and for all the dislike I have for that code style it was so far down the list of bad things about that job that it didn't even register

                    1 Reply Last reply
                    0
                    • geoffwozniak@masto.hackers.townG geoffwozniak@masto.hackers.town

                      @whitequark @ireneista I was in this wonderousness today, used in one of those functions that is a few hundred lines long with nested case statements and no attempt at functional abstraction.

                      So perhaps I have lost any hope of making art.

                      Link Preview Image
                      sourceware.org Git - binutils-gdb.git/blob - bfd/elf-bfd.h

                      favicon

                      (sourceware.org)

                      whitequark@social.treehouse.systemsW This user is from outside of this forum
                      whitequark@social.treehouse.systemsW This user is from outside of this forum
                      whitequark@social.treehouse.systems
                      wrote last edited by
                      #96

                      @GeoffWozniak @ireneista yeah I have regretfully seen libbfd

                      geoffwozniak@masto.hackers.townG 1 Reply Last reply
                      0
                      • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                        i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

                        the "ideal" (their choice of words) case is 64.2%

                        netraven@hear-me.socialN This user is from outside of this forum
                        netraven@hear-me.socialN This user is from outside of this forum
                        netraven@hear-me.social
                        wrote last edited by
                        #97

                        @whitequark do the thing. Science the shit out of it.

                        1 Reply Last reply
                        0
                        • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                          i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

                          the "ideal" (their choice of words) case is 64.2%

                          lizardbill@hachyderm.ioL This user is from outside of this forum
                          lizardbill@hachyderm.ioL This user is from outside of this forum
                          lizardbill@hachyderm.io
                          wrote last edited by
                          #98

                          @whitequark 64.2% of the time, it works every time!

                          1 Reply Last reply
                          0
                          • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                            @GeoffWozniak @ireneista yeah I have regretfully seen libbfd

                            geoffwozniak@masto.hackers.townG This user is from outside of this forum
                            geoffwozniak@masto.hackers.townG This user is from outside of this forum
                            geoffwozniak@masto.hackers.town
                            wrote last edited by
                            #99

                            @whitequark @ireneista Sorry, I probably should have put a CW on that.

                            1 Reply Last reply
                            0
                            • whitequark@social.treehouse.systemsW This user is from outside of this forum
                              whitequark@social.treehouse.systemsW This user is from outside of this forum
                              whitequark@social.treehouse.systems
                              wrote last edited by
                              #100

                              @static one of my motivations for this is that there are linters popular in the Python ecosystem and i really don't like how they work, haha

                              1 Reply Last reply
                              0
                              • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                                i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

                                the "ideal" (their choice of words) case is 64.2%

                                csolisr@hub.azkware.netC This user is from outside of this forum
                                csolisr@hub.azkware.netC This user is from outside of this forum
                                csolisr@hub.azkware.net
                                wrote last edited by
                                #101
                                @whitequark Seeing somebody trying to implement the service proposed at malus.sh/ and it working just half of the time makes me keep some hope.
                                whitequark@social.treehouse.systemsW 1 Reply Last reply
                                0
                                • csolisr@hub.azkware.netC csolisr@hub.azkware.net
                                  @whitequark Seeing somebody trying to implement the service proposed at malus.sh/ and it working just half of the time makes me keep some hope.
                                  whitequark@social.treehouse.systemsW This user is from outside of this forum
                                  whitequark@social.treehouse.systemsW This user is from outside of this forum
                                  whitequark@social.treehouse.systems
                                  wrote last edited by
                                  #102

                                  @csolisr i did a double take

                                  1 Reply Last reply
                                  0
                                  • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                                    i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

                                    the "ideal" (their choice of words) case is 64.2%

                                    aburka@hachyderm.ioA This user is from outside of this forum
                                    aburka@hachyderm.ioA This user is from outside of this forum
                                    aburka@hachyderm.io
                                    wrote last edited by
                                    #103

                                    @whitequark I guess if your code is extruded as a homogenous paste and probably didn't work to begin with, one doesn't care as much...?

                                    1 Reply Last reply
                                    0
                                    • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                                      @ireneista @GeoffWozniak based on a discussion with someone who has worked on this problem before we want to try building a diffusion model that captures the whitespace between code tokens and is then able to inject it into a given parsetree, which appears to be a fairly efficient and unproblematic way to do this

                                      kouhai@social.treehouse.systemsK This user is from outside of this forum
                                      kouhai@social.treehouse.systemsK This user is from outside of this forum
                                      kouhai@social.treehouse.systems
                                      wrote last edited by
                                      #104

                                      @whitequark @ireneista @GeoffWozniak ~~ah, so python indentation~~

                                      1 Reply Last reply
                                      0
                                      • nxskok@cupoftea.socialN nxskok@cupoftea.social

                                        @whitequark @deborahh @danlyke ie, the sort of thing a linter does?

                                        hennichodernich@radiosocial.deH This user is from outside of this forum
                                        hennichodernich@radiosocial.deH This user is from outside of this forum
                                        hennichodernich@radiosocial.de
                                        wrote last edited by
                                        #105

                                        @nxskok @whitequark @deborahh @danlyke to be fair, according to the paper, replacing for with while loops and vice versa and the like was also the goal

                                        illybytes@shrimp.imsofucking.gayI 1 Reply Last reply
                                        0
                                        • deborahh@cosocial.caD deborahh@cosocial.ca

                                          @whitequark @danlyke so … by "reformatted" I assume you mean aesthetically tidied up, with no change in functionality required?

                                          If I got that right: wtf?

                                          mrkeen@mastodon.socialM This user is from outside of this forum
                                          mrkeen@mastodon.socialM This user is from outside of this forum
                                          mrkeen@mastodon.social
                                          wrote last edited by
                                          #106

                                          @deborahh @whitequark @danlyke

                                          No.

                                          "there is no existing work that performs full stylization on an arbitrary piece of code. The most common methods are rule-based linters, formatters, which are limited to a few pre-defined style rules"

                                          whitequark@social.treehouse.systemsW 1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups