Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

Scheduled Pinned Locked Moved Uncategorized
140 Posts 61 Posters 0 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

    i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

    the "ideal" (their choice of words) case is 64.2%

    theorangetheme@en.osm.townT This user is from outside of this forum
    theorangetheme@en.osm.townT This user is from outside of this forum
    theorangetheme@en.osm.town
    wrote last edited by
    #32

    @whitequark That's it, these people lose their computer privileges until they take some undergraduate CS theory classes.

    whitequark@social.treehouse.systemsW 1 Reply Last reply
    0
    • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

      @lu_leipzig I actually really don't like formatters like black or rustfmt which is why I'm collaborating on research into doing it with ML, but there are ways to do it that never produce a different AST

      lu_leipzig@troet.cafeL This user is from outside of this forum
      lu_leipzig@troet.cafeL This user is from outside of this forum
      lu_leipzig@troet.cafe
      wrote last edited by
      #33

      @whitequark oh, interesting, what do you not like about them? I could imagine a ML model would do a decent job deciding between n equivalent deterministically produced ASTs that vary e.g. w.r.t. indentation on multi-line definitions/calls.

      whitequark@social.treehouse.systemsW 1 Reply Last reply
      0
      • theorangetheme@en.osm.townT theorangetheme@en.osm.town

        @whitequark That's it, these people lose their computer privileges until they take some undergraduate CS theory classes.

        whitequark@social.treehouse.systemsW This user is from outside of this forum
        whitequark@social.treehouse.systemsW This user is from outside of this forum
        whitequark@social.treehouse.systems
        wrote last edited by
        #34

        @theorangetheme both authors are currently full professors i believe

        1 Reply Last reply
        0
        • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

          @ireneista starting with "gotofail bad" and ending with making the problem significantly worse, apparently without ever reflecting on this

          ireneista@adhd.irenes.spaceI This user is from outside of this forum
          ireneista@adhd.irenes.spaceI This user is from outside of this forum
          ireneista@adhd.irenes.space
          wrote last edited by
          #35

          @whitequark because "the thing we're promoting is incredibly dangerous, and not in fun ways" is not really the thing anyone wants to be cited for

          geoffwozniak@masto.hackers.townG 1 Reply Last reply
          0
          • lu_leipzig@troet.cafeL lu_leipzig@troet.cafe

            @whitequark oh, interesting, what do you not like about them? I could imagine a ML model would do a decent job deciding between n equivalent deterministically produced ASTs that vary e.g. w.r.t. indentation on multi-line definitions/calls.

            whitequark@social.treehouse.systemsW This user is from outside of this forum
            whitequark@social.treehouse.systemsW This user is from outside of this forum
            whitequark@social.treehouse.systems
            wrote last edited by
            #36

            @lu_leipzig I view code as art and so any tool that puts determinism strictly above aesthetics is a net negative to my craft

            theeclecticdyslexic@mstdn.socialT 1 Reply Last reply
            0
            • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

              @lu_leipzig I actually really don't like formatters like black or rustfmt which is why I'm collaborating on research into doing it with ML, but there are ways to do it that never produce a different AST

              argv_minus_one@mastodon.sdf.orgA This user is from outside of this forum
              argv_minus_one@mastodon.sdf.orgA This user is from outside of this forum
              argv_minus_one@mastodon.sdf.org
              wrote last edited by
              #37

              @whitequark

              Even if the AST is the same, might a sufficiently bad format mislead humans reading the resulting code?

              I'm reminded of the Obfuscated C Contest…

              @lu_leipzig

              1 Reply Last reply
              0
              • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

                the "ideal" (their choice of words) case is 64.2%

                selinica@social.mechsploitation.orgS This user is from outside of this forum
                selinica@social.mechsploitation.orgS This user is from outside of this forum
                selinica@social.mechsploitation.org
                wrote last edited by
                #38

                @whitequark@social.treehouse.systems I didn't know the ideal number for code to behave differently was over 30% of the time!
                Then again, I like and don't mind working with legacy code and systems so I personally tend to wonder "why even redo a working thing"

                1 Reply Last reply
                0
                • xgranade@wandering.shopX xgranade@wandering.shop

                  @whitequark @porglezomp I'm spitting out my drink at j++ ­→ j--. Holy shit.

                  robin@gts.icewind.meR This user is from outside of this forum
                  robin@gts.icewind.meR This user is from outside of this forum
                  robin@gts.icewind.me
                  wrote last edited by
                  #39

                  @xgranade
                  I think the right is the output from running the model on the right code (center being the "desired output"). So it's not changing the semantics of the loop, just not not changing the loop order to match their desired outcome.

                  Given that loop order can have behavioral impact (and I would never trust an LLM to be able to tell if it did), that seems like the correct behavior to me though
                  @whitequark @porglezomp

                  whitequark@social.treehouse.systemsW 1 Reply Last reply
                  0
                  • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                    i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

                    the "ideal" (their choice of words) case is 64.2%

                    slampoud@mastodon.cloudS This user is from outside of this forum
                    slampoud@mastodon.cloudS This user is from outside of this forum
                    slampoud@mastodon.cloud
                    wrote last edited by
                    #40

                    @whitequark The Code Randomizer (TM)

                    1 Reply Last reply
                    0
                    • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                      i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

                      the "ideal" (their choice of words) case is 64.2%

                      eatyourgreens@mastodon.socialE This user is from outside of this forum
                      eatyourgreens@mastodon.socialE This user is from outside of this forum
                      eatyourgreens@mastodon.social
                      wrote last edited by
                      #41

                      @whitequark well two out of three ain’t bad. No, wait…

                      1 Reply Last reply
                      0
                      • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                        @lu_leipzig I view code as art and so any tool that puts determinism strictly above aesthetics is a net negative to my craft

                        theeclecticdyslexic@mstdn.socialT This user is from outside of this forum
                        theeclecticdyslexic@mstdn.socialT This user is from outside of this forum
                        theeclecticdyslexic@mstdn.social
                        wrote last edited by
                        #42

                        @whitequark @lu_leipzig Ideally, I think a formatter that learns how I formatted the rest of the buffer would be the goal.

                        Most of the time I like the deterministic formatting. However, I find deterministic formatting fails me around function headers and long function calls / long boolean statements.

                        I want it to do the deterministic formatting once, and then if I undo immediately, don't do it again to that area... and preferably learn what I was trying to do.

                        whitequark@social.treehouse.systemsW 1 Reply Last reply
                        0
                        • krans@mastodon.me.ukK krans@mastodon.me.uk

                          @ireneista TIL that my philosophy is the same as the Extreme Programming philosophy

                          @whitequark

                          ireneista@adhd.irenes.spaceI This user is from outside of this forum
                          ireneista@adhd.irenes.spaceI This user is from outside of this forum
                          ireneista@adhd.irenes.space
                          wrote last edited by
                          #43

                          @krans @whitequark it was a nice name for a movement, it did a good job of conveying that the goal was radical change

                          at the time, from what we can tell, none of the people saw it as a labor movement specifically, which is too bad... that might have prevented it from being watered down by successive cycles of consulting and renaming

                          1 Reply Last reply
                          0
                          • robin@gts.icewind.meR robin@gts.icewind.me

                            @xgranade
                            I think the right is the output from running the model on the right code (center being the "desired output"). So it's not changing the semantics of the loop, just not not changing the loop order to match their desired outcome.

                            Given that loop order can have behavioral impact (and I would never trust an LLM to be able to tell if it did), that seems like the correct behavior to me though
                            @whitequark @porglezomp

                            whitequark@social.treehouse.systemsW This user is from outside of this forum
                            whitequark@social.treehouse.systemsW This user is from outside of this forum
                            whitequark@social.treehouse.systems
                            wrote last edited by
                            #44

                            @robin @xgranade @porglezomp oh you're right

                            1 Reply Last reply
                            0
                            • theeclecticdyslexic@mstdn.socialT theeclecticdyslexic@mstdn.social

                              @whitequark @lu_leipzig Ideally, I think a formatter that learns how I formatted the rest of the buffer would be the goal.

                              Most of the time I like the deterministic formatting. However, I find deterministic formatting fails me around function headers and long function calls / long boolean statements.

                              I want it to do the deterministic formatting once, and then if I undo immediately, don't do it again to that area... and preferably learn what I was trying to do.

                              whitequark@social.treehouse.systemsW This user is from outside of this forum
                              whitequark@social.treehouse.systemsW This user is from outside of this forum
                              whitequark@social.treehouse.systems
                              wrote last edited by
                              #45

                              @theeclecticdyslexic @lu_leipzig my goal is to be able to run a command on a patch that formats the added lines "more or less like the rest of the file"

                              theeclecticdyslexic@mstdn.socialT 1 Reply Last reply
                              0
                              • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                                i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

                                the "ideal" (their choice of words) case is 64.2%

                                snowyfox@deadinsi.deS This user is from outside of this forum
                                snowyfox@deadinsi.deS This user is from outside of this forum
                                snowyfox@deadinsi.de
                                wrote last edited by
                                #46

                                Figure in question seems to be about "model performing in its ideal conditions"

                                The author's actual opinion is implied in the Results:

                                "After inspecting the compilation checking module, we found that DUET CS achieves 55.8% computational accuracy, which is a practical metric for a code generation system. This result shows that more than half of the output code are compilable and implement the same function as the input code. The user can
                                use this check as an optional layer of the pipeline to guarantee grammar correctness.
                                ...
                                We found that even the non-compilable outputs display around 60% similarity to the ground truth, which means even if DUET CS cannot always produce grammar-correct code, it can still provide valuable information to help user to transfer code style.
                                ...
                                Notice, that generally the task of generating the exact same code as ground truth is very hard, especially when the code length is rather long (˜47 lines)."

                                snowyfox@deadinsi.deS 1 Reply Last reply
                                0
                                • snowyfox@deadinsi.deS snowyfox@deadinsi.de

                                  Figure in question seems to be about "model performing in its ideal conditions"

                                  The author's actual opinion is implied in the Results:

                                  "After inspecting the compilation checking module, we found that DUET CS achieves 55.8% computational accuracy, which is a practical metric for a code generation system. This result shows that more than half of the output code are compilable and implement the same function as the input code. The user can
                                  use this check as an optional layer of the pipeline to guarantee grammar correctness.
                                  ...
                                  We found that even the non-compilable outputs display around 60% similarity to the ground truth, which means even if DUET CS cannot always produce grammar-correct code, it can still provide valuable information to help user to transfer code style.
                                  ...
                                  Notice, that generally the task of generating the exact same code as ground truth is very hard, especially when the code length is rather long (˜47 lines)."

                                  snowyfox@deadinsi.deS This user is from outside of this forum
                                  snowyfox@deadinsi.deS This user is from outside of this forum
                                  snowyfox@deadinsi.de
                                  wrote last edited by
                                  #47

                                  That last one is a funny statement because it's laughably easy for a human to maintain the execution of a function after a style refactor. You would reprimand a junior if they couldn't do that

                                  1 Reply Last reply
                                  0
                                  • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                                    @theeclecticdyslexic @lu_leipzig my goal is to be able to run a command on a patch that formats the added lines "more or less like the rest of the file"

                                    theeclecticdyslexic@mstdn.socialT This user is from outside of this forum
                                    theeclecticdyslexic@mstdn.socialT This user is from outside of this forum
                                    theeclecticdyslexic@mstdn.social
                                    wrote last edited by
                                    #48

                                    @whitequark @lu_leipzig that's a pretty reasonable concept I think.

                                    I like the idea at least.

                                    One thing I will say of deterministic formatters is they have changed my habits over time in order to get it to format the way I want. You can take that as both good and bad, but I think most (maybe 60%) of the things they have forced on me have been good.

                                    Edit: I also get stun locked trying to decide how to format 15 lines of code far less often.

                                    whitequark@social.treehouse.systemsW 1 Reply Last reply
                                    0
                                    • theeclecticdyslexic@mstdn.socialT theeclecticdyslexic@mstdn.social

                                      @whitequark @lu_leipzig that's a pretty reasonable concept I think.

                                      I like the idea at least.

                                      One thing I will say of deterministic formatters is they have changed my habits over time in order to get it to format the way I want. You can take that as both good and bad, but I think most (maybe 60%) of the things they have forced on me have been good.

                                      Edit: I also get stun locked trying to decide how to format 15 lines of code far less often.

                                      whitequark@social.treehouse.systemsW This user is from outside of this forum
                                      whitequark@social.treehouse.systemsW This user is from outside of this forum
                                      whitequark@social.treehouse.systems
                                      wrote last edited by
                                      #49

                                      @theeclecticdyslexic @lu_leipzig yeah if a formatter requires me to do things I don't want I simply quit using the formatter (and sometimes the codebase)

                                      burningtyger@nrw.socialB 1 Reply Last reply
                                      0
                                      • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                                        i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

                                        the "ideal" (their choice of words) case is 64.2%

                                        yvandasilva@hachyderm.ioY This user is from outside of this forum
                                        yvandasilva@hachyderm.ioY This user is from outside of this forum
                                        yvandasilva@hachyderm.io
                                        wrote last edited by
                                        #50

                                        @whitequark what the what.

                                        1 Reply Last reply
                                        0
                                        • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                                          i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

                                          the "ideal" (their choice of words) case is 64.2%

                                          gudenau@hachyderm.ioG This user is from outside of this forum
                                          gudenau@hachyderm.ioG This user is from outside of this forum
                                          gudenau@hachyderm.io
                                          wrote last edited by
                                          #51

                                          @whitequark Just like, use one of the tools that already exists? It'll be:
                                          - Fast
                                          - Cheap
                                          - Efficient
                                          - Accurate

                                          I don't understand any of this "industry" outside of being a massive destructive boondoggle.

                                          1 Reply Last reply
                                          0
                                          • R relay@relay.an.exchange shared this topic
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups