Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

Scheduled Pinned Locked Moved Uncategorized
140 Posts 61 Posters 0 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

    i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

    the "ideal" (their choice of words) case is 64.2%

    mc@mathstodon.xyzM This user is from outside of this forum
    mc@mathstodon.xyzM This user is from outside of this forum
    mc@mathstodon.xyz
    wrote last edited by
    #112

    @whitequark well the paper speaks of *code style* which is more than just formatting but also, shouldn't we welcome negative results in science?

    whitequark@social.treehouse.systemsW benjamineskola@hachyderm.ioB 2 Replies Last reply
    0
    • mc@mathstodon.xyzM mc@mathstodon.xyz

      @whitequark well the paper speaks of *code style* which is more than just formatting but also, shouldn't we welcome negative results in science?

      whitequark@social.treehouse.systemsW This user is from outside of this forum
      whitequark@social.treehouse.systemsW This user is from outside of this forum
      whitequark@social.treehouse.systems
      wrote last edited by
      #113

      @mc I feel like if the negative result is obvious given the hypothesis it has a lot less value

      1 Reply Last reply
      0
      • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

        i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

        the "ideal" (their choice of words) case is 64.2%

        S This user is from outside of this forum
        S This user is from outside of this forum
        sop@unstable.systems
        wrote last edited by
        #114

        @whitequark I do think that asking for 100.0% equivalency is something that's both necessary to ask of something you'd want to put in a CI _and_ unreasonable to ask of something that tries to solve this problem

        having accidentally gone through this specific kind of exercise a few times in the last couple weeks — turning java code into kotlin code intellij would spit into kotlin code I'd be happy to put my name on — I usually reach maybe 98% compatibility, then settle for that because I identify the remaining 2% of behaviours as "hard to replicate in the new shape of the code," "minor enough not to matter" and "not desirable, actually"

        once you're happy to aim somewhere south than 100.0% I guess it's interesting to figure out how close you can get — and then yeah this approach only gets you to 64% which is only good as a milestone for future efforts to compare against 🤷‍♀️

        maybe all this ends up being good for is dropping comments on PRs (and, if you recognize me, we both know how we feel about that)

        whitequark@social.treehouse.systemsW S 2 Replies Last reply
        0
        • S sop@unstable.systems

          @whitequark I do think that asking for 100.0% equivalency is something that's both necessary to ask of something you'd want to put in a CI _and_ unreasonable to ask of something that tries to solve this problem

          having accidentally gone through this specific kind of exercise a few times in the last couple weeks — turning java code into kotlin code intellij would spit into kotlin code I'd be happy to put my name on — I usually reach maybe 98% compatibility, then settle for that because I identify the remaining 2% of behaviours as "hard to replicate in the new shape of the code," "minor enough not to matter" and "not desirable, actually"

          once you're happy to aim somewhere south than 100.0% I guess it's interesting to figure out how close you can get — and then yeah this approach only gets you to 64% which is only good as a milestone for future efforts to compare against 🤷‍♀️

          maybe all this ends up being good for is dropping comments on PRs (and, if you recognize me, we both know how we feel about that)

          whitequark@social.treehouse.systemsW This user is from outside of this forum
          whitequark@social.treehouse.systemsW This user is from outside of this forum
          whitequark@social.treehouse.systems
          wrote last edited by
          #115

          @sop but I'm not doing language translation, input and output are in the same language and should have essentially identical (machine-checkably equivalent) ASTs

          1 Reply Last reply
          0
          • S sop@unstable.systems

            @whitequark I do think that asking for 100.0% equivalency is something that's both necessary to ask of something you'd want to put in a CI _and_ unreasonable to ask of something that tries to solve this problem

            having accidentally gone through this specific kind of exercise a few times in the last couple weeks — turning java code into kotlin code intellij would spit into kotlin code I'd be happy to put my name on — I usually reach maybe 98% compatibility, then settle for that because I identify the remaining 2% of behaviours as "hard to replicate in the new shape of the code," "minor enough not to matter" and "not desirable, actually"

            once you're happy to aim somewhere south than 100.0% I guess it's interesting to figure out how close you can get — and then yeah this approach only gets you to 64% which is only good as a milestone for future efforts to compare against 🤷‍♀️

            maybe all this ends up being good for is dropping comments on PRs (and, if you recognize me, we both know how we feel about that)

            S This user is from outside of this forum
            S This user is from outside of this forum
            sop@unstable.systems
            wrote last edited by
            #116

            @whitequark (reads https://social.treehouse.systems/@whitequark/116283070331505039) oh no did I just explain something you thought obvious back to you

            whitequark@social.treehouse.systemsW 1 Reply Last reply
            0
            • S sop@unstable.systems

              @whitequark (reads https://social.treehouse.systems/@whitequark/116283070331505039) oh no did I just explain something you thought obvious back to you

              whitequark@social.treehouse.systemsW This user is from outside of this forum
              whitequark@social.treehouse.systemsW This user is from outside of this forum
              whitequark@social.treehouse.systems
              wrote last edited by
              #117

              @sop i guess? basically, you can set up a system around an ML model in two ways: where the model gets to alter things that are not (lexer) whitespace, and where the model gets to alter random (lexer) tokens

              the paper goes for #2
              i am collabrating on a project that does #1, which gives 100.0% (with the caveat above) by design—because a formatting tool that sometimes breaks code is a net negative

              1 Reply Last reply
              0
              • lu_leipzig@troet.cafeL lu_leipzig@troet.cafe

                @whitequark And this is how research money is lit on fire, I guess. Why else conduct research into ML for a task that has had obvious, deterministic, efficient and well-tested solutions for decades?

                srazkvt@tech.lgbtS This user is from outside of this forum
                srazkvt@tech.lgbtS This user is from outside of this forum
                srazkvt@tech.lgbt
                wrote last edited by
                #118

                @lu_leipzig @whitequark i would honestly be more interested into a deterministic but very configurable formatter, and a ml model to, from sample code, write a config for you, and you just do minor adjustments to it, generally all code styles stand in just a few hundred switches

                whitequark@social.treehouse.systemsW 1 Reply Last reply
                0
                • srazkvt@tech.lgbtS srazkvt@tech.lgbt

                  @lu_leipzig @whitequark i would honestly be more interested into a deterministic but very configurable formatter, and a ml model to, from sample code, write a config for you, and you just do minor adjustments to it, generally all code styles stand in just a few hundred switches

                  whitequark@social.treehouse.systemsW This user is from outside of this forum
                  whitequark@social.treehouse.systemsW This user is from outside of this forum
                  whitequark@social.treehouse.systems
                  wrote last edited by
                  #119

                  @SRAZKVT @lu_leipzig this would be ~easy to do but convincing people to implement and maintain "a few hundred switches" has been incredibly difficult; my motivation is exactly that rustfmt maintainers have been consistently unwilling to entertain that

                  whitequark@social.treehouse.systemsW 1 Reply Last reply
                  0
                  • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                    @SRAZKVT @lu_leipzig this would be ~easy to do but convincing people to implement and maintain "a few hundred switches" has been incredibly difficult; my motivation is exactly that rustfmt maintainers have been consistently unwilling to entertain that

                    whitequark@social.treehouse.systemsW This user is from outside of this forum
                    whitequark@social.treehouse.systemsW This user is from outside of this forum
                    whitequark@social.treehouse.systems
                    wrote last edited by
                    #120

                    @SRAZKVT @lu_leipzig if every language i cared about (at this point: mainly rust, python, and c++) had highly configurable formatters i would not care to spend as much effort as i'm planning to on ml research

                    srazkvt@tech.lgbtS 1 Reply Last reply
                    0
                    • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                      @SRAZKVT @lu_leipzig if every language i cared about (at this point: mainly rust, python, and c++) had highly configurable formatters i would not care to spend as much effort as i'm planning to on ml research

                      srazkvt@tech.lgbtS This user is from outside of this forum
                      srazkvt@tech.lgbtS This user is from outside of this forum
                      srazkvt@tech.lgbt
                      wrote last edited by
                      #121

                      @whitequark @lu_leipzig most tooling devs today seem to believe in a one size fits all with no configurability, kind of sad

                      also i think the problem of "but if every codebase isn't formatted exactly the same" is way overblown, once you start reading the code it really doesn't take long to adapt to a new style, barely a few minutes from my experience

                      whitequark@social.treehouse.systemsW c0dec0dec0de@hachyderm.ioC 2 Replies Last reply
                      0
                      • srazkvt@tech.lgbtS srazkvt@tech.lgbt

                        @whitequark @lu_leipzig most tooling devs today seem to believe in a one size fits all with no configurability, kind of sad

                        also i think the problem of "but if every codebase isn't formatted exactly the same" is way overblown, once you start reading the code it really doesn't take long to adapt to a new style, barely a few minutes from my experience

                        whitequark@social.treehouse.systemsW This user is from outside of this forum
                        whitequark@social.treehouse.systemsW This user is from outside of this forum
                        whitequark@social.treehouse.systems
                        wrote last edited by
                        #122

                        @SRAZKVT @lu_leipzig there is a more real problem of "some people bounce off contributing if you ask them to fix style"

                        srazkvt@tech.lgbtS 1 Reply Last reply
                        0
                        • sabik@rants.auS sabik@rants.au

                          @xgranade @whitequark @porglezomp
                          I think reversing the `j` for loop is actually wanted by them? It's labelled "ground truth", and it is a potential valid optimisation

                          ingalovinde@embracing.spaceI This user is from outside of this forum
                          ingalovinde@embracing.spaceI This user is from outside of this forum
                          ingalovinde@embracing.space
                          wrote last edited by
                          #123

                          @sabik @xgranade @whitequark @porglezomp but they also changed the boundaries! "Input" checks all values from 2 to i+2 inclusive; but "ground truth" just trows i+2 iteration out.

                          sabik@rants.auS 1 Reply Last reply
                          0
                          • hennichodernich@radiosocial.deH hennichodernich@radiosocial.de

                            @nxskok @whitequark @deborahh @danlyke to be fair, according to the paper, replacing for with while loops and vice versa and the like was also the goal

                            illybytes@shrimp.imsofucking.gayI This user is from outside of this forum
                            illybytes@shrimp.imsofucking.gayI This user is from outside of this forum
                            illybytes@shrimp.imsofucking.gay
                            wrote last edited by
                            #124
                            @hennichodernich @danlyke @whitequark @deborahh @nxskok but like wouldn't that be easy to implement?
                            like

                            for(expression;bool expression; affectation) that would turn into
                            expression; while (bool) { //every possible branch inside while would get affectation }
                            1 Reply Last reply
                            0
                            • ingalovinde@embracing.spaceI ingalovinde@embracing.space

                              @sabik @xgranade @whitequark @porglezomp but they also changed the boundaries! "Input" checks all values from 2 to i+2 inclusive; but "ground truth" just trows i+2 iteration out.

                              sabik@rants.auS This user is from outside of this forum
                              sabik@rants.auS This user is from outside of this forum
                              sabik@rants.au
                              wrote last edited by
                              #125

                              @IngaLovinde @xgranade @whitequark @porglezomp
                              `i` starts from 1 in the "ground truth" version

                              ingalovinde@embracing.spaceI 1 Reply Last reply
                              0
                              • sabik@rants.auS sabik@rants.au

                                @IngaLovinde @xgranade @whitequark @porglezomp
                                `i` starts from 1 in the "ground truth" version

                                ingalovinde@embracing.spaceI This user is from outside of this forum
                                ingalovinde@embracing.spaceI This user is from outside of this forum
                                ingalovinde@embracing.space
                                wrote last edited by
                                #126

                                @sabik @xgranade @whitequark @porglezomp ah I see, so the new i is just the old one + 1

                                1 Reply Last reply
                                0
                                • mc@mathstodon.xyzM mc@mathstodon.xyz

                                  @whitequark well the paper speaks of *code style* which is more than just formatting but also, shouldn't we welcome negative results in science?

                                  benjamineskola@hachyderm.ioB This user is from outside of this forum
                                  benjamineskola@hachyderm.ioB This user is from outside of this forum
                                  benjamineskola@hachyderm.io
                                  wrote last edited by
                                  #127

                                  @mc @whitequark do they actually even recognise it as a negative result though?

                                  They seem to be presenting it as a positive one (looking at the abstract and conclusion) — but I admit I'm not familiar with the norms for writing this sort of paper.

                                  1 Reply Last reply
                                  0
                                  • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                                    i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

                                    the "ideal" (their choice of words) case is 64.2%

                                    dasgrueneblatt@wien.rocksD This user is from outside of this forum
                                    dasgrueneblatt@wien.rocksD This user is from outside of this forum
                                    dasgrueneblatt@wien.rocks
                                    wrote last edited by
                                    #128

                                    @whitequark amazing 😱

                                    1 Reply Last reply
                                    0
                                    • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                                      i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

                                      the "ideal" (their choice of words) case is 64.2%

                                      teilweise@layer8.spaceT This user is from outside of this forum
                                      teilweise@layer8.spaceT This user is from outside of this forum
                                      teilweise@layer8.space
                                      wrote last edited by
                                      #129

                                      @whitequark Looking at https://upload.whitequark.org/1774306843-Duetcs_Code_Style_Transfer_through_Generation_and_Retrieval.pdf, Fig. 6:

                                      Look at `bool ok, count = false;`: This leaves “ok” at an undefined value.
                                      In any case that should print “YES”, the `ok = false;` line is never called, it’s undefined whether it prints “YES” or ”NO” (might even be different for each invocation).

                                      Neither the input nor the ground truth had that bug.

                                      It looks like the researches did not notice it and considered it correct.
                                      (64.2% …)

                                      It was obvious to me, would you have caught it?

                                      1 Reply Last reply
                                      0
                                      • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                                        @SRAZKVT @lu_leipzig there is a more real problem of "some people bounce off contributing if you ask them to fix style"

                                        srazkvt@tech.lgbtS This user is from outside of this forum
                                        srazkvt@tech.lgbtS This user is from outside of this forum
                                        srazkvt@tech.lgbt
                                        wrote last edited by
                                        #130

                                        @whitequark @lu_leipzig yea, such as: the code being shit

                                        1 Reply Last reply
                                        0
                                        • srazkvt@tech.lgbtS srazkvt@tech.lgbt

                                          @whitequark @lu_leipzig most tooling devs today seem to believe in a one size fits all with no configurability, kind of sad

                                          also i think the problem of "but if every codebase isn't formatted exactly the same" is way overblown, once you start reading the code it really doesn't take long to adapt to a new style, barely a few minutes from my experience

                                          c0dec0dec0de@hachyderm.ioC This user is from outside of this forum
                                          c0dec0dec0de@hachyderm.ioC This user is from outside of this forum
                                          c0dec0dec0de@hachyderm.io
                                          wrote last edited by
                                          #131

                                          @SRAZKVT @whitequark @lu_leipzig in general, I agree, but I almost wish I could have just told the software teams that I worked with a couple years ago “this is style for this language, just drank with it” instead of having hours of meetings about clang-format settings.

                                          whitequark@social.treehouse.systemsW 1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups