Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

Scheduled Pinned Locked Moved Uncategorized
140 Posts 61 Posters 0 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

    @SRAZKVT @lu_leipzig this would be ~easy to do but convincing people to implement and maintain "a few hundred switches" has been incredibly difficult; my motivation is exactly that rustfmt maintainers have been consistently unwilling to entertain that

    whitequark@social.treehouse.systemsW This user is from outside of this forum
    whitequark@social.treehouse.systemsW This user is from outside of this forum
    whitequark@social.treehouse.systems
    wrote last edited by
    #120

    @SRAZKVT @lu_leipzig if every language i cared about (at this point: mainly rust, python, and c++) had highly configurable formatters i would not care to spend as much effort as i'm planning to on ml research

    srazkvt@tech.lgbtS 1 Reply Last reply
    0
    • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

      @SRAZKVT @lu_leipzig if every language i cared about (at this point: mainly rust, python, and c++) had highly configurable formatters i would not care to spend as much effort as i'm planning to on ml research

      srazkvt@tech.lgbtS This user is from outside of this forum
      srazkvt@tech.lgbtS This user is from outside of this forum
      srazkvt@tech.lgbt
      wrote last edited by
      #121

      @whitequark @lu_leipzig most tooling devs today seem to believe in a one size fits all with no configurability, kind of sad

      also i think the problem of "but if every codebase isn't formatted exactly the same" is way overblown, once you start reading the code it really doesn't take long to adapt to a new style, barely a few minutes from my experience

      whitequark@social.treehouse.systemsW c0dec0dec0de@hachyderm.ioC 2 Replies Last reply
      0
      • srazkvt@tech.lgbtS srazkvt@tech.lgbt

        @whitequark @lu_leipzig most tooling devs today seem to believe in a one size fits all with no configurability, kind of sad

        also i think the problem of "but if every codebase isn't formatted exactly the same" is way overblown, once you start reading the code it really doesn't take long to adapt to a new style, barely a few minutes from my experience

        whitequark@social.treehouse.systemsW This user is from outside of this forum
        whitequark@social.treehouse.systemsW This user is from outside of this forum
        whitequark@social.treehouse.systems
        wrote last edited by
        #122

        @SRAZKVT @lu_leipzig there is a more real problem of "some people bounce off contributing if you ask them to fix style"

        srazkvt@tech.lgbtS 1 Reply Last reply
        0
        • sabik@rants.auS sabik@rants.au

          @xgranade @whitequark @porglezomp
          I think reversing the `j` for loop is actually wanted by them? It's labelled "ground truth", and it is a potential valid optimisation

          ingalovinde@embracing.spaceI This user is from outside of this forum
          ingalovinde@embracing.spaceI This user is from outside of this forum
          ingalovinde@embracing.space
          wrote last edited by
          #123

          @sabik @xgranade @whitequark @porglezomp but they also changed the boundaries! "Input" checks all values from 2 to i+2 inclusive; but "ground truth" just trows i+2 iteration out.

          sabik@rants.auS 1 Reply Last reply
          0
          • hennichodernich@radiosocial.deH hennichodernich@radiosocial.de

            @nxskok @whitequark @deborahh @danlyke to be fair, according to the paper, replacing for with while loops and vice versa and the like was also the goal

            illybytes@shrimp.imsofucking.gayI This user is from outside of this forum
            illybytes@shrimp.imsofucking.gayI This user is from outside of this forum
            illybytes@shrimp.imsofucking.gay
            wrote last edited by
            #124
            @hennichodernich @danlyke @whitequark @deborahh @nxskok but like wouldn't that be easy to implement?
            like

            for(expression;bool expression; affectation) that would turn into
            expression; while (bool) { //every possible branch inside while would get affectation }
            1 Reply Last reply
            0
            • ingalovinde@embracing.spaceI ingalovinde@embracing.space

              @sabik @xgranade @whitequark @porglezomp but they also changed the boundaries! "Input" checks all values from 2 to i+2 inclusive; but "ground truth" just trows i+2 iteration out.

              sabik@rants.auS This user is from outside of this forum
              sabik@rants.auS This user is from outside of this forum
              sabik@rants.au
              wrote last edited by
              #125

              @IngaLovinde @xgranade @whitequark @porglezomp
              `i` starts from 1 in the "ground truth" version

              ingalovinde@embracing.spaceI 1 Reply Last reply
              0
              • sabik@rants.auS sabik@rants.au

                @IngaLovinde @xgranade @whitequark @porglezomp
                `i` starts from 1 in the "ground truth" version

                ingalovinde@embracing.spaceI This user is from outside of this forum
                ingalovinde@embracing.spaceI This user is from outside of this forum
                ingalovinde@embracing.space
                wrote last edited by
                #126

                @sabik @xgranade @whitequark @porglezomp ah I see, so the new i is just the old one + 1

                1 Reply Last reply
                0
                • mc@mathstodon.xyzM mc@mathstodon.xyz

                  @whitequark well the paper speaks of *code style* which is more than just formatting but also, shouldn't we welcome negative results in science?

                  benjamineskola@hachyderm.ioB This user is from outside of this forum
                  benjamineskola@hachyderm.ioB This user is from outside of this forum
                  benjamineskola@hachyderm.io
                  wrote last edited by
                  #127

                  @mc @whitequark do they actually even recognise it as a negative result though?

                  They seem to be presenting it as a positive one (looking at the abstract and conclusion) — but I admit I'm not familiar with the norms for writing this sort of paper.

                  1 Reply Last reply
                  0
                  • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                    i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

                    the "ideal" (their choice of words) case is 64.2%

                    dasgrueneblatt@wien.rocksD This user is from outside of this forum
                    dasgrueneblatt@wien.rocksD This user is from outside of this forum
                    dasgrueneblatt@wien.rocks
                    wrote last edited by
                    #128

                    @whitequark amazing 😱

                    1 Reply Last reply
                    0
                    • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                      i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

                      the "ideal" (their choice of words) case is 64.2%

                      teilweise@layer8.spaceT This user is from outside of this forum
                      teilweise@layer8.spaceT This user is from outside of this forum
                      teilweise@layer8.space
                      wrote last edited by
                      #129

                      @whitequark Looking at https://upload.whitequark.org/1774306843-Duetcs_Code_Style_Transfer_through_Generation_and_Retrieval.pdf, Fig. 6:

                      Look at `bool ok, count = false;`: This leaves “ok” at an undefined value.
                      In any case that should print “YES”, the `ok = false;` line is never called, it’s undefined whether it prints “YES” or ”NO” (might even be different for each invocation).

                      Neither the input nor the ground truth had that bug.

                      It looks like the researches did not notice it and considered it correct.
                      (64.2% …)

                      It was obvious to me, would you have caught it?

                      1 Reply Last reply
                      0
                      • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                        @SRAZKVT @lu_leipzig there is a more real problem of "some people bounce off contributing if you ask them to fix style"

                        srazkvt@tech.lgbtS This user is from outside of this forum
                        srazkvt@tech.lgbtS This user is from outside of this forum
                        srazkvt@tech.lgbt
                        wrote last edited by
                        #130

                        @whitequark @lu_leipzig yea, such as: the code being shit

                        1 Reply Last reply
                        0
                        • srazkvt@tech.lgbtS srazkvt@tech.lgbt

                          @whitequark @lu_leipzig most tooling devs today seem to believe in a one size fits all with no configurability, kind of sad

                          also i think the problem of "but if every codebase isn't formatted exactly the same" is way overblown, once you start reading the code it really doesn't take long to adapt to a new style, barely a few minutes from my experience

                          c0dec0dec0de@hachyderm.ioC This user is from outside of this forum
                          c0dec0dec0de@hachyderm.ioC This user is from outside of this forum
                          c0dec0dec0de@hachyderm.io
                          wrote last edited by
                          #131

                          @SRAZKVT @whitequark @lu_leipzig in general, I agree, but I almost wish I could have just told the software teams that I worked with a couple years ago “this is style for this language, just drank with it” instead of having hours of meetings about clang-format settings.

                          whitequark@social.treehouse.systemsW 1 Reply Last reply
                          0
                          • c0dec0dec0de@hachyderm.ioC c0dec0dec0de@hachyderm.io

                            @SRAZKVT @whitequark @lu_leipzig in general, I agree, but I almost wish I could have just told the software teams that I worked with a couple years ago “this is style for this language, just drank with it” instead of having hours of meetings about clang-format settings.

                            whitequark@social.treehouse.systemsW This user is from outside of this forum
                            whitequark@social.treehouse.systemsW This user is from outside of this forum
                            whitequark@social.treehouse.systems
                            wrote last edited by
                            #132

                            @c0dec0dec0de @SRAZKVT @lu_leipzig I think it's different for corporate. I don't really care about most corporate code I touch (that isn't already OSS I maintain that is), it's completely whatever. I care a lot about this in projects I'm invested in success of

                            c0dec0dec0de@hachyderm.ioC 1 Reply Last reply
                            0
                            • disorderlyf@todon.euD disorderlyf@todon.eu

                              @whitequark So let me get this straight, IEEE thinks you should count it as a win if rewriting your code by vibing it has less than 15% better odds than a literal coinflip of reproducibility?

                              edited for clarity and to fix a typo

                              sammy@cherrykitten.gayS This user is from outside of this forum
                              sammy@cherrykitten.gayS This user is from outside of this forum
                              sammy@cherrykitten.gay
                              wrote last edited by
                              #133

                              @disorderlyf @whitequark i think "ideal" here means "the best case scenario that we encountered under ideal conditions", as opposed to a target for how it should be

                              disorderlyf@todon.euD 1 Reply Last reply
                              0
                              • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                                @c0dec0dec0de @SRAZKVT @lu_leipzig I think it's different for corporate. I don't really care about most corporate code I touch (that isn't already OSS I maintain that is), it's completely whatever. I care a lot about this in projects I'm invested in success of

                                c0dec0dec0de@hachyderm.ioC This user is from outside of this forum
                                c0dec0dec0de@hachyderm.ioC This user is from outside of this forum
                                c0dec0dec0de@hachyderm.io
                                wrote last edited by
                                #134

                                @whitequark @SRAZKVT @lu_leipzig I get that. At the end of it, I was just like pick something, I don’t care. This will make your code more readable regardless what you pick and minimize diffs in some cases.

                                1 Reply Last reply
                                0
                                • whitequark@social.treehouse.systemsW whitequark@social.treehouse.systems

                                  i'm at a loss of words after reading a paper about reformatting code using an ML model that has a measured statistical quantity A_c which says how often the reformatted code behaves the same as the original

                                  the "ideal" (their choice of words) case is 64.2%

                                  numerfolt@kirche.socialN This user is from outside of this forum
                                  numerfolt@kirche.socialN This user is from outside of this forum
                                  numerfolt@kirche.social
                                  wrote last edited by
                                  #135

                                  @whitequark Uh, that's crazy O.o

                                  1 Reply Last reply
                                  0
                                  • urixturing@hachyderm.ioU urixturing@hachyderm.io

                                    @disorderlyf @whitequark IEEE and ACM don't do the research nor they think you to do things, they are publishers that own journals and conferences where researchers publish their work

                                    disorderlyf@todon.euD This user is from outside of this forum
                                    disorderlyf@todon.euD This user is from outside of this forum
                                    disorderlyf@todon.eu
                                    wrote last edited by
                                    #136

                                    @urixturing @whitequark I initially thought IEEE was like a standards body specifically for networking, like a hardware W3C. Regardless of who did the research, I thought this was their conclusion. It sounds like I was wrong on both parts

                                    urixturing@hachyderm.ioU 1 Reply Last reply
                                    0
                                    • sammy@cherrykitten.gayS sammy@cherrykitten.gay

                                      @disorderlyf @whitequark i think "ideal" here means "the best case scenario that we encountered under ideal conditions", as opposed to a target for how it should be

                                      disorderlyf@todon.euD This user is from outside of this forum
                                      disorderlyf@todon.euD This user is from outside of this forum
                                      disorderlyf@todon.eu
                                      wrote last edited by
                                      #137

                                      @sammy @whitequark I hope you're right

                                      1 Reply Last reply
                                      0
                                      • disorderlyf@todon.euD disorderlyf@todon.eu

                                        @urixturing @whitequark I initially thought IEEE was like a standards body specifically for networking, like a hardware W3C. Regardless of who did the research, I thought this was their conclusion. It sounds like I was wrong on both parts

                                        urixturing@hachyderm.ioU This user is from outside of this forum
                                        urixturing@hachyderm.ioU This user is from outside of this forum
                                        urixturing@hachyderm.io
                                        wrote last edited by
                                        #138

                                        @disorderlyf @whitequark that would be the IETF, who publishes the RFCs (networking standards like email or HTTP)

                                        urixturing@hachyderm.ioU 1 Reply Last reply
                                        0
                                        • urixturing@hachyderm.ioU urixturing@hachyderm.io

                                          @disorderlyf @whitequark that would be the IETF, who publishes the RFCs (networking standards like email or HTTP)

                                          urixturing@hachyderm.ioU This user is from outside of this forum
                                          urixturing@hachyderm.ioU This user is from outside of this forum
                                          urixturing@hachyderm.io
                                          wrote last edited by
                                          #139

                                          @disorderlyf @whitequark but honestly I understand why it's very confusing

                                          1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups