Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. A cursed feature of C in 1972: Labels and functions were reassignable (i.e., lvalues)!

A cursed feature of C in 1972: Labels and functions were reassignable (i.e., lvalues)!

Scheduled Pinned Locked Moved Uncategorized
unixretrocomputingvintagecomputin
22 Posts 12 Posters 15 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • thalia@discuss.systemsT thalia@discuss.systems

    A cursed feature of C in 1972: Labels and functions were reassignable (i.e., lvalues)!

    For example, this is a clever way to initialize once:

    goto init;
    init:
    ouptr = oubuf;
    init = init1;
    init1:

    which is compiled to:

    jmp *4120
    mov 4136,4144
    mov 4122,4120

    Note the indirect jump and assignment to that address. All gotos used indirect jumps. This apparently would have also worked with functions.

    #c #unix #retrocomputing #vintagecomputing

    thalia@discuss.systemsT This user is from outside of this forum
    thalia@discuss.systemsT This user is from outside of this forum
    thalia@discuss.systems
    wrote last edited by
    #2

    At the time, C was rapidly evolving into what we recognize today.

    It started as B, an untyped and interpreted language that only had one kind of value, a word. This was a perfect fit for the PDP-7 that UNIX started on, with 18-bit words, but as they transitioned to the PDP-11, with 16-bit words and 8-bit addressing, this became a limitation.

    Soon, types were added, to distinguish char, int, and pointers, and it became known as NB ("New B"). But, B wasn't particularly fast, as it was interpreted (well, threaded).

    Once it was rewritten to be compiled, the language became known as C (perhaps initially as NC). For a short while, everything in C was an lvalue like B, giving the above snippet, but this was dropped a few months later, presumably for efficiency.

    Some UNIX utilities had been written in B/C from the start, but efforts to rewrite the kernel itself in B/C had failed. Finally, once structs were added to C, it was powerful enough to support the kernel and it was rewritten in C over the summer of 1973, culminating in the release of UNIX V4.

    #c #unix #retrocomputing #vintagecomputing

    thalia@discuss.systemsT bms48@mastodon.socialB 2 Replies Last reply
    0
    • thalia@discuss.systemsT thalia@discuss.systems

      At the time, C was rapidly evolving into what we recognize today.

      It started as B, an untyped and interpreted language that only had one kind of value, a word. This was a perfect fit for the PDP-7 that UNIX started on, with 18-bit words, but as they transitioned to the PDP-11, with 16-bit words and 8-bit addressing, this became a limitation.

      Soon, types were added, to distinguish char, int, and pointers, and it became known as NB ("New B"). But, B wasn't particularly fast, as it was interpreted (well, threaded).

      Once it was rewritten to be compiled, the language became known as C (perhaps initially as NC). For a short while, everything in C was an lvalue like B, giving the above snippet, but this was dropped a few months later, presumably for efficiency.

      Some UNIX utilities had been written in B/C from the start, but efforts to rewrite the kernel itself in B/C had failed. Finally, once structs were added to C, it was powerful enough to support the kernel and it was rewritten in C over the summer of 1973, culminating in the release of UNIX V4.

      #c #unix #retrocomputing #vintagecomputing

      thalia@discuss.systemsT This user is from outside of this forum
      thalia@discuss.systemsT This user is from outside of this forum
      thalia@discuss.systems
      wrote last edited by
      #3

      This snippet appears in cvft, a compiler for translating Fortran threaded code to machine code from June 1972, which is notably derived from the early C code generator. See putchar (and also getcha) in dmr/cgd/cvft.c.

      The earliest extant C compiler is last1120c from July 1972, the last C version for the PDP-11/20, before they migrated to the PDP-11/45. This version still has the label lvalue behavior of B seen in cvft. Then, it was changed to the modern behavior by the time of prestruct-c from December 1972. That version supports structures, but does not yet use them itself.

      All three can be found in Dennis_Tapes: https://www.tuhs.org/Archive/Applications/Dennis_Tapes

      thalia@discuss.systemsT 1 Reply Last reply
      0
      • thalia@discuss.systemsT thalia@discuss.systems

        At the time, C was rapidly evolving into what we recognize today.

        It started as B, an untyped and interpreted language that only had one kind of value, a word. This was a perfect fit for the PDP-7 that UNIX started on, with 18-bit words, but as they transitioned to the PDP-11, with 16-bit words and 8-bit addressing, this became a limitation.

        Soon, types were added, to distinguish char, int, and pointers, and it became known as NB ("New B"). But, B wasn't particularly fast, as it was interpreted (well, threaded).

        Once it was rewritten to be compiled, the language became known as C (perhaps initially as NC). For a short while, everything in C was an lvalue like B, giving the above snippet, but this was dropped a few months later, presumably for efficiency.

        Some UNIX utilities had been written in B/C from the start, but efforts to rewrite the kernel itself in B/C had failed. Finally, once structs were added to C, it was powerful enough to support the kernel and it was rewritten in C over the summer of 1973, culminating in the release of UNIX V4.

        #c #unix #retrocomputing #vintagecomputing

        bms48@mastodon.socialB This user is from outside of this forum
        bms48@mastodon.socialB This user is from outside of this forum
        bms48@mastodon.social
        wrote last edited by
        #4

        @thalia Dealing with BCPL in AmigaOS systems code, inherited from TripOS, pre-A3000 era was like kicking dead whales down the beach.

        djl@mastodon.mit.eduD 1 Reply Last reply
        0
        • thalia@discuss.systemsT thalia@discuss.systems

          A cursed feature of C in 1972: Labels and functions were reassignable (i.e., lvalues)!

          For example, this is a clever way to initialize once:

          goto init;
          init:
          ouptr = oubuf;
          init = init1;
          init1:

          which is compiled to:

          jmp *4120
          mov 4136,4144
          mov 4122,4120

          Note the indirect jump and assignment to that address. All gotos used indirect jumps. This apparently would have also worked with functions.

          #c #unix #retrocomputing #vintagecomputing

          vk2bea@mastodon.radioV This user is from outside of this forum
          vk2bea@mastodon.radioV This user is from outside of this forum
          vk2bea@mastodon.radio
          wrote last edited by
          #5

          @thalia that seems dangerous!

          huitema@social.secret-wg.orgH 1 Reply Last reply
          0
          • bms48@mastodon.socialB bms48@mastodon.social

            @thalia Dealing with BCPL in AmigaOS systems code, inherited from TripOS, pre-A3000 era was like kicking dead whales down the beach.

            djl@mastodon.mit.eduD This user is from outside of this forum
            djl@mastodon.mit.eduD This user is from outside of this forum
            djl@mastodon.mit.edu
            wrote last edited by
            #6

            @bms48 @thalia

            " like kicking dead whales down the beach."

            Hmm. I used BCPL on Tenex in 1980, and don't remember having problems.

            And my dad had a PDP-7 at work. This one.

            Link Preview Image
            FAF_PDP7web.jpg by David in Tokyo

            favicon

            PBase (pbase.com)

            thalia@discuss.systemsT bms48@mastodon.socialB 2 Replies Last reply
            0
            • djl@mastodon.mit.eduD djl@mastodon.mit.edu

              @bms48 @thalia

              " like kicking dead whales down the beach."

              Hmm. I used BCPL on Tenex in 1980, and don't remember having problems.

              And my dad had a PDP-7 at work. This one.

              Link Preview Image
              FAF_PDP7web.jpg by David in Tokyo

              favicon

              PBase (pbase.com)

              thalia@discuss.systemsT This user is from outside of this forum
              thalia@discuss.systemsT This user is from outside of this forum
              thalia@discuss.systems
              wrote last edited by
              #7

              @djl @bms48 Lovely. I'd love to get a PDP-7, but they're incredibly rare.

              1 Reply Last reply
              0
              • thalia@discuss.systemsT thalia@discuss.systems

                A cursed feature of C in 1972: Labels and functions were reassignable (i.e., lvalues)!

                For example, this is a clever way to initialize once:

                goto init;
                init:
                ouptr = oubuf;
                init = init1;
                init1:

                which is compiled to:

                jmp *4120
                mov 4136,4144
                mov 4122,4120

                Note the indirect jump and assignment to that address. All gotos used indirect jumps. This apparently would have also worked with functions.

                #c #unix #retrocomputing #vintagecomputing

                djl@mastodon.mit.eduD This user is from outside of this forum
                djl@mastodon.mit.eduD This user is from outside of this forum
                djl@mastodon.mit.edu
                wrote last edited by
                #8

                @thalia

                I'm reminded of the MIT PDP-6 assmbler poem:

                PUSHJ, PUSHJ, POPJ P,
                JRST . + 1203

                thalia@discuss.systemsT 1 Reply Last reply
                0
                • djl@mastodon.mit.eduD djl@mastodon.mit.edu

                  @thalia

                  I'm reminded of the MIT PDP-6 assmbler poem:

                  PUSHJ, PUSHJ, POPJ P,
                  JRST . + 1203

                  thalia@discuss.systemsT This user is from outside of this forum
                  thalia@discuss.systemsT This user is from outside of this forum
                  thalia@discuss.systems
                  wrote last edited by
                  #9

                  @djl I'm afraid I don't speak PDP-6 / PDP-10 assembly (yet?). Could you elucidate?

                  djl@mastodon.mit.eduD 1 Reply Last reply
                  0
                  • thalia@discuss.systemsT thalia@discuss.systems

                    A cursed feature of C in 1972: Labels and functions were reassignable (i.e., lvalues)!

                    For example, this is a clever way to initialize once:

                    goto init;
                    init:
                    ouptr = oubuf;
                    init = init1;
                    init1:

                    which is compiled to:

                    jmp *4120
                    mov 4136,4144
                    mov 4122,4120

                    Note the indirect jump and assignment to that address. All gotos used indirect jumps. This apparently would have also worked with functions.

                    #c #unix #retrocomputing #vintagecomputing

                    usul@piaille.frU This user is from outside of this forum
                    usul@piaille.frU This user is from outside of this forum
                    usul@piaille.fr
                    wrote last edited by
                    #10

                    @thalia what lond of assembly is that ?

                    thalia@discuss.systemsT 1 Reply Last reply
                    0
                    • vk2bea@mastodon.radioV vk2bea@mastodon.radio

                      @thalia that seems dangerous!

                      huitema@social.secret-wg.orgH This user is from outside of this forum
                      huitema@social.secret-wg.orgH This user is from outside of this forum
                      huitema@social.secret-wg.org
                      wrote last edited by
                      #11

                      @vk2bea @thalia
                      The ASSIGN statement in Fortran IV and the ALTER statement in COBOL supported ways to redirect the target of a GOTO, much in the same way as the "cursed figure" of C that you described. I assume that at the time, it felt important to have parity between C and Fortran (and maybe COBOL).

                      markd@hachyderm.ioM 1 Reply Last reply
                      0
                      • usul@piaille.frU usul@piaille.fr

                        @thalia what lond of assembly is that ?

                        thalia@discuss.systemsT This user is from outside of this forum
                        thalia@discuss.systemsT This user is from outside of this forum
                        thalia@discuss.systems
                        wrote last edited by
                        #12

                        @usul PDP-11 assembly with UNIX syntax. Those are octal addresses. Unless an address has a $, it refers to the value at that address. * is a dereference.

                        1 Reply Last reply
                        0
                        • thalia@discuss.systemsT thalia@discuss.systems

                          This snippet appears in cvft, a compiler for translating Fortran threaded code to machine code from June 1972, which is notably derived from the early C code generator. See putchar (and also getcha) in dmr/cgd/cvft.c.

                          The earliest extant C compiler is last1120c from July 1972, the last C version for the PDP-11/20, before they migrated to the PDP-11/45. This version still has the label lvalue behavior of B seen in cvft. Then, it was changed to the modern behavior by the time of prestruct-c from December 1972. That version supports structures, but does not yet use them itself.

                          All three can be found in Dennis_Tapes: https://www.tuhs.org/Archive/Applications/Dennis_Tapes

                          thalia@discuss.systemsT This user is from outside of this forum
                          thalia@discuss.systemsT This user is from outside of this forum
                          thalia@discuss.systems
                          wrote last edited by
                          #13

                          Another strange pattern from the same program.

                          This one reassigns the address of an array, `int nlist[250]`, in char increments. Arrays are no longer lvalues, so this doesn't work anymore. Also, the address is unaligned every other iteration.

                          lbp;
                          nlist[250];

                          getnam()
                          {
                          extern nlist, lbp;
                          char nlist[], lbp[], c;

                          loop:
                          c = *lbp++;
                          if (c==';' | c=='\n')
                          goto el;
                          *nlist++ = c;
                          goto loop;
                          el:
                          *nlist++ = '\0';
                          }

                          Somewhat simplified from Dennis_Tapes/dmr/cgd/cg1.c:getnam.

                          thalia@discuss.systemsT 1 Reply Last reply
                          0
                          • huitema@social.secret-wg.orgH huitema@social.secret-wg.org

                            @vk2bea @thalia
                            The ASSIGN statement in Fortran IV and the ALTER statement in COBOL supported ways to redirect the target of a GOTO, much in the same way as the "cursed figure" of C that you described. I assume that at the time, it felt important to have parity between C and Fortran (and maybe COBOL).

                            markd@hachyderm.ioM This user is from outside of this forum
                            markd@hachyderm.ioM This user is from outside of this forum
                            markd@hachyderm.io
                            wrote last edited by
                            #14

                            @huitema @vk2bea @thalia ALTER-like functionality reflects the fact that programming was still evolving to use subroutines/functions and concepts like Structured Programming were still considered radical by old-school programmers at the time**.

                            It probably didn't help that this was a time prior to formal programming courses so a lot of programmers were self-taught and developed their craft in isolation (and often in assembler) so using "go to"s and ALTERs came pretty naturally.

                            IOWs, if you wanted a language that appealed to the masses at the time then you pretty well had to include goto and ALTER.

                            ** https://en.wikipedia.org/wiki/Structured_programming#Debate

                            1 Reply Last reply
                            0
                            • djl@mastodon.mit.eduD djl@mastodon.mit.edu

                              @bms48 @thalia

                              " like kicking dead whales down the beach."

                              Hmm. I used BCPL on Tenex in 1980, and don't remember having problems.

                              And my dad had a PDP-7 at work. This one.

                              Link Preview Image
                              FAF_PDP7web.jpg by David in Tokyo

                              favicon

                              PBase (pbase.com)

                              bms48@mastodon.socialB This user is from outside of this forum
                              bms48@mastodon.socialB This user is from outside of this forum
                              bms48@mastodon.social
                              wrote last edited by
                              #15

                              @djl @thalia This was largely because of the constant need to translate longword-pointers to byte ones when interworking between modules with BCPL and C linkage. It only specifically affected AmigaDOS and not other subsystems (exec, graphics, intuition etc.)

                              1 Reply Last reply
                              0
                              • thalia@discuss.systemsT thalia@discuss.systems

                                Another strange pattern from the same program.

                                This one reassigns the address of an array, `int nlist[250]`, in char increments. Arrays are no longer lvalues, so this doesn't work anymore. Also, the address is unaligned every other iteration.

                                lbp;
                                nlist[250];

                                getnam()
                                {
                                extern nlist, lbp;
                                char nlist[], lbp[], c;

                                loop:
                                c = *lbp++;
                                if (c==';' | c=='\n')
                                goto el;
                                *nlist++ = c;
                                goto loop;
                                el:
                                *nlist++ = '\0';
                                }

                                Somewhat simplified from Dennis_Tapes/dmr/cgd/cg1.c:getnam.

                                thalia@discuss.systemsT This user is from outside of this forum
                                thalia@discuss.systemsT This user is from outside of this forum
                                thalia@discuss.systems
                                wrote last edited by
                                #16

                                This snippet deliberately triggers a "Bus error -- Core dumped":

                                int o1[];
                                o1 = -3;
                                *o1;

                                From Dennis_Tapes/dmr/cgd/cg1.c:expr.

                                aap@mastodon.sdf.orgA 1 Reply Last reply
                                0
                                • thalia@discuss.systemsT thalia@discuss.systems

                                  A cursed feature of C in 1972: Labels and functions were reassignable (i.e., lvalues)!

                                  For example, this is a clever way to initialize once:

                                  goto init;
                                  init:
                                  ouptr = oubuf;
                                  init = init1;
                                  init1:

                                  which is compiled to:

                                  jmp *4120
                                  mov 4136,4144
                                  mov 4122,4120

                                  Note the indirect jump and assignment to that address. All gotos used indirect jumps. This apparently would have also worked with functions.

                                  #c #unix #retrocomputing #vintagecomputing

                                  rupertreynolds@hachyderm.ioR This user is from outside of this forum
                                  rupertreynolds@hachyderm.ioR This user is from outside of this forum
                                  rupertreynolds@hachyderm.io
                                  wrote last edited by
                                  #17

                                  @thalia Eeeek! Make it go away!

                                  1 Reply Last reply
                                  0
                                  • thalia@discuss.systemsT thalia@discuss.systems

                                    A cursed feature of C in 1972: Labels and functions were reassignable (i.e., lvalues)!

                                    For example, this is a clever way to initialize once:

                                    goto init;
                                    init:
                                    ouptr = oubuf;
                                    init = init1;
                                    init1:

                                    which is compiled to:

                                    jmp *4120
                                    mov 4136,4144
                                    mov 4122,4120

                                    Note the indirect jump and assignment to that address. All gotos used indirect jumps. This apparently would have also worked with functions.

                                    #c #unix #retrocomputing #vintagecomputing

                                    sigmasternchen@comfy.socialS This user is from outside of this forum
                                    sigmasternchen@comfy.socialS This user is from outside of this forum
                                    sigmasternchen@comfy.social
                                    wrote last edited by
                                    #18
                                    @thalia@discuss.systems I’m sorry. What.
                                    1 Reply Last reply
                                    0
                                    • thalia@discuss.systemsT thalia@discuss.systems

                                      A cursed feature of C in 1972: Labels and functions were reassignable (i.e., lvalues)!

                                      For example, this is a clever way to initialize once:

                                      goto init;
                                      init:
                                      ouptr = oubuf;
                                      init = init1;
                                      init1:

                                      which is compiled to:

                                      jmp *4120
                                      mov 4136,4144
                                      mov 4122,4120

                                      Note the indirect jump and assignment to that address. All gotos used indirect jumps. This apparently would have also worked with functions.

                                      #c #unix #retrocomputing #vintagecomputing

                                      eniko@mastodon.gamedev.placeE This user is from outside of this forum
                                      eniko@mastodon.gamedev.placeE This user is from outside of this forum
                                      eniko@mastodon.gamedev.place
                                      wrote last edited by
                                      #19

                                      @thalia oh this is cool, basically self-modifying code in C!

                                      1 Reply Last reply
                                      0
                                      • thalia@discuss.systemsT thalia@discuss.systems

                                        @djl I'm afraid I don't speak PDP-6 / PDP-10 assembly (yet?). Could you elucidate?

                                        djl@mastodon.mit.eduD This user is from outside of this forum
                                        djl@mastodon.mit.eduD This user is from outside of this forum
                                        djl@mastodon.mit.edu
                                        wrote last edited by
                                        #20

                                        @thalia

                                        PUSHJ, PUSHJ, POPJ P,
                                        JRST . + 1203

                                        pushjay, pushjay, popjay pee
                                        Jrst to point plus twelve oh three

                                        PUSHJ is the recursive subroutine call, POPJ is the return therefrom, both require the stack register to be stipulated.

                                        JRST is the unconditional jump instruction, and "." ("point") is the current address.

                                        The point of 1203 being that it's a pretty random place in memory that's really unlikely to have code starting there that makes any sense.

                                        1 Reply Last reply
                                        0
                                        • thalia@discuss.systemsT thalia@discuss.systems

                                          This snippet deliberately triggers a "Bus error -- Core dumped":

                                          int o1[];
                                          o1 = -3;
                                          *o1;

                                          From Dennis_Tapes/dmr/cgd/cg1.c:expr.

                                          aap@mastodon.sdf.orgA This user is from outside of this forum
                                          aap@mastodon.sdf.orgA This user is from outside of this forum
                                          aap@mastodon.sdf.org
                                          wrote last edited by
                                          #21

                                          @thalia Very interesting program too 🙂 did it ever up in anything later or was it more like an experiment?

                                          1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups