Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Fediverse
  3. Fedi
  4. I have deeply mixed feelings about #ActivityPub's adoption of JSON-LD, as someone who's spent way too long dealing with it while building #Fedify.

I have deeply mixed feelings about #ActivityPub's adoption of JSON-LD, as someone who's spent way too long dealing with it while building #Fedify.

Scheduled Pinned Locked Moved Fedi
fedifyjsonldfedidevactivitypub
82 Posts 19 Posters 1 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • evan@cosocial.caE evan@cosocial.ca

    @hongminhee do you use the activitystrea.ms module from npm? It takes a lot of the pain out.

    hongminhee@hollo.socialH This user is from outside of this forum
    hongminhee@hollo.socialH This user is from outside of this forum
    hongminhee@hollo.social
    wrote last edited by
    #24

    @evan@cosocial.ca I don't remember exactly, but I think I came across it while doing research before developing Fedify. I probably didn't use it because the TypeScript type definitions were missing. In the end, I ended up making something similar in Fedify anyway.

    evan@cosocial.caE 1 Reply Last reply
    0
    • mariusor@metalhead.clubM This user is from outside of this forum
      mariusor@metalhead.clubM This user is from outside of this forum
      mariusor@metalhead.club
      wrote last edited by
      #25

      @silverpill I'm sorry, I'm not aware of that and I thought I read the specs pretty thoroughly. Could you point me in the right direction for where you got this information from?

      @hongminhee

      1 Reply Last reply
      0
      • hongminhee@hollo.socialH hongminhee@hollo.social

        I have deeply mixed feelings about #ActivityPub's adoption of JSON-LD, as someone who's spent way too long dealing with it while building #Fedify.

        Part of me wishes it had never happened. A lot of developers jump into ActivityPub development without really understanding JSON-LD, and honestly, can you blame them? The result is a growing number of implementations producing technically invalid JSON-LD. It works, sort of, because everyone's just pattern-matching against what Mastodon does, but it's not correct. And even developers who do take the time to understand JSON-LD often end up hardcoding their documents anyway, because proper JSON-LD processor libraries simply don't exist for many languages. No safety net, no validation, just vibes and hoping you got the @context right. Naturally, mistakes creep in.

        But then the other part of me thinks: well, we're stuck with JSON-LD now. There's no going back. So wouldn't it be nice if people actually used it properly? Process the documents, normalize them, do the compaction and expansion dance the way the spec intended. That's what Fedify does.

        Here's the part that really gets to me, though. Because Fedify actually processes JSON-LD correctly, it's more likely to break when talking to implementations that produce malformed documents. From the end user's perspective, Fedify looks like the fragile one. “Why can't I follow this person?” Well, because their server is emitting garbage JSON-LD that happens to work with implementations that just treat it as a regular JSON blob. Every time I get one of these bug reports, I feel a certain injustice. Like being the only person in the group project who actually read the assignment.

        To be fair, there are real practical reasons why most people don't bother with proper JSON-LD processing. Implementing a full processor is genuinely a lot of work. It leans on the entire Linked Data stack, which is bigger than most people expect going in. And the performance cost isn't trivial either. Fedify uses some tricks to keep things fast, and I'll be honest, that code isn't my proudest work.

        Anyway, none of this is going anywhere. Just me grumbling into the void. If you're building an ActivityPub implementation, maybe consider using a JSON-LD processor if one's available for your language. And if you're not going to, at least test your output against implementations that do.

        #JSONLD #fedidev

        varpie@peculiar.floristV This user is from outside of this forum
        varpie@peculiar.floristV This user is from outside of this forum
        varpie@peculiar.florist
        wrote last edited by
        #26

        @hongminhee I have the same feeling. The idea behind JSON-LD is nice, but it isn't widely available, so developing with it becomes a headache: do I want to create a JSON-LD processor, spending twice the time I wanted to, or do I just consider it as JSON for now and hope someone will make a JSON-LD processor soon? Often, the answer is the latter, because it's a big task that we're not looking for when creating fedi software.

        1 Reply Last reply
        0
        • mariusor@metalhead.clubM This user is from outside of this forum
          mariusor@metalhead.clubM This user is from outside of this forum
          mariusor@metalhead.club
          wrote last edited by
          #27

          @silverpill aaah, I see. I think we've had this discussion before (or at least I had it with someone else).

          For me "SHOULD" falls in the category of the robustness principle: "be conservative in what you do, be liberal in what you accept from others".

          So for me if you treat "SHOULD" in a spec as non mandatory you haven't really implemented the spec.

          @hongminhee

          1 Reply Last reply
          0
          • mariusor@metalhead.clubM This user is from outside of this forum
            mariusor@metalhead.clubM This user is from outside of this forum
            mariusor@metalhead.club
            wrote last edited by
            #28

            @silverpill regarding size, ActivityPub is such a verbose protocol that the hundred or so of raw bytes you save through omitting context, are most likely negligible through the prism of connection compression. So to me that's not entirely a "valid reason".

            And as developer myself, I think that contexts, even in a non valid JSON-LD implementation, offer enough guidance for building a data vocabulary for them to have plenty of value.

            Do you propose we replace contexts with Open API specifications, or how do we coordinate what's a valid vocabulary data object in a federated network? And how do you propose that others discover these specs?

            @hongminhee

            mariusor@metalhead.clubM 1 Reply Last reply
            0
            • mariusor@metalhead.clubM This user is from outside of this forum
              mariusor@metalhead.clubM This user is from outside of this forum
              mariusor@metalhead.club
              wrote last edited by
              #29

              @hongminhee can you point me to the parser you use for fedify?

              One of my long term plans for GoActivityPub is to built a code generation tool based on contexts and I would need some prior art to see what's important in parsing JSON-LD and RDF.

              hongminhee@hollo.socialH 1 Reply Last reply
              0
              • julian@activitypub.spaceJ julian@activitypub.space

                @hongminhee@hollo.social I'll give you my take on this... which is that my understanding of JSON-LD is that with JSON-LD you can have two disparate apps using the same property, like thread, and avoid namespace collision because one is actually https://example.org/ns/thread and the other's really https://foobar.com/ns/thread.

                Great.

                I posit that this is a premature optimization, and one that fails because of inadequate adoption. There are likely documented cases of implementations using the same property, and those concern the actual ActivityStreams vocabulary, and the solution to that is to communicate and work together so that you don't step on each others' toes.

                I personally feel that it is a technical solution to a problem that can be completely handled by simply talking to one another... but we're coders, we're famously anti-social yes? mmmmm...

                douginamug@mastodon.xyzD This user is from outside of this forum
                douginamug@mastodon.xyzD This user is from outside of this forum
                douginamug@mastodon.xyz
                wrote last edited by
                #30

                @pintoch read this thread?

                1 Reply Last reply
                0
                • ? Offline
                  ? Offline
                  Guest
                  wrote last edited by
                  #31

                  @hongminhee @mariusor JSON-LD aware implementations should work around missing @context, probably by assuming ActivityPub context.

                  1 Reply Last reply
                  0
                  • mariusor@metalhead.clubM mariusor@metalhead.club

                    @silverpill regarding size, ActivityPub is such a verbose protocol that the hundred or so of raw bytes you save through omitting context, are most likely negligible through the prism of connection compression. So to me that's not entirely a "valid reason".

                    And as developer myself, I think that contexts, even in a non valid JSON-LD implementation, offer enough guidance for building a data vocabulary for them to have plenty of value.

                    Do you propose we replace contexts with Open API specifications, or how do we coordinate what's a valid vocabulary data object in a federated network? And how do you propose that others discover these specs?

                    @hongminhee

                    mariusor@metalhead.clubM This user is from outside of this forum
                    mariusor@metalhead.clubM This user is from outside of this forum
                    mariusor@metalhead.club
                    wrote last edited by
                    #32

                    @silverpill personally I feel like the various activity/object signing methods that get used in recent FEPs are more egregious from a size point of view, when the in spec behaviour for obtaining canonical versions of a resource is to fetch them from their server, instead of relying on random object signing that introduces so much more friction.

                    @hongminhee

                    julian@activitypub.spaceJ ? 2 Replies Last reply
                    0
                    • mariusor@metalhead.clubM mariusor@metalhead.club

                      @silverpill personally I feel like the various activity/object signing methods that get used in recent FEPs are more egregious from a size point of view, when the in spec behaviour for obtaining canonical versions of a resource is to fetch them from their server, instead of relying on random object signing that introduces so much more friction.

                      @hongminhee

                      julian@activitypub.spaceJ This user is from outside of this forum
                      julian@activitypub.spaceJ This user is from outside of this forum
                      julian@activitypub.space
                      wrote last edited by
                      #33

                      @mariusor@metalhead.club I thought the whole point of signing objects or attaching proofs (none of which I do, mind you) are precisely to save the need to make a new request, which comes with its own overhead.

                      The good thing is fetching from canonical source will never go out of style.

                      cc @silverpill@mitra.social

                      Aside, it seems like I'm only getting Marius's posts, not silverpills. Makes for an interesting one-sided exchange 😛

                      mariusor@metalhead.clubM ? 2 Replies Last reply
                      0
                      • mariusor@metalhead.clubM mariusor@metalhead.club

                        @hongminhee can you point me to the parser you use for fedify?

                        One of my long term plans for GoActivityPub is to built a code generation tool based on contexts and I would need some prior art to see what's important in parsing JSON-LD and RDF.

                        hongminhee@hollo.socialH This user is from outside of this forum
                        hongminhee@hollo.socialH This user is from outside of this forum
                        hongminhee@hollo.social
                        wrote last edited by
                        #34

                        @mariusor@metalhead.club It's barely documented, but has worked well so far!

                        Link Preview Image
                        fedify/packages/vocab-tools at main · fedify-dev/fedify

                        ActivityPub server framework in TypeScript. Contribute to fedify-dev/fedify development by creating an account on GitHub.

                        favicon

                        GitHub (github.com)

                        1 Reply Last reply
                        0
                        • julian@activitypub.spaceJ julian@activitypub.space

                          @mariusor@metalhead.club I thought the whole point of signing objects or attaching proofs (none of which I do, mind you) are precisely to save the need to make a new request, which comes with its own overhead.

                          The good thing is fetching from canonical source will never go out of style.

                          cc @silverpill@mitra.social

                          Aside, it seems like I'm only getting Marius's posts, not silverpills. Makes for an interesting one-sided exchange 😛

                          mariusor@metalhead.clubM This user is from outside of this forum
                          mariusor@metalhead.clubM This user is from outside of this forum
                          mariusor@metalhead.club
                          wrote last edited by
                          #35

                          > to save the need to make a new request

                          @julian probably. But then there's Mastodon that treats so many activities as transient, therefore unfetcheable, which I think is what made the object signing an actual necessity. And outside of the happy path where the actor that generated the object is already known to the server that receives it, there's still the need to fetch their key, so there's no savings for 10-20% (number out of my butt) of activities... As you can probably tell, to me the frictions introduced by signatures are not a good enough tradeoff to effecting one more request.

                          @silverpill

                          1 Reply Last reply
                          0
                          • mariusor@metalhead.clubM This user is from outside of this forum
                            mariusor@metalhead.clubM This user is from outside of this forum
                            mariusor@metalhead.club
                            wrote last edited by
                            #36

                            @silverpill lol, that's simply madness to me. See the sibling reply to Julian why I think signatures, which is what I imagine you mean by "authenticated" are an unnecessary contrievance.

                            I meant "data object" in this context as the end-result binary data type that your application deals with, which for my preference, needs to match the structure of the incoming payload as closely as possible.

                            @hongminhee

                            1 Reply Last reply
                            0
                            • mariusor@metalhead.clubM mariusor@metalhead.club

                              @silverpill personally I feel like the various activity/object signing methods that get used in recent FEPs are more egregious from a size point of view, when the in spec behaviour for obtaining canonical versions of a resource is to fetch them from their server, instead of relying on random object signing that introduces so much more friction.

                              @hongminhee

                              ? Offline
                              ? Offline
                              Guest
                              wrote last edited by
                              #37

                              @mariusor Signatures increase message size but reduce the number of network requests. They are optional, too.

                              @hongminhee

                              1 Reply Last reply
                              0
                              • hongminhee@hollo.socialH hongminhee@hollo.social

                                I have deeply mixed feelings about #ActivityPub's adoption of JSON-LD, as someone who's spent way too long dealing with it while building #Fedify.

                                Part of me wishes it had never happened. A lot of developers jump into ActivityPub development without really understanding JSON-LD, and honestly, can you blame them? The result is a growing number of implementations producing technically invalid JSON-LD. It works, sort of, because everyone's just pattern-matching against what Mastodon does, but it's not correct. And even developers who do take the time to understand JSON-LD often end up hardcoding their documents anyway, because proper JSON-LD processor libraries simply don't exist for many languages. No safety net, no validation, just vibes and hoping you got the @context right. Naturally, mistakes creep in.

                                But then the other part of me thinks: well, we're stuck with JSON-LD now. There's no going back. So wouldn't it be nice if people actually used it properly? Process the documents, normalize them, do the compaction and expansion dance the way the spec intended. That's what Fedify does.

                                Here's the part that really gets to me, though. Because Fedify actually processes JSON-LD correctly, it's more likely to break when talking to implementations that produce malformed documents. From the end user's perspective, Fedify looks like the fragile one. “Why can't I follow this person?” Well, because their server is emitting garbage JSON-LD that happens to work with implementations that just treat it as a regular JSON blob. Every time I get one of these bug reports, I feel a certain injustice. Like being the only person in the group project who actually read the assignment.

                                To be fair, there are real practical reasons why most people don't bother with proper JSON-LD processing. Implementing a full processor is genuinely a lot of work. It leans on the entire Linked Data stack, which is bigger than most people expect going in. And the performance cost isn't trivial either. Fedify uses some tricks to keep things fast, and I'll be honest, that code isn't my proudest work.

                                Anyway, none of this is going anywhere. Just me grumbling into the void. If you're building an ActivityPub implementation, maybe consider using a JSON-LD processor if one's available for your language. And if you're not going to, at least test your output against implementations that do.

                                #JSONLD #fedidev

                                douginamug@mastodon.xyzD This user is from outside of this forum
                                douginamug@mastodon.xyzD This user is from outside of this forum
                                douginamug@mastodon.xyz
                                wrote last edited by
                                #38

                                @hongminhee I'm reading this thread as a relative noob, but what I see again and again: almost no one "properly" implents #ActivityPub largely because #JSONLD is hard but also because the spec itself is unclear. Most people who get stuff done have to go off-spec to actually ship.

                                This seems a fundamental weakness of the #fediverse - and that disregarding the limitations coming from base architecture. Seems to pose a mid/long-term existential threat.

                                What can we do to help improve things?

                                1 Reply Last reply
                                0
                                • julian@activitypub.spaceJ julian@activitypub.space

                                  @mariusor@metalhead.club I thought the whole point of signing objects or attaching proofs (none of which I do, mind you) are precisely to save the need to make a new request, which comes with its own overhead.

                                  The good thing is fetching from canonical source will never go out of style.

                                  cc @silverpill@mitra.social

                                  Aside, it seems like I'm only getting Marius's posts, not silverpills. Makes for an interesting one-sided exchange 😛

                                  ? Offline
                                  ? Offline
                                  Guest
                                  wrote last edited by
                                  #39

                                  @julian I noticed that your inbox endpoint returns 404s (my activities are delivered to personal inbox, not shared).

                                  @mariusor

                                  liaizon@social.wake.stL 1 Reply Last reply
                                  0
                                  • hongminhee@hollo.socialH hongminhee@hollo.social

                                    I have deeply mixed feelings about #ActivityPub's adoption of JSON-LD, as someone who's spent way too long dealing with it while building #Fedify.

                                    Part of me wishes it had never happened. A lot of developers jump into ActivityPub development without really understanding JSON-LD, and honestly, can you blame them? The result is a growing number of implementations producing technically invalid JSON-LD. It works, sort of, because everyone's just pattern-matching against what Mastodon does, but it's not correct. And even developers who do take the time to understand JSON-LD often end up hardcoding their documents anyway, because proper JSON-LD processor libraries simply don't exist for many languages. No safety net, no validation, just vibes and hoping you got the @context right. Naturally, mistakes creep in.

                                    But then the other part of me thinks: well, we're stuck with JSON-LD now. There's no going back. So wouldn't it be nice if people actually used it properly? Process the documents, normalize them, do the compaction and expansion dance the way the spec intended. That's what Fedify does.

                                    Here's the part that really gets to me, though. Because Fedify actually processes JSON-LD correctly, it's more likely to break when talking to implementations that produce malformed documents. From the end user's perspective, Fedify looks like the fragile one. “Why can't I follow this person?” Well, because their server is emitting garbage JSON-LD that happens to work with implementations that just treat it as a regular JSON blob. Every time I get one of these bug reports, I feel a certain injustice. Like being the only person in the group project who actually read the assignment.

                                    To be fair, there are real practical reasons why most people don't bother with proper JSON-LD processing. Implementing a full processor is genuinely a lot of work. It leans on the entire Linked Data stack, which is bigger than most people expect going in. And the performance cost isn't trivial either. Fedify uses some tricks to keep things fast, and I'll be honest, that code isn't my proudest work.

                                    Anyway, none of this is going anywhere. Just me grumbling into the void. If you're building an ActivityPub implementation, maybe consider using a JSON-LD processor if one's available for your language. And if you're not going to, at least test your output against implementations that do.

                                    #JSONLD #fedidev

                                    kopper@not-brain.d.on-t.workK This user is from outside of this forum
                                    kopper@not-brain.d.on-t.workK This user is from outside of this forum
                                    kopper@not-brain.d.on-t.work
                                    wrote last edited by
                                    #40
                                    @hongminhee from the point of view of someone who is "maintaining" a JSON-LD processing fedi software and has implemented their own JSON-LD processing library (which is, to my knowledge, the fastest in it's programming language), JSON-LD is pure overhead. there is nothing it allows for that can't be done with

                                    1. making fields which take multiple values explicit
                                    2. always using namespaces and letting HTTP compression take care of minimizing the transfer

                                    without JSON-LD, fedi software could use zero-ish-copy deserialization for a majority of their objects (when strings aren't escaped) through tools like serde_json and Cow<str>, or
                                    System.Text.Json.JsonDocument. JSON-LD processing effectively mandates a JSON node DOM (in the algorithms standardized, you may be able to get rid of it with Clever Programming)

                                    additionally, due to JSON-LD 1.1 features like @type:@json, you can not even fetch contexts ahead of time of running JSON DOM transformations, meaning all JSON-LD code has to be async (in the languages which has the concept), potentially losing out on significant optimizations that can't be done in coroutines due to various reasons (e.g. C# async methods can't have ref structs, Rust async functions usually require thread safety due to tokio's prevalence, even if they're ran in a single-threaded runtime)

                                    this is
                                    after context processing introducing network dependency to the deserialization of data, wasting time and data on non-server cases (e.g. activitypub C2S). sure you can cache individual contexts, but then the context can change underneath you, desynchronizing your cached context and, in the worst case, opening you up to security vulnerabilities

                                    json-ld is not my favorite part of this protocol
                                    kopper@not-brain.d.on-t.workK sl007@digitalcourage.socialS 2 Replies Last reply
                                    0
                                    • kopper@not-brain.d.on-t.workK kopper@not-brain.d.on-t.work
                                      @hongminhee from the point of view of someone who is "maintaining" a JSON-LD processing fedi software and has implemented their own JSON-LD processing library (which is, to my knowledge, the fastest in it's programming language), JSON-LD is pure overhead. there is nothing it allows for that can't be done with

                                      1. making fields which take multiple values explicit
                                      2. always using namespaces and letting HTTP compression take care of minimizing the transfer

                                      without JSON-LD, fedi software could use zero-ish-copy deserialization for a majority of their objects (when strings aren't escaped) through tools like serde_json and Cow<str>, or
                                      System.Text.Json.JsonDocument. JSON-LD processing effectively mandates a JSON node DOM (in the algorithms standardized, you may be able to get rid of it with Clever Programming)

                                      additionally, due to JSON-LD 1.1 features like @type:@json, you can not even fetch contexts ahead of time of running JSON DOM transformations, meaning all JSON-LD code has to be async (in the languages which has the concept), potentially losing out on significant optimizations that can't be done in coroutines due to various reasons (e.g. C# async methods can't have ref structs, Rust async functions usually require thread safety due to tokio's prevalence, even if they're ran in a single-threaded runtime)

                                      this is
                                      after context processing introducing network dependency to the deserialization of data, wasting time and data on non-server cases (e.g. activitypub C2S). sure you can cache individual contexts, but then the context can change underneath you, desynchronizing your cached context and, in the worst case, opening you up to security vulnerabilities

                                      json-ld is not my favorite part of this protocol
                                      kopper@not-brain.d.on-t.workK This user is from outside of this forum
                                      kopper@not-brain.d.on-t.workK This user is from outside of this forum
                                      kopper@not-brain.d.on-t.work
                                      wrote last edited by
                                      #41
                                      @hongminhee take this part with a grain of salt because my benchmarks for it are with dotNetRdf which is the slowest C# implementation i know of (hence my replacement library), but JSON-LD is slower than RSA validation, which is one of the pain points around authorized fetch scalability

                                      wetdry.world/@kopper/114678924693500011
                                      fentiger@mastodon.socialF kopper@not-brain.d.on-t.workK 3 Replies Last reply
                                      0
                                      • ? Guest

                                        @julian I noticed that your inbox endpoint returns 404s (my activities are delivered to personal inbox, not shared).

                                        @mariusor

                                        liaizon@social.wake.stL This user is from outside of this forum
                                        liaizon@social.wake.stL This user is from outside of this forum
                                        liaizon@social.wake.st
                                        wrote last edited by
                                        #42

                                        reposting so @julian sees this

                                        "I noticed that your inbox endpoint returns 404s (my activities are delivered to personal inbox, not shared)." says @silverpill

                                        julian@activitypub.spaceJ 1 Reply Last reply
                                        0
                                        • kopper@not-brain.d.on-t.workK kopper@not-brain.d.on-t.work
                                          @hongminhee take this part with a grain of salt because my benchmarks for it are with dotNetRdf which is the slowest C# implementation i know of (hence my replacement library), but JSON-LD is slower than RSA validation, which is one of the pain points around authorized fetch scalability

                                          wetdry.world/@kopper/114678924693500011
                                          fentiger@mastodon.socialF This user is from outside of this forum
                                          fentiger@mastodon.socialF This user is from outside of this forum
                                          fentiger@mastodon.social
                                          wrote last edited by
                                          #43

                                          @kopper @hongminhee I'm glad I'm not the only one who noticed this.

                                          1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups