I have deeply mixed feelings about #ActivityPub's adoption of JSON-LD, as someone who's spent way too long dealing with it while building #Fedify.
-
@julian @mcc @hongminhee the downside is that you now need a central registry of allowed terms and what they mean.
the way to avoid that is to always use "expanded" form, i.e. use full IRIs as property keys (and types) and {"id": "foo"} over "foo". in effect, you treat the http(s) authority as the social entity defining the term.
-
@mcc @hongminhee Don't really support, but discards activities without
@contextanyway.I suspect JSON-LD was a way to have extensibility and escape XMPP's XEP hell with servers and clients not supporting or disabling features in an infinite matrix.
But seems community favors FEPs describing JSON schemas and hardcoding it over getting them from a server and mapping the object at runtime.@cochise @mcc @hongminhee mastodon is one of the "better" ones in that regard, but famously requires you to have the same context as it (instead of expanding shorthand terms to the full IRIs and comparing those...)
-
@hongminhee if i can give one piece of advice to devs who want to process JSON-LD: dont bother compacting. you already know the schema you output (or you're just passing through what the user gives and it doesn't matter to you), serialize directly to the compacted representation, and only run expansion on incoming data
expansion is the cheapest JSON-LD operation (since all other operations depend on it and run it internally anyhow), and this will get you all the compatibility benefits of JSON-LD with little downsides (beyond more annoying deserialization code, as you have to map the expanded representation to your internal structure which will likely be modeled after the compacted one)generally agreed except
> you have to map the expanded representation to your internal structure which will likely be modeled after the compacted one
this is compaction but manual instead of using a jsonld processor to do it. maybe the more precise argument is "don't bother with auto/native compaction"?
with that said: you also lose out on flattening and framing, which are pretty cool features for transforming the serialization. if you don't care about those, ok fine
-
@julian @mat atproto lets you section things off by "app" roughly, which is something that could be done with "plain old http" using content-types and well-known uris.
json-ld makes it so that you don't have to use those -- the uris can be anything you'd like, including more natural names.
the problem is that people can and will disagree. "talk it out" is not a complete solution. the "talk it out" solution is things like central registries managed by the IANA which most treat as consensus.
-
The problem is rarely in parsing as2 context, but dealing with how different implementations decide to create projections from the data.
Take a simple poll. The 3 diffferent servers I saw were generating the text, the choices, and the replies collection in completely different ways. Without JSON-LD, each separate system would be fighting to figure out how to present the data.
@raphael @silverpill @hongminhee @mariusor most often the trouble i see is with ignoring the fact that everyone is using the same terms with different meanings, and pretending that we all agree when we actually do not.
the second most common issue i see is with the complete lack of any guarantees beyond "this thing is probably an activity" (which even that small bit is often discarded!)
json-ld is so far down the list of pain points, and the pain comes from ignoring it or misusing it.
-
@raphael @silverpill @hongminhee @mariusor most often the trouble i see is with ignoring the fact that everyone is using the same terms with different meanings, and pretending that we all agree when we actually do not.
the second most common issue i see is with the complete lack of any guarantees beyond "this thing is probably an activity" (which even that small bit is often discarded!)
json-ld is so far down the list of pain points, and the pain comes from ignoring it or misusing it.
@raphael @silverpill @hongminhee @mariusor example: if you took lemmy's use of as:Group you might assume that every as:Group is a lemmy-style "community" and that it always produces 1b12-style Announce activities, and that "Announce" means how they use it and not its actual definition.
now if lemmy had used their own vocabulary, it might be easier to understand that "this is a lemmy-style community".
the activity processing model shouldn't care what lemmy properties are used.
-
@raphael @silverpill @hongminhee @mariusor example: if you took lemmy's use of as:Group you might assume that every as:Group is a lemmy-style "community" and that it always produces 1b12-style Announce activities, and that "Announce" means how they use it and not its actual definition.
now if lemmy had used their own vocabulary, it might be easier to understand that "this is a lemmy-style community".
the activity processing model shouldn't care what lemmy properties are used.
@raphael @silverpill @hongminhee @mariusor what is far more powerful is drawing *equivalences* between values. you might say every lemmy:Community is also always as:Group, but not every as:Group is always a lemmy:Community. in this case we are basically saying lemmy:Community is rdfs:subClassOf as:Group.
separately we might say every as:Group is also a vcard:Group, and vice versa -- that might make them owl:equivalentClass, but that doesn't mean the "activity model" and "vcard model" are equal!
-
@hongminhee what i have found necessary (sadly) is to sometimes ignore what @\context a software produces and simply inject a corrected @\context describing what they *actually* meant instead of what they said they meant. x_x
the "incorrect" mastodon context in use right now (or equivalent), which can be swapped out for the "correct" mastodon context to be more compatible with generic json-ld (and more semantically correct)
the "incorrect" mastodon context in use right now (or equivalent), which can be swapped out for the "correct" mastodon context to be more compatible with generic json-ld (and more semantically correct) - mastodon-context-correct.jsonld
Gist (gist.github.com)
it would be an Exercise to sit down and map out the actual contexts of softwares like mastodon 4.5, mastodon 4.4, misskey 2025.12, akkoma 3.10.2, and so on...
for all else, there's shacl i guess, if you want to beat things into the correct shapes.
@trwnh@mastodon.social it's not an exercise, not anymore, with the Fediverse Observatory!
-
R relay@relay.an.exchange shared this topic
-
generally agreed except
> you have to map the expanded representation to your internal structure which will likely be modeled after the compacted one
this is compaction but manual instead of using a jsonld processor to do it. maybe the more precise argument is "don't bother with auto/native compaction"?
with that said: you also lose out on flattening and framing, which are pretty cool features for transforming the serialization. if you don't care about those, ok fine
@trwnh @hongminhee i'm not entirely sure on what you mean (it's about 3am here) but compaction isnt that cheap.
flattening and especially framing are the most expensive, and expansion is the cheapest especially since all the other algorithms depend on it (though if you do expand manually before it'll take a fast path out)
my argument here is that, if you know the structure you're serializing to (i.e. if you're a contemporary AP implementation that isn't doing anything too fancy), you can directly serialize in compacted form and skip constructing a tree of JSON objects in your library and running the compaction algorithm over it. depending on how clever you(r libraries) get you may be able to directly write the JSON string directly, even.
from some brief profiling i've done this does show up as a hot code path in iceshrimp.net, one of my goals with Eventually replacing dotNetRdf with my own impl mentioned above is to, given i'm gonna have to mess with serialization anyhow, remove the JSON-LD bits there and serialize directly to compacted form which should help with large boosts and other bursts -
@hongminhee from the point of view of someone who is "maintaining" a JSON-LD processing fedi software and has implemented their own JSON-LD processing library (which is, to my knowledge, the fastest in it's programming language), JSON-LD is pure overhead. there is nothing it allows for that can't be done with
1. making fields which take multiple values explicit
2. always using namespaces and letting HTTP compression take care of minimizing the transfer
without JSON-LD, fedi software could use zero-ish-copy deserialization for a majority of their objects (when strings aren't escaped) through tools like serde_json and Cow<str>, or System.Text.Json.JsonDocument. JSON-LD processing effectively mandates a JSON node DOM (in the algorithms standardized, you may be able to get rid of it with Clever Programming)
additionally, due to JSON-LD 1.1 features like @type:@json, you can not even fetch contexts ahead of time of running JSON DOM transformations, meaning all JSON-LD code has to be async (in the languages which has the concept), potentially losing out on significant optimizations that can't be done in coroutines due to various reasons (e.g. C# async methods can't have ref structs, Rust async functions usually require thread safety due to tokio's prevalence, even if they're ran in a single-threaded runtime)
this is after context processing introducing network dependency to the deserialization of data, wasting time and data on non-server cases (e.g. activitypub C2S). sure you can cache individual contexts, but then the context can change underneath you, desynchronizing your cached context and, in the worst case, opening you up to security vulnerabilities
json-ld is not my favorite part of this protocol@kopper @hongminhee As the person probably most responsible for making sure json-ld stayed in the spec (two reasons: because it was the only extensibility answer we had, and because we were trying hard to retain interoperability with the linked data people, which ultimately did not matter), I agree with you. I do ultimately regret not having a simpler solution than json-ld, especially because it greatly hurt our ability to sign messages, which has considerable effect on the ecosystem.
Mea culpa

I do think it's fixable. I'd be interested in joining a conversation about how to fix it.
-
@kopper @hongminhee As the person probably most responsible for making sure json-ld stayed in the spec (two reasons: because it was the only extensibility answer we had, and because we were trying hard to retain interoperability with the linked data people, which ultimately did not matter), I agree with you. I do ultimately regret not having a simpler solution than json-ld, especially because it greatly hurt our ability to sign messages, which has considerable effect on the ecosystem.
Mea culpa

I do think it's fixable. I'd be interested in joining a conversation about how to fix it.
I don't remember it that way.
We started the WG off with AS2 being based on JSON-LD, and I don't think we ever considered removing it.
I don't think it was a decision you made on your own. I'm not sure how you would, since you edited AP and not AS2 Core or Vocabulary.
-
I don't remember it that way.
We started the WG off with AS2 being based on JSON-LD, and I don't think we ever considered removing it.
I don't think it was a decision you made on your own. I'm not sure how you would, since you edited AP and not AS2 Core or Vocabulary.
I would be strongly opposed to any effort to remove JSON-LD from AS2. We use it for a lot of extensions. Every AP server uses the Security vocabulary for public keys.
-
I would be strongly opposed to any effort to remove JSON-LD from AS2. We use it for a lot of extensions. Every AP server uses the Security vocabulary for public keys.
@cwebber @kopper @hongminhee It would be a huge backwards-incompatible change for almost zero benefit. People would still make mistakes in their ActivityPub implementations (sorry, Minhee, but that's life on an open network). We'd need to adopt another mechanism for defining extensions, and guess what? People are going to make mistakes with that, too.
-
@cwebber @kopper @hongminhee It would be a huge backwards-incompatible change for almost zero benefit. People would still make mistakes in their ActivityPub implementations (sorry, Minhee, but that's life on an open network). We'd need to adopt another mechanism for defining extensions, and guess what? People are going to make mistakes with that, too.
@cwebber @kopper @hongminhee The biggest downside to JSON-LD, it seems, is that it lets most developers treat AS2 as if it's plain old JSON. That was by design. People sometimes mess it up, but most JSON-LD parsers are pretty tolerant.
-
@cwebber @kopper @hongminhee The biggest downside to JSON-LD, it seems, is that it lets most developers treat AS2 as if it's plain old JSON. That was by design. People sometimes mess it up, but most JSON-LD parsers are pretty tolerant.
@evan @cwebber @kopper @hongminhee Couldn’t we agree to standardize on expanded json-ld? We would not need any json-ld processor, we would not need to fetch or cache any context. There would be no way to shadow properties.
-
@evan @cwebber @kopper @hongminhee Couldn’t we agree to standardize on expanded json-ld? We would not need any json-ld processor, we would not need to fetch or cache any context. There would be no way to shadow properties.
@gugurumbe @cwebber @kopper @hongminhee AS2 requires compacted JSON-LD.
-
@gugurumbe @cwebber @kopper @hongminhee AS2 requires compacted JSON-LD.
@evan @gugurumbe @cwebber @kopper @hongminhee only for terms defined in AS2, though?
if the activitystreams context is missing in an application/activity+json document, then you MUST assume/inject it. this means you can't redefine "actor" to mean "actor in a movie".
otherwise, you don't have to augment the context with anything else. "https://w3id.org/security#publicKey" is a valid property name. the proposal is to not augment the normative context where possible. no parsing context if there is no context
-
@evan @gugurumbe @cwebber @kopper @hongminhee only for terms defined in AS2, though?
if the activitystreams context is missing in an application/activity+json document, then you MUST assume/inject it. this means you can't redefine "actor" to mean "actor in a movie".
otherwise, you don't have to augment the context with anything else. "https://w3id.org/security#publicKey" is a valid property name. the proposal is to not augment the normative context where possible. no parsing context if there is no context
@trwnh i was replying to a post that wanted all expanded terms.
-
@trwnh i was replying to a post that wanted all expanded terms.
@evan @trwnh @cwebber @kopper @hongminhee I think it would be great to have everything expanded besides the required as2 context.
The results of the compaction algorithm would change if new things migrate into schema.org, so technically a document could become invalid or break without being modified, but this would be a lot better otherwise I guess. -
@evan @trwnh @cwebber @kopper @hongminhee I think it would be great to have everything expanded besides the required as2 context.
The results of the compaction algorithm would change if new things migrate into schema.org, so technically a document could become invalid or break without being modified, but this would be a lot better otherwise I guess.@gugurumbe @evan @cwebber @kopper @hongminhee yup, using full IRIs also has the advantage that ld-unaware processors only need to recognize 1 form instead of infinitely many.
the thing is, we have semantics imported from the content type (activity+json) which can also change. which is why i think versioning the context document is also important -- it freezes the semantics at the time of publishing, like pinning your dependencies.
without that, we might well have a simpler profile...