I've started my exploration of using @timbray's Quamina project for saving some compute time in the filters module of #GoActivityPub
-
I've started my exploration of using @timbray's Quamina project for saving some compute time in the filters module of #GoActivityPub
Currently the GoAP storage backends iterate over resources (usually stored as raw JSON bytes), unmarshal them into GoActivityPub object structs, and *only* then apply the custom filtering logic on those objects. Since the majority of the objects generally fail the filtering logic, all that JSON decoding is wasted compute time and makes things slower.
Ideally quamina will allow me to check the raw JSON payloads directly against the filters, streamlining the execution and speeding things up.
-
R relay@relay.an.exchange shared this topic on
-
I've started my exploration of using @timbray's Quamina project for saving some compute time in the filters module of #GoActivityPub
Currently the GoAP storage backends iterate over resources (usually stored as raw JSON bytes), unmarshal them into GoActivityPub object structs, and *only* then apply the custom filtering logic on those objects. Since the majority of the objects generally fail the filtering logic, all that JSON decoding is wasted compute time and makes things slower.
Ideally quamina will allow me to check the raw JSON payloads directly against the filters, streamlining the execution and speeding things up.
Sadly adding quamina didn't bring any meaningful changes to the integration test suite I'm using for my federated server, probably because the amount of data they handle is way too low and the overhead of running the application and testsuite is way too high.
It looks like I need to build some artificial benchmarks handling strictly the storage fetches.
-
Sadly adding quamina didn't bring any meaningful changes to the integration test suite I'm using for my federated server, probably because the amount of data they handle is way too low and the overhead of running the application and testsuite is way too high.
It looks like I need to build some artificial benchmarks handling strictly the storage fetches.
Well, benchmarking doesn't help either, the measurements are so noisy that I can't even make any inferences from them.
I'm not sure how I can isolate the tests even more.
Perhaps the issue is that all the tests rely on the actual filesystem.
Maybe I need to find a memory backed filesystem mock...
-
R relay@relay.mycrowd.ca shared this topic on
-
Well, benchmarking doesn't help either, the measurements are so noisy that I can't even make any inferences from them.
I'm not sure how I can isolate the tests even more.
Perhaps the issue is that all the tests rely on the actual filesystem.
Maybe I need to find a memory backed filesystem mock...
I realized I already run on a memory backed filesystem, as the default testing.T.TempDir() returns a path in /tmp which is tmpfs for the machine where I'm testing.
Gaaah!!!

-
I've started my exploration of using @timbray's Quamina project for saving some compute time in the filters module of #GoActivityPub
Currently the GoAP storage backends iterate over resources (usually stored as raw JSON bytes), unmarshal them into GoActivityPub object structs, and *only* then apply the custom filtering logic on those objects. Since the majority of the objects generally fail the filtering logic, all that JSON decoding is wasted compute time and makes things slower.
Ideally quamina will allow me to check the raw JSON payloads directly against the filters, streamlining the execution and speeding things up.
I have finally I made significant progress on this.
The final code looks like this:
https://github.com/go-ap/filters/blob/master/bytes_filter.go#L199
For a list of filters that #GoActivityPub uses, we generate two patterns for Quamina: one for a denormalized raw document, and one for normalized raw document (they usually are stored in a normalized form, where an Activity's object/actor properties are flattened to their IRIs, but we can't know which it is unless we unmarshall it, which we want to avoid)
Another improvement from my complain from 4 days ago is that I was regenerating the patterns and initializing quamina for every new document, instead of one time per collection load.
-
I have finally I made significant progress on this.
The final code looks like this:
https://github.com/go-ap/filters/blob/master/bytes_filter.go#L199
For a list of filters that #GoActivityPub uses, we generate two patterns for Quamina: one for a denormalized raw document, and one for normalized raw document (they usually are stored in a normalized form, where an Activity's object/actor properties are flattened to their IRIs, but we can't know which it is unless we unmarshall it, which we want to avoid)
Another improvement from my complain from 4 days ago is that I was regenerating the patterns and initializing quamina for every new document, instead of one time per collection load.
@mariusor BTW I think Spencer Nelson just fixed Quamina's /v2 problem. Now I'll look at what you did (somehow I missed this post back in February)
-
@mariusor BTW I think Spencer Nelson just fixed Quamina's /v2 problem. Now I'll look at what you did (somehow I missed this post back in February)
@timbray no worries, there isn't much that I did on my side.
I have some "filter" types, that I use for filtering ActivityPub collections and objects and now they get transformed into quamina patterns and applied on the raw json that we store the ActivityPub objects as before marshaling them into the structs that the filters themselves get applied against.
Hopefully it saves some computation, but the benchmarks I created weren't very conclusive... perhaps because there's so much other functionality in the pipeline...
However the feature has made its way into production into the various servers that use this library...
-
@timbray no worries, there isn't much that I did on my side.
I have some "filter" types, that I use for filtering ActivityPub collections and objects and now they get transformed into quamina patterns and applied on the raw json that we store the ActivityPub objects as before marshaling them into the structs that the filters themselves get applied against.
Hopefully it saves some computation, but the benchmarks I created weren't very conclusive... perhaps because there's so much other functionality in the pipeline...
However the feature has made its way into production into the various servers that use this library...
@timbray erm... apologies for the verbosity, on a second look my first paragraph doesn't read very well at all. Hopefully it makes at least a little sense.
Anyway the permalink to the function where I do all that has changed a little since previous post: https://github.com/go-ap/filters/blob/master/bytes_filter.go#L201
-
@timbray erm... apologies for the verbosity, on a second look my first paragraph doesn't read very well at all. Hopefully it makes at least a little sense.
Anyway the permalink to the function where I do all that has changed a little since previous post: https://github.com/go-ap/filters/blob/master/bytes_filter.go#L201
@mariusor OK, feel free to yell at me if anything goes wrong.
-
R relay@relay.mycrowd.ca shared this topic
-
@mariusor OK, feel free to yell at me if anything goes wrong.
@timbray so far there isn't a v2 in the go proxy ecosystem. I don't know if it's because there's lag before the version gets picked up and disseminated or there's still something missing. I'll keep an eye on it and let you know tomorrow if it's still not there.
-
@timbray so far there isn't a v2 in the go proxy ecosystem. I don't know if it's because there's lag before the version gets picked up and disseminated or there's still something missing. I'll keep an eye on it and let you know tomorrow if it's still not there.
@mariusor Please do let me know, I thought we had that sorted
-
@mariusor Please do let me know, I thought we had that sorted
@timbray @mariusor I think this is up and running at https://pkg.go.dev/quamina.net/go/quamina/v2.
You won't see it on the https://pkg.go.dev/quamina.net/go/quamina?tab=versions because a new major version constitutes a new module path. https://pkg.go.dev/quamina.net/go/quamina has a tiny callout at the top of the page that says "The highest tagged major version is v2."
-
@timbray @mariusor I think this is up and running at https://pkg.go.dev/quamina.net/go/quamina/v2.
You won't see it on the https://pkg.go.dev/quamina.net/go/quamina?tab=versions because a new major version constitutes a new module path. https://pkg.go.dev/quamina.net/go/quamina has a tiny callout at the top of the page that says "The highest tagged major version is v2."