Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (Cyborg)
  • No Skin
Collapse
Brand Logo

CIRCLE WITH A DOT

  1. Home
  2. Uncategorized
  3. I've been thinking about adding federation health monitoring to #Fedify—not as a separate data store or custom API, but by extending the existing #OpenTelemetry integration.

I've been thinking about adding federation health monitoring to #Fedify—not as a separate data store or custom API, but by extending the existing #OpenTelemetry integration.

Scheduled Pinned Locked Moved Uncategorized
fedifyfedidevactivitypubopentelemetry
3 Posts 3 Posters 3 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • fedify@hollo.socialF This user is from outside of this forum
    fedify@hollo.socialF This user is from outside of this forum
    fedify@hollo.social
    wrote last edited by
    #1

    I've been thinking about adding federation health monitoring to #Fedify—not as a separate data store or custom API, but by extending the existing #OpenTelemetry integration. The idea is to expose delivery outcomes, signature verification failures, and per-remote-host error rates as OpenTelemetry metrics alongside the spans Fedify already emits. If you already have a Prometheus or Grafana setup, you'd get federation observability basically for free. Circuit breaker behavior (temporarily skipping a remote server that's been consistently unreachable) could surface as OpenTelemetry events, keeping everything in the same trace context rather than scattered across separate logs.

    Does this sound useful to you? I'm curious whether people building on Fedify—or running federated servers in general—would actually reach for this, and what kinds of things you'd most want to observe. Happy to hear any thoughts.

    #fedidev #ActivityPub

    julian@fietkau.socialJ thisismissem@activitypub.spaceT 2 Replies Last reply
    0
    • fedify@hollo.socialF fedify@hollo.social

      I've been thinking about adding federation health monitoring to #Fedify—not as a separate data store or custom API, but by extending the existing #OpenTelemetry integration. The idea is to expose delivery outcomes, signature verification failures, and per-remote-host error rates as OpenTelemetry metrics alongside the spans Fedify already emits. If you already have a Prometheus or Grafana setup, you'd get federation observability basically for free. Circuit breaker behavior (temporarily skipping a remote server that's been consistently unreachable) could surface as OpenTelemetry events, keeping everything in the same trace context rather than scattered across separate logs.

      Does this sound useful to you? I'm curious whether people building on Fedify—or running federated servers in general—would actually reach for this, and what kinds of things you'd most want to observe. Happy to hear any thoughts.

      #fedidev #ActivityPub

      julian@fietkau.socialJ This user is from outside of this forum
      julian@fietkau.socialJ This user is from outside of this forum
      julian@fietkau.social
      wrote last edited by
      #2

      @fedify As a Mastodon server admin and user, I look at the Sidekiq diagnostic interface whenever I notice something is off – for example when I'm not seeing a post which I know should exist. I don't monitor connection health proactively. Maybe people who are admins for larger servers do that.

      For Fedify, I might use something like you describe on rare occasions, and would accordingly see it as a nice to have feature, but lower priority.

      1 Reply Last reply
      0
      • fedify@hollo.socialF fedify@hollo.social

        I've been thinking about adding federation health monitoring to #Fedify—not as a separate data store or custom API, but by extending the existing #OpenTelemetry integration. The idea is to expose delivery outcomes, signature verification failures, and per-remote-host error rates as OpenTelemetry metrics alongside the spans Fedify already emits. If you already have a Prometheus or Grafana setup, you'd get federation observability basically for free. Circuit breaker behavior (temporarily skipping a remote server that's been consistently unreachable) could surface as OpenTelemetry events, keeping everything in the same trace context rather than scattered across separate logs.

        Does this sound useful to you? I'm curious whether people building on Fedify—or running federated servers in general—would actually reach for this, and what kinds of things you'd most want to observe. Happy to hear any thoughts.

        #fedidev #ActivityPub

        thisismissem@activitypub.spaceT This user is from outside of this forum
        thisismissem@activitypub.spaceT This user is from outside of this forum
        thisismissem@activitypub.space
        wrote last edited by
        #3

        I think you may want circuit breakers as independent from observerability but this level of detail for observerability would still be good. Keep in mind at scale you usually can only do sampling of x% of events in OTel

        1 Reply Last reply
        1
        0
        • R relay@relay.mycrowd.ca shared this topic
        Reply
        • Reply as topic
        Log in to reply
        • Oldest to Newest
        • Newest to Oldest
        • Most Votes


        • Login

        • Login or register to search.
        • First post
          Last post
        0
        • Categories
        • Recent
        • Tags
        • Popular
        • World
        • Users
        • Groups