RIP burner accounts
-
RIP burner accounts
LLMs can unmask pseudonymous users at scale with surprising accuracy
Pseudonymity has never been perfect for preserving privacy. Soon it may be pointless.
Ars Technica (arstechnica.com)
@dangoodin yeah pair that with this https://adbleed.eu/ 🥶
-
RIP burner accounts
LLMs can unmask pseudonymous users at scale with surprising accuracy
Pseudonymity has never been perfect for preserving privacy. Soon it may be pointless.
Ars Technica (arstechnica.com)
@dangoodin The article mentions 68% recall and 90% precision. Another way to state these numbers in 42% false negative and 10% false positive. This second number means 10% of the general population would be classified as "pseudonymous". Apply that for example to 1 billion Facebook account, and you get 100 million users wrongly flagged. That could be a problem!
-
R relay@relay.an.exchange shared this topic
-
RIP burner accounts
LLMs can unmask pseudonymous users at scale with surprising accuracy
Pseudonymity has never been perfect for preserving privacy. Soon it may be pointless.
Ars Technica (arstechnica.com)
@dangoodin lol so you basically have to run your text through an LLM to anonymize your styel first

-
RIP burner accounts
LLMs can unmask pseudonymous users at scale with surprising accuracy
Pseudonymity has never been perfect for preserving privacy. Soon it may be pointless.
Ars Technica (arstechnica.com)
@dangoodin jus spel thigs rong an tipe difrent then nrml.
-
@Ostrobothnia@toot.community @dangoodin
Good recommendation. I had read something similar in privacy guides. I use the following setup: I’ve compartmentalized my browser setup for daily browsing — I use Brave. As a second browser I use Mullvad, and my tertiary browser is a Tails + Tor combo; in that combo there are no country-specific flags anyway.
Still, your recommendation is also good. Maybe there is a technical solution — a friend said that in theory it would be easy to find a solution.
-
RIP burner accounts
LLMs can unmask pseudonymous users at scale with surprising accuracy
Pseudonymity has never been perfect for preserving privacy. Soon it may be pointless.
Ars Technica (arstechnica.com)
@dangoodin
I wonder if something like this helps any
https://gibberifier.com/ -
@dangoodin The article mentions 68% recall and 90% precision. Another way to state these numbers in 42% false negative and 10% false positive. This second number means 10% of the general population would be classified as "pseudonymous". Apply that for example to 1 billion Facebook account, and you get 100 million users wrongly flagged. That could be a problem!
@huitema @dangoodin I’d also like to point out that the paper has a member of Anthropic listed as one of the authors. Anthropic has previously played up the effectiveness of their products in papers, before backtracking and providing more realistic details after the news has made its rounds. I’m skeptical of this paper at best
-
@dangoodin so to disguise your true identity from a LLM everything you send to it must be anonymised in another LLM before that? I’ll put down some notes here.
@rpsu technically just written in a completely different style other than yours, but yeah, LLM is a fastest way to do that
Like, old school human criminalsts, given enough examples of text, could accurately estimate, if they were written by the same person
And LLMs are literally designed to encode all text nuances in comparable mathematical vectors, so they can do that even more accurately, and on a scale
-
RIP burner accounts
LLMs can unmask pseudonymous users at scale with surprising accuracy
Pseudonymity has never been perfect for preserving privacy. Soon it may be pointless.
Ars Technica (arstechnica.com)
I wonder whether the fact that different forums have different unspoken rules about the language they use might make cross identification more difficult. There are forums where has to be a bit…mean, almost, to survive trolls and others where the moderators take care of the trolls. It changes the language a lot.
I imagine even someone who writes LinkedIn poetry wouldn’t carry that style over to another forum.
But there are still other identifiers, such as preferences for outliers, etc. (“I hate chocolate, Cara oranges, and The Godfather”).
Either way, while I’ve always figured it would one day possible to advance in that direction woth more automation, (since humans can already kind of do it, too), it is very creepy and deeply unwanted.
-
Based on that one experiment they did where they identified 7% of users, I bet it would be used more as an initial attempt to identify someone, then see if those guesses include someone you’re looking for, etc.
It would lower the barrier for humans trying to unmask other humans- or go for low hanging fruit among the pseudonyms.Maybe we should have regular talk like a pirate day to spike the data with some “argh matey”.
Edit: to make it clear, 7% of users is very, very little lol.
-
But yeah, I agree that they do like to oversell. It feels like these models are a bit like hammers in search of nails.
-
R relay@relay.infosec.exchange shared this topic