generative so-called "AI" is now being used to transcribe and translate Latin manuscripts.

dalias@hachyderm.io

@elilla "Signal-shaped noise" is an utterly brilliant characterization of what "gen AI" produces.

jmelesky@tinylad.social

@elilla@transmom.love I'm a data engineer. I've been saying for years if not decades: "Bad data is worse than no data". And, generally, when people hear that, they agree with me.

When I point out that genAI produces bad data, the turnaround to "oh, but, so useful", "early days", etc, is quick and disheartening.

lorxus@yiff.life

@elilla @jacel As someone who likes using (but not remotely relying on) automated transcription and notetaking that way... as far as I'm concerned, if anyone's *training* on that stuff, then they deserve exactly what they'll get. And if whatever big corporation is *putting that stuff in training sets*, then they need to quit shitting where they eat.

jacel@m.prettyshiny.org

@lorxus @elilla yeah, but.

You know whose training set all these things that are being indiscriminately vomited into the informational substrate of humanity /do/ end up in?

The people's c.c

rose_alibi@post.lurk.org

@elilla SIGNAL-SHAPED NOISE

gloriouscow@oldbytes.space

@elilla I experimented with using ChatGPT to do OCR on old scanned assembly code listings.

Columnar text has always been a huge challenge for OCR, and I had already tried Tesseract and given up on it.

At first I thought the results from ChatGPT were a revolutionary leap in the state of the art.

Then I looked closer - it had reworded the comments and headers. It even changed the code in places, swapping out entire mnemonics and parameters.

Like any good sloperator I tried to prompt may way around this, which was met by effusive apologies and assurances that it would, going forward, be sure to never do that again.

Which of course, it immediately did.

I suspect there's only the most tenuous thread of context between a "multi-modal" LLM's text and image capabilities - they're basically just two models duct-taped together.

I find this particularly disturbing as if someone simply doing an editorial pass looking for spelling or grammar errors may not notice that the content appears fundamentally correct, but was actually altered.

I would rather wade through a sea of Tesseract's obvious typos than have to take on the much higher cognitive burden of making sure grammatically correct sentences weren't invented wholesale.

moonhouse@social.tchncs.de

@elilla Earlier today I reflected on how AI generated closed captions on local news here in Sweden are too exact. When a human does them in Sweden they remove filler words and repeat words. When they suddenly are there it takes more cognitive effort to read what people are saying.

707kat@mastodon.art

@elilla Wrong information is so not better than nothing.

weekend_editor@mathstodon.xyz

@elilla

Thing is, our myths and literature have been telling us this for millennia!

*All* the oracle stories involve an oracle saying something ambiguous, which the protagonist dangerously misinterprets. It will always be mushy, you'll always choose the wrong interpretation, and it will always be your fault. In that sense, saying "you have to check the AI result" is a threat, meaning the AI is free to make mistakes, but you will be held liable.

This is not positive information; it is almost *negative* information in that we still don't know the truth, but are tempted into dangerous fantasies of misinterpretation.

We've even turned the whole mess into a cautionary tale with the "ibis redibis" story of the oracle at Dodona, a caution heeded nowadays by almost nobody:

Ibis redibis nunquam per bella peribis - Wikipedia

(en.wikipedia.org)

rogerparkinson@mastodon.nz

@elilla I have direct experience of this. There's a handwritten letter from my grandfather dated around 1914 that turned up in a box of stuff. It's in cursive and younger people are less familiar with cursive so a family member put it through chatgtp. The result was, as you'd expect, vaguely similar to what was written, with some alarming inaccuracies. And it missed the actual point he was writing about.
I'm old enough to read cursive and I've had some recent experience making out other old writing in much worse hand, so I could read it quite well. A couple of words were hard to decipher but not impossible.
So my conclusion was that the AI transcription was worse than useless.

chemicaleyeguy@mstdn.science

@elilla #AI is #clankers all the way down.

#Resist #broligarchs and #wankers and their #AIslop.

lorxus@yiff.life

@jacel @elilla As in, people will read those and uncritically accept it? Or something else?

jupiter@mastodon.gamedev.place

@elilla
Plane recte dicis!

CIRCLE WITH A DOT

generative so-called "AI" is now being used to transcribe and translate Latin manuscripts.

Ibis redibis nunquam per bella peribis - Wikipedia

Dan Conway (@magisterconway.bsky.social)