CIRCLE WITH A DOT

d_olex@mastodon.social

@pwinn @mistakenotmy ... and to put some serious nails into the coffin of "LLMs are dumb and can't solve puzzles" take -- here's Hack The Box CTF profile of my Sonnet 4.5/4.6 based AI bot: it can solve insane difficulty tasks and performs on the same level with top 0.5% of human players. Most of these tasks are recent ones so it doesn't have any writeups or solutions in its training data. So yeah: trust no one and conduct your own experiments

d_olex@mastodon.social

@pwinn @mistakenotmy … and completely random (presumably not hardcoded) question, just in case

d_olex@mastodon.social

@pwinn @mistakenotmy All models available under free subscription are able to produce correct answer for different random words
(haven’t tested with Opus 4.7 since I don’t have pro plan account under my hand)

d_olex@mastodon.social

@pwinn @mistakenotmy older model with different random word

d_olex@mastodon.social

@mistakenotmy lol, fake news

CIRCLE WITH A DOT

d_olex@mastodon.social

Posts