@pwinn @mistakenotmy ... and to put some serious nails into the coffin of "LLMs are dumb and can't solve puzzles" take -- here's Hack The Box CTF profile of my Sonnet 4.5/4.6 based AI bot: it can solve insane difficulty tasks and performs on the same level with top 0.5% of human players. Most of these tasks are recent ones so it doesn't have any writeups or solutions in its training data. So yeah: trust no one and conduct your own experiments 






