<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[&quot;A recent 2026 empirical study titled &quot;Beyond Code Snippets: Benchmarking LLMs on Repository-Level Question Answering&quot; (published on arXiv&#x2F;ResearchGate) explicitly tested LLMs on codebase comprehension.]]></title><description><![CDATA[<p>"A recent 2026 empirical study titled "Beyond Code Snippets: Benchmarking LLMs on Repository-Level Question Answering" (published on arXiv/ResearchGate) explicitly tested LLMs on codebase comprehension. The researchers concluded that high performance often <strong>"results from verbatim reproduction of Stack Overflow answers rather than genuine reasoning."</strong> " <a href="https://www.researchgate.net/publication/403262523_Beyond_Code_Snippets_Benchmarking_LLMs_on_Repository-Level_Question_Answering" rel="nofollow noopener"><span>https://www.</span><span>researchgate.net/publication/4</span><span>03262523_Beyond_Code_Snippets_Benchmarking_LLMs_on_Repository-Level_Question_Answering</span></a></p>]]></description><link>https://board.circlewithadot.net/topic/e568d96b-66c0-48b6-8967-e7ec04a72e1f/a-recent-2026-empirical-study-titled-beyond-code-snippets-benchmarking-llms-on-repository-level-question-answering-published-on-arxiv-researchgate-explicitly-tested-llms-on-codebase-comprehension.</link><generator>RSS for Node</generator><lastBuildDate>Fri, 15 May 2026 00:33:23 GMT</lastBuildDate><atom:link href="https://board.circlewithadot.net/topic/e568d96b-66c0-48b6-8967-e7ec04a72e1f.rss" rel="self" type="application/rss+xml"/><pubDate>Tue, 21 Apr 2026 13:05:53 GMT</pubDate><ttl>60</ttl><item><title><![CDATA[Reply to &quot;A recent 2026 empirical study titled &quot;Beyond Code Snippets: Benchmarking LLMs on Repository-Level Question Answering&quot; (published on arXiv&#x2F;ResearchGate) explicitly tested LLMs on codebase comprehension. on Wed, 22 Apr 2026 10:08:08 GMT]]></title><description><![CDATA[<p><span><a href="/user/slyecho%40mdon.ee" rel="nofollow noopener">@<span>slyecho</span></a></span> feel free to evaluate yourself using whatever tools you prefer</p>]]></description><link>https://board.circlewithadot.net/post/https://infosec.exchange/users/codinghorror/statuses/116447804697185176</link><guid isPermaLink="true">https://board.circlewithadot.net/post/https://infosec.exchange/users/codinghorror/statuses/116447804697185176</guid><dc:creator><![CDATA[codinghorror@infosec.exchange]]></dc:creator><pubDate>Wed, 22 Apr 2026 10:08:08 GMT</pubDate></item><item><title><![CDATA[Reply to &quot;A recent 2026 empirical study titled &quot;Beyond Code Snippets: Benchmarking LLMs on Repository-Level Question Answering&quot; (published on arXiv&#x2F;ResearchGate) explicitly tested LLMs on codebase comprehension. on Wed, 22 Apr 2026 08:39:06 GMT]]></title><description><![CDATA[<p><span><a href="/user/codinghorror%40infosec.exchange">@<span>codinghorror</span></a></span> 0 surprise there.</p>]]></description><link>https://board.circlewithadot.net/post/https://mastodon.sdf.org/users/doragasu/statuses/116447454574817162</link><guid isPermaLink="true">https://board.circlewithadot.net/post/https://mastodon.sdf.org/users/doragasu/statuses/116447454574817162</guid><dc:creator><![CDATA[doragasu@mastodon.sdf.org]]></dc:creator><pubDate>Wed, 22 Apr 2026 08:39:06 GMT</pubDate></item><item><title><![CDATA[Reply to &quot;A recent 2026 empirical study titled &quot;Beyond Code Snippets: Benchmarking LLMs on Repository-Level Question Answering&quot; (published on arXiv&#x2F;ResearchGate) explicitly tested LLMs on codebase comprehension. on Wed, 22 Apr 2026 02:49:14 GMT]]></title><description><![CDATA[<p><span><a href="/user/dalias%40hachyderm.io" rel="nofollow noopener">@<span>dalias</span></a></span> <span><a href="/user/brianowen%40fosstodon.org" rel="nofollow noopener">@<span>brianowen</span></a></span> <span><a href="/user/codinghorror%40infosec.exchange">@<span>codinghorror</span></a></span> The number of billion dollar valuation security industry products that amount to a shiny web UI over a few FOSS tools ...</p>]]></description><link>https://board.circlewithadot.net/post/https://infosec.exchange/users/JessTheUnstill/statuses/116446078868655823</link><guid isPermaLink="true">https://board.circlewithadot.net/post/https://infosec.exchange/users/JessTheUnstill/statuses/116446078868655823</guid><dc:creator><![CDATA[jesstheunstill@infosec.exchange]]></dc:creator><pubDate>Wed, 22 Apr 2026 02:49:14 GMT</pubDate></item><item><title><![CDATA[Reply to &quot;A recent 2026 empirical study titled &quot;Beyond Code Snippets: Benchmarking LLMs on Repository-Level Question Answering&quot; (published on arXiv&#x2F;ResearchGate) explicitly tested LLMs on codebase comprehension. on Wed, 22 Apr 2026 01:58:35 GMT]]></title><description><![CDATA[<p><span><a href="/user/rjohnston%40techhub.social" rel="nofollow noopener">@<span>rjohnston</span></a></span> I've never had that happen to me, personally, but I have pretty good resting bitch face to be fair.</p>]]></description><link>https://board.circlewithadot.net/post/https://infosec.exchange/users/codinghorror/statuses/116445879716276003</link><guid isPermaLink="true">https://board.circlewithadot.net/post/https://infosec.exchange/users/codinghorror/statuses/116445879716276003</guid><dc:creator><![CDATA[codinghorror@infosec.exchange]]></dc:creator><pubDate>Wed, 22 Apr 2026 01:58:35 GMT</pubDate></item><item><title><![CDATA[Reply to &quot;A recent 2026 empirical study titled &quot;Beyond Code Snippets: Benchmarking LLMs on Repository-Level Question Answering&quot; (published on arXiv&#x2F;ResearchGate) explicitly tested LLMs on codebase comprehension. on Tue, 21 Apr 2026 22:44:29 GMT]]></title><description><![CDATA[<p><span><a href="/user/codinghorror%40infosec.exchange" rel="nofollow noreferrer noopener">@<span>codinghorror</span></a></span> <span><a href="/user/bms48%40mastodon.social" rel="nofollow noreferrer noopener">@<span>bms48</span></a></span> change incentives to be for long term not quarterly. Give people doing work more autonomy to set their own standards. Possibly UBI will enable this shift in perspective from eeking out a paycheck to professional/citizen/human responsibility/opportunity.</p>]]></description><link>https://board.circlewithadot.net/post/https://social.lane-jayasinha.com/users/chris/statuses/01KPS3ETBAMD2JWNC86G1E1SYJ</link><guid isPermaLink="true">https://board.circlewithadot.net/post/https://social.lane-jayasinha.com/users/chris/statuses/01KPS3ETBAMD2JWNC86G1E1SYJ</guid><dc:creator><![CDATA[chris@social.lane-jayasinha.com]]></dc:creator><pubDate>Tue, 21 Apr 2026 22:44:29 GMT</pubDate></item><item><title><![CDATA[Reply to &quot;A recent 2026 empirical study titled &quot;Beyond Code Snippets: Benchmarking LLMs on Repository-Level Question Answering&quot; (published on arXiv&#x2F;ResearchGate) explicitly tested LLMs on codebase comprehension. on Tue, 21 Apr 2026 21:47:54 GMT]]></title><description><![CDATA[<p><span><a href="/user/codinghorror%40infosec.exchange">@<span>codinghorror</span></a></span> I have yet to have an LLM tell me to RTFM and then end the conversation.</p>]]></description><link>https://board.circlewithadot.net/post/https://techhub.social/ap/users/115726408341208805/statuses/116444893956395105</link><guid isPermaLink="true">https://board.circlewithadot.net/post/https://techhub.social/ap/users/115726408341208805/statuses/116444893956395105</guid><dc:creator><![CDATA[rjohnston@techhub.social]]></dc:creator><pubDate>Tue, 21 Apr 2026 21:47:54 GMT</pubDate></item><item><title><![CDATA[Reply to &quot;A recent 2026 empirical study titled &quot;Beyond Code Snippets: Benchmarking LLMs on Repository-Level Question Answering&quot; (published on arXiv&#x2F;ResearchGate) explicitly tested LLMs on codebase comprehension. on Tue, 21 Apr 2026 21:09:44 GMT]]></title><description><![CDATA[<p><span><a href="/user/joe%40f.duriansoftware.com">@<span>joe</span></a></span> Agreed! I’m genuinely always in favor of repeating research like this given how fast the models are moving. Even the non-reasoning models are dramatically better today so I’d love to run an experiment on them too, it’s just concerning to me when 1-2 year old outdated material becomes considered a source of truth.</p>]]></description><link>https://board.circlewithadot.net/post/https://macaw.social/users/mergesort/statuses/116444743911361644</link><guid isPermaLink="true">https://board.circlewithadot.net/post/https://macaw.social/users/mergesort/statuses/116444743911361644</guid><dc:creator><![CDATA[mergesort@macaw.social]]></dc:creator><pubDate>Tue, 21 Apr 2026 21:09:44 GMT</pubDate></item><item><title><![CDATA[Reply to &quot;A recent 2026 empirical study titled &quot;Beyond Code Snippets: Benchmarking LLMs on Repository-Level Question Answering&quot; (published on arXiv&#x2F;ResearchGate) explicitly tested LLMs on codebase comprehension. on Tue, 21 Apr 2026 20:56:36 GMT]]></title><description><![CDATA[<p><span><a href="/user/bms48%40mastodon.social" rel="nofollow noopener">@<span>bms48</span></a></span> turns out far too many humans are pretty goddamned lazy and will ship the prototype. How do we change this?</p>]]></description><link>https://board.circlewithadot.net/post/https://infosec.exchange/users/codinghorror/statuses/116444692263468512</link><guid isPermaLink="true">https://board.circlewithadot.net/post/https://infosec.exchange/users/codinghorror/statuses/116444692263468512</guid><dc:creator><![CDATA[codinghorror@infosec.exchange]]></dc:creator><pubDate>Tue, 21 Apr 2026 20:56:36 GMT</pubDate></item><item><title><![CDATA[Reply to &quot;A recent 2026 empirical study titled &quot;Beyond Code Snippets: Benchmarking LLMs on Repository-Level Question Answering&quot; (published on arXiv&#x2F;ResearchGate) explicitly tested LLMs on codebase comprehension. on Tue, 21 Apr 2026 18:28:28 GMT]]></title><description><![CDATA[<p><span><a href="/user/mergesort%40macaw.social">@<span>mergesort</span></a></span> sounds like a good opportunity for a one-up paper to try it again with the newer models. would be interesting to see what difference the "reasoning" really makes</p>]]></description><link>https://board.circlewithadot.net/post/https://f.duriansoftware.com/users/joe/statuses/116444109779942454</link><guid isPermaLink="true">https://board.circlewithadot.net/post/https://f.duriansoftware.com/users/joe/statuses/116444109779942454</guid><dc:creator><![CDATA[joe@f.duriansoftware.com]]></dc:creator><pubDate>Tue, 21 Apr 2026 18:28:28 GMT</pubDate></item><item><title><![CDATA[Reply to &quot;A recent 2026 empirical study titled &quot;Beyond Code Snippets: Benchmarking LLMs on Repository-Level Question Answering&quot; (published on arXiv&#x2F;ResearchGate) explicitly tested LLMs on codebase comprehension. on Tue, 21 Apr 2026 18:15:03 GMT]]></title><description><![CDATA[<p><span><a href="/user/joe%40f.duriansoftware.com">@<span>joe</span></a></span> More of an FYI for this repost in case you’re curious. (It’s mentioned in the abstract.) <a href="https://macaw.social/@mergesort/116444049426350678" rel="nofollow noopener"><span>https://</span><span>macaw.social/@mergesort/116444</span><span>049426350678</span></a></p>]]></description><link>https://board.circlewithadot.net/post/https://macaw.social/users/mergesort/statuses/116444056999514889</link><guid isPermaLink="true">https://board.circlewithadot.net/post/https://macaw.social/users/mergesort/statuses/116444056999514889</guid><dc:creator><![CDATA[mergesort@macaw.social]]></dc:creator><pubDate>Tue, 21 Apr 2026 18:15:03 GMT</pubDate></item><item><title><![CDATA[Reply to &quot;A recent 2026 empirical study titled &quot;Beyond Code Snippets: Benchmarking LLMs on Repository-Level Question Answering&quot; (published on arXiv&#x2F;ResearchGate) explicitly tested LLMs on codebase comprehension. on Tue, 21 Apr 2026 17:40:45 GMT]]></title><description><![CDATA[<p><span><a href="/user/codinghorror%40infosec.exchange">@<span>codinghorror</span></a></span> theft en masse as a business model</p>]]></description><link>https://board.circlewithadot.net/post/https://infosec.exchange/users/OvertonDoors/statuses/116443922146805758</link><guid isPermaLink="true">https://board.circlewithadot.net/post/https://infosec.exchange/users/OvertonDoors/statuses/116443922146805758</guid><dc:creator><![CDATA[overtondoors@infosec.exchange]]></dc:creator><pubDate>Tue, 21 Apr 2026 17:40:45 GMT</pubDate></item><item><title><![CDATA[Reply to &quot;A recent 2026 empirical study titled &quot;Beyond Code Snippets: Benchmarking LLMs on Repository-Level Question Answering&quot; (published on arXiv&#x2F;ResearchGate) explicitly tested LLMs on codebase comprehension. on Tue, 21 Apr 2026 16:48:45 GMT]]></title><description><![CDATA[<p><span><a href="/user/codinghorror%40infosec.exchange" rel="nofollow noopener">@<span>codinghorror@infosec.exchange</span></a></span></p><p></p><div class="card col-md-9 col-lg-6 position-relative link-preview p-0">



<a href="https://tenor.com/view/shocker-shocked-futurama-im-shocked-gif-5296191" title="I'M Shocked! - Futurama GIF - Shocker Shocked Futurama - Discover &amp; Share GIFs">
<img src="https://media1.tenor.com/m/ZyFZFTpTLfgAAAAC/shocker-shocked.gif" class="card-img-top not-responsive" style="max-height:15rem" alt="Link Preview Image" />
</a>



<div class="card-body">
<h5 class="card-title">
<a href="https://tenor.com/view/shocker-shocked-futurama-im-shocked-gif-5296191">
I'M Shocked! - Futurama GIF - Shocker Shocked Futurama - Discover &amp; Share GIFs
</a>
</h5>
<p class="card-text line-clamp-3">The perfect Shocker Shocked Futurama Animated GIF for your conversation. Discover and Share the best GIFs on Tenor.</p>
</div>
<a href="https://tenor.com/view/shocker-shocked-futurama-im-shocked-gif-5296191" class="card-footer text-body-secondary small d-flex gap-2 align-items-center lh-2">



<img src="https://tenor.com/assets/img/favicon/favicon-16x16.png" alt="favicon" class="not-responsive overflow-hiddden" style="max-width:21px;max-height:21px" />













<p class="d-inline-block text-truncate mb-0">Tenor <span class="text-secondary">(tenor.com)</span></p>
</a>
</div><p></p>]]></description><link>https://board.circlewithadot.net/post/https://app.wafrn.net/fediverse/post/5c53ab46-850d-4deb-bb07-271b5f7c93b0</link><guid isPermaLink="true">https://board.circlewithadot.net/post/https://app.wafrn.net/fediverse/post/5c53ab46-850d-4deb-bb07-271b5f7c93b0</guid><dc:creator><![CDATA[dogiedog64@app.wafrn.net]]></dc:creator><pubDate>Tue, 21 Apr 2026 16:48:45 GMT</pubDate></item><item><title><![CDATA[Reply to &quot;A recent 2026 empirical study titled &quot;Beyond Code Snippets: Benchmarking LLMs on Repository-Level Question Answering&quot; (published on arXiv&#x2F;ResearchGate) explicitly tested LLMs on codebase comprehension. on Tue, 21 Apr 2026 16:46:32 GMT]]></title><description><![CDATA[<p><span><a href="/user/dalias%40hachyderm.io">@<span>dalias</span></a></span> <span><a href="/user/brianowen%40fosstodon.org">@<span>brianowen</span></a></span> <span><a href="/user/codinghorror%40infosec.exchange">@<span>codinghorror</span></a></span> (and i want to be very clear this is not my code, it's really codex/openai, i barely glanced at some of it, let alone write anything other than prompts, i'm a dev with years of experience, not the best in the world for sure, but i know how to code, here i've been doing a PM's job, and not a very competent one, the tool had to cater to my whims and half ideas, and did quite well at that)</p>]]></description><link>https://board.circlewithadot.net/post/https://mas.to/users/tshirtman/statuses/116443708947693422</link><guid isPermaLink="true">https://board.circlewithadot.net/post/https://mas.to/users/tshirtman/statuses/116443708947693422</guid><dc:creator><![CDATA[tshirtman@mas.to]]></dc:creator><pubDate>Tue, 21 Apr 2026 16:46:32 GMT</pubDate></item><item><title><![CDATA[Reply to &quot;A recent 2026 empirical study titled &quot;Beyond Code Snippets: Benchmarking LLMs on Repository-Level Question Answering&quot; (published on arXiv&#x2F;ResearchGate) explicitly tested LLMs on codebase comprehension. on Tue, 21 Apr 2026 16:41:57 GMT]]></title><description><![CDATA[<p><span><a href="/user/codinghorror%40infosec.exchange">@<span>codinghorror</span></a></span> <br />So, they're all like the AI on LinkedIn that will do a "smart" search for me that takes 100 times longer to give me the exact same list as the normal search option does? Because it probably just runs the search and fails a thousand process calls before just giving me the search?</p>]]></description><link>https://board.circlewithadot.net/post/https://infosec.exchange/users/RnDanger/statuses/116443690894044622</link><guid isPermaLink="true">https://board.circlewithadot.net/post/https://infosec.exchange/users/RnDanger/statuses/116443690894044622</guid><dc:creator><![CDATA[rndanger@infosec.exchange]]></dc:creator><pubDate>Tue, 21 Apr 2026 16:41:57 GMT</pubDate></item><item><title><![CDATA[Reply to &quot;A recent 2026 empirical study titled &quot;Beyond Code Snippets: Benchmarking LLMs on Repository-Level Question Answering&quot; (published on arXiv&#x2F;ResearchGate) explicitly tested LLMs on codebase comprehension. on Tue, 21 Apr 2026 16:36:14 GMT]]></title><description><![CDATA[<p><span><a href="/user/dalias%40hachyderm.io">@<span>dalias</span></a></span> <span><a href="/user/brianowen%40fosstodon.org">@<span>brianowen</span></a></span> <span><a href="/user/codinghorror%40infosec.exchange">@<span>codinghorror</span></a></span> there are certainly many doing just that, but i'm probably not alone in doing something completely different, my vibe coded app for my own personal use is an openXR remote display for my quest3, in rust, with a desktop (linux/macos) agent capturing, encoding and streaming to it, using rust on linux, swift on macos, and python to wrap things.</p><p>it was done in weeks what would have taken me months/years, assuming i would have found the time/motivation to even try.</p>]]></description><link>https://board.circlewithadot.net/post/https://mas.to/users/tshirtman/statuses/116443668425955497</link><guid isPermaLink="true">https://board.circlewithadot.net/post/https://mas.to/users/tshirtman/statuses/116443668425955497</guid><dc:creator><![CDATA[tshirtman@mas.to]]></dc:creator><pubDate>Tue, 21 Apr 2026 16:36:14 GMT</pubDate></item><item><title><![CDATA[Reply to &quot;A recent 2026 empirical study titled &quot;Beyond Code Snippets: Benchmarking LLMs on Repository-Level Question Answering&quot; (published on arXiv&#x2F;ResearchGate) explicitly tested LLMs on codebase comprehension. on Tue, 21 Apr 2026 16:12:37 GMT]]></title><description><![CDATA[<p><span><a href="/user/brianowen%40fosstodon.org">@<span>brianowen</span></a></span> <span><a href="/user/codinghorror%40infosec.exchange">@<span>codinghorror</span></a></span> This is exactly what it is. This is exactly what the web dev industry has been for decades. Millions of LoC of garbage to justify prices for what should be an easy in-house job using an existing CMS with minimal or no code and should be as easy as using Excel.</p>]]></description><link>https://board.circlewithadot.net/post/https://hachyderm.io/users/dalias/statuses/116443575607421807</link><guid isPermaLink="true">https://board.circlewithadot.net/post/https://hachyderm.io/users/dalias/statuses/116443575607421807</guid><dc:creator><![CDATA[dalias@hachyderm.io]]></dc:creator><pubDate>Tue, 21 Apr 2026 16:12:37 GMT</pubDate></item><item><title><![CDATA[Reply to &quot;A recent 2026 empirical study titled &quot;Beyond Code Snippets: Benchmarking LLMs on Repository-Level Question Answering&quot; (published on arXiv&#x2F;ResearchGate) explicitly tested LLMs on codebase comprehension. on Tue, 21 Apr 2026 16:07:07 GMT]]></title><description><![CDATA[<p><span><a href="/user/codinghorror%40infosec.exchange">@<span>codinghorror</span></a></span> I made a code  going against "the recommended way of doing it" in several ways for certain task. It was a very conscious decision, these violations are needed to make something new I want. Recently I handed the code to my student and asked her to read and understand it. She handed me what she said was a written summary of her notes. It was a summary of the average recommended solutions on stack overflow, absolutely no mention of my very specific anti standard choices or code.</p>]]></description><link>https://board.circlewithadot.net/post/https://mastodon.gal/users/elrohir/statuses/116443553933190076</link><guid isPermaLink="true">https://board.circlewithadot.net/post/https://mastodon.gal/users/elrohir/statuses/116443553933190076</guid><dc:creator><![CDATA[elrohir@mastodon.gal]]></dc:creator><pubDate>Tue, 21 Apr 2026 16:07:07 GMT</pubDate></item><item><title><![CDATA[Reply to &quot;A recent 2026 empirical study titled &quot;Beyond Code Snippets: Benchmarking LLMs on Repository-Level Question Answering&quot; (published on arXiv&#x2F;ResearchGate) explicitly tested LLMs on codebase comprehension. on Tue, 21 Apr 2026 16:02:19 GMT]]></title><description><![CDATA[<p><span><a href="/user/henryk%40chaos.social" rel="nofollow noopener">@<span>henryk</span></a></span> <span><a href="/user/codinghorror%40infosec.exchange" rel="nofollow noopener">@<span>codinghorror</span></a></span> I used to think the "just copying from Stack Overflow" jokes developers made were, you know, jokes. Then I watched all my coworkers embrace LLMs, and I was forced to conclude that they weren't joking.</p><p>So, depressing agreement.</p>]]></description><link>https://board.circlewithadot.net/post/https://cyberpunk.lol/users/Azuaron/statuses/116443535059699643</link><guid isPermaLink="true">https://board.circlewithadot.net/post/https://cyberpunk.lol/users/Azuaron/statuses/116443535059699643</guid><dc:creator><![CDATA[azuaron@cyberpunk.lol]]></dc:creator><pubDate>Tue, 21 Apr 2026 16:02:19 GMT</pubDate></item><item><title><![CDATA[Reply to &quot;A recent 2026 empirical study titled &quot;Beyond Code Snippets: Benchmarking LLMs on Repository-Level Question Answering&quot; (published on arXiv&#x2F;ResearchGate) explicitly tested LLMs on codebase comprehension. on Tue, 21 Apr 2026 13:53:52 GMT]]></title><description><![CDATA[<p><span><a href="/user/codinghorror%40infosec.exchange">@<span>codinghorror</span></a></span> I'm genuinely curious how much of the hype is an army of mid level engineers all building the same five web apps as the other guy.</p>]]></description><link>https://board.circlewithadot.net/post/https://fosstodon.org/users/brianowen/statuses/116443029976265871</link><guid isPermaLink="true">https://board.circlewithadot.net/post/https://fosstodon.org/users/brianowen/statuses/116443029976265871</guid><dc:creator><![CDATA[brianowen@fosstodon.org]]></dc:creator><pubDate>Tue, 21 Apr 2026 13:53:52 GMT</pubDate></item><item><title><![CDATA[Reply to &quot;A recent 2026 empirical study titled &quot;Beyond Code Snippets: Benchmarking LLMs on Repository-Level Question Answering&quot; (published on arXiv&#x2F;ResearchGate) explicitly tested LLMs on codebase comprehension. on Tue, 21 Apr 2026 13:16:40 GMT]]></title><description><![CDATA[<p><span><a href="/user/codinghorror%40infosec.exchange">@<span>codinghorror</span></a></span> I gots no problem with da one-shotting da boilerplate! But the actual useful application is a far cry from what Jensen, who pretends to be everyone's friend, wants you to do the "tokenmaxxing" for.</p>]]></description><link>https://board.circlewithadot.net/post/https://mastodon.social/ap/users/116175731239673526/statuses/116442883709546749</link><guid isPermaLink="true">https://board.circlewithadot.net/post/https://mastodon.social/ap/users/116175731239673526/statuses/116442883709546749</guid><dc:creator><![CDATA[bms48@mastodon.social]]></dc:creator><pubDate>Tue, 21 Apr 2026 13:16:40 GMT</pubDate></item><item><title><![CDATA[Reply to &quot;A recent 2026 empirical study titled &quot;Beyond Code Snippets: Benchmarking LLMs on Repository-Level Question Answering&quot; (published on arXiv&#x2F;ResearchGate) explicitly tested LLMs on codebase comprehension. on Tue, 21 Apr 2026 13:15:37 GMT]]></title><description><![CDATA[<p><span><a href="/user/bms48%40mastodon.social" rel="nofollow noopener">@<span>bms48</span></a></span> it's better for blank page ideation, and mashups / galactic brain fuzz testing in my opinion, and should always be double-checked by a human</p>]]></description><link>https://board.circlewithadot.net/post/https://infosec.exchange/users/codinghorror/statuses/116442879592117289</link><guid isPermaLink="true">https://board.circlewithadot.net/post/https://infosec.exchange/users/codinghorror/statuses/116442879592117289</guid><dc:creator><![CDATA[codinghorror@infosec.exchange]]></dc:creator><pubDate>Tue, 21 Apr 2026 13:15:37 GMT</pubDate></item><item><title><![CDATA[Reply to &quot;A recent 2026 empirical study titled &quot;Beyond Code Snippets: Benchmarking LLMs on Repository-Level Question Answering&quot; (published on arXiv&#x2F;ResearchGate) explicitly tested LLMs on codebase comprehension. on Tue, 21 Apr 2026 13:12:45 GMT]]></title><description><![CDATA[<p><span><a href="/user/codinghorror%40infosec.exchange">@<span>codinghorror</span></a></span> The really sad part of this was that the LLMs were caught directly in the act of fabricating erroneous output. I had a Git sparse-checkout of xnu directly in front of me. The technical matter related to TCP interactive behaviour.</p>]]></description><link>https://board.circlewithadot.net/post/https://mastodon.social/ap/users/116175731239673526/statuses/116442868311900184</link><guid isPermaLink="true">https://board.circlewithadot.net/post/https://mastodon.social/ap/users/116175731239673526/statuses/116442868311900184</guid><dc:creator><![CDATA[bms48@mastodon.social]]></dc:creator><pubDate>Tue, 21 Apr 2026 13:12:45 GMT</pubDate></item><item><title><![CDATA[Reply to &quot;A recent 2026 empirical study titled &quot;Beyond Code Snippets: Benchmarking LLMs on Repository-Level Question Answering&quot; (published on arXiv&#x2F;ResearchGate) explicitly tested LLMs on codebase comprehension. on Tue, 21 Apr 2026 13:09:56 GMT]]></title><description><![CDATA[<p><span><a href="/user/codinghorror%40infosec.exchange">@<span>codinghorror</span></a></span> I have witnessed cloud based LLMs giving me authoritative-sound answers, when asked questions about macOS xnu kernel code, which came from Linux constructions, and were completely irrelevant. Generative AI is often useless and not fit for its advertised purpose. The limitations are structural and well known to ML researchers, and Noam Chomsky called it already years ago. I wish they'd just jog on and stop bothering real people with what is essentially a shell game.</p>]]></description><link>https://board.circlewithadot.net/post/https://mastodon.social/ap/users/116175731239673526/statuses/116442857251320680</link><guid isPermaLink="true">https://board.circlewithadot.net/post/https://mastodon.social/ap/users/116175731239673526/statuses/116442857251320680</guid><dc:creator><![CDATA[bms48@mastodon.social]]></dc:creator><pubDate>Tue, 21 Apr 2026 13:09:56 GMT</pubDate></item><item><title><![CDATA[Reply to &quot;A recent 2026 empirical study titled &quot;Beyond Code Snippets: Benchmarking LLMs on Repository-Level Question Answering&quot; (published on arXiv&#x2F;ResearchGate) explicitly tested LLMs on codebase comprehension. on Tue, 21 Apr 2026 13:09:54 GMT]]></title><description><![CDATA[<p><span><a href="/user/codinghorror%40infosec.exchange">@<span>codinghorror</span></a></span> So ... roughly on par with normal developers then? <img src="https://board.circlewithadot.net/assets/plugins/nodebb-plugin-emoji/emoji/android/1f609.png?v=28325c671da" class="not-responsive emoji emoji-android emoji--wink" style="height:23px;width:auto;vertical-align:middle" title=";-)" alt="😉" /></p>]]></description><link>https://board.circlewithadot.net/post/https://chaos.social/users/henryk/statuses/116442857085014670</link><guid isPermaLink="true">https://board.circlewithadot.net/post/https://chaos.social/users/henryk/statuses/116442857085014670</guid><dc:creator><![CDATA[henryk@chaos.social]]></dc:creator><pubDate>Tue, 21 Apr 2026 13:09:54 GMT</pubDate></item></channel></rss>