@niekvdpas To give an even vaguer answer. A fairly limited LLM can be run reasonably well on my four-year-old laptop. A much more capable model can be run on a machine, such as a Mac Studio. This does consume energy, but not very much. Especially given that typical LLM use does not imply sustained full-load operation.
Compared to these examples, optimized hardware or software, scale, better utilization, etc. will be more efficient.