GPU power waste, LLM nonsense 

one other thing I've been meaning to research is more on how GPU architecture… actually works. like, really works. I understand CPU architecture to a very very simplified degree, but something I'd particularly like to understand is how exactly power profiling works on GPUs, how it could work on GPUs, and whether that's reasonable or not

like, CPUs can effectively treat parts of the pipeline that aren't use as nonexistent, at least as far as power is concerned. but GPUs have these pipelines multiplied hundreds to thousands of times… do all of these pipelines have to draw power all the time while the GPU is doing work, or does this not affect power consumption at all?

like, CPU power consumption is incredibly variable due to all the conditional logic in CPU execution. even with stuff like branch prediction, if you haven't done any FPU instructions in a while, the FPU is off; it's not drawing power. but the GPU explicitly does not like conditional logic so that it can have a lot more reliability in its pipelines

like, basically, there are some computations which are power-constant, and some which are not, and it's worth investigating what those are specifically for GPUs with how much they're being used

like, for example, on the CPU, things mostly boil down to doing your work more quickly: the quicker your work can get done, the sooner the CPU can idle and stop drawing power. but GPUs just do all their work in batches; does the size of the batch determine the power consumption, or the number of batches? is it also relative to time (and thus, number of operations) like CPU work is, or is the entire GPU just on until all the work is done?

these are questions I could not possibly hope to answer right now, and I'm honestly not even sure if I'll be able to answer them without directly asking someone who's worked making GPUs, and even then, they'll likely be under NDA

Follow

re: GPU power waste, LLM nonsense 

@clarfonthey as a general rule when using GPUs for general compute (instead of just graphics) you can do parallel computation with more energy efficiency than a CPU. It is just that the minimum power draw to reach that efficiency is significantly higher than a CPU, and the GPU can only be used efficiently for certain types of compute (notably nothing involving significant branching).

I've always thought the more pertinent point is that power is being wasted on entertaining this nonsense at all.

re: GPU power waste, LLM nonsense 

@thufie it is more pertinent, but if you want to actually consider interpolation methods it's worth talking about power consumption profiles for that outside of LLMs and into ML more broadly

Sign in to participate in the conversation
Pixietown

Small server part of the pixie.town infrastructure. Registration is closed.