We have some actual numbers on tokens! In their commentary on the latest earnings report, Microsoft report:

We processed over 100 trillion tokens this quarter, up 5X year-over-year – including a record 50 trillion tokens last month alone

So we’ve got a baseline, and can estimate a growth rate. Assuming exponential growth and a three month token split of 19/31/50 we get 62% month on month growth, or, if it keeps going 100 T → 424 T → 1.8 P → 7.6 P, which is lower than the 5x yoy growth MS quote.

If this growth continues for a year, we get to 9.9 quadrillion tokens for the next 12 months and 1.10TWh power 1. Again this is ignoring everything except pure token compute. You could add in a factor for PUE, networking, storage etc. We’re also only looking at inference and ignoring training, so it’s as rough an estimate as it can be.

This is a lot of power, but it’s across Microsoft’s worldwide data centres. They don’t give a figure for data centre energy use, but they do provide figures for total energy use across the organisation - 23.6TWh in FY2023, with most of that likely to be Azure according to analysts. It’s also growing rapidly: 11TWh in 2020, then 14, 18 and 24. Since that growth started before the LLM take off, it feels safe to assume that most of it is due to general growth in cloud, not just AI.

So we keep the same conclusion as last time - everything depends on if that growth rate stays the same or increases, and how that combines with increases in efficiency and we don’t have the public data to work it out.


  1. In my previous post I worked with 0.4j/token. That still seems good as an estimate for current uses. NVIDIA in benchmark testing get numbers that equate to 0.2j/token given H100 power draw of 700W. That is likely a best case on that hardware, and all of the fleet won’t be H100s yet. But it’ll go lower: IBM Research get another order of magnitude lower using custom ASICs ↩︎