He seems to be writing about floating point operations. Considering instruction ...

KaeseEs · on Feb 20, 2012

Think of how complex merely fetching and decoding an instruction is on a current x86 machine.

Now consider how the pipeline, multiple issue, hazard detection and mitigation, and out-of-order execution are implemented.

Ten million is probably an underestimate by a fair margin.

wmf · on Feb 20, 2012

According to Wikipedia the Alpha 21264 was 6M transistors not counting caches and the Pentium Pro was 5.5M total. 10M doesn't sound far off for today's processors. Since these are some of the earliest out-of-order superscalar processors, they may give an idea of the minimum cost of those features.

KaeseEs · on Feb 20, 2012

A quad core i7 has 1.16 billion transistors, for reference. The question posed was whether each instruction might touch 10 million, or .86%, of these.