Inside Intel’s Core architecture
Intel’s Core architecture now underlies mobile, desktop and server chips, and is a major departure from the Pentium 4’s NetBurst design.
It is hard to overstate the importance of the Core micro-architecture to Intel, and thus to the rest of the industry. The product of a major debate within Intel (what Pat Gelsinger, General Manager of the Digital Enterprise Group, called the ’speed freaks versus brainiacs’), it marks the victory of those who felt that extra performance was best achieved not by constantly upping the processor’s clock speed, but by going for ever more parallel systems with much finer control of performance versus power consumption.
With the Core 2 Duo and Core 2 Extreme, the Core architecture is being applied to mobile (Merom) and desktop (Conroe) processors, which now join last month’s Xeon 5100 (Woodcrest) to put basically the same chip in everything from notebooks to servers.
All modern processors work by reading a stream of instructions that tell them where data is in memory and what to do with it. Originally, processors took in one instruction from memory and took as long as it needed to fully execute it before starting on the next. Each clock tick — of which there are a million a second with a 1MHz processor, a billion with 1GHz — moved the instruction one stage further through the processor. Some instructions could take four or more ticks.
Now, processors move blocks of hundreds or thousands of instructions into cache before executing them in blocks of four or more at a time, trying to execute even the most complex instruction in one tick. To this end, each generation of processor uses all the best tricks from previous designs and adds some of its own: the Core architecture is mostly a mix of Pentium M (Banias and Dothan) ideas with those from the Pentium Pro/P6. It’s a major departure from Pentium4’s NetBurst design.
Core details
Core is a pipelined architecture, where instructions move through a number of internal stages between entering the processor and being completed – ‘retired’, in the jargon. As an instruction exits a stage another can enter, minimising the idle time for each internal component. Core has around fourteen stages in its pipeline: as with most modern architectures, there are a number of complications, such as early completion and out of order execution, which make it hard to define exactly how many stages there are. more…

