FREE MOBILE CLOUD
COMPUTING CONCEPTS - TRAINING_MODULES_WITH_TONS_OF_VIDEOS
Intels Sandy Bridge Core processors
by Gary Garland with Phoenix Cloud and IT Solutions, Inc
Finally, Intel is taking showing one of the most anticipated
bits of microarchitecture we've seen in years: Sandy Bridge. We've known the architectural details of the processor code-named
Sandy Bridge for months—they are very detailed, new, and different—but we haven't known exactly how the changes
would translate into performance and power efficiency, which is the big question about any product played into this heavy.
Sandy Bridge is also referred to as "the second-generation Core microprocessors'"—performance
ready for your perusal.
Sandy Bridge is, basically, a next-generation
replacement for Intel's primary CPUs for desktops and laptops, including those based on quad-core Lynnfield and dual-core
so much data about Sandy Bridge has been available for awhile, we're going to gloss over the architectural aspects in this
review, give you a quick overview of Sandy's key features, and then focus on our test results.
Keep in mind that even a quick
overview of this new chip will take some time, simply because so very much has changed.
At the heart of Sandy
Bridge is an essentially new processor microarchitecture, the most sweeping architectural transition from Intel since the
introduction of the star-crossed Pentium 4. Nearly everything has changed, from the branch predictors through the out-of-order
execution engine and into the memory subsystem.
The goal: to achieve higher performance and power efficiency,
even on single-threaded tasks, where the integration of multiple CPU cores hasn't been much help. Additionally, each of those
cores holds a revamped floating-point unit that supports a new instruction set called AVX.
These instructions allow the processing of vectors up
to 256 bits in width, and the hardware supports them quite fully.
The result should be much higher sustained rates of
throughput for floating-point math, giving new life to media processing applications and other sorts of data-parallel computation.
strong new cores, Sandy Bridge incorporates more of a PC's basic functions on a single square of silicon than any prior CPU
in its class. Not only does it have the memory controller and PCIe links (in addition to two to four CPU cores), but it also
brings a graphics processor (GPU) onboard.
This creeping integration of system components has resulted in higher
performance, lower platform power consumption, and more compact packaging, which is why both Intel and AMD are moving deliberately
toward further integration.
same time, integrated graphics processors (IGPs) are growing more capable.
Sandy Bridge's IGP bears little resemblance to Intel's past attempts
at graphics; its execution units are capable of substantially more work per instruction and per clock cycle.
the IGP's video processing block can both decode and, somewhat distinctively (if you don't count, say, the iPhone 4), encode
H.264 high-definition video streams, opening up the possibility of fully hardware-accelerated video transcoding that barely
touches the CPU cores.
To facilitate better integration, Intel's architects gave Sandy Bridge a high-bandwidth,
ring-style interconnect between the cores, with their associated L3 cache partitions, and the IGP. This fast (up to 384 GB/s
in a 3GHz quad-core chip) interconnect has a number of purported benefits, including easing data sharing between cores, providing
the throughput needed for the processor's revamped floating-point units, and allowing the onboard graphics component to expand
its available bandwidth by making use of the L3 cache.
The quad-core Sandy Bridge die with
major sections labeled.
Better integration has created new possibilities for power management, as well.
extends in several ways Intel's Turbo Boost feature, which takes advantage of available headroom in the CPU power delivery
and cooling mechanisms to deliver higher clock frequencies at lower load levels. The first change is simply more clock speed
Although Turbo behavior varies from model to model, Sandy Bridge reaches higher clock speeds and ramps up more aggressively
than older processors. The revised Turbo algorithm also does something that may seem a little counterintuitive at first, allowing
the CPU to ramp beyond its maximum rated power use (thermal design power or TDP) for brief periods of time.
As I understand
it, Intel is taking advantage of the lag between when a relatively cool idle chip begins to warm up its environment and when
temperatures have risen to levels where full cooling capacity is needed.
During this span of time, the chip may opportunistically
push beyond its rated thermal peak by running at higher-than-usual frequencies within its Turbo Boost range. Once the surrounding
system has warmed up or enough time has passed (the algorithm is complex, and Intel hasn't shared all of the details with
us), the chip will drop back to operating within its TDP max.
claims this feature has an important usability benefit for common usage patterns, where periods of high-utilization are "bursty"
by nature—think of opening a program or running a PhotoShop filter. Furthermore, Sandy Bridge's Turbo Boost algorithm
incorporates not just the CPU cores but the IGP, as well; it can raise the operating frequency of the graphics processor when
the CPU cores aren't at full utilization.
Last-level cache size
Estimated transistors (Millions)
Die area (mm²)
Core 2 Duo
Core i5, i7
Core i3, i5
Core i5, i7
Core i3, i5
Athlon II X4/X3
512 KB x 4
Athlon II X2
1 MB x 2
Phenom II X6
table above shows the key specs for the quad- and dual-core versions of Sandy Bridge alongside other recent chips. Thanks
to Intel's 32-nm, high-K metal gate fabrication process, the nearly one-billion transistors in the quad-core version
of Sandy Bridge fit into a die area smaller than either the Lynnfield chip it replaces or the "Deneb" Phenom II
with which it competes—and neither of those other chips have integrated graphics.
If you're counting
along at home, Intel tells us each CPU core is made up of roughly 55 million transistors, while the graphics core is 114 million.
We suspect a great many of the remaining transistors are packed tightly into the chip's 8MB of L3 cache.
ALL THIS MEANS IS
HIGHER THROUGHPUT...TO WORK WITH HIGHER AND FASTER BANDWIDTH....CLOUD PROS WILL PROFIT.....AND GAIN TRACTION TOO...