Starting with Bulldozer, we should note that this is not the full, final product for sale - it's the core microarchitecture, and there are other codenames for desktop and server products. Interlagos is the server variant, with Zambezi for the desktop.
The purpose of the Bulldozer module is to improve multi-threaded compute performance and utilization. The basic building block of a Bulldozer module is very different from the existing x86 processor cores we see today. A Bulldozer module is two integer threads plus one floating point thread, with shared fetch, decode and L2 cache. Inside each of the integer core are four pipelines, and the floating point dual issues 128-bit Fused/Multiply/Add (FMA) processors which can be broken down into four 64-bit FMA's or combined into a single 256-bit unit. Bulldozer also adds SSE1, SSE2, SSE3, SSSE3, and Secure Virtualization instruction set extensions. AVX is also supported, courtesy of the 256-bit FMA.
Bulldozer Shared Components
Each Bulldozer module is frequency and power gated, meaning in a multi-Bulldozer module product, individual modules can be turned down, or off, to improve power saving - which should already be good as it's manufactured on the latest 32nm SOI (Silicon Insulated) process technology. Operating systems see each BD module are two cores, with the shared Floating Point thread transparent to the OS. This is bad news for the enterprise VMware customer, who has a choice of paying more per server for Intel's Nehalem two-socket, or more per year for each VMware license for AMD's Opteron four socket.
Bulldozer will be introduced first in the server market, under the codenames Interlagos and Valencia. It will be a drop-in upgrade for existing Magny-Cours based platforms, using the G34 or C32 sockets. Bulldozer is the heavy lifter, designed to scale up and down to suit a customers' needs - AMD is now completing their transition to being a design company, offering flexibility in product choice for partners to build upon.
AMD's current top server product is an eight or twelve-core processor with quad-channel memory controller and based on the Istanbul processor foundation. Bulldozer based Interlagos promises to use the same power envelope with 33% more cores to deliver 50% more performance. Sixteen cores (8 BD modules) in the same 105W ACP as Magny-Cours, giving 50% more performance in ... what? Best guess is the SpecInt benchmark. Based on that we can estimate a 12.5% integer performance increase per core against Magny-Cour, for similar power envelope products - we still don't know clock rates of final shipping products (due 2011), meaning the increase in integer performance per clock is still unknown.
Simultaneous Two Strong Thread Execution
Bulldozer's modular approach with two full integer cores and shared floating-point is AMD's combination of Simultaneous Multi-Threading (SMT) and Core Multi-Processing (CMP). You'll be most familiar with SMT through Intel's Hyper-Threading implementation of superscalar design with multi-issue instructions. CMP is the more tradition method, where full complete cores are added to scale performance - i.e. AMD multi-cores to date. Bulldozer bridges the gap, with shared x86 fetch and decode but separate dual Integer processing and floating-point (FP). Each thread can be assisted with a scheduled 128-bit FP thread, or one core could use the whole 256-bit FP capability.
Building Interlagos From Bulldozer Modules
Bulldozer and SIMD
There is no real reason that an APU utilizing Bulldozer cores can't become reality. In fact, it's probably a good bet that a future generation of Compute processors will be Bulldozer cores with big fat SIMD arrays on them. The advantages of shared memory are very good for compute applications that mix integer bound workloads with massively parallel operations. Heuristic Anti-Virus, Intrusion Detection, Storage Controllers, High-bandwidth Network Switches, Virtualization servers and more could all leverage the massive throughput and flexibility servers based on Bulldozer APU's could offer. FirePro and FireStream cards could be provided a new HyperTransport direct connect slot and include Bobcat APU(s) to provide more compute scaling.
Bulldozer on the Desktop
For the desktop, the Zambezi processor is good news and bad news. The good news is it's an 8 core product, the bad news is it needs a new socket - AM3r, or AM3+. This is an electrical upgrade of the AM3 platform, to provide the power phases and planes/states required by the power gating features of Zambezi. As you might have guessed from the name, this socket is backwards compatible with existing AM3 processors, so you'll be able to piecemeal your upgrade - motherboard for your birthday, CPU for Christmas, or however your upgrade cycle works.
Desktop Bulldozer module based products will be introduced later than server variants, so if you're waiting for BD to refresh your desktop there is no end in sight yet. No indication of Turbo mode was given by AMD, but it is expected to be offered on both platforms and is a new implementation compared to the current Turbo Mode used by the Phenom II X6 processors.
content not found