Eric Demers and GCN, Part II

Company: AMD
Author: James Prior
Date: December 22nd, 2011

Crossfire Design Considerations

R3D: When you're sitting down to design, it is in your mind to think about compatibility with previous generations, in terms of Crossfire, or is that an afterthought?

Tom: No, that's absolutely a consideration.

Eric: It's actually part of the requirements specification.

Tom: There's Crossfire, then there's also dual graphics configurations, with the APU core, so yeah, it's very important.

R3D: Is it going to be tough for you guys to get the VLIW4 in Trinity working with GCN or is that aleady working?

Eric: I don't know if we've made an announcement, check with Devon [Nekechuk, 6970 product manager] for the Crossfire compatibility we've promised for this part. The problem we have, for example, if we made it Crossfire compatible with the 6970 - which is not impossible - is we would have to turn off any difference between the cards.

R3D: DirectX 11.1 vs. 11.0, all the filtering changes?

Tom: Yeah, exactly.

Eric: All the changes would have to be brought back down, and we have put in some of those capabilities to be able to turn off some of these enhancements, but then you sacrifice those enhancements. Is that really what a customer would want to do, if they buy a new card like that? Yeah, you'd get a higher performance if you pair it with your old card but you lose the new features; it's a tough one. It's not really an engineering decision, it's more of a business decision to see what makes it happen.

Eric: Yes, we definitely have backwards compatibility as a high priority for ourselves so we can potentially offer it. On these high end cards it's usually about pairing them together, because that is the ultimate Crossfire performance, when you have two paired cards. You have the best, we're getting 1.7x-1.8x scaling with two cards - it is the very best experience.

R3D: The performance uplift offered here, you're saying 40% better than previous single GPU on the market?

Eric: Actually you're gonna have to measure it yourself! In the reviewers guide that we'll ship with the board, we'll give you a bazillion different benchmarks which you'll ignore, because you shouldn't trust anybody. That number is in my head right now, I know there are applications that are well beyond that and I know there are applications that are below that; is that the right number, vs. 6970? I'd have to go check, it's going to vary significantly, per app. 1.3x-1.5x isn't a bad range. If you're running in the higher resolutions you're gonna get the higher end of that, if you're running in the lower resolutions you're gonna get diminishing returns. If you're running 1280 and no AA; no offense, but maybe you should buy a different card. ;-)

R3D: How far down from your expectations or your performance target is that, and what caused that to come about? Is it a pull in a different direction that caused a design decision where graphics performance and compute had different needs or is it the process didn't work out right?

Eric: No, I think it is actually pretty close to what we wanted the target to be, if you look at the process improvements - I forget the exact scaling, 20-30% maybe – and then the new architecture is better so in some applications, like the tessellation bound ones, you saw a number of 136% [improvement] on the Unigine so we're getting well over 2x performance improvements there. It does vary significantly, and I think it's in the right ballpark.

Tom: Process-wise we're seeing very good clocking, you saw some claims on overclocking, we're seeing faster rates in general for both the memory and the core clock. As well as the scaling benefits, the performance/watt benefits. We're getting close to what we were expecting.

Eric: We're actually very close to target.

R3D: The last time I was in this facility was for the Zambezi tech day, for the Bulldozer briefing. I'm hearing similar sort of message for how the design was brought together here, and obviously this architecture is going to be with us for a while as the previous one (VLIW/Terascale) was. What's the difference here in execution for GCN vs. Bulldozer?

Eric: I wasn't part of the Bulldozer briefing so I'm not 100% sure of what was said. From our perspective this is like when we introduced Northern Islands, or when we introduced Evergreen or other parts, there is no magic sauce. Don't take our word for it on any of our numbers, go do your own benchmarks. We feel very confident you will get good numbers, if you're not getting good numbers then come back and see us. You'll see the benchmarks do vary a lot, there's a lot of 2x and there are some 1.3x or whatever the minimum bar is for just the clock improvement. On average 1.5x is very good, we could quote you 2x and we'd probably be right half the time as well. We aren't hiding anything, and we aren't saying it's a server only part or anything like that. It's a great part, it's an awesome part.

Eric: It's architecture design was created by a completely separate team [from Bulldozer], but there is a lot of overlap between the tech managers and what they're using for boosting their performance is the same thing we're using in PowerTune. There is some crossbreeding there. We talk a lot with them about the shader design core and leveraging technology from Bulldozer; Bulldozer is an x86 architecture, it is completely different from a GPU architecture. We're wide and relatively speaking slow compared to theirs, what 4.5GHz rates, we're running 925 and a completely different process technology. They're running at GLOBALFOUNDRIES, we're at TSMC - it's very hard to compare those two things; different business units, different PR people, different engineers, different everything.

Eric: I actually like the Bulldozer design, I think particularly the revisions that are upcoming are going to be pretty good for it. It's not a bad CPU, it’s just that the competition is very good there. We [Graphics division] have the advantage with the competition being somewhat on-par with our current designs, in performance/$$ we’re probably still ahead of them. This part is another salvo in that continuous war. We don't have alien process technology like Intel does [laughing], thankfully we're not competing directly with them. We're actually competing with guys that have exactly the same process technology as us, so we feel really comfortable about going for it. In fact, right now, I wish we had more time (of course) with it before we introduced it but this is looking to be a rock solid product. Everybody has met their expectations and everybody is happy with their performance.