Rage3D R580 Technology Review
By Ryan "MrB" Ku & Mark "Ratchet" Thorne - mrb@rage3d.com,ratchet@rage3d.com
January 31st, 2006

[ Print ] - [ Close ]

Introduction

Advertisement
It’s only been three months since ATI launched the X1800 line of cards, but ATI is already launching their refresher parts for the high-end line. This represents one of the shortest-lived product lines for ATI and the entire 3D graphics industry in general. In recent memory only the failed GeForce 5800 series was shorter, though that is arguable as that product never really saw retail availability.

The X1800 is a solid product that unfortunately ran into one too many delays which shortened its life considerably. It offered good performance combined with a slew of new technologies that finally brought ATI level with, or better than Nvidia. Most significantly, ATI products were finally capable of handling Shader Model 3.0. ATI preached that Shader Model 2.0 was good enough and they were more or less correct as games didn’t really take considerable advantage of SM3.0 features. Even so it made fully recommending their products difficult as it wasn’t as future proof as some of its competitors.  The X1800 line also introduced ATI’s all new ring-bus memory architecture that gave considerable potential for performance improvements. Another significant technology introduced was AVIVO technology that has only recently shown its muscle in the video reproduction area.  All these really made X1800 a great chip that was available in decent quantities during its short life.

Why replace the X1800 so quickly when it’s a pretty good product? Product planning occurs considerably earlier than product launch, and with certain assumptions made. The X1800 was supposed to launch around May and this was the timeframe used in planning for the refresher part. If all went according to plan the launch of the refresher part would then closely follow the traditional 6-month product refresh cycle. As we know the X1800 didn’t go according to plan and was delayed significantly. Still, competition and product cycles don’t wait for anyone and it’s better to move forward and cut your losses than stand still in a fast paced environment.

With the plan in hand and no hiccups in sight, ATI is launching the X1900 line consisting of four different cards: X1900XTX, X1900XT, X1900 Crossfire, and the AIW X1900. It’s the first time ATI has introduced an AIW edition at the same time as a new GPU launch. The X1900 isn’t just a speed bump over the X1800, it has a few considerable improvements that we’ll get into soon. It’s meant to engage, entertain, and breakthrough. Engage your reflexes; entertain your senses and breakthrough performance barriers. Let’s find out how well it accomplishes this.

What's New

The X1900 series doesn’t introduce any significantly new technologies to the table but makes up for it by adding a whole lot of punch. Pixel shading punch. As speculated, ATI has boosted the number of pixel shader processors to 48, which is three times the amount found in the X1800 line. This accounts for the largest change from the X1800 and also for the increase in the amount of transistors. The X1900 has over 380 million transistors, an increase of 20% from the X1800, and draws approximately 150W of power! The maturation of the 90nm process technology allowed ATI to pack more transistors into the same area.

The significance of pixel shader power with respect to performance is still relatively vague in comparison to using the number of pipelines (ROPs) as a traditional indicator. A jump of three times in magnitude of ROPs would’ve meant substantial performance improvement. What does such an increase in pixel shader power mean?

The use of pixel shaders in games has steadily increased over the past few years. ATI research shows that all new titles in 2006 will use pixel shaders in some way. Not only are they used more but they are becoming increasingly more complex, so that pixel shading is fast becoming the main performance bottleneck. This observation easily explains ATI’s design consideration to up the number of pixel shader processors. Old games already run plenty fast on today’s high end cards. Games released recently that use fairly simple pixel shader programs will see a noticeable improvement and future games that use even more PS will see substantial gains.

The X1900 uses the same pixel shader core from the X1800. Each one is capable of performing up to two 3-component vector operations and two scalar operations each clock cycle at full FP32 precision. Each core is equipped with a dedicated Branch Execution Unit that allows flow control to occur seamlessly. Of course ATI’s Ultra-Threaded Pixel Shader Engine has carried over to ensure that each PS Core is always working. (To read more about the Ultra-Thread Pixel Shader Engine check out our X1800 review)

The next improvement again deals with shading but this time on the texturing side. With the X1900 ATI has added Fetch4 capability to the texture units. Fetch4 actually isn’t something new as the X1300 and X1600 have had the capability ever since launch but it hasn’t been widely publicized due to the X1800 lacking the feature. Fetch4 is a new texture sampling method that is useful for shadow map acceleration. Normally texture units sample a colour texture where one colour value consists of four components (red, green, blue and alpha) and each texture unit is only able to sample one value at a time. A shadow map (depth texture) on the other hand is made up of only one component value. Fetch4 takes advantage of this situation and allows a texture unit to sample four values from adjacent address at a time, increasing the texture sampling rate by four. Taking advantage of this feature requires additional programming on the applications part but ATI noted that it is fairly easy to implement. 

Fetch4 Chart
Fetch4 Chart

High resolution users will be happy with the final improvement which benefits them considerably. ATI increased the Hierarchical Z on-chip memory by 50% to handle resolutions such as 1920x1200 all the way up to 2560x1600. In prior cases there would be a large performance impact if the memory wasn’t enough to handle it but this has been eliminated.

Everything else about the X1900 is carried over from the X1800 including its advanced 512-bit ring-bus memory controller and industry-leading AVIVO technology. From a paper stand point the X1900 filled in the one weak point the X1800 had which was less pixel shader processors compared to the competition. The X1900 reverses that weakness and makes it one of its strengths.

The X1900 Family

As previously mentioned, ATI is releasing four cards at once: X1900XTX, X1900XT, X1900 Crossfire, and the AIW X1900. Each board has the same base configuration of 16 pipelines (ROPs), 16 texture units, 48 pixel shaders and 8 vertex shaders. The R5xx series was designed to be flexible in configuration compared to previous architectures. ATI is able to scale not only in number of pipelines, but also in PS units, texture units and VS units. The X1900 uses a ratio of 3 PS (ALU) for every 1 TU which is the same ratio used in the X1600.

Why not also increase the number of texture units? ATI has analyzed shaders used in 3D games and observed that shaders are increasingly using more arithmetic (mathematical) operations in comparison to texture operations. Early shaders had roughly equal the number instructions between these two types but new shaders are leveraging more and more mathematical operations, approaching the ratio of 5:1. ATI felt the 3:1 ratio of ALU:TU was the ideal balance for current and future games.

Here's a nice chart outlining the new lineup:

Model Memory Amount Core Speed Memory Speed Interface ROP TU PS VS SM MSRP ($US)
X1900 XTX 512 MB 650 MHz 1.55 GHz 256-bit 16 16 48 8 3.0 $649
X1900 XT 512 MB 625 MHz 1.45 GHz 256-bit 16 16 48 8 3.0 $549
X1900 CrossFire 512 MB 625 MHz 1.45 GHz 256-bit 16 16 48 8 3.0 $599
AIW X1900 256 MB 500 MHz 960 MHz 256-bit 16 16 48 8 3.0 $499
X1800XT 512 MB 625 MHz 1.5 GHz 256-bit 16 16 16 8 3.0 $599
X1800XL 256 MB 500 MHz 1.0 GHz 256-bit 16 16 16 8 3.0 $399
ROP = Render Output, TU = Texture Unit, PS = Pixel Shader Processors, VS = Vertex Shader Processors, SM = DirectX Shader Model Version

The X1900 XTX is ATI’s extreme high-end card, which sports an equally extreme price of $649. It’s clocked at 650MHz core speed and 775MHz memory speed, an increase of 25MHz from the X1800 XT. It’s the first time ATI has used the XTX nomenclature. Frankly it’s ugly and contains too many Xs, XT Extreme. They could’ve used Platinum Edition but choose not to as it carried a poor image of unavailability. The XTX will be highly available for purchase through all channels. The X1900 XT is clocked similarly to the X1800 XT with the exception in memory which dropped 25MHz. It’s at a more “reasonable” price of $549. The CrossFire edition for the X1900 is clocked at the lower speed of the XT which is opposite to the X1800 CrossFire that was clocked at the higher model. Finally there is the AIW X1900 which is clocked similar to the AIW X1800XL at 500MHz/480MHz. All four cards will be available at launch.

The Card

The only visual difference between the X1900 and X1800 is the number of these power regulators
The only visual difference between the X1900 and X1800 is the number of these power regulators
In most respects the X1900 reference board is identical to the X1800 board. The only noticeable difference in fact is in the number of voltage regulators aligned along the back of the card; the X1900 has 7 such voltage regulators whereas the X1800 only has 5. These extra components are, of course, required by the X1900 because of the extra power requirements the new flagship has over the X1800. In all other respects the X1900 and X1800 appear the same.

The cooler is still the big heavy dual slot design that ATI originally developed for the X1800XT and, unfortunately, there doesn’t appear to be any improvements made to it for the X1900. It’s relatively loud, certainly louder than the 7800 GTX fan without question, but when the card is in 2D mode it’s fairly quiet and doesn’t annoy all that much. In 3D mode it spins up fairly quickly under the load however, and given the right circumstances can get extremely loud and annoying.

The cooler certainly does seem to do the job it was designed for though. It pumps a ton of heat out the back of your case, which is good considering how hot the exhaust air feels on the palm of my hand. You’d really hate to have all that dumped in your case.

Connections
Connections
On the connections side of things the X1900XTX and X1900XT both offer Dual Link Dual-DVI with an S-Video TV-out connection that doubles as the Video-In/Video-Out connector. VIVO functionality is provided by a Rage Theater chip mounted on the front surface of the card.

Naked X1900XTX!
Naked X1900XTX!
With the cooler removed you can see that the memory is arranged in the same semi-circular pattern as that employed on the X1800XT. The memory chips themselves, at least on the XTX cards sent to reviewers, are Samsung K4J52324QC-BJ11, rated for 1.1ns, or 900MHz (1.8GHz). Very potent chips for a card whose memory clock is set to 775MHz.

ATI's R580 Core
ATI's R580 Core
In its naked state the focal point of the card is the brand new R580 GPU, a 382 million transistor, 342mm² (approx) beast of a chip built on TSMC’s 90nm low-k process.

The back of the card is bereft of any notable features save a small springy retention bracket used to keep the ginormous cooler mounted firmly to the surface of the card. On a sidenote, if you want to burn your finger, touch this when the card has been running benchmarks for a couple hours.

Samsung 1.1ns GDDR3
Samsung 1.1ns GDDR3
Rage Theater
Rage Theater
X1900XTX Backside
X1900XTX Backside
Test Setup

Resolutions

Image Quality Settings

Test System Specs

  Radeon X1900 XTX Radeon X1900 XT Crossfire Radeon X1800 XT GeForce 7800 GTX 512 GeForce 7800 GTX
Core R580 R580 R520 G70 G70
Silicon Process 90nm low-k 90nm low-k 90nm low-k 110nm 110nm
Transistor Count
(millions)
384 384 321 302 302
Core Speed MHz 650 625 625 550 430
Memory Speed MHz (Effective) 775
(1.55 GHz)
725
(1.45 GHz)
750
(1.50 GHz)
850
(1.70 GHz)
600
(1.20 GHz)
Memory Size 512 MB 512 MB 512 MB 512 MB 256 MB
Bus Standard PEG 16x PEG 16x PEG x16 PEG x16 PEG x16
Bus Width 256bit 256bit 256bit 256bit 256bit
Pixel Pipelines 16 16 16 24 24
Pixel Shaders 48 48 16 16 16
Vertex Shaders 8 8 8 8 8
Peak Memory Bandwidth
(GB/s)
49.6 46.4 48.0 54.4 38.4
Pixel Fillrate
(million pixels/sec)
10,400 10,000 10,000 13,200 10,320
Texel Fillrate
(million texels/sec)
10,400 10,000 10,000 8,800 6,880
API Compliancy DX 9.0c
Shader Model 3.0
DX 9.0c
Shader Model 3.0
DX 9.0c
Shader Model 3.0
DX 9.0c
Shader Model 3.0
DX 9.0c
Shader Model 3.0

Note that, in lieu of an actual X1900XT, we are using an X1900 Crossfire master card which has the exact same clock speed and specifications of the X1900XT.

 

Games Benchmarks (click for settings)

The Windows XP desktop was set to 1280x960 with a 32bit color depth and 85Hz refresh rate for all tests. Refresh rate locks for 3D graphics modes, as supported by both NVIDIA and ATI graphics control panels, was not enabled. V-Sync was forced off via the graphics card control panel as well. All other graphics card control panel settings were left to their default settings unless otherwise noted.

Anti-Aliasing and Anisotropy were applied in the game engine where the options existed. For games that did not support those options natively, the graphics card control panel was used.

Custom batch files were used when possible for automated benchmarking (the details of the commands used are outlined for each test). When manual benchmarking was necessary Fraps was used.

Benchmarking was done with Windows set to the "Adjust for best performance" profile, and all unnecessary Windows services and hardware devices were disabled. The latest drivers for each necessary hardware component were installed prior to testing and kept consistent throughout.

Sound was disabled for all tests unless otherwise noted.

To setup the test machine I installed Windows XP, patched and tweaked it, and installed all the required games, apps, utilities, and hardware drivers needed for the testing procedure except for the graphics drivers. Using Norton Ghost, I then cloned the drive onto a second identical hard-drive. After that I installed the ATI drivers on one hard-drive and the NVIDIA drivers on the other. Testing the videocards was then a simple matter of swapping videocards and hard-drives when required.

Benchmarks

F.E.A.R.

Benchmarking FEAR was simply a matter of running the in-game Performance Test. Some of the action in the sequence is random, but for the most part it produces reliable, repeatable results. It doesn't reflect actual game play but it should give us an idea of how these cards will perform relative to each other.



Quake 4

I benchmarked Quake 4 using 3 custom timedemos then averaged the score from each demo to get the final score for that resolution and setting. The game was set to the "High" setting for all cards, with Anti-aliasing and Anisotropy set in the game.

I used a custom batch file which automatically runs each demo, resolution, and setting. The commandline used in the batch file is detailed below.

"quake4.exe" +set logFile 1 +set com_showFPS 1 +set r_multiSamples <anti-aliasing> +set r_mode <resolution mode> +set image_anisotropy <anisotropy> +set timescale 7 +playdemo demo1 +wait 1000 +timedemoquit demo1
hover your mouse over the yellow text to get a brief description of what each switch does

The Quake 4 results are below. Click the text links at the top of the chart to change settings.



Serious Sam 2

Serious Sam 2 was benchmarked using the three timedemos that come with the game. The results of each demo were averaged to get the final result for the resolution and setting. A custom batch file was used to automate the process.

The first set of charts were run with HDR disabled and the seond set with HDR enabled. All other settings stayed the same. Since only ATI's X1000 family of cards support HDR with Antialiasing, only those cards were tested with AA in the second set of charts.

The commandline used to launch Serious Sam 2 is below:

bin\Sam2.exe +demo "content/serioussam2/demos/<demo>" +bmk_bAutoQuit 1 +bmk_bBenchmarkDemos 1 +sam_demo 1 +sam_bBootSequence 0 +fullscreen 1 +aspect <aspect ratio> +gfx_iAntiAliasing <anti-aliasing level> +tex_iAnisotropy <anisotropic level> +width <width> +height <height>
hover your mouse over the yellow text to get a brief description of what each switch does



Battlefield 2

Battlefield 2 benchmarking is a little tricky. It has a built in time demo feature, but the results it produces can be very unreliable because it starts logging frame rate on the menu screen, before the actual demo starts. The most reliable method to get results is to take the frame rate log the time demo produces (.csv file) and sample the last few thousand frames (in our case we sampled the last 7000 frames). You need to sample the last of the frames because the game starts logging timedemo results duing the loading sequence, which greatly skews the results.. I used a custom timedemo to get the results below.



Splinter Cell Chaos Theory

Splinter Cell Chaos Theory was also benchmarked using a custom batch file. Anti-aliasing and anisotropy were set within the game. The first chart shows all three cards using the same Shader Model 1.1 path, which is the setting the game defaults to.

In the second chart the Shader Model 3.0 path was chosen and all the SM3.0 features were enabled. Even though the X1000 family supports HDR with AA, Splinter Cell Chaos Theory needs to be patched for it to work. As yet there is no such patch, so we only have 0/0 and 0/16 results there.



Half-Life 2: Lost Coast

To get results from Lost Coast we recorded a custom timedemo, which actually ran for the whole length of the level, and manually ran each resolution and setting.



Benchmarks (cont.)

Half-Life 2

Testing of Half-Life 2 was done using 4 custom Source Engine 7 time demos from various sections of the game. Because there is some frame rate variance during the Half-Life 2 benchmark process, we ran each timedemo for each resolution and AA/AF setting 3 times then averaged the results to get the final score. Anti-aliasing and Anisotropy were set on the command line.

A batch file was used to automate testing; the command line is below for reference. This batch file was used for each card that was tested. The settings surrounded by < > change for each pass:

"hl2.exe" +r_fastzreject 1 +r_waterforcereflectentities 1 -novid -nosound -width <resolution width> -height <resolution height> +mat_antialias <anti-aliasing> +mat_forceaniso <anisotropy> +mat_trilinear 1 +timedemoquit <timedemo>

hover your mouse over the yellow text to get a brief description of what each switch does

 



Doom 3

Doom3 was tested with 3 custom timedemo and the resutls of each timedemo were averaged to get the final score for that resolution and setting. We benchmarked combinations of Anti-Aliasing and Anisotropy over the resolutions shown in the chart below. Anti-aliasing and Anisotropy were set on the command line.

Another batch file was used to automate Doom 3 testing as well. The command line is below for reference. This batch file was used for each card that was tested. The settings surrounded by < > change for each pass:

"doom3.exe" +set logFile 1 +set com_showFPS 1 +set r_multiSamples <anti-aliasing> +set r_mode <resolution mode> +set image_anisotropy <anisotropy> +set timescale 7 +playdemo <demoname> +wait 1000 +timedemoquit <demoname>
hover your mouse over the yellow text to get a brief description of what each switch does


Chronicles of Riddick: Escape from Butchers Bay

Chronicles of Riddick was benchmarked using another custom batch file that ran each of the 3 built in timedemos included with the 1.01 patch. The Shader Model 2.0 path was used for all cards and Anti-aliasing and Anisotropy were set in the graphics card control panel.



Far Cry

Again a custom batch file was used to benchmark Far Cry. We used 3 demos included with the newer patches to test performance (from the Cooler, Training, and Volcano levels). Because Far Cry benchmark frame rates can vary between each subsequent pass (sometimes fairly significantly), I ran each of the 3 demos 3 times, then averaged the 9 results to get the final score for the detail level.

The command line for the batch file I used to automate Far Cry benchmarking is below. This batch file was used for each card that was tested. The settings surrounded by < & > change for each pass:

"farcry.exe" -DEVMODE "s_soundEnable 0" "r_width <horizontal res>" "r_height <vertical res>" "demo_num_runs 2" "map <mapname>" "demo <mapname>" "demo_quit 1" "r_FSAA <1|0>" "r_FSAA_samples <AA Samples>" "r_Texture_Anisotropic_Level <AF Level>"
hover your mouse over the yellow text to get a brief description of what each switch does


Pacific Fighters

Pacific Fighters was benchmarked by loading the included "N1K1 vs BeauFighter.ntrk" track and logging framerates using Fraps from the beginning of the track for 90 seconds. All the in game details were set to their maximum levels, including "Landscape Detail" which was set to "Perfect", enabling Pixel Shaded water. Video was set to the "ATI Radeon X800/9800/9700/9600/9500" profile for the ATI cards and "NVIDIA GeForce 6800/6600/FX/4/3" profile for the NVIDIA cards. Anti-aliasing and Anisotropy were set via the graphics card control panel.

Highest Image Quality Benchmarks
Battlefield 2 X1900XTX
Battlefield 2 X1900XTX
Battlefield 2 7800GTX
Battlefield 2 7800GTX
Splinter Cell Chaos Theory X1900XTX
Splinter Cell Chaos Theory X1900XTX
Splinter Cell Chaos Theory 7800GTX
Splinter Cell Chaos Theory 7800GTX
Doom 3 X1900XTX
Doom 3 X1900XTX
Doom 3 7800GTX
Doom 3 7800GTX
Far Cry X1900XTX
Far Cry X1900XTX
Far Cry 7800GTX
Far Cry 7800GTX

 

 

 

 

Overclocking

To test overclocking capability we used ATI's Overdrive.

OverDrive, at least in it's current state, has pretty shallow limits on clocks when it comes to the X1900XTX, just 690MHz maximum for the core speed and 800MHz max for the memory. We hit these limits without trouble almost immediately, which tells us that the card can go a lot higher. Unfortunately the only third party tools currently available which can overclock the X1900XTX past these speeds are not as stable as we'd like to see (I did manage to get the memory speed up to 855MHz with ATI Tool, but it would crash when trying to modify the core speed), so for now we're going to test the performance increase with the maximum OverDrive overclock and hopefully revisit X1900XTX overclocking when better tools become available.

So, in summary, we are testing the card at an overclocked core speed of 690MHz (which represents a 6.2% increase over the default speed of 650MHz), and with the memory set to 800MHz (which is a tiny 3.2% increase over the default 775MHz). These clocks we are very likely far under the capabilities of the X1900XTX.

The following charts represent the performance increase in Doom 3 and Splinter Cell: Chaos Theory with the card running at maximum OverDrive.

[ No AA / No AF ] [ No AA / 16x AF ] [ 4x AA / No AF ] [ 4x AA / 16x AF ]
highend1
 1280x10241600x12001920x12002048x1536
     % Increase      1.0% 3.0% 3.8% 4.3%


[ No AA / No AF ] [ No AA / 16x AF ]
highend1
 1280x10241600x12001920x12002048x1536
     % Increase      1.4% 3.1% 3.9% 4.3%
Conclusion

The X1900 series represents a nice boost over the short lived X1800 series. The X1900 builds on the strong foundation set by the X1800 series but provides an overall more balanced design (increase in pixel shaders) in accelerating today’s games yet also leaving room for greater potential performance in running future games. Best part about the launch is availability. ATI is so confident in availability that they are making available the whole line of X1900s at once, XTX, XT, Crossfire and AIW!

So how does the new line stack up? The XTX on average performs 15.5% faster than the X1800XT and the 1900XT at 11.8%. We can see the impact of increasing the pixel shaders as the X1900XT actually has a lower speed on the memory side compared to the X1800XT. I wouldn’t get a X1800XT unless you can find it cheap on the market. The X1900XTX is definitely not worth the extra $100 premium as it’s only 3.2% faster than the X1900XT.

Compared to the competition the GTX 512 holds up fairly well and if you only play games based on Doom3 engine then that should be your choice. Really though the GTX 512 is basically not available for purchase which makes the X1900 that much better. The X1900 really flexes it muscles with all the bells and whistles on and leaves the GTX 512 behind. The X1900 clearly surpasses the GTX 256.

The X1900 is an all around solid product from ATI that really is bringing them back on track. It can only get better as driver enhancements leave room for improvement because the architecture is still relatively new. ATI continues to put tremendous amount of resources into drivers which has given us solid Catalyst drivers each and every month. The X1900 provides performance, stability, CrossFire capability, high resolution gaming, HDR with AA, high quality video processing technology and crisp display output. Couldn’t ask for more well except for a lower price ;) !