Can CPU be used for tasks the gfx card is too slow for (AntiAliasing etc.) ?

flexy

New member
i was thinking about that,

but nowadays a fast gfx card runs at 300mhz, and a fast CPU runs somewhere from 1600mhz to over 2Gig !

There should be ways (somehow) to split tasks (like AntiAliasing) to the cpu which is so much faster - anyway i dont know if my thought is totally off...but maybe some developers get an idea out of it :)

greets
 
Re: Can CPU be used for tasks the gfx card is too slow for (AntiAliasing etc.) ?

flexy said:
i was thinking about that,

but nowadays a fast gfx card runs at 300mhz, and a fast CPU runs somewhere from 1600mhz to over 2Gig !

There should be ways (somehow) to split tasks (like AntiAliasing) to the cpu which is so much faster - anyway i dont know if my thought is totally off...but maybe some developers get an idea out of it :)

greets

Well, the problem with that is that the CPU is a generic processing device (for misc. calculations), and the GPU is a special processing device which is designed for doing graphics (like AA), and if you tried to use the CPU for doing something like AA it would be VERY slow, because it's not designed for it (it's too generic of a processing device). Heck, even the most powerful consumer processors nowdays still can't do software graphics fast (like software OpenGL for instance).

Per pixel operations on a CPU are in general VERY slow (AA is a per pixel op).

Hope that answers your question...
 
Re: Re: Can CPU be used for tasks the gfx card is too slow for (AntiAliasing etc.) ?

Re: Re: Can CPU be used for tasks the gfx card is too slow for (AntiAliasing etc.) ?

NitroGL said:


Per pixel operations on a CPU are in general VERY slow (AA is a per pixel op).

Hope that answers your question...


why would a per pixel op on a cpu differ from a per pixel op on a GPU ? I think the basics should be the same - be it a memory region in system memory to work with - or a memory operation in video memory (which is represented then on the screen as image, simplified speaking :) (memory region of longwords maybe, each 32 bit == 1 pixel ?)

The problem/bottleneck literally would maybe be how the CPU can access the videomemory. For the GPU no problem - but the CPU would depend on pci/agp bus (whatever)... (correct me if i am wrong)

They could implement a special SSE/3dNOW command in the cpu similiar to what antialiasing does (if it's not there already :) ?
 
Well, for one CPU's are still CISC based (for the most part), while GPU's are all RISC (as far as I know anyway). There are a lot of things that could factor into the CPU being slower at doing that stuff. Like I said, the GPU is designed to do that kind of stuff, nothing more, and a CPU is designed to do a lot more, but it does it slower, because it has to be adaptive.

Yeah, the bus has a lot to do with it too, for instance, the GPU doesn't multiply the core speed by the bus speed (like a CPU), instead the bus between the GPU and the VRAM is running at the speed the core is running at (in an sync model anyway, that's why a async model can cause problems, because the core speed <> mem speed can sometimes conflict), so theres a direct path for data to freely flow. Now if you had that kind of path between your PC's system memory and the CPU, it would be a LOT faster, but the CPU is still too generic for doing that kind of stuff fast, because it just simply wasn't designed for it.
 
I see this kind of question quite often. All I can do is frown. How can you compress years of learning into a paragraph to explain (what one would consider a daft Q) why certain things just don't work like that. :bleh:

The simple answer is: if the CPUs could so such a job why would you have a graphics card that is more complex than a Pentium 4?
 
Also GPU's aren't even RISC. They are practically pure hardware.

The GPU is written to perform a fixed function (generally). It is pure logic. And when I say "written" I mean the logic is designed in a high level language that compiles down to code to transistors to slithers of silicon. Usually VHDL, or Verilog.

Just for the fun of it, here's how you make the lights on an audio graphic equaliser:

Code:
library IEEE;
use IEEE.STD_LOGIC_1164.all;

entity BarGraphDecoder is
	port (
			BinInput	:in  STD_LOGIC_VECTOR (2 downto 0);
			LinOutput	:out STD_LOGIC_VECTOR (6 downto 0);
			Enable		:in  STD_LOGIC
		 );
	
end BarGraphDecoder;

architecture BarGraphDecoder of BarGraphDecoder is
begin
	process (BinInput (2 downto 0))
	begin
		if Enable = '1' then	
			case BinInput is
				when "000" => LinOutput <= "0000000";
				when "001" => LinOutput <= "0000001";
				when "010" => LinOutput <= "0000011";
				when "011" => LinOutput <= "0000111";
				when "100" => LinOutput <= "0001111";
				when "101" => LinOutput <= "0011111";
				when "110" => LinOutput <= "0111111";
				when "111" => LinOutput <= "1111111";
				when others => null;
			end case;
		else
			LinOutput <= "0000000";
		end if;
	end process;
end BarGraphDecoder;

Could have got a nice job doing VHDL, dead easy too. Very good money... I just had to go into Visual Basic, and electronics. Doh!
 
Simple case, pure gourad shading, no texturing. Even in such as simple case a CPU will take several cycles to produce every single pixel, while a GPU will produce several pixels every cycle.

A graphic card at 300MHz with 4 pipelines can produce 1.2 Gpixels / s. A 2GHz P4 with very optimized code may be able to produce around 0.1 Gpixels/s, pure estimation. Now add texturing to it. Point samples for simplicity, small texture. The graphic card may go down to 1.0 Gixels, software is now at 0.01Gpixels. Add mipmapping, filtering, projection etc and the graphic card still is doing around 1.0 Gpixels/s, the software is now at 0.001Gpixels. And so on ...

An example, my dynamic perpixel lighting demo on my homepage runs at around 60-70 fps on a Radeon. On a GF2 pro I tested it on that doesn't support 3d texturing in hardware (thus went into software mode) I had to wait 10-15 seconds for the first frame to appear.
 
Well, for one CPU's are still CISC based (for the most part), while GPU's are all RISC (as far as I know anyway).

actually, just wanted to say that for the most part, today's CPUs are more RISC like internally then it is CISC

I think most of it centers on the fact that the GPU is designed to do a select few functions as fast as possible while the CPU is designed to do as MANY functions as possible.
 
CLxyz said:


actually, just wanted to say that for the most part, today's CPUs are more RISC like internally then it is CISC

Thats only 1/2 true.

The pre Athlon AMD and cyrix cpus were RISC, using a translation layer to convert the CISC to RISC. Pentium 1's were pure x86. Then AMD designed the K7 which was their first pure x86 CISC CPU. However just before that Intel created the Pentium pro (and P2 etc) which used a RISC core like the old AMDs and Cyrix's. Weird eh!
 
Then AMD designed the K7 which was their first pure x86 CISC CPU.

I'm pretty sure the K7 is a RISC-like core the way that the P6 and P7 architectures are RISClike. And anywyz, the K7 was more of an Alpha chip than a AMD chip ;) But this is really off topic.
 
Re: Can CPU be used for tasks the gfx card is too slow for (AntiAliasing etc.) ?

flexy said:

There should be ways (somehow) to split tasks (like AntiAliasing) to the cpu which is so much faster - anyway i dont know if my thought is totally off...but maybe some developers get an idea out of it :)

If you do antialiasing on the cpu that means you have to do everything on the cpu.

If you do everything on the cpu, then your frame buffer is in system ram, AGP 4X is what, 1GB/s worth of bandwidth? Compare that with 6-8GB/s the video card has natively. Of course, the cpu has to read and write from ram as well so you'll need DDR to actually see anywhere close to 1GB/s sustained. You have to write each pixel in the frame buffer, that will have to be done at 1GB/s if you want to see if transfered at that rate across the AGP bus. Of course writing random pixels to system memory isn't very interesting so you want to read data from memory as well. 3GB/s of system ram bandwidth ought to be enough to give you 1GB/s worth of AGP bandwidth for a frame buffer. Of course the game code might actually want to run as well and all those cpu ram accesses don't leave a lot of time for that. :)
 
CLxyz said:


I'm pretty sure the K7 is a RISC-like core the way that the P6 and P7 architectures are RISClike. And anywyz, the K7 was more of an Alpha chip than a AMD chip ;) But this is really off topic.

The only thing alpha in the althon is the CPU-memory bus.
 
GPU does some operations on a pixel in a single clock cycle which would take on a generic CPU 10-50 or more cycles. Also GPU's have multiple pixel pipes which will further parallelize the operation and yes, GPU's have some pixel operations implemented in ciruits on ASIC not by any kind of code, that's why they are so much faster. Also, if you use GPU to manipulate video memory, you do not have to pump pixels back and forth over system bus, which is a good thing, because the bus will be free for other devices. The amount of operations GPU does for one pixel on the screen nowadays can easily bring even the fastest available generic processor to its knees.

Vahur
 
Back
Top