Tiny ATI_fragment_shader + Radeon 9700 Pro issue. . .

Ostsol

New member
It's a strange issue. . . one that didn't pop up on my Radeon 8500. My little OpenGL program's bump mapping implementation through fragment shaders does not work on my Radeon 9700. It was working perfectly until I upgraded. The strange part is that every other OpenGL demo that uses fragment shaders still works perfectly. My own program shows up as black, though.

Here's the fragment shader program:

Code:
glSampleMapATI (GL_REG_0_ATI, GL_TEXTURE0_ARB, GL_SWIZZLE_STQ_ATI);
glSampleMapATI (GL_REG_2_ATI, GL_TEXTURE1_ARB, GL_SWIZZLE_STR_ATI);
glSampleMapATI (GL_REG_3_ATI, GL_TEXTURE3_ARB, GL_SWIZZLE_STR_ATI);
// LIGHT 1
// N.L
glColorFragmentOp2ATI (GL_DOT3_ATI,
			   GL_REG_2_ATI, GL_NONE, GL_NONE,
			   GL_REG_0_ATI, GL_NONE, GL_BIAS_BIT_ATI|GL_2X_BIT_ATI,
			   GL_REG_2_ATI, GL_NONE, GL_BIAS_BIT_ATI|GL_2X_BIT_ATI);

// add colour
glColorFragmentOp2ATI (GL_MUL_ATI,
			   GL_REG_2_ATI, GL_NONE, GL_NONE,
			   GL_REG_2_ATI, GL_NONE, GL_NONE,
			   GL_CON_1_ATI, GL_NONE, GL_NONE);

// LIGHT 2
// N.L
glColorFragmentOp2ATI (GL_DOT3_ATI,
			   GL_REG_3_ATI, GL_NONE, GL_NONE,
			   GL_REG_0_ATI, GL_NONE, GL_BIAS_BIT_ATI|GL_2X_BIT_ATI,
			   GL_REG_3_ATI, GL_NONE, GL_BIAS_BIT_ATI|GL_2X_BIT_ATI);

// add colour
glColorFragmentOp2ATI (GL_MUL_ATI,
			   GL_REG_3_ATI, GL_NONE, GL_NONE,
			   GL_CON_2_ATI, GL_NONE, GL_NONE,
			   GL_REG_3_ATI, GL_NONE, GL_NONE);

// reg1 = base map
// reg2 = light1 coloured N.L
// reg3 = light2 coloured N.L
// reg4 = light1 attenuation map
// reg5 = light2 attenuation map
glSampleMapATI (GL_REG_1_ATI, GL_TEXTURE0_ARB, GL_SWIZZLE_STQ_ATI);
glPassTexCoordATI (GL_REG_2_ATI, GL_REG_2_ATI, GL_SWIZZLE_STR_ATI);
glPassTexCoordATI (GL_REG_3_ATI, GL_REG_3_ATI, GL_SWIZZLE_STR_ATI);
glPassTexCoordATI (GL_REG_4_ATI, GL_TEXTURE2_ARB, GL_SWIZZLE_STR_ATI);
glPassTexCoordATI (GL_REG_5_ATI, GL_TEXTURE4_ARB, GL_SWIZZLE_STR_ATI);

// LIGHT 1
// generate the light map via attenuation
glColorFragmentOp2ATI (GL_DOT3_ATI,
			   GL_REG_4_ATI, GL_NONE, GL_SATURATE_BIT_ATI,
			   GL_REG_4_ATI, GL_NONE, GL_NONE,
			   GL_REG_4_ATI, GL_NONE, GL_NONE);

// apply attenuation to N.L
glColorFragmentOp2ATI (GL_MUL_ATI,
			   GL_REG_2_ATI, GL_NONE, GL_NONE,
			   GL_REG_2_ATI, GL_NONE, GL_NONE,
			   GL_REG_4_ATI, GL_NONE, GL_COMP_BIT_ATI);

// LIGHT 2
// generate the light map via attenuation
glColorFragmentOp2ATI (GL_DOT3_ATI,
			   GL_REG_5_ATI, GL_NONE, GL_SATURATE_BIT_ATI,
			   GL_REG_5_ATI, GL_NONE, GL_NONE,
			   GL_REG_5_ATI, GL_NONE, GL_NONE);

// apply attenuation to N.L and add to light 1
glColorFragmentOp3ATI (GL_MAD_ATI,
			   GL_REG_2_ATI, GL_NONE, GL_SATURATE_BIT_ATI,
			   GL_REG_3_ATI, GL_NONE, GL_NONE,
			   GL_REG_5_ATI, GL_NONE, GL_COMP_BIT_ATI,
			   GL_REG_2_ATI, GL_NONE, GL_NONE);

// add ambient light
glColorFragmentOp2ATI (GL_ADD_ATI,
			   GL_REG_0_ATI, GL_NONE, GL_SATURATE_BIT_ATI,
			   GL_CON_0_ATI, GL_NONE, GL_NONE,
			   GL_REG_2_ATI, GL_NONE, GL_NONE);

// modulate with the base texture
glColorFragmentOp2ATI (GL_MUL_ATI,
			   GL_REG_0_ATI, GL_NONE, GL_SATURATE_BIT_ATI,
			   GL_REG_1_ATI, GL_NONE, GL_NONE,
			   GL_REG_0_ATI, GL_NONE, GL_NONE);

Upon experimenting I found that the program is being executed properly. I can comment out a bunch and output the first couple instructions to register 0 and there is a result. However I cannot revert it back to normal and get the whole program to work.

Upon further experimentation I got to this point:

Code:
glSampleMapATI (GL_REG_0_ATI, GL_TEXTURE0_ARB, GL_SWIZZLE_STQ_ATI);
glSampleMapATI (GL_REG_2_ATI, GL_TEXTURE1_ARB, GL_SWIZZLE_STR_ATI);
glSampleMapATI (GL_REG_3_ATI, GL_TEXTURE3_ARB, GL_SWIZZLE_STR_ATI);

glColorFragmentOp1ATI (GL_MOV_ATI,
			   GL_REG_0_ATI, GL_NONE, GL_NONE,
			   GL_REG_0_ATI, GL_NONE, GL_NONE);

// LIGHT 1
// N.L
glColorFragmentOp2ATI (GL_DOT3_ATI,
			   GL_REG_2_ATI, GL_NONE, GL_NONE,
			   GL_REG_0_ATI, GL_NONE, GL_BIAS_BIT_ATI|GL_2X_BIT_ATI,
			   GL_REG_2_ATI, GL_NONE, GL_BIAS_BIT_ATI|GL_2X_BIT_ATI);

// add colour
glColorFragmentOp2ATI (GL_MUL_ATI,
			   GL_REG_2_ATI, GL_NONE, GL_NONE,
			   GL_REG_2_ATI, GL_NONE, GL_NONE,
			   GL_CON_1_ATI, GL_NONE, GL_NONE);

// LIGHT 2
// N.L
glColorFragmentOp2ATI (GL_DOT3_ATI,
			   GL_REG_3_ATI, GL_NONE, GL_NONE,
			   GL_REG_0_ATI, GL_NONE, GL_BIAS_BIT_ATI|GL_2X_BIT_ATI,
			   GL_REG_3_ATI, GL_NONE, GL_BIAS_BIT_ATI|GL_2X_BIT_ATI);

// add colour
glColorFragmentOp2ATI (GL_MUL_ATI,
			   GL_REG_3_ATI, GL_NONE, GL_NONE,
			   GL_CON_2_ATI, GL_NONE, GL_NONE,
			   GL_REG_3_ATI, GL_NONE, GL_NONE);

// reg1 = base map
// reg2 = light1 coloured N.L
// reg3 = light2 coloured N.L
// reg4 = light1 attenuation map
// reg5 = light2 attenuation map
glSampleMapATI (GL_REG_1_ATI, GL_TEXTURE0_ARB, GL_SWIZZLE_STQ_ATI);
glPassTexCoordATI (GL_REG_2_ATI, GL_REG_2_ATI, GL_SWIZZLE_STR_ATI);
glPassTexCoordATI (GL_REG_3_ATI, GL_REG_3_ATI, GL_SWIZZLE_STR_ATI);
glPassTexCoordATI (GL_REG_4_ATI, GL_TEXTURE2_ARB, GL_SWIZZLE_STR_ATI);
glPassTexCoordATI (GL_REG_5_ATI, GL_TEXTURE4_ARB, GL_SWIZZLE_STR_ATI);

// LIGHT 1
// generate the light map via attenuation
glColorFragmentOp2ATI (GL_DOT3_ATI,
			   GL_REG_4_ATI, GL_NONE, GL_SATURATE_BIT_ATI,
			   GL_REG_4_ATI, GL_NONE, GL_NONE,
			   GL_REG_4_ATI, GL_NONE, GL_NONE);

// apply attenuation to N.L
glColorFragmentOp2ATI (GL_MUL_ATI,
			   GL_REG_2_ATI, GL_NONE, GL_NONE,
			   GL_REG_2_ATI, GL_NONE, GL_NONE,
			   GL_REG_4_ATI, GL_NONE, GL_COMP_BIT_ATI);

// LIGHT 2
// generate the light map via attenuation
glColorFragmentOp2ATI (GL_DOT3_ATI,
			   GL_REG_5_ATI, GL_NONE, GL_SATURATE_BIT_ATI,
			   GL_REG_5_ATI, GL_NONE, GL_NONE,
			   GL_REG_5_ATI, GL_NONE, GL_NONE);

// apply attenuation to N.L and add to light 1
glColorFragmentOp3ATI (GL_MAD_ATI,
			   GL_REG_2_ATI, GL_NONE, GL_SATURATE_BIT_ATI,
			   GL_REG_3_ATI, GL_NONE, GL_NONE,
			   GL_REG_5_ATI, GL_NONE, GL_COMP_BIT_ATI,
			   GL_REG_2_ATI, GL_NONE, GL_NONE);

// add ambient light
glColorFragmentOp2ATI (GL_ADD_ATI,
			   GL_REG_0_ATI, GL_NONE, GL_SATURATE_BIT_ATI,
			   GL_CON_0_ATI, GL_NONE, GL_NONE,
			   GL_REG_2_ATI, GL_NONE, GL_NONE);

// modulate with the base texture
glColorFragmentOp2ATI (GL_MUL_ATI,
			   GL_REG_0_ATI, GL_NONE, GL_SATURATE_BIT_ATI,
			   GL_REG_1_ATI, GL_NONE, GL_NONE,
			   GL_REG_0_ATI, GL_NONE, GL_NONE);

The fragment shader has to use register zero in the first phase, or else the thing won't work. It also has to be doing this early in the phase, so I can't simply output to register zero for the second light. That seemingly useless "mov r0, r0" is the only thing that is allowing this damn thing to work. As I said before, though: no other demo that uses ATI_fragment_shader that I've tested has this problem.

Any clue as to what's going on?

Update: This is strange. I put the "mov r0, r0" as the second instruction and the thing doesn't work.

There was another alternative I used where I had "mov r1, r0" instead, and output light 1 to r0 and then used r1 for light 2's normal map (the second phase modified accordingly) and it worked.

In another test I commented out everything except for light 1's first two instructions. I set it so instruction 2 would output to r0. The damn thing refused to work. If I set the first instruction to output to r0 and modified instruction two accordingly, the thing works.

There's no pattern. I'm almost about to say that the first instruction has to output to r0, but one of my alternate methods (mov r1, r0) proves that wrong.

:hmm:
 
Last edited:
Humus said:
Seams like a driver bug to me. I suggest you send your app to [email protected] with your comments.

Hmm. . . that's possible. . . It's just strange that this problem appears localized specifically to my program and -only- my program. Hmm. . . Do you have a Radeon 9700? If you do, perhaps I could send the thing to you to test on your system. . . (Or anyone else reading this thread.)
 
No, I'm still on my 8500 and waiting for my 9700 to arrive. I ordered one directly from ATi with the 40% developer discount. I kinda thought they would prioritize developers a little higher regardless of how insignificant they are. It's been one and a half month now, and I have reminded them a few times, still haven't arrived, probably haven't been shipped yet either. Was told a week or two ago that it were supposed to ship that day ... I suppose it wasn't :(
 
Humus said:
No, I'm still on my 8500 and waiting for my 9700 to arrive. I ordered one directly from ATi with the 40% developer discount. I kinda thought they would prioritize developers a little higher regardless of how insignificant they are. It's been one and a half month now, and I have reminded them a few times, still haven't arrived, probably haven't been shipped yet either. Was told a week or two ago that it were supposed to ship that day ... I suppose it wasn't :(

Eck. . . That sucks. . .

Well, I'll let this post stew and collect more views for today. If no one else replies, I'll email ATI.
 
Humus said:
I got my card a few days ago.

Humus, is the developer discount any better than the employee purchase plan? I worked at ATI for the past 2 summers (finished just before you started I think), and have a few contacts that could get me some 9700's from the EPP.

I have since joined with a software company, and could probably get with ATI's devrel. We could sure use a few of the 9700's. The question is whether I should ask my former colleagues to order some for me or if I should go through devrel. Does devrel have a limit?
 
The devrel discount is 40% off the retail price; comes out to be about $240 with shipping. The only problem is that the waiting list is very long and if you're in a hurry to get a card I'd recommend just buying it through normal channels.
 
This "waiting list", why is that? It can't possibly be that they don't have enough cards.
 
Humus said:
This "waiting list", why is that? It can't possibly be that they don't have enough cards.

I wasn't expecting there to be a waiting list either; but when I emailed asking them what was taking so long to ship my cards that is what they told me.
 
DarkVamp said:
Hi,

hey HUMUS what are you planing to do magic with this great piece of hardware ????

Hmm, let see ... I plan to do .... stuff! :)

I hope there will be a driver with support for GL_ARB_fragment_program soon and some extension for float buffers. Until then I'm working on some other parts of the engine I'm working on, like collision detection.
 
Hey, Humus. . . That issue I had is fixed in the 6200 drivers. What a bizzare bug. . .

BTW: Is EXT_vertex_shader compatible with ARB_fragment_program?
 
Sure, it doesn't matter for the fragment program which way it gets feed with it's inputs, so the EXT_vertex_shader should work. However, I would recommend using the ARB_vertex_program instead.
 
Ok, another question about ARB_fragment_program:

Is it just me or are there no operand modifiers? It was so convenient in ATI_fragment_shader to have GL_BIAS_BIT_ATI|GL_2X_BIT_ATI to scale and bias a fragment. . .
 
Yeah, I was kind of dissapointed about that (there's only a negate mod). You can scale and bias after a TEX instruction though ( MAD texreg, texreg, 2.0, -1.0; ). Seems to me that that would be faster rather than doing a scale and bias for every instruction that needs it (just a pre-process), could be a pain if you need the same texture both a unbiased/scaled and a bias/scaled though.
 
My concern was that it uses up instructions. While for a simple 8 instruction bump-mapping demo this isn't a huge issue (when the max is 128), it could get to be with really large programs. *shrugs* Perhaps they'll be back in a future revision. I think D3D's PS2.0 has 'em, as does ATI_text_fragment_shader, so the R300 certainly is capable of having them.
 
I wouldn't worry about it too much. I don't think PS 2.0 has them though, at least I don't remember seeing that in the spec (I could be wrong of course :)).
 
Back
Top