I have 4 rendertargets R10G10B10A2 each being used to store color and normal in a compact double RGB 555 bits format. When I’m doing the lighting pass I’m obliged to have extra rendertargets because if I can read(extract) the normal/color, I can’t write back directly the lit color to these surfaces.
So I moved to a compute shader to achieve this goal (DX11 CS5). I have found that I’m limited to R32_UINT for read/write in DX11 CS5. I have thus adapted my encoding/decoding to have things work fine, although the final colors are a little bit hugly with my encoding.
Unfortunately the final frame rate is lower than the regular pixel shader even if I have to do 4 copyresources from the intermediate rendertargets with my previous method.
I have implemented the CS starting from those found in the legacy DX11 samples. My code is like this:
//some CSbufferConstant and CSshaderresourceview settings (e.g. shadowmaps, depth textures gpDC11->CSSetShader(pCS, 0, 0); //I' mworking on 4 UAV at the same time in the CSshader. With only one it is even slower gpDC11->CSSetUnorderedAccessViews(0, 4, gSRDeffered.ppUAV, (UINT*)(&gSRDeffered.ppUAV)); gpDC11->Dispatch(960, 540, 1);//the size of my screen
GPU CS shader:
[numthreads(1,1,1)] void CS_PostDeferred( uint3 nGid : SV_GroupID, uint3 nDTid : SV_DispatchThreadID, uint3 nGTid : SV_GroupThreadID )//only nDTid is used in fact here uint Output; float2 Tex = float2(nDTid.x/960.0, nDTid.y/540.0); float Depth1 = txDepth1.SampleLevel(samPoint, Tex, 0).r; float Depth2 = txDepth2.SampleLevel(samPoint, Tex, 0).r; float Depth3 = txDepth3.SampleLevel(samPoint, Tex, 0).r; float Depth4 = txDepth4.SampleLevel(samPoint, Tex, 0).r; Output = UAVDiffuse0[nDTid.xy]; if ( Depth1 < 1 ) Output = GetLColorUnPackPack_CS(Depth1, Tex, Output); UAVDiffuse0[nDTid.xy]=Output; Output = UAVDiffuse1[nDTid.xy]; if ( Depth2 < 1 ) Output = GetLColorUnPackPack_CS(Depth2, Tex, Output); UAVDiffuse1[nDTid.xy]=Output; Output = UAVDiffuse2[nDTid.xy]; if ( Depth3 < 1 ) Output = GetLColorUnPackPack_CS(Depth3, Tex, Output); UAVDiffuse2[nDTid.xy]=Output; Output = UAVDiffuse3[nDTid.xy]; if ( Depth4 < 1 ) Output = GetLColorUnPackPack_CS(Depth4, Tex, Output); UAVDiffuse3[nDTid.xy]=Output;
The uint GetLColorUnPackPack_CS (float Depth, float2 UV, uint Data) is the function extracting the color and normal as 2 float3 from the R32_UINT Data param, calculating as usual lighting and shadows for the pixel and then recompacting the final color to the R32_UINT (the normal is not changed). The Depth and UV params are used to recover the position in view space needed for the point lights I’m using.
Any advice welcome