Looking for an advice on how to possibly boost my compute shader


I have 4 rendertargets R10G10B10A2 each being used to store color and normal in a compact double RGB 555 bits format. When I’m doing the lighting pass I’m obliged to have extra rendertargets because if I can read(extract) the normal/color, I can’t write back directly the lit color to these surfaces.

So I moved to a compute shader to achieve this goal (DX11 CS5). I have found that I’m limited to R32_UINT for read/write in DX11 CS5. I have thus adapted my encoding/decoding to have things work fine, although the final colors are a little bit hugly with my encoding.

Unfortunately the final frame rate is lower than the regular pixel shader even if I have to do 4 copyresources from the intermediate rendertargets with my previous method.

I have implemented the CS starting from those found in the legacy DX11 samples. My code is like this:

CPU calls:

 //some CSbufferConstant and CSshaderresourceview settings (e.g. shadowmaps, depth textures
 gpDC11->CSSetShader(pCS, 0, 0);
 //I' mworking on 4 UAV at the same time in the CSshader. With only one it is even slower
 gpDC11->CSSetUnorderedAccessViews(0, 4, gSRDeffered.ppUAV, (UINT*)(&gSRDeffered.ppUAV));
 gpDC11->Dispatch(960, 540, 1);//the size of my screen

GPU CS shader:

[numthreads(1,1,1)]
void CS_PostDeferred( uint3 nGid : SV_GroupID, uint3 nDTid : SV_DispatchThreadID, uint3 nGTid : SV_GroupThreadID )//only nDTid is used in fact here

    uint Output;
    float2 Tex = float2(nDTid.x/960.0, nDTid.y/540.0);
    float Depth1 = txDepth1.SampleLevel(samPoint, Tex, 0).r;
    float Depth2 = txDepth2.SampleLevel(samPoint, Tex, 0).r;
    float Depth3 = txDepth3.SampleLevel(samPoint, Tex, 0).r;
    float Depth4 = txDepth4.SampleLevel(samPoint, Tex, 0).r;

    Output = UAVDiffuse0[nDTid.xy];
    if ( Depth1 < 1 ) Output = GetLColorUnPackPack_CS(Depth1, Tex, Output);
    UAVDiffuse0[nDTid.xy]=Output;
    Output = UAVDiffuse1[nDTid.xy];
    if ( Depth2 < 1 ) Output = GetLColorUnPackPack_CS(Depth2, Tex, Output);
    UAVDiffuse1[nDTid.xy]=Output;
    Output = UAVDiffuse2[nDTid.xy];
    if ( Depth3 < 1 ) Output = GetLColorUnPackPack_CS(Depth3, Tex, Output);
    UAVDiffuse2[nDTid.xy]=Output;
    Output = UAVDiffuse3[nDTid.xy];
    if ( Depth4 < 1 ) Output = GetLColorUnPackPack_CS(Depth4, Tex, Output);
    UAVDiffuse3[nDTid.xy]=Output;

The uint GetLColorUnPackPack_CS (float Depth, float2 UV, uint Data) is the function extracting the color and normal as 2 float3 from the R32_UINT Data param, calculating as usual lighting and shadows for the pixel and then recompacting the final color to the R32_UINT (the normal is not changed). The Depth and UV params are used to recover the position in view space needed for the point lights I’m using.

Any advice welcome



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *

Previous Article

Past Simple and Past Continuous Tense Exercises

Next Article

Comedy veteran Zoe Rabnett joins Counterfeit Pictures as Director of Talent and Comedy Development

Related Posts