文章

URP中的环境光遮蔽算法与实现

1 Alchemy SSAO

1.1 Intro

这一章节主要是拆解URP管线内置的SSAO,该AO基于Alchemy方法。在本篇博客所对应的场景中,SSAO的使用条件为:

  • Deferred Rendering
  • Interleaved Gradient Noise
  • Rely on Depth+Normal
  • Bilateral Blur
  • Down-Sample Render Textures
  • AO in Lighting Calculations

基于以上的使用条件,SSAO的流程如下:

  1. Render GBuffer
  2. Copy Depth
  3. SSAO
    1. 计算AO
    2. 水平模糊
    3. 竖直模糊
    4. 最后的模糊
  4. Deferred Lighting

接下来,我们逐一分析SSAO四次Draw Call所使用的Shader Pass

1.2 计算AO

。函数SSAO将会输出一个half4,该值打包了计算得到ao值与法线向量。

首先,SSAO是一个基于深度值的后处理效果,所以我们跳过深度值为0的片段:

1
2
3
4
// early out for sky
// -----------------
float currentRawDepth = SampleDepth(uv);
if (currentRawDepth < SKY_DEPTH_VALUE) return PackAONormal(HALF_ZERO, HALF_ZERO);

URP的SSAO实现添加了AO对于相机的可见距离,这也是需要我们优先进行处理的部分:

1
2
3
4
5
// early out for Falloff
// ---------------------
float currentLinearDepth = GetLinearEyeDepth(currentRawDepth);
half halfCurrentLinearDepth = half(currentLinearDepth);
if (halfCurrentLinearDepth > FALLOFF) return PackAONormal(HALF_ZERO, HALF_ZERO);

获取当前片段的法线向量。由于我们在延迟渲染中实现SSAO,Unity会为我们提供_CameraNormalsTexture这张纹理,我们直接根据屏幕空间坐标采样即可:

1
2
3
// get normal for current fragment
// -------------------------------
half3 normalWS = half3(SampleSceneNormals(uv));

我们在观察空间内完成SSAO的计算,所以,我们需要根据当前片段的屏幕空间坐标与深度值,来还原出观察空间下由相机指向当前片段的向量。具体的还原过程可以回顾这篇博客,这里就不再赘述了。这部分的代码如下:

1
2
3
// get camera->current fragment vector in view space
// -------------------------------------------------
float3 currentPosVS = ReconstructViewPos(uv, currentLinearDepth);

SSAO的核心在于,分析每个片段周围的多个采样点的几何遮蔽关系,从而计算环境光遮蔽的强度。这意味着我们需要在重复迭代中完成计算。每次迭代中,我们需要首先获取一个采样点,经过一定的空间变换,进行遮蔽关系的判断。

所以,在进入循环前,我们还需要为采样点的空间变换做一点准备。具体来说,我们需要在C#中准备一个矩阵,用于将世界空间的坐标转换到裁剪空间:

1
2
3
Matrix4x4 view = renderingData.cameraData.GetViewMatrix();
Matrix4x4 proj = renderingData.cameraData.GetProjectionMatrix();
mCameraViewProjection = proj * view; 

在Shader中,我们将_CameraViewProjection分解为两组参数:

1
2
half3 camTransform000102 = half3(_CameraViewProjection._m00, _CameraViewProjection._m01, _CameraViewProjection._m02);
half3 camTransform101112 = half3(_CameraViewProjection._m10, _CameraViewProjection._m11, _CameraViewProjection._m12);

这实际上是提取了视图投影矩阵的前两行前两列,对应着裁剪空间坐标的X与Y分量计算

现在,我们可以开始处理循环中的代码了:

1
2
3
4
5
6
7
8
const half rcpSampleCount = half(rcp(SAMPLE_COUNT));
half ao = HALF_ZERO;
half indexHalf = HALF_MINUS_ONE;
UNITY_UNROLL
for (int sampleIndex = 0; sampleIndex < SAMPLE_COUNT; sampleIndex++)
{
    ...
}

在每次迭代时,我们首先需要获取当前片段周围的一个采样点。具体来说,是先计算出一个法线指向的半球内的随机向量:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
half3 PickSamplePointFromHemiSphere(float2 uv, int index, half indexHalf, half rcpSampleCount, half3 normal)
{
    // generate random noise
    // ---------------------
    const float2 positionSS = GetScreenSpacePosition(uv);
    const half noise = half(InterleavedGradientNoise(positionSS, index));

    // generate uniformly distributed spherical coordinates
    // ----------------------------------------------------
    const half cosPhi = frac(GetRandomVal(HALF_ZERO, index) + noise) * HALF_TWO - HALF_ONE;
    const half theta = (GetRandomVal(HALF_ONE, index) + noise) * HALF_TWO_PI;

    // generate cartesian coordinates on unit sphere from spherical coordinates
    // ------------------------------------------------------------------------
    const half sinPhi = half(sqrt(HALF_ONE - cosPhi * cosPhi));
    half3 v = half3(sinPhi * cos(theta), sinPhi * sin(theta), cosPhi);

    // stratify radial distribution for AO sampling (denser near, sparser far)
    // -----------------------------------------------------------------------
    // Note: sampleIndexHalf starts from 0.0
    v *= sqrt((indexHalf + HALF_ONE) * rcpSampleCount);

    // align the sample point with normal
    // ----------------------------------
    v = faceforward(v, -normal, v);

    // apply sample radius, and return
    // -------------------------------
    return v * RADIUS;
}

将得到的随机向量作为偏移值,就可以获取到采样点的观察空间坐标了:

1
2
3
4
// get sample point in view space
// ------------------------------
half3 randomVector = PickSamplePointFromHemiSphere(uv, sampleIndex, indexHalf, rcpSampleCount, normalWS);
half3 samplePosVS = half3(currentPosVS + randomVector);

既然我们有了随机采样点的观察空间坐标,那么自然而然地就可以获取到该采样点在观察空间中的理论深度值,即:

1
half theoreticalSampleDepthVS = half(-dot(UNITY_MATRIX_V[2].xyz, samplePosVS));

但是,这个理想深度值并没有考虑到在场景中的实际几何遮挡理想深度值与实际的深度值这二者之间的关系正是SSAO算法用于计算遮蔽的核心。那么,如何计算实际的深度信息呢?很简单,计算出采样点的屏幕UV坐标,采样深度图,并映射到线性深度即可。具体来说,需要进行如下的空间变换:

  1. 计算采样点在观察空间中的绝对深度值(也就是theoreticalSampleDepthVS
  2. 计算采样点在裁剪空间中的XY分量(没有做透视除法)
  3. 执行透视除法与坐标映射,得到屏幕UV

这一过程的代码如下:

1
2
3
4
5
6
7
8
9
// get theoretical depth and sample screen-space uv
// ------------------------------------------------
half theoreticalSampleDepth = half(-dot(UNITY_MATRIX_V[2].xyz, samplePosVS));
half2 samplePosXYCS = half2
    (
        camTransform000102.x * samplePosVS.x + camTransform000102.y * samplePosVS.y + camTransform000102.z * samplePosVS.z,
        camTransform101112.x * samplePosVS.x + camTransform101112.y * samplePosVS.y + camTransform101112.z * samplePosVS.z
    );
half2 sampleUV = saturate(half2(samplePosXYCS * rcp(theoreticalSampleDepth) + HALF_ONE) * HALF_HALF);

这样,我们就可以获取到采样点的实际深度值了:

1
2
3
4
// get raw/linear depth of sample point
// ------------------------------------
float rawSampleDepth = SampleDepth(sampleUV);
float actualSampleDepth = GetLinearEyeDepth(rawSampleDepth);

接下来,我们需要验证采样点的有效性,需要满足两个条件:

  • 深度差约束:用于确保采样点在有效半径内。具体来说,我们为SSAO添加了一个半径参数,用于表示AO效果的影响范围,也就是说,只有在这个半径内的几何体,才应该对片段片段的遮蔽效果产生贡献。通过深度值约束,能够避免较远的实际几何体的无效干扰
  • 天空盒剔除:确保采样点不是天空盒
1
2
3
4
5
6
7
8
// We need to make sure we not use the AO value if the sample point it's outside the radius or if it's the sky...
// sample point should only contribute to AO if:
// --------------------------------------------
half halfActualSampleDepth = half(actualSampleDepth);
// 1. the sample point is inside the radius
half validAO = abs(theoreticalSampleDepth - halfActualSampleDepth) < RADIUS ? 1.0 : 0.0;
// 2. the sample point is not the sky
validAO *= rawSampleDepth > SKY_DEPTH_VALUE ? 1.0 : 0.0;

现在,我们终于可以计算遮蔽贡献了。Alchemy所使用的遮蔽计算表达式为:

\[AO=\frac{max(v\cdot n-\beta\cdot d, 0)}{||v^2||+\epsilon}\times validAO\]

首先我们构建向量$v$,它表示从当前片段的观察空间位置指向采样点实际几何的观察空间位置:

1
2
3
// get relative position of the sample point
// -----------------------------------------
half3 currentPosToSample = half3(ReconstructViewPos(sampleUV, actualSampleDepth) - currentPosVS);

这个向量的意义是什么呢?如下图所示,我们用$v$表示该向量,那么$v$与法线之间的点积就可以用于衡量采样点的遮蔽权重。夹角越大,则遮蔽贡献越小;夹角越小,则遮蔽贡献越大。

1
half dotValue = dot(currentPosToSample, normalWS);

此外,我们需要考虑到,如果当前几何体处于较远的位置,则应该适当减弱其遮蔽效果,所以我们添加一个深度偏置项:

1
dotVal -= kBeta * halfCurrentLinearDepth;

接下来我们考虑分母。这里我们采样平方距离衰减的方法,这样的话,邻近物体的遮蔽影响更大,更接近于物理正确的效果。而$\epsilon$则用于防止$v$过于接近零向量(采样点与当前片段几乎重合):

1
2
3
4
5
6
7
// estimate the obscurance value
// -----------------------------
half dotVal = dot(currentPosToSample, normalWS);
dotVal -= kBeta * halfCurrentLinearDepth;
half a1 = max(dotVal, HALF_ZERO);
half a2 = dot(currentPosToSample, currentPosToSample) + kEpsilon;
ao += a1 * rcp(a2) * validAO;

当我们结束迭代后,我们就可以对AO进行最后的处理,并与法线打包输出了:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
// intensity normalization
// -----------------------
ao *= RADIUS;

// calculate falloff
// -----------------
half falloff = HALF_ONE - halfCurrentLinearDepth * half(rcp(FALLOFF));
falloff = falloff * falloff;

// apply contrast + intensity + falloff^2
// --------------------------------------
ao = PositivePow(saturate(ao * INTENSITY * falloff * rcpSampleCount), kContrast);

// return the packed ao + normals
// ------------------------------
return PackAONormal(ao, normalWS);

1.3 Bilateral Blur

SSAO需在计算环境遮蔽时保留深度边缘(如物体轮廓),避免模糊导致AO泄露或伪影。双边滤波通过法线权重区分边缘,确保遮蔽效果仅在平滑区域传播,而边缘保持锐利。相比之下,高斯模糊会破坏边缘,Kawase Blur虽高效但无法处理边缘保留的需求。

SSAO算法中的双边滤波的核心逻辑在于,比较两个法线向量的相似度,生成基于法线差异的权重系数

1
2
3
4
half CompareNormal(half3 d1, half3 d2)
{
    return smoothstep(kGeometryCoeff, HALF_ONE, dot(d1, d2));
}

其中,kGeometryCoeff用于双边滤波的几何感知灵敏度,值越高,越灵敏,默认为0.8。

那么,前两次模糊的代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
half4 Blur(const float2 uv, const float2 offset) : SV_Target
{
    half4 p0 =  SAMPLE_BASEMAP(uv                       );
    half4 p1a = SAMPLE_BASEMAP(uv - offset * 1.3846153846);
    half4 p1b = SAMPLE_BASEMAP(uv + offset * 1.3846153846);
    half4 p2a = SAMPLE_BASEMAP(uv - offset * 3.2307692308);
    half4 p2b = SAMPLE_BASEMAP(uv + offset * 3.2307692308);

    half3 n0 = GetPackedNormal(p0);

    half w0  =                                           half(0.2270270270);
    half w1a = CompareNormal(n0, GetPackedNormal(p1a)) * half(0.3162162162);
    half w1b = CompareNormal(n0, GetPackedNormal(p1b)) * half(0.3162162162);
    half w2a = CompareNormal(n0, GetPackedNormal(p2a)) * half(0.0702702703);
    half w2b = CompareNormal(n0, GetPackedNormal(p2b)) * half(0.0702702703);

    half s = half(0.0);
    s += GetPackedAO(p0)  * w0;
    s += GetPackedAO(p1a) * w1a;
    s += GetPackedAO(p1b) * w1b;
    s += GetPackedAO(p2a) * w2a;
    s += GetPackedAO(p2b) * w2b;
    s *= rcp(w0 + w1a + w1b + w2a + w2b);

    return PackAONormal(s, n0);
}

half4 HorizontalBlur(Varyings input) : SV_Target
{
    const float2 uv = input.texcoord;
    const float2 offset = float2(_SourceSize.z * rcp(DOWNSAMPLE), 0.0);
    return Blur(uv, offset);
}

half4 VerticalBlur(Varyings input) : SV_Target
{
    const float2 uv = input.texcoord;
    const float2 offset = float2(0.0, _SourceSize.w * rcp(DOWNSAMPLE));
    return Blur(uv, offset);
}

经过书水平与竖直方向的两次处理,我们已经显著降低了噪声水平。所以在最后一次模糊中,我们使用对角线偏移进行采样,处理前两次模糊残留的斜向瑕疵,并使用了更小的采样范围,从而面不引入过多的模糊:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
half BlurSmall(const float2 uv, const float2 offset)
{
    half4 p0 = SAMPLE_BASEMAP(uv                            );
    half4 p1 = SAMPLE_BASEMAP(uv + float2(-offset.x, -offset.y));
    half4 p2 = SAMPLE_BASEMAP(uv + float2( offset.x, -offset.y));
    half4 p3 = SAMPLE_BASEMAP(uv + float2(-offset.x,  offset.y));
    half4 p4 = SAMPLE_BASEMAP(uv + float2( offset.x,  offset.y));

    half3 n0 = GetPackedNormal(p0);

    half w0 = HALF_ONE;
    half w1 = CompareNormal(n0, GetPackedNormal(p1));
    half w2 = CompareNormal(n0, GetPackedNormal(p2));
    half w3 = CompareNormal(n0, GetPackedNormal(p3));
    half w4 = CompareNormal(n0, GetPackedNormal(p4));

    half s = HALF_ZERO;
    s += GetPackedAO(p0) * w0;
    s += GetPackedAO(p1) * w1;
    s += GetPackedAO(p2) * w2;
    s += GetPackedAO(p3) * w3;
    s += GetPackedAO(p4) * w4;

    return s *= rcp(w0 + w1 + w2 + w3 + w4);
}

half4 FinalBlur(Varyings input) : SV_Target
{
    const float2 uv = input.texcoord;
    const float2 offset = _SourceSize.zw * rcp(DOWNSAMPLE);
    return HALF_ONE - BlurSmall(uv, offset);
}

最后,由于我们想要让输出的AO直接参与光照计算,所以我们还需要将AO转化为“可见度”。

1.4 配置Render Feature

这一步就相对来说比较简单了,唯一需要注意的是,由于我们将在Deferred Pass中计算AO,Render Feature需要为此做一些准备工作:

1
2
3
CoreUtils.SetKeyword(cmd, ShaderKeywordStrings.ScreenSpaceOcclusion, true);
cmd.SetGlobalVector(_AmbientOcclusionParam, new Vector4(1f, 0f, 0f, mSettings.m_DirectLightingStrength));
cmd.SetGlobalTexture(_ScreenSpaceOcclusionTexture, mSSAOTextures[kFinalTextureIndex]);

1.5 References

VV11AlchemyAO.pdf

A Comparative Study of Screen-Space Ambient Occlusion Methods


本文由作者按照 CC BY 4.0 进行授权