I just added compute shader support to Crystal Engine.
And it’s fully integrated with the FrameGraph, render pipeline, and the asset pipeline. And all the pipeline barriers and dependencies are automatically generated for you by the FrameGraph!
Here’s what you need to do to add compute shaders to your render pipeline:
Create an asset in Engine/Assets/Sandbox/MyComputeShader.compute with the following code:
#pragma kernel CSMain
// SRG includes
#include "Core/Macros.hlsli"
RWTexture2D<float4> _TextureTest : SRG_PerPass(u0);
[numthreads(8, 8, 1)]
void CSMain(uint3 tid : SV_DispatchThreadID)
{
const uint2 uv = tid.xy;
_TextureTest[uv] = float4(_TextureTest[uv].r, _TextureTest[uv].g * 0.1, 0, 1);
}
Currently, only compile-time asset processing is supported, hence why you need to create the shader inside Engine/Assets directory.
Next, you need to create a pass for it in MainRenderPipeline.cpp at the very end of the pipeline as below:
auto assetManager = AssetManager::Get();
Ref<CE::ComputeShader> computeShader = assetManager->LoadAssetAtPath<CE::ComputeShader>("/Engine/Assets/Sandbox/TestCompute");
auto computePass = CreateObject<RPI::ComputePass>(this, "TestComputePass");
{
// This will return (8, 8, 1) in our case!
Vec3i invocationSize = computeShader->GetReflection().invocationSize;
computePass->SetShader(computeShader->GetRpiShader(0));
// This automatically picks the dispatch work group size from the output image resoltion.
computePass->dispatchSizeSource.source = pipelineOutput->name;
computePass->dispatchSizeSource.sizeMultipliers = Vec3(1.0f / invocationSize.x, 1.0f / invocationSize.y, 1.0f / invocationSize.z);
// _TextureTest
{
RPI::PassSlot textureSlot{};
textureSlot.name = "Texture";
textureSlot.slotType = RPI::PassSlotType::InputOutput;
textureSlot.shaderInputName = "_TextureTest";
textureSlot.attachmentUsage = ScopeAttachmentUsage::Shader;
textureSlot.loadStoreAction.loadAction = AttachmentLoadAction::Load;
textureSlot.loadStoreAction.storeAction = AttachmentStoreAction::Store;
computePass->AddSlot(textureSlot);
// Bind the output of resolve pass to the input of our TestComputePass
RPI::PassAttachmentBinding textureBinding{};
textureBinding.name = "Texture";
textureBinding.slotType = RPI::PassSlotType::InputOutput;
textureBinding.attachmentUsage = ScopeAttachmentUsage::Shader;
textureBinding.connectedBinding = resolvePass->FindOutputBinding("Resolve");
textureBinding.fallbackBinding = nullptr;
computePass->AddAttachmentBinding(textureBinding);
}
rootPass->AddChild(computePass);
}
And here’s how the final render pipeline will look like (inside the dotted rectangle)

And here’s the final output:

Feel free to check it out here:
https://github.com/neilmewada/CrystalEngine