So, I've been talking about probes a lot of late, and most recently, a fair bit about Spherical Harmonics. I linked to a couple of things, but I figure that for posterity, and clarity, it'd be worth it to just jot down what it is and how it do, if for nothing else except to have a good breakdown of the idea, theory and application for games.
So, the concept of utilizing Spherical Harmonics come out of the need for fast, cheap ambient lighting that also was directional. As in, if we say 'Hey, renderer, what's the lighting to our right look like', we can get particular colors rather than a blanket color like the flat ambient we get off of our sun lights.
There's a lot of different ways to get all that indrect lighting information, which I'll cover a few, but it all stems from the idea that when light is cast into a scene, unless it's completely absorptive, some light will reflect off it's surface, picking up some coloration from the surface, and bouncing onto other surfaces nearby.
The cornell box is a very standard representation of this idea:
As you can see, red wall on the left, green wall on the right. When the light on the ceiling is cast into the box, it hits various surfaces, including those walls, and will bounce light around. When it bounces, color form those surfaces is picked up and bounced to other nearby surfaces, which can be seen on the 2 small boxes in the room. The red and green bounce onto them from their respective sides.
Now, the CORRECT way to simulate all this is by raytracing thousands of photons per frame from lightsources, which hit surfaces and bounce to hit more surfaces and the like, until the photon runs out of energy. This is, however, very. VERY. slow. It's why render times for scenes in offline renderers take minutes to hours to days.
So obviously that's right out for game rendering, we need to be WAY faster than that. Which is where all our various methods developed over the years come into play. One very common method that's still used today but has limitations is lightmapping.
That's where we do the raytracing as per our offline renderer, but we then save the results into a texture that can be very cheaply looked up and applied onto our objects so during runtime it's very fast while still getting those excellent offline render results. However, that's at the sacrifice of objects not being able to move if they want that fancy lighting. Dynamic objects such as players, cars, etc don't get that lighting information at all!
So work was done and a few other ways were found to convey that fancy lighting where light bounces and transfers colors and stuff via photons - henceforth referred to as 'indirect lighting'. With modern hardware, we can do some approximate methods that get the gist of the raytracing method, but can run in realtime(though this doesn't leave much room for other stuff to render fast if you don't have a VERY expensive graphics card) such as SVOTI or VXGI.
SVOTI, or Sparse-Voxel Octree Total Illumination takes the geometry of the scene, voxelizes it, and then with the much simpler voxel scene, raymarches from the camera to pixels, and samples nearby voxels to get the bounce information. It's not super accurate, but it's pretty accurate, rather fast and it looks good. Dynamic objects can get the bounced lighting info from the static objects around it, but dynamic objects don't contribute bounced lighting themselves. So the greenery of a forest will bounce green light onto your soldier guy, but your soldier guy won't bounce lighting onto the greenery. The voxels are calculated on the CPU asynchronously, so it doesn't drag the rendering down, but it's still not super fast(it can have problems keeping up if you're fast moving, for example) and has higher memory requirements if we don't want to keep recomputing the same voxels.
VXGI, or Voxel Global Illumination, is similar, but the voxelization happens each frame purely on the GPU. This lets everything bounce lighting, so it's comparatively more accurate, but it's also a lot more expensive to do the voxelization each frame. Even with dedicated hardware support, it's still basically too expensive to actually use it. But the results are very nice:
So it's pretty accurate, but it's rather slow still.
So a middle-ground between lightmaps and voxel tracing methods that has seen a lot of use in realtime rendering is Spherical Harmonics.
Spherical Harmonics is the idea where we want to encode the irradiance of a scene into as compact - but decodable - as possible while still being able to get that indirect lighting like we would expect. So what's irradiance? Good question.
Irradiance is the concept that, for any given pixel of a surface, pragmatically, that surface can "see" a 180 degree hemisphere around it. So if we to take a ball, any given point on that ball can 'see' 180 degrees away from that surface. If you were to shoot a laser at that point, the point could be hit by that laser anywhere from that 180 degree hemisphere. This means that, when we're talking about indirect lighting, any given pixel, principally, will be receiving light from ALL directions inside that hemisphere it can 'see'.
In our offline raycast method, this is done by just firing an obscene number of photon rays away from each pixel and sampling them, basically brute forcing what the given pixel can 'see'. the voxel methods use 'cone tracing' which is a rougher approximation requiring fewer samples. For spherical harmonics, though, we have a cubemap.
A cubemap, you say? Yep, a cubemap. See, when the probes are baked to do reflections, we take 6 renders from the probe's position, Positive along the X axis, Negative along the X axis, Positive Y, Negative Y, Positive Z, Negative Z. This lets us know what a reflection would look like from literally any direction around the object.
When we do the renders for our cubemap, we're also rendering with lighting enabled. This is so reflections are actually accurate, but it ALSO means that we know what the scene looks like from a lighting perspective. If there's little light in the scene, say from a single flashlight, the cubemap is going to be pretty dark. If there's a lot of light, from the sun, we're going to see a LOT of light.
So that's cool, since the cubemap also represents our lighting around the scene, we can just take the pixel, sample from the cubemap, and we're good, right? Well...not quite.
See, the problem goes back to irradiance, with the full hemisphere thing I mentioned above. When we sample the cubemap, we can sample one specific point on it based on a direction. When sampling for reflections, this is great because it gives us as sharp or soft of reflections as we need, but when it comes to irradiance, that's not accurate because we're not getting the full hemisphere of lighting info that pixel can 'see'.
An example of a cubemap, a blurred version, and a irradiance map. Irradiance is different from just running a blur filter on a cubemap, because it's specifically biasing towards bright, lit pixels in the cubemap. You can see in the example there that the lit spots from the windows are much better defined with irradiance because it's biased towards lights. This is important for our lighting information.
So we need to store it. There's 2 ways to do this. Irradiance Mapping, or Spherical Harmonics. Irradiance mapping is very accurate, as I've said before, but it requires an entire second cubemap. Even if the cubemap is low resolution, that's a fair bit of additional cost per probe. To calculate irradiance, we pretty much just mathematically take a pixel on an imaginary sphere, and then sample every pixel in the cubemap that pixel on our sphere can see, and use some math to average it out. It's pretty much brute forcing it, but it works. When we finish that, we know what the irradiance info for every pixel on our sphere is.
We could then save it to another cubemap, making an irradiance map, so that when we have our rendered pixel in the scene, when we sample from the cubemap, we basically precomputed the full hemisphere of lighting information that pixel can 'see', and we're done. But as said, second cubemap, more overhead, etc.
So the other way is Spherical Harmonics. We do mostly the same work with calculating irradiance as above, sampling the hemisphere of light for each pixel, but instead of saving it to a cubemap, we use some voodoo math to 'encode' it. Using some very particular math formulas, we can encode all our 360 degrees of irradiance information in just 9 RGB colors - our Spherical Harmonics Terms. At 3 orders(9 colors), Spherical Harmonics has around a 90-95% accuracy.
So to use it, we pass those 9 colors to the shader, and when we render, we take the pixel's normal(which informs the direction the pixel faces) and run it through a decoding function, which uses some particular maths to manipulate the 9 terms we have to end out with a single, final RGB color that represents the irradiance that pixel can see.
So when we do a bake of a probe in our cornell box we can get that indirect lighting information happening by calculating the irradiance and encoding to SH terms for that probe. It lets us do this for any pixel that is inside the probe's radius, working for dynamic or static objects. The memory footprint is low because it's only 9 colors + our reflection cubemap we were going to have anyways. The only limiter is that updating requires re-baking the probe, doing the 6 directional renders again, so doing it realtime is rather costly and would be used VERY discretionally. But it's a good middle-ground technique between lightmaps and full voxel raymarching GI.
An example scene using spherical harmonics. You can clearly see that we get lighting information directionally. The floor-facing pixels get the darker brows, the right-facing pixels get muted tans from the walls and the window-facing pixels are brightly lit from the direct light exposure. This is decoded off the 9 terms we made by encoding the irradiance data.
Once we have our reflections and our irradiance, we need to apply both to the scene. We're still hashing the best way to do this(do we write both to the same buffer, should irradiance be applied to the direct lighting buffer instead, etc, etc) but the data is largely there now, so we just need to decide the best way to apply it. I'll probably add a few more images later when I make them to better illustrate some parts, but hopefully this better explains what SH is and why we'd even want to use it.
Boy howdy, that was a busy weekend. It was a holiday weekend, so I did get monday off, which was naturally largely squirreled away for gamedev, but I think I spent about half of it helping out family move and build furniture!
But no one cares about furniture, so lets get to the good stuff!
You guys may or may not have noticed on the repo that there's now a PR to kill of D3D9. She's served us and many others incredibly well, but it's actively holding back being able to expand the GFX layer to better support modern APIs, so out she goes. If you can grab it and give it a whirl to help test, that'd be awesome. Following on the back of that, I *should* have a few other PRs going up across the next couple of days, a biggie is sRGB support for images, which standardizes formatting across the engine, and simplifies the linear pipeline stuff we were doing before(because sRGB supports the behavior in actual hardware, so less back-and-forthing on the encoding, which is awesome!).
Also got some various GFX API improvements for both GL and D3D11, which should help performance and clean stuff up. I'm also going to start pulling out bits from my RnDBuild repo and PR'ing them. Once the above is sorted, we pretty much just gotta lay out the last few bits for probes and PBR'll be ready to go in, which is a pretty big deal.
Other things that have been getting worked on include me putting a bit more time at the new editor layout. I'd mentioned before here, but I'm looking at a refactor of the editor's window handling, with the ability to dock, tab, etc the various windows. The groundwork for that is mostly done, mostly just gotta iron out bugs, and then standardize an API for creating tools and hook-ins for the main editor interface to make it super simple to customize the editor.
Another bit towards that end is I modified(expanded, really) the guiVariableInspectorCtrl which wasn't used until now. Now, it'll be an inspector, just like the regular one in the editor, only this allows setup of completely arbitrary fields, and binding them to any given object or global variable. Idea being a much cleaner, more standard way of doing inspector-like uis, like ones I use in the statemachine and shadergraph editors, asset import options, and editor/project settings. Common control, lots of flexibility in it's use. I'm even going to have a setup in place to allow 100% unique, script-generated field controls, so you can get really fancy with it if you need to and the regular inspector fields just don't cut it. It's based off the code I'm already doing for the components, such as the material field on the mesh component, so it definitely works pretty slick.
I mentioned asset importing, and that got the bulk of the attention done tonight.
So, a video related:
Here, you see me doing our friend the drag-and-drop to trigger asset importing, but you'll likely spot some neat new bits. First: I dropped in an fbx file, and it appropriately loaded it in as a model file. What? How! you may ask - shocked to your core.
Well, the answer is Assimp. I worked on getting that sucker integrated today. It's got some bugs yet, but the majority is there, and it's far enough along I can test the frontend side of things. This'll allow us to load in a bunch of different model file types. The full list of POSSIBLE types is on their site/repo, so feel free to look if you want, but we'll be having a more...reserved list of formats. No need to go crazy with it. But an expanded list such as obj, 3ds, fbx and blend files seem like a good baseline. Yeah, you will be able to drag and drop the entire blend file if you want when this goes in. That'll be cooooool.
From there, we see the asset list. If the icon to the right is pressed, we get a new Import Option window. This uses that guiVariableInspector control i mentioned before. Each asset has a configuration object to contain all the sweet details we may want to configure when loading in. I'll probably add a lot more data to these windows going forward(where a given file is coming from/going to, the ability to pre-emptively rename assets, files, being able to selectively pick which module an asset goes to, etc) but you should get an idea of how much control you'll have over the import process.
Then we hit it, it loads everything in and I pork through. You'll probably notice we have an extra image asset, and a material asset. These were created off the model asset we imported. We parse through the model, get materials and textures it use, and then if the Import Materials option on the import configuration window is picked, we'll automatically pull those files in with the mesh, generate the appropriate assets and a fledgeling material asset. Interestingly, assimp lets textures in the model file be flagged for type, so Diffuse, Normal, Roughness, Specularity, etc. So we could potentially pre-load in all associated files, type them correctly and smartly populate the end material definition without any input required. Obviously the import options will let you refine as needed, but it should streamline away a LOT of the manual dicking around to get stuff hooked in, especially the materials side.
I do need to get the 'brought in with' assets to display in our 'inbound assets' list with some kind of indication of that, but yeah. Pretty sweet deal. One thing that came up when I was talking to az was the ability to automate out parts of this kinda thing: removing/adding/renaming nodes, materials, etc. Having certain import settings done up a certain way for different assets. Basically saving configurations for your personalized workflow. So one thing I'll be looking into is the ability to have scripts that can be selected when importing that act as a configuration for all that kinda stuff. If you have a particular setup for your import settings for players, and it's different to vehicles, you could have 2 configuration settings scripts, and pick the relevent one and it'll pre-configure most of those settings for you so you don't have to do a ton of clicking around each and every time.
Small creature comforts that can really save time over the course of a project.
Another creature-comfort bit I started on was better editor settings controls and more importantly project settings controls. The ability to adjust fairly minor project-specific settings, such as default pref settings, the splash/icon paths, and a interface to quickly build out the keybind maps so you can just worry about the actual logic they implement can likewise save time with less hopping around and trying to track down where's-what. Nothing fancy yet, but I think when it's done, it's going to be a lot bigger a help that people may anticipate.
On the probes stuff, we've mostly been trying to hash out if we can get some parallax correction on the reflections so they're not so wildly incorrect. It's not a major thing, as like 95% of materials don't reflect sharply enough to matter, but often those subtle details are what really sells the scene. Az also started looking into a long-standing thorn in the side of PostEffects and a few other spots, where they register a named texture target, and nothing else gets to use it. It makes having several post effects modify the same data, or post effects read and write to the same target, a real problem.
This is relevent to the probes because we want to get ScreenSpace Reflections added to supplement the reflections of the probes, but that requires pulling the indirectLighting target that the probes write into, and then writing data into it again. When we get that limitation fixed, not only will we be able to do more stuff, there's the possibility we can streamline a few postEffects as well.
Anywho. Once PBR goes in, a majority of attention then becomes wrapping up the asset and e/c stuff to get that stuff locked into place, at which point we should have quite a little monster on hand and a good chunk of the 4.0 target knocked out.
Jason Campbell Yep
Plan is to shift to a more artist-friendly setup that uses the nodegraph UI so it's visually based. We're also mulling on maintaining 'legacy mode' with the inspector-field style setup that the current material editor has for people more comfortable with that if they don't wanna do up anything outside the standard, but my plan is to pretty much standardize out custom and regular materials into just 'Materials' with the visual editor being the primary frontend for rigging up what the material does. The advantage there would be that you could then draft up completely custom shaders/materials without needing to fuss on a divergent subsystem, so your effects can be as simple or as advanced as you want them to be.
However, one of the big advantages that the inspector-style material editor we have now offers is it's very quick to slam out standard materials. So the current thinking is the engine having a spread of common materials to act as templates you can easily duplicate and expand to fill a similar roll. Also having nodes for the nodegraph editor that makes common features stupid easy to add in if you're fine with the standard/default approach, such as having a node you can just throw on there to enable forest-wind behavior on the verts, adding parallax, or adding a wave animation and so on.
We're also looking at shifting to a single shader language that's compiled into the appropriate platform's shader language as needed. This would simplify things for those that want to manually write shaders, would cut down on duplication of common shaders we use for lighting and other baseline render behaviors, and would be easier to maintain on the shadergen side of things.
Mostly just a question of the best approach for it. We can use macros in the shaders to pretty much translate the language-specific elements(floats vs vecs, mix vs lerp, yada yada) auto-magically. There's also transpiler options, like one that the Khronos group is working on that's coming along nicely where you pass it a shader type, and it'll compile it into Spirv, which can be stored as-is, but then also compiled into HLSL, GLSL and MetalSL without any manual work. So we've got options there that are looking promising as well. But the end-goal is to standardize and simplify how much work the shader backend is to maintain(and for the end users to have to work with) so that's all looking good.
Simplification sounds great. In my current project I am having trouble maintaining decent performance even with modest settings. It seems weird that a decent computer struggles. What is the realistic outlook on performance gains?
Quite a few things we're looking to that should help on the performance side.
A number of small improvements - cleanup of extraneous codebits, such as needlessly rebinding shader uniforms or render targets, can lead to small savings. We've also got some cbuffer implementation that will get PR'd after i get the sRGB stuff PR'd which helps some too.
The big ones are I'm implementing Intel's Software Occlusion library, which will let objects dynamically occlude without always needing to set up occlusion volumes, meaning we can skip out on a lot of crap we can't see, saving a lot of performance per frame.
The other biggie is threading out the renderer into it's own thread. As-is, the renderer is in the main game thread, which isn't great for performance, because the main thread has to wait on the renderer after it throws stuff to the GPU, and the GPU has to wait on the main thread to get the next batch of stuff to draw.
The good news is, the existing renderbin system already utilizes render info being handled via render proxies (MeshRenderInst and the like), so the plan is to properly compartmentalize GPU resources - GFXTextureHandles, Shader data, etc, copy it to the render thread, and then when objects inform the renderer they should be rendered, the proxy is handed off to the render thread. when the main thread's out of stuff it needs rendered, the render thread then runs full-tilt to render everything as fast as possible while the main thread loads up on the next batch of stuff.
So yeah, that should lead to a very good perf gain as well.
Beyond that, I'm looking towards implementing small preference oriented-things, like adding draw distance values to lights with fade, so that if you're far away from a bunch of small lights, you can just have them fade out at a distance and not render, unlike now, where we render everything that's in-view, even if it's too far away to pragmatically matter.
Won't know the ultimate result until we start getting all that locked in, but gut-estimate holds that it should be a pretty hefty gain.
This GI algorithm works 100% as post process without requiring any additional passes or data storage. The algorithm does respect occlusion and if screenspace data is correct (no hidden occluders or off-screen light sources) then it converges to exact path traced solution, it's unbiased.
There is no article known to me describing this kind of technique. It's based on semi-analytical GI integral estimation based on horizon levels. It produces very accurate near-field GI similar to HBAO and somewhat accurate far-field GI, however some thin remote occluders might be skipped.
+ occlusion accuracy is enough to produce indirect shadows and light+shadows from area lights (emissive objects arbitrary shapes) that are computed basically for free.
+ designed to compute 2nd bounce indirect light, but since it also computes HBAO as a byproduct, I use it as approximation of 3rd bounce.
+ framerate is high enough to be used in real-time in fullres -- around 50fps on my laptop's GTX850M, independent of screen geometry
+ unbiased, pretty accurate
- exploits both spatial and temporal coherence that might produce artifacts on rapid movement and some static-like noise
- being a screenspace technique, it completely disrespects everything that's out of camera field of view