Code for this part can be found here.

## Go Wild!

In Part II, we saw how we could utilize the algorithm for rendering 3D objects without perspective-divide with aid of a depth buffer. We left off saying that next is vertex attributes and perspective-correct interpolation so let’s get to it!

If you are wondering what the heck vertex attributes are to begin with: A vertex attribute is simply any **user-defined** value that is attached to each vertex of an object, such as color, normal, texture coordinates. It’s simply up to you how you define them and put them to use by passing between shaders. For instance, using per-fragment normals and texture coordinates, one could implement normal mapping to improve objects’ appearances. Remember, they are completely optional and not required for rasterization algorithm itself.

As we have seen in earlier posts, by rasterizing objects, we transform them from 3D object-space to camera space first (applying perspective projection in the process) and then all the way down to 2D screen-space. As such, above-mentioned *vertex* attributes need to be interpolated across a triangle in screen-space so that we can have these values available at all fragments inside a triangle as we shade them.

Thankfully, to interpolate these values in screen-space, we don’t need to start anew: It’s all the same process & properties that we used to set up edge functions in Part I that we are going to extend here. Put another way, edge functions are just another application of the same algorithm that is used to interpolate parameters (which must not come as a shock to anyone who’s read the paper I’ve linked) so let’s revisit the original paper once again:

That last sentence is the catch here — the coefficients *a, b,* and *c* are the same in all three equations. Which means that, once we calculate those coefficients a, b and *c*, we can compute *u*/*w* value in screen-space using (3).

In Part I, that’s exactly how we set up our edge functions. First, we said that we have three edge functions defined at each vertex of a triangle:

And then went onto calculating those coefficients by using what we called *vertex matrix* and *interpolation constraints*, which is simply that each edge function is constant and equal to 1 on one edge while 0 at opposite vertex:

And lastly, by inverting vertex matrix on the left and multiplying it by our interpolation constraints on the right, we ended up with what we were after: the coefficients of our edge functions, which we could feed into eq. (3) above now (i.e. what we do as we loop over all pixels and compute edge functions).

Notice that for this to work, division by *w* (aka *homogeneous division*) is applied to eq. (2). *That* is the property that gives us the ability to interpolate parameters across a triangle in screen-space!

If it’s not clear: there is nothing *special* about our edge functions that makes this all work: **Any** three values defined at each vertex of a triangle will do. One very important fact to note here is that, after applying homogeneous division to eq. (2), we end up with *functions of pixel coordinates* (X, Y) and the values (or results/outputs of functions) themselves are multiples of 1/*w*. You may wonder, how did this work in previous parts then? The answer is simple: We simply didn’t care because all we were interested in was that the edge functions tell us whether a point is inside or outside a triangle by comparing *signs* and not *values* themselves! To recover the true parameter values (e.g. *u* and not just *u/w*), we are going to make use of what we just made up and dubbed *constant function* in Part II. How, you say? As I alluded to it, we can make up and attach any value to vertices of triangles that we want to rasterize. So, let’s make one and assign constant value 1 to every vertex:

By using the same mechanism of applying homogeneous division to the parameter function, we end up with:

Oh, that’s great! It’s a function of pixel coordinates again so we can just interpolate it at every fragment (as for all other parameters) as we rasterize triangles and divide our parameters by *1/w* value to recover the values themselves:

Note how we interpolate both parts — *u/w* and *1/w* separately to interpolate function *u(x/w, y/w)* correctly.

To convert this theoretical knowledge into practice, we are going to render a more complex scene this time, the well-known Sponza scene, which has about ~260k textured triangles with texture coordinates and per-vertex normals so let’s get into calculating those interpolation coefficients and functions we’ve just discussed:

// Interpolate 1/w at current fragment float oneOverW = (C.x * sample.x) + (C.y * sample.y) + C.z; // w = 1/(1/w) float w = 1.f / oneOverW;

This time around, we also use *z-buffer* instead of *1/w* to perform depth test in order to resolve visibility so let’s interpolate and recover it, too:

// Interpolate z that will be used for depth test float zOverW = (Z.x * sample.x) + (Z.y * sample.y) + Z.z; float z = zOverW * w;

Handling of *normals*:

// Calculate normal interpolation vector glm::vec3 PNX = M * glm::vec3(fi0.Normal.x, fi1.Normal.x, fi2.Normal.x); glm::vec3 PNY = M * glm::vec3(fi0.Normal.y, fi1.Normal.y, fi2.Normal.y); glm::vec3 PNZ = M * glm::vec3(fi0.Normal.z, fi1.Normal.z, fi2.Normal.z);

// Interpolate normal float nxOverW = (PNX.x * sample.x) + (PNX.y * sample.y) + PNX.z; float nyOverW = (PNY.x * sample.x) + (PNY.y * sample.y) + PNY.z; float nzOverW = (PNZ.x * sample.x) + (PNZ.y * sample.y) + PNZ.z; glm::vec3 normal = glm::vec3(nxOverW, nyOverW, nzOverW) * w; // {nx/w, ny/w, nz/w} * w -> { nx, ny, nz}

Handling of *texture coordinates* (or *UV*s):

// Calculate UV interpolation vector glm::vec3 PUVS = M * glm::vec3(fi0.TexCoords.s, fi1.TexCoords.s, fi2.TexCoords.s); glm::vec3 PUVT = M * glm::vec3(fi0.TexCoords.t, fi1.TexCoords.t, fi2.TexCoords.t);

// Interpolate texture coordinates float uOverW = (PUVS.x * sample.x) + (PUVS.y * sample.y) + PUVS.z; float vOverW = (PUVT.x * sample.x) + (PUVT.y * sample.y) + PUVT.z; glm::vec2 texCoords = glm::vec2(uOverW, vOverW) * w; // {u/w, v/w} * w -> {u, v}

As we did in Part II, we will feed all the vertex data that we fetch from the vertex buffer as input to Vertex Shader and return from it a *vec4* position which will be used to transform a triangle:

// Vertex Shader to apply perspective projections and also pass vertex attributes to Fragment Shader glm::vec4 VS(const VertexInput& input, const glm::mat4& MVP, FragmentInput& output) { // Simply pass normal and texture coordinates directly to FS output.Normal = input.Normal; output.TexCoords = input.TexCoords; // Output a clip-space vec4 that will be used to rasterize parent triangle return (MVP * glm::vec4(input.Pos, 1.0f)); }

Now that we are done handling all parameters, we can feed all that data output from *Vertex Shader* into what usually is called *Fragment Shader* (or Pixel Shader in DX parlance) as input, execute the fragment shader and write the new color value at given fragment:

// Pass interpolated normal & texture coordinates to FS FragmentInput fsInput = { normal, texCoords }; // Invoke fragment shader to output a color for each fragment glm::vec3 outputColor = FS(fsInput, pTexture); // Write new color at this fragment frameBuffer[x + y * g_scWidth] = outputColor;

Our Fragment Shader will consist of code to fetch texel (i.e. *tex*ture *el*ements) values from the loaded texture of a primitive using its texture coordinates, which we pass to it from our Vertex Shader after interpolating correctly (as we just did above):

// Fragment Shader that will be run at every visible pixel on triangles to shade fragments glm::vec3 FS(const FragmentInput& input, Texture* pTexture) { #if 1 // Render textured polygons // By using fractional part of texture coordinates only, we will REPEAT (or WRAP) the same texture multiple times uint32_t idxS = static_cast<uint32_t>((input.TexCoords.s - static_cast<int64_t>(input.TexCoords.s)) * pTexture->m_Width - 0.5f); uint32_t idxT = static_cast<uint32_t>((input.TexCoords.t - static_cast<int64_t>(input.TexCoords.t)) * pTexture->m_Height - 0.5f); uint32_t idx = (idxT * pTexture->m_Width + idxS) * pTexture->m_NumChannels; float r = static_cast<float>(pTexture->m_Data[idx++] * (1.f / 255)); float g = static_cast<float>(pTexture->m_Data[idx++] * (1.f / 255)); float b = static_cast<float>(pTexture->m_Data[idx++] * (1.f / 255)); return glm::vec3(r, g, b); #else // Render interpolated normals return (input.Normals) * glm::vec3(0.5) + glm::vec3(0.5); // transform normal values [-1, 1] -> [0, 1] to visualize better #endif }

Texturing alone is a complex topic with all its tricky details like address control modes, filtering, aliasing and whatnot so I don’t want to get into this and get lost. Put simply, it’s the process of mapping provided texture coordinates to image coordinates and looking up texels from image resource to texture polygons as they get shaded. Many resources can be found created by far more knowledgeable people on the topic than me, so I hope that you’ll excuse my severe hand-waving here!

If you followed along without any issues, you must be able to render a frame as follows:

If you run the project right away to render *one frame* of Part III, you might be surprised how utterly *slow* it turned out to be, so let’s quickly mention some optimization opportunities. In the meanwhile, you may want to profile the source code and see where bottlenecks lie, maybe.

One of the biggest performance killers in our little rasterizer is that, even if many triangles cover a relatively small region on screen, we blindly loop over **all** pixels to check which ones overlap a given triangle. This could be optimized heavily by what’s called *binning*. What the process of binning does is simple: Partition the whole frame buffer into sub-regions and discard those sub-regions that do *not* overlap triangles all at once.

Second optimization is elegantly simple, *once* you discover it: Instead of evaluating edge functions at every fragment, we could compute it once and rely on the following property, as noted by Juan Pineda in **A Parallel Algorithm for Polygon Rasterization** in order to evaluate edge functions incrementally:

Third point is that if you look at DrawIndexed() function, you should see that all the computations for one triangle are independent from one another, with two exceptions: performing depth test on a *global* depth buffer and *if* it passes, writing new color value to *global* frame buffer, which should give you enough clue for how you could go about drawing many triangles (be it of same primitive or different one) in parallel. Of course, as with many things, in real-world rasterization, some restrictions, one of which is primitive order would apply.

Fourth point of optimization is detailed in **Incremental and Hierarchical Hilbert Order Edge Equation Polygon Rasterization.**

## Conclusion

Phew, that was quite a journey! Starting from the foundation of the algorithm in Part I, we saw how we could render a *single triangle*. In Part II, we examined how the algorithm could be extended to 3D. And finally in Part III, we discussed how to utilize the same algorithm to handle vertex attributes & interpolation and finished off rendering a non-trivial scene with many polygons correctly. Thanks & congratulations to all who made it to the end with me!