Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yeah. It essentially ends up as the Playstation GPU, but with quads and forwards texturing.

It's basically inevitable. Any proposal for fixing the Saturn that slaps on a "optimise for BoM" requirement will be pushed towards unified video memory. And the only contemporary example of such a GPU was the playstation. Not to mention, the design of the playstation GPU appears to be pretty close to optimal for it's era and price point.

And if you were to also switch your hypothetical Saturn design to triangles and inverse texturing, the resulting design looks even more like the Playstation GPU.

> but the bus is only saturated when drawing 2D elements or untextured polygons

The Playstation's bus will also be saturated when loading the texture cache, or when scanning out the previous frame (only on the later SGRAM versions of the GPU). Even with textured triangles framebuffer writes are limited to 16bit bandwidth, less time is wasted on those other tasks which periodically interrupt rasterisation.

Though, now that I think about it, inverse texturing probably gives the playstation a massive performance advantage over and forwards texturing. Inverse texturing allows triangles to be rasterized in strict left-to-right then top-to-bottom order, which is also the order that the framebuffer is laid out in memory. This order will minimise the amount of time spent waiting for the next DRAM row to be prefeched.

The order of framebuffer writes with forwards texturing is somewhat chaotic and probably wastes a bunch more time closing and opening DRAM rows.



The memory access pattern of inverse texturing is just as chaotic when it comes to reading texels, which is why the Playstation's texture cache plays an important role in its performance characteristics.

Saturn doesn't have a texture cache, since texel reads should be nice and sequential. But then it pays a penalty when writing to the framebuffer.


I guess since forwards rendering doesn't need a proper texture cache, you could used the savings to implement a write combining buffer.

But I'm only saying that because I know write combining is a thing, and I'm not sure a hardware engineer in 1993 would have gone down that path.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: