Topic : S-Buffer FAQ
Author : Paul Nettle
Page : << Previous 4  Next >>
Go to page :


         // Remove Cur from the list
            // Call IntersectSplit for Cur and New
            // Return
         }
      }
   }

   // Just add it to the end
   // Return



This routine makes use of a few linked-list management routines and a routine I called IntersectSplit().

IntersectSplit() simply finds the intersection of two lines that start at the same x coordinate, end at the same x coordinate and returns the order in which the two segments appear from left->right, unless one segment is totally behind the other, in which case it simply returns the single segment.

Note that your splitting code must be very accurate. Not making an accurate splitting routine or an accurate IntersectSplit() routine may cause artifacting in the foreground polygons caused by polygons hidden behind them.

Besides, if your intersection and splitting routines are accurate, then you can expect more accurate results than a 16-bit z-Buffer.

For more information on inserting segments into the list, see the Q titled: "What are some ways to improve the Insertion routine?"

Q: "What is...KEY ELEMENT #3: The segment manager"

The segment manager is simply a reservoir of pre-allocated segment structures. These segment structures hold the actual segments that are linked together in the s-Buffer's linked lists.

If you were to malloc() and free() segments as you needed them, you run the risk of fragmenting memory, and running out of larger chunks of memory for other parts of your application. Not to mention the extra overhead of the malloc() and free() routines.

To solve this problem, a segment manager is needed. It keeps a reservoir of segments, and gives you free segments as you ask for them. This may be done in any number of ways. It's also very fast.

A simple data example would be an array of SEG pointers that is "grown" as needed, in blocks of...say...512 segments at a time. When you run out of available segments, you simply allocate another block, and give one up.

Here's some psudo-code for the GetSegment() routine:


SEG *GetSegment( void )
{
   if (there are no available segments)
   {
      GrowReservoir()
   }

   Find an available segment
   Decrease the number of available segments in the reservoir
   Return the segment pointer
}



The segment manager might consist of these routines:

InitSManager()
Allocates the default base number of segments for the reservoir probably by calling GrowReservoir()

UninitSManager()
Frees all memory in use

ResetSManager()
Marks all segments in reservoir as "available"

GetSegment()
Find an available segment (call GrowReservoir() if none are available), mark it as used, and return it's pointer.

CopySegment()
A simple memcpy() will do

GrowReservoir()
realloc() the reservoir's array of segments in increments of some pre-determined number. Use a #define for this number for fine-tuning memory usage.

Q: "What are some ways to improve the Insertion routine?"

Now, it's your turn. Let's have some optimizations. I'd love to see any code/psudo-code submissions to the FAQ! Here are some things to consider:

Using a Binary-searchable tree instead of a linked list
Checking entire New segments for fully behind or in-front before adding or quitting.
[this section not complete]
Q: "What about accuracy?"

Considering the accuracy of splitting two 2D line segments, the accuracy is perfectly pixel accurate where as a 16-bit z-Buffer is only partially accurate for a 32-bit world coordinate system.

Q: "What kind of performance can I expect?"

I can't say, at this time, exactly what sort of performance improvements to expect over z-Buffer. It all depends on the application, the code used for the segment manager, and especially the insertion routine. When I have some numbers, I'll include them here.

However, let me explain why I believe you can expect better results out of s-Buffering than z-Buffering in hardware. No this dude ain't nuts, that's right, software that's faster than hardware. Keep on readin'. :)

When you're rendering to the s-Buffer, if you're clever about it in your insertion routine, you can eliminate many segments before they get into the s-Buffer, so you've just eliminated a segment that your code would have spent VERY valuable time drawing. Consider this example:


           +----------+    +----------+
           |4444444444|    |2222222222|
           |4444444444|    |2222222222|
        +------------------|2222222222|--+
        |111111111111111111|2222222222|11|
        |111111111111111111|2222222222|11|
        |111111111111111111|2222222222|11|
        |111111111111111111|2222222222|11|
        |111111111111111111|2222222222|11|
        +------------------|2222222222|--+
           |4444444444|    |2222222222|
           |4444444444|    |2222222222|
           |4444444444|    |2222222222|
           |4444444444|    |2222222222|
        +--|4444444444|------------------+
        |33|4444444444|333333333333333333|
        |33|4444444444|333333333333333333|
        |33|4444444444|333333333333333333|
        |33|4444444444|333333333333333333|
        |33|4444444444|333333333333333333|
        +--|4444444444|------------------+
           |4444444444|    |2222222222|
           |4444444444|    |2222222222|
           +----------+    +----------+

[You wouldn't believe how long it took to draw that!]



In the above example, the polygons are numbered. Notice, I even used the age-old problem of recursively overlapping polygons. s-Buffers handle it intuitively.

There are (roughly) 300 pixels in the above example. 200 of which overlap. Using z-Buffering in hardware, you will have to draw 500 pixels, keep in mind that you'll also have to track the z-coords for each pixel for use with the z-buffer hardware. But we'll ignore that fact for now.

So you draw blast out 500 pixels. This is your lowest-level code. This is your fastest stuff.

You've just rendered a full-screen of polygons, and 40% of it was for nothing. Jeesh! 40%! What a waste!

Consider s-Buffers. They immediately find that where you have an overlap in the example, there is no extra drawing to be done. Granted this takes a little time (not much, mind you). But the savings is 40% of screen-space coverage! You're simply NOT drawing those pixels!

There are other advantages, too.

s-Buffers allow you to cut your drawing to the significant pixels ONLY. You draw 300 pixels (without tracking the z-coord along the way).

Unless the hardware is drawing the stuff for you (which s-Buffers should be nicely adaptable to -- if the hardware supports scanline drawing of textures, shades, etc. rather than entire polygons) then you're still doing the drawing. The z-Buffer hardware is only helping you out with your low-level code. It's only doing something for you. It's not reducing the number of pixels you draw, it's just doing some work for you.

s-Buffers not only reduce your workload by reducing the amount of screen space coverage you're drawing to, but it does the z-Buffering for you. Just like hardware does. Granted there is overhead for inserting segments, but nothing compared to the rendering of them to the screen only to find that they're not viewed.

So, in this example, you see about a 5/3 speed improvement. It gets much better as you have more hidden polygons. But, with z-Buffering, your performance just plummets with more hidden polys. It's an inverse non-linear scale to your disadvantage.

Q: "Yeah, so like, what about memory?"

Since you've written a segment manager (which is, by definition -- a memory manager) you've got control. You decide how much memory to allocate at any one point in time. The smarter your segment manager is the better off you'll be.

But there is still the issue of how much memory to store the required segments in the list.

For simple gouraud shading, here's what I would expect a segment structure to look like:


typedef struct segment
{

   short  start_x, end_x;       // starting and ending Xs
   short  start_z, end_z;       // starting and ending Zs
   char   start_s, end_s;       // starting and ending shade values

} SEG;



I count 14 bytes per segment structure. This means that

Page : << Previous 4  Next >>