2010-05-31

Fencing - Problem Solved

4 days of brain bashing of a very complex error to track it down and slay it.

At first, the program would crash if run outside of debug mode.

"Okay, so it must be a memory overrun, and I'm writing past the end of the array, because the debugger puts padding for detection against clobbering memory"; which looks like this:

uchar myarray[ 10 ];
myarray[ 10 ] = 60; //WHOA, myarray holds 10 ELEMENTS, so position 10 is the 11th element.

Often called a fence, I couldn't find it. What was weirder, std::string was throwing errors. std::map was having kittens. and ntdll seemed to vomit all over the floor at this party from hell.

As I burrowed into the code, seething rage ignited; was I being burned by out of date compilers? Was my code wrong? What had I done? What had THEY done wrong?

This project had it all, nested complex templates, deep inheritance heirarchies, large blocks of data and nasty C algorithms. Luckly, most of my (good) code has __debug_regression defined, so I was able to quickly rule out very large segments of code against error with a single look back at passed test dates.

But my mind was going, Dave.

I ignored the fact that if my code worked in debugging, then only the heap allocation method would cause 0xc0000005 errors thanks to ntdll 's nice memory bounding protection (mainly protects it from 0xBAADFOOD but yeah)

So, I eventually upgraded my compiler, my debugger, brought in the help of my custom designed memory heap manager (igtl_MMHeapSystem; I love you, I LOVE YOU! OH GOD YOU'RE SO SEXY MMMMM*codepronz*) and tracked down the offending bug. Hours of poking and prodding landed me with a really surprisingly simple conclusion.

I was off by one.

I'd been accused of being off kilter, off base, and not even on this world, but one? Off by one causes these insanely weird errors? Why did the debugger not detect the stack corruption? Why did the heap manager skip telling me I was writing one integer past my array? Why did absolutely none of the tools find this till I made my own?

Here's why:

When my array was writing past the end of it's detection, it wrote (coincidentally) over the bad food word. But, in the running of the program, that error PROPOGATED until some time later, when the error detection actually scanned memory; because no one in their right mind scans every single memory allocation (super slowness). By the time a scan actually occured, it was too late and bizzaro values had multiplied. But only in a small well contained region. That is, inside of a system dll I don;t have debug points in. Ironically, since the error affected nothing else in MY code, it really screwed up ntdll, and cause it to throw weird errors all over the place.. So, my code was just missing a +1, but the stack was being blow to pieces thanks to trying to protect itself.

Moral of the story?

Make sure to take a nice drink if you run into a hard problem. Usually, it's just something stupid. I use High Gravity Steel Reserve.

-Z

Here's some preliminary results with a nifty 3 instruction toon shader I made: (<3 Mecha Dragon)

2010-05-28

CURSES!

I've been fighting some errors for a while now with my importer code;

Eventually (2 days of brain bashing) I figured this out:

0xC0000005 errors are normal; But they're a good hint into memory over-walking.


Turns out, you can;t do this anymore (used to be able to; I'll figure out the fix):

struct myvec{ float x,y,z; } //Nicely packed struct takes up 12 byts, 3 floats in order.

float myfloats[9] = { 0, 0, 0, 10, 15, 10, -5, -10, 0 };  //Nicely packed float coordinates.

myvec somevecs[3]; //Make some vectors

memcpy( &somevecs[0].x,  &myfloats[0], sizeof(float) * 9 );  //Copy them in!


Before you scream at this; Note that this works perfectly lots of times. Apparently, there are some rare cases you can cause this to break and GDB is helpless in figuring out you just walked passed the save/writable zone in an array. It's really hard to get this kind of thing detected, especially when using pointer casts.

Basically, if you run into a situation in which you find yourself doing this; Please stop and redesign your code so you do not have to do this. Unfortunately, there is no "fast" solution without redesigning some kickass wrappers and C-style code; but if you OOP this, you run into massive overhead and ultra slowdown from all these specialized copy operations; So, what can you do.

In my testing, caching blocks of floats seems to work best, then just forcing your user to slowly copy them into their structures. Although, I don't do that, because I redesign my structure to WORK with a float * internally so this can be as fast as possible and still lexically coherent.

What a $(#@^& problem.

-Z

Here's a picture. .TED files contain a lot of information; Once I clean up this stupid problem I'll post more goodies. By the way, these are models from the PC game "I of the Dragon" because I don;t want to show my models until they look somewhat decent. Obviously, again, I'll never use these; They're only good test sets to play with.

2010-05-14

TED file format; Restricted XML

Ever notice how awful XML documents are?

Well, I upgraded my model exporter from blender to 2.49b; so now, instead of the annoying variety of formats I used to use (obm, arf, brf arf2.0, brf3.0, grf, iRF) I decided to spec out a more logical, easier to understand but far more byte/size inefficient DXML format.

I call it "TED" (Text Extracted Data).

The goal of TED files is to export the maximum usable, useful information out of Blender 2.49b+.
  • This includes Meshes, Armatures, Animations, Objects, Materials, Textures, and some scripting and logic elements. Meshes should contain enough data to be GLSL renderable (IE normals, tangents, uv layers, parameters).
  • TED files must also conform to a very strict file standard, so there will be NO allowed coordinate systems other than the default universal correct system (RHR, +X forward, +Y left, +Z up; This is universal for ALL transforms. Matrices also have a standard definition of Xx Xy Xz Yx Yy Yz Zx Zy Zz where X is the forward axis, Y is the left axis, Z is the up axis. Quaternions follow these definitions as well.) 
  • TED files must be easy to read in; Though they are not (byte) efficient, they should contain "hints" for a loader so it may allocate elements beforehand; for example declaring the number of elements in an array before reading it in. You might notice .TED files can be
  • TED files must maintain a logical and render-system friendly data model; And must encapsulate all data elements; No element can have differing types of data; ergo a tag can store an array of strings, an array of ints, an array of floats, or other tags ONLY.
Why TED files? Because my old exporters did not generate shader code or export tangent vectors. Also, the old exporters somewhat obsfuscated the data, and did not present it in a correct hardware manner; There are fixed graphical paradigms in working with 3D polygonal data; TED files will adhere to them strictly and yet maintain enough data to be edit-friendly.

What TED files should NOT be used for:
  • Final game data (you should convert to your own or local host format)
  • Final Level descriptions (TED files do not provide hashing or optimizations or portals of scenes)
  • Replacing something you already understand (unless it has something you need; just make a converter to convert TED to whatever)
TED files could possibly be placed in direct competition with other formats, such as "OGRE XML", "COLLADA", "Alias Wavefront .obj", "Direct X 8.0 .x" and other such lexical, human readable formats. None of those formats contained all of data I wanted to use, therefore I made .TED files. It would be a trivial exercise to convert FROM TED files to those formats, but the other way around is much more difficult due to those other formats inefficiencies and shortfalls in dealing with hardware level data. Yes, I've used all those formats before, and yes I have used and loaded .3ds, .mshp, .md2 .md5 and many many others. So no, the format you are thinking of also sucks.

I have not made the exporter for 2.5.2 yet because I have been burned many times by blender changing specifications mid-stream on me. Luckily, the ted exporter is written to avoid this as much as possible; but when more alpha versions come out I will snap one in easily. Until then, 2.49b it is.


If I could show you a .TED file I would; but HTML can't understand < > signs.

Have a dragon instead (All credit goes to DragonBlade from the Wii; Obviously I'll never use this model, it just happens to be the first example I picked)


Blender on the left, mingl without GLSL on the right (in-game)

-Z