Posted On: 2020-03-23
Recently, I encountered (and resolved) a rather surprising application crash. Although it turned out to have a rather mundane cause, it was particularly difficult to diagnose, as it managed to bypass nearly every crash-handling mechanism available. For the benefit of those that may face similar situations (or those just looking for a tale of things going wrong in development) this post will walk through how I stumbled into the bug, as well as how I diagnosed and resolved it.
Leading up to the error, I was working on a "patience" system - which is designed to limit consecutive interactions with a particular character (so they don't run out of things to say.) I worked through a simple implementation without too much trouble, but, after some time away from it, I realized that, with some (significant) changes, I could vastly simplify the user experience for a writer using the system. As such, I took the working system and rewrote large chunks of it, until it reached the point that the writer could simply flag a particular choice as "finite" and the system would handle everything else*.
The implementation touched a number of important systems, so the first test run was quite problematic. There were several significant cosmetic issues*, but the most severe issue that I faced was the entire Unity editor suddenly closing. Unlike an "ordinary" crash, this spontaneous closure did not display Unity's crash reporting tools, (mis)leading me to believe that this was not actually a crash. Perhaps most frustratingly, the process for the game would terminate abruptly, causing my debugger to detatch and leaving me with no information about what went wrong.
I started first by trying to force the debugger to break on any exception, but, sadly, no actual exception was raised before the application closed. From there, I dug into Unity's error logs searching for any clue about what exception was occurring. Alas, there were no logs for this event either. Then, I dug into the computer's application error logs and, finally, found an "application error" event.
While this was a step forward, I still had no information what the actual cause was, so I started to place breakpoints in the application, starting somewhere I knew was working and then carefully trying to step through and identify which parts did and did not run. As I worked through each part of the new code, I noticed something odd: the code responsible for reading the remaining "patience" was running multiple times, even though it should only need to run once in this frame.
As I walked the debugger down the stack, I found, to my dismay, the previous read of "patience" was in the same call stack! The application could never figure out what the value of "patience" was*, because any time it tried it had to first figure out the value of "patience" (which required figuring out the value of "patience" - and so on, repeating infintely until it crashed.) At this point, it was clear: I was dealing with a stack overflow.
The underlying cause of this crash was quite simple: I had accidentally created an infinite loop. The code to read the current value of "patience" from Yarn would call a static method, and that static method would then try to read the value of "patience" from Yarn. As soon as I realized my mistake, it was trivial to fix* - though this whole thing did leave me scratching my head wondering why it didn't cause a Stack Overflow Exception.
So, the lesson for today's post is: if it looks like a crash, and sounds like a crash, but doesn't log like a crash, it might just be a stack overflow. Of course, it also might be something else entirely - but hopefully seeing my process for going through it can give you a leg up on any problems you might face.
After some more investigation, it turns out this was a bug in the particular Unity version I was using: upgrading to a newer version resolved the editor crash (instead providing the expected exception.)