How to crash ‘gracefully’ – a Windows CE story
After 12 hours of frustration, digging and trying incredibly stupid things like recompiling WinCE images for tens of times, I finally managed to identify an unexpected behavior that no sane person would ever imagine. But insanity is daily part of your life when you work with embedded devices (small scale computers, in fact).
The problem: I have a WinCE application that is deployed and runs in an emulator instance. To make sure that it’s as inefficient as possible, it’s a managed application. Then you have this absolutely strange behavior: when running it, I get the following message: “The device-side executable could not be shut down”.
This happens even when there was no device-side executable to be shut down. Which is really really strange even for Windows CE, well known for wtf-grade behavior. Luckily, my project was on a very tight schedule, thus such an illustrious error is nothing more but entertaining, in a deadly, insanity edging way.
There are a few traces on the Holy Interwebz on how to fix this issue. The most important one is that there is not enough memory. Of course, this is laughable, since we’re talking of small number of bytes and there are roughly 64 MB of memory available. But since desperation makes you do wonderful things, I stripped down the emulator image to the point that there is nothing left to be removed, yet no improvement whatsoever was felt in the behavior of the application.
After a while, through magical desperation and after thousands of ‘prayers’ to the gods of embedded devices, the shut-down error simply disappeared (I am unable to understand why) – and a new behavior got in place: The application was deployed correctly, and apparently it wanted to start running – but didn’t run at all. Instead, the application caused the connection to the device to be lost!!!
Sour desperation installed in my veins; I reinstalled Virtual PC 2007, I uninstalled all other virtual machines from my workstation. I reinstalled even the SDK for the device I was working on. No success, the automagical reboot/reinstall routine seems not to work for my issue.
Until one inspired suggestion that made me place a breakpoint on the very first line of code of the program. And it looked like the program DID run – and it modestly crashed on the very first statement of the program. I shall not insist on how crappy an environment that misguides the developer like that is. I shall not lower myself to performing various acts with other developers that were involved in the development of WinCE.
I shall just tell you what the problem really was: a stack overflow. Hidden recursion caused by uninspired singleton usage overflowed on my stack, and instead of receiving a message from the system: “stack overflow, you stupid bastard!”, WinCE decided it would be far better not to say anything and explain me that the device is no longer connected (even if it’s a damn emulator).
So the solution: check the code for recursion or stack overflow. Microsoft’s WinCE thinks that stack overflows are personal offenses and decided that the best behavior would be to cause a mindfuck fest for the hesitant but idiotic developer that thought he had a good idea by referencing an instance of a singleton during its initialization. GG, WinCE!