Friday, September 13, 2019

Can your ALM do that?

Support for runtime crash reporting and analysis should be a part of the Application Lifecycle Management (ALM).

Now, my exposure to the ALM landscape out there is limited. TFS/AzDevOps - I know it inside and out. Github (including the corporate edition) - moderate experience. GoCD, TeamCity, Jenkins - I have an idea what do they do.

Here's what a good crash reporting framework should be doing:
  1. Gather crash reports from user devices or production servers
  2. Interpret them in the best possible way - reconstruct the call stack, the variables
  3. Identify repeated crashes, group them together
  4. Bring up the line in the source where the crash was, and lines that led up to this one
  5. Human triage - is this crash a software bug, or the fault of the platform?
I'm proud to say that in my two person software shop, this is (partially) a reality. On Android more so than on iOS. There's a nice Web UI where I can see the list of recent crash reports. Once inside, I see a call stack with function names and source lines. When I click on a function name in the stack, the source file pops up with the offending line highlighted, and side by side there's a disassembly of the function, with the crash line highlighted. No local variables though; that'd be cool, but I'm not that far yet. (EDIT: I am that far now, parameters and locals are parsed out of crash reports too, where possible.)

This amazing capability, though, requires rich cooperation from other sides of the application lifecycle. In order to recover the call stack from the crash report, one needs a copy of the unstripped SO file that shipped with that particular version - so the build system should be aware, so that it stores a copy of the SO where the crash reporter can find it. Bringing up the source, in turn, requires tapping into the source control.

Finally, once I decide the crash is mine to fix, a bug is created in the bug tracker. Once there's a bug for a crash, subsequent crashes in the same location are automatically assigned to the same bug. It's a circle of life, and I'm quite surprised big name vendors are not offering something like this.


No comments:

Post a Comment