Husband. Father. Software engineer. Ubuntu Linux user.
I love stories about hunting down interesting bugs. I find things like left-pad breaking the internet, cloudbleed, and the sleep bug in Eve Online extraordinarily interesting.
Recently, I went on my own bug hunt. I’d seen some strange behavior in our PHPUnit test suite. The tests were passing when run normally, but certain tests would consistently fail with process isolation enabled. The behavior was similar to what is shown below, which is a from a simple test suite I made to reproduce the problem:
mkasberg$ phpunit .
PHPUnit 5.7.15 by Sebastian Bergmann and contributors.
... 3 / 3 (100%)
Time: 124 ms, Memory: 8.00MB
OK (3 tests, 3 assertions)
mkasberg$ phpunit --process-isolation .
PHPUnit 5.7.15 by Sebastian Bergmann and contributors.
..E 3 / 3 (100%)
Time: 421 ms, Memory: 8.00MB
There was 1 error:
1) ThingTest::testTheThing
array_keys() expects parameter 1 to be array, integer given
/home/mkasberg/Desktop/php-test/ThingTest.php:7
ERRORS!
Tests: 3, Assertions: 2, Errors: 1.
Uh-oh. That’s not good. I mean, the failure alone is bad enough, but the fact that it only shows up with process isolation turned on is particularly troubling. Process isolation starts a new PHP process for each unit test. If tests are passing without process isolation and failing with process isolation, it implies that the tests are changing global state. These type of bugs are notoriously hard to track down.
I started by simply debugging one of the failing tests. And I found rather quickly that the error (shown with process isolation enabled) was correct. Very simply, a function was being called with incorrect parameters. Now, that’s easy to fix, but I didn’t want to fix it yet. Before I could fix it, I had to understand why it appeared to pass when run as part of a larger test suite. A test suite that hides failures is a really scary thing.
The scale of the problem I faced was overwhelming. The results shown above are from a simple test suite designed to reproduce the problem. The suite where I originally discovered the problem had more than 1000 different tests, more than 20 of which would fail with process isolation enabled. Figuring out which one of the 1000+ tests was affecting the global state was a daunting task (because it could literally be any of them).
My first step was to reduce the size of the problem. Disable as many test cases as possible while still producing the behavior I originally saw. Basically, I turned off different parts of the test suite until I figured out exactly which combination of tests was causing the problem. This is a bit strange - imagine, I have a test suite with more than 1000 tests that should be producing roughly 20 errors, but isn’t. And I’m turning off entire chunks of the test suite, hoping that I still don’t see any errors even though I should. Eventually, I was able to reduce the problem to 3 tests.
One of the tests was the test that is supposed to fail. The other two tests were required to run before the failing test to reproduce the behavior that hides the error. If I turned off either of the other two tests, the error would be shown. These two tests were somehow affecting the global state of the PHP process. I created a gist on Github that shows my own (simplified) reproduction of the problem.
At this point, although I knew which tests were causing the problem, I still didn’t know why. In the codebase where I first discovered the problem, the problematic function call was buried 15 layers down the call stack. To find it, I took the same approach that I took to finding the 3 problematic tests. Comment out large chunks of code ’til something changes. If I comment out a large chunk of code and the error shows up because of it, I know that something in that chunk is causing the problem. I didn’t care if I broke any of the other tests along the way - I just needed to find out which line of code was preventing PHPUnit from showing the correct error message.
After a bit of searching, I found it. And after finding it, it made perfect
sense. Our code was calling set_error_handler()
to do some custom logging with
errors. In a production environment, our custom error handler would be
configured to send error messages to certain places, or act on them in other
ways. In PHPUnit, none of this config was loaded. So it would eat error messages
without reporting them anywhere.
If I was writing all of this from scratch, I’d probably solve this problem with
dependency injection. In production, set_error_handler()
would actually be
called. And in unit tests, I’d use a mock that would never actually call the
set_error_handler()
function. But I wasn’t writing this from scratch. This
error handler was in a very old part of a very large codebase. I needed to be
sure that set_error_handler()
would be called 100% of the time in
production. And, for fear of running into the same problem again, I needed to be
sure that it would never be called inside a unit test. The solution I came up
with is ugly. But it always works.
if (!isset($GLOBALS['argv'][0]) || strpos($GLOBALS['argv'][0], 'phpunit') === false) {
set_error_handler(...);
}
The biggest lesson I learned from this experience is really just reinforcing
something I already knew. PHP has several functions like ini_set()
,
set_error_handler()
, etc. that can affect the global state of a PHP process.
If you can’t avoid using these functions entirely, at least isolate them in your
code base so they’re not inter-mingled with other logic in your code.
Run your PHPUnit test suite with --process-isolation
occasionally. It’s
slower, but it might show you problems you didn’t otherwise know about.
👋 Hi, I'm Mike! I'm a husband, I'm a father, and I'm a senior software engineer at Strava. I use Ubuntu Linux daily at work and at home. And I enjoy writing about Linux, open source, programming, 3D printing, tech, and other random topics. I'd love to have you follow me on X or LinkedIn to show your support and see when I write new content!
I run this blog in my spare time. There's no need to pay to access any of the content on this site, but if you find my content useful and would like to show your support, buying me a coffee is a small gesture to let me know what you like and encourage me to write more great content!
You can also support me by visiting LinuxLaptopPrices.com, a website I run as a side project.