Summary: This is another one of those “What’s new in verification land” posts. It talks about the verification aspects of the recent Oscars fiasco.
I was on vacation, and thus (mostly) missed this earth-shaking event. Luckily my spies are everywhere, and my friend Elliot Cohen sent me the following email with all the gory details:
This event should be very interesting to people interested in your field, especially people who don’t understand how the concept of a bug can be 100% technology independent. The bug report is simple:
Almost four hours into the event, the time had come for the final Oscar to be presented – Best Picture. Presenting Actor was handed an envelope. He opened the envelope, hesitated a bit, looked confused and announced “… Best Picture..” and handed it to Presenting Actress, who simply read the name written on the card which the winner of a previously announced award.
Only after the wrong winners came up and made three acceptance speeches was the workaround released and the correct Best Picture winner named.
https://www.youtube.com/watch?v=8KeOxeuiZjs
Root cause was like others you have mentioned – a combination of highly unlikely events. But when a process is repeated for ~90 years “Expect the unexpected”. Root causes:
- Turns out that the presenters enter the stage from two directions. On each side there is an accountant (!) with the results. Each one has the full set of envelopes – so the envelope handed to the presenters was of an award that had already been given
- The coloring of the envelope / text was changed this year and difficult to read under the stage lighting.
- The accountant was busy tweeting about the movie stars – possibly to his colleagues stuck in the office auditing multi-national widget manufacturers.
- The text on the cards was at the bottom and tiny and the text was huge.
- The presenters were ~80 years old and may have been confused. Turns out they don’t really get along with each other and were not able to break out of their acting personas. Their confusion and hesitation was mistaken for fake suspense and humor. A decent verification engineer would have stopped right here and called for help. But then, a QA engineer would never be nominated for Best Actor.
- The bug was discovered immediately – even before the “winners” reached the stage. And the people who knew that were right there. There was no mechanism to elegantly correct the error. Fast.
Anyway, as a result the accounting firm may have a hard time finding anyone who wants to even do a simple tax form with them, the accountant will be fired, the actor and actress will now be remembered for this snafu, instead of their amazing ~50 year careers. And no one will blame the system.
Count on next year seeing the same yellow font on red envelopes.
On a positive note – experts in every field (not just verification) can now toot their horns, from typography experts, lawyers, conspiracy theorists, political pundits, comedians and so even wedding planners.
And a recursive thought – perhaps this will spawn a movie about the whole incident. And perhaps that movie will be nominated for an Academy Award …..
Elliot’s email reminded me that the world is full of human-process / human-communication bugs. Most occur in areas not deemed worthy of “real” verification. Others have pretty serious consequences. For instance, consider the Avianca flight 52 story, which I mentioned here:
The bug was in the plane-crew-to-air-traffic-controller human procedure: There were long delays for landing at JFK, the pilot had little English, the co-pilot did not challenge the authority of the air-traffic controller (and also said “priority” rather than “emergency”), and eventually the airplane ran out of fuel and crashed.
Notes
I’d like to thank Kerstin Eder, Amiram Yehudai, Yaron Kashai, Gil Amid, Benjamin Maytal and Sandeep Desai for commenting on a previous version of this post (which was originally part of another post).
Verification is about predicting future failures, but failure investigations have perfect hindsight.
How on Earth forward-looking verification could predict Avianca disaster or (really not comparable) Oscar snafu?
Those are systems verified by thousands of landings and 90 years of low tech ceremony accordingly. Verified by long-standing success, invalidated by a single accident?
Not really sure about the Oscars bug: I found Elliot’s email entertaining, but I don’t know how much verification effort this really deserves.
However, the Avianca bug (and similar) is really serious stuff, and I think we _can_ come up with techniques for finding such bugs before they happen (essentially by simulating human interactions and using CDV-like techniques to go to corners).
This is clearly non-trivial. I talked about this in https://blog.foretellix.com/2015/07/28/its-the-spec-bugs-that-kill-you/
Simply funny – and though-provoking – at least until the Avianca bug mention, then it became sobering (and still though-provoking). Capturing the human interaction _within_ the “design under test” – not as an adjunct thing outside of the design – is what needs to happen, IMO. Modeling humans effectively “inside the design”, without strongly formed, or certainly automatically constrained, interaction protocol means you need to strongly capture this internal interface and, as pointed out, have what seems to be an over-the-top, strong, redundant correction system in place.
Seriously though: tweeting while handing out _THOSE_ envelopes, OMG, even my 14 year old would not have done that, she would have been petrified about screwing it up.
Indeed, capturing human behavior is one of the toughest parts. There have been several attempts at doing that: BDI (which I covered in previous posts), system dynamics, Markov chains and more.
Even when there is a defined interaction protocol, humans weave in and out of it, and that needs to be modeled somehow.
Probably correct that the Oscar “bug” has very little to do with verification. But a few thoughts anyway – just to keep this intersection of IEEE Transactions and Computer Aided Design and People Magazine alive.
1. A process repeated for a long time is considered bug-free and may actually be. But a modification of the process may either go unnoticed or verified with some hand waving. In short – rev 2.0 requires as rigorous of a verification effort as rev 1.0. I think.
2. I am sure they did a dry run – actors love rehearsals (I am told). But how accurate were the conditions. Did they use 80 year old actors? Was the lighting as dim? Did a movie star walk by so the accountant could tweet. Was anyone there at the level of “Panu” asking why was Mr. High Priced Well Dressed Accountant tweeting.
Thats it for me. Go and solve AV verification.