Misc stuff: The Oscars bug

Summary: This is another one of those “What’s new in verification land” posts. It talks about the verification aspects of the recent Oscars fiasco.

I was on vacation, and thus (mostly) missed this earth-shaking event. Luckily my spies are everywhere, and my friend Elliot Cohen sent me the following email with all the gory details:

This event should be very interesting to people interested in your field, especially people who don’t understand how the concept of a bug can be 100% technology independent. The bug report is simple:

Almost four hours into the event, the time had come for the final Oscar to be presented – Best Picture. Presenting Actor was handed an envelope. He opened the envelope, hesitated a bit, looked confused and announced “… Best Picture..” and handed it to Presenting Actress, who simply read the name written on the card which the winner of a previously announced award.

Only after the wrong winners came up and made three acceptance speeches was the workaround released and the correct Best Picture winner named.

https://www.youtube.com/watch?v=8KeOxeuiZjs

Root cause was like others you have mentioned – a combination of highly unlikely events. But when a process is repeated for ~90 years “Expect the unexpected”. Root causes:

Turns out that the presenters enter the stage from two directions. On each side there is an accountant (!) with the results. Each one has the full set of envelopes – so the envelope handed to the presenters was of an award that had already been given
The coloring of the envelope / text was changed this year and difficult to read under the stage lighting.
The accountant was busy tweeting about the movie stars – possibly to his colleagues stuck in the office auditing multi-national widget manufacturers.
The text on the cards was at the bottom and tiny and the text was huge.
The presenters were ~80 years old and may have been confused. Turns out they don’t really get along with each other and were not able to break out of their acting personas. Their confusion and hesitation was mistaken for fake suspense and humor. A decent verification engineer would have stopped right here and called for help. But then, a QA engineer would never be nominated for Best Actor.
The bug was discovered immediately – even before the “winners” reached the stage. And the people who knew that were right there. There was no mechanism to elegantly correct the error. Fast.

Anyway, as a result the accounting firm may have a hard time finding anyone who wants to even do a simple tax form with them, the accountant will be fired, the actor and actress will now be remembered for this snafu, instead of their amazing ~50 year careers. And no one will blame the system.

Count on next year seeing the same yellow font on red envelopes.

On a positive note – experts in every field (not just verification) can now toot their horns, from typography experts, lawyers, conspiracy theorists, political pundits, comedians and so even wedding planners.

And a recursive thought – perhaps this will spawn a movie about the whole incident. And perhaps that movie will be nominated for an Academy Award …..

Elliot’s email reminded me that the world is full of human-process / human-communication bugs. Most occur in areas not deemed worthy of “real” verification. Others have pretty serious consequences. For instance, consider the Avianca flight 52 story, which I mentioned here:

The bug was in the plane-crew-to-air-traffic-controller human procedure: There were long delays for landing at JFK, the pilot had little English, the co-pilot did not challenge the authority of the air-traffic controller (and also said “priority” rather than “emergency”), and eventually the airplane ran out of fuel and crashed.

Notes

I’d like to thank Kerstin Eder, Amiram Yehudai, Yaron Kashai, Gil Amid, Benjamin Maytal and Sandeep Desai for commenting on a previous version of this post (which was originally part of another post).

5 thoughts on “Misc stuff: The Oscars bug”

Verification is about predicting future failures, but failure investigations have perfect hindsight.
How on Earth forward-looking verification could predict Avianca disaster or (really not comparable) Oscar snafu?
Those are systems verified by thousands of landings and 90 years of low tech ceremony accordingly. Verified by long-standing success, invalidated by a single accident?

Yoav Hollander says:

March 13, 2017 at 10:22 am

Not really sure about the Oscars bug: I found Elliot’s email entertaining, but I don’t know how much verification effort this really deserves.

However, the Avianca bug (and similar) is really serious stuff, and I think we _can_ come up with techniques for finding such bugs before they happen (essentially by simulating human interactions and using CDV-like techniques to go to corners).

This is clearly non-trivial. I talked about this in https://blog.foretellix.com/2015/07/28/its-the-spec-bugs-that-kill-you/

Loading...

Reply

Simply funny – and though-provoking – at least until the Avianca bug mention, then it became sobering (and still though-provoking). Capturing the human interaction _within_ the “design under test” – not as an adjunct thing outside of the design – is what needs to happen, IMO. Modeling humans effectively “inside the design”, without strongly formed, or certainly automatically constrained, interaction protocol means you need to strongly capture this internal interface and, as pointed out, have what seems to be an over-the-top, strong, redundant correction system in place.

Seriously though: tweeting while handing out _THOSE_ envelopes, OMG, even my 14 year old would not have done that, she would have been petrified about screwing it up.

Yoav Hollander says:

March 16, 2017 at 9:45 am

Indeed, capturing human behavior is one of the toughest parts. There have been several attempts at doing that: BDI (which I covered in previous posts), system dynamics, Markov chains and more.

Even when there is a defined interaction protocol, humans weave in and out of it, and that needs to be modeled somehow.

Loading...

Reply

Probably correct that the Oscar “bug” has very little to do with verification. But a few thoughts anyway – just to keep this intersection of IEEE Transactions and Computer Aided Design and People Magazine alive.

1. A process repeated for a long time is considered bug-free and may actually be. But a modification of the process may either go unnoticed or verified with some hand waving. In short – rev 2.0 requires as rigorous of a verification effort as rev 1.0. I think.

2. I am sure they did a dry run – actors love rehearsals (I am told). But how accurate were the conditions. Did they use 80 year old actors? Was the lighting as dim? Did a movie star walk by so the accountant could tweet. Was anyone there at the level of “Panu” asking why was Mr. High Priced Well Dressed Accountant tweeting.

Thats it for me. Go and solve AV verification.

Sergey Tozik says:

March 13, 2017 at 10:07 am

Verification is about predicting future failures, but failure investigations have perfect hindsight.
How on Earth forward-looking verification could predict Avianca disaster or (really not comparable) Oscar snafu?
Those are systems verified by thousands of landings and 90 years of low tech ceremony accordingly. Verified by long-standing success, invalidated by a single accident?

Loading...

1. Yoav Hollander says:
  
  March 13, 2017 at 10:22 am
  
  Not really sure about the Oscars bug: I found Elliot’s email entertaining, but I don’t know how much verification effort this really deserves.
  
  However, the Avianca bug (and similar) is really serious stuff, and I think we _can_ come up with techniques for finding such bugs before they happen (essentially by simulating human interactions and using CDV-like techniques to go to corners).
  
  This is clearly non-trivial. I talked about this in https://blog.foretellix.com/2015/07/28/its-the-spec-bugs-that-kill-you/
  
  Loading...
  
PANU says:

March 14, 2017 at 2:06 pm

Simply funny – and though-provoking – at least until the Avianca bug mention, then it became sobering (and still though-provoking). Capturing the human interaction _within_ the “design under test” – not as an adjunct thing outside of the design – is what needs to happen, IMO. Modeling humans effectively “inside the design”, without strongly formed, or certainly automatically constrained, interaction protocol means you need to strongly capture this internal interface and, as pointed out, have what seems to be an over-the-top, strong, redundant correction system in place.

Seriously though: tweeting while handing out _THOSE_ envelopes, OMG, even my 14 year old would not have done that, she would have been petrified about screwing it up.

Loading...

1. Yoav Hollander says:
  
  March 16, 2017 at 9:45 am
  
  Indeed, capturing human behavior is one of the toughest parts. There have been several attempts at doing that: BDI (which I covered in previous posts), system dynamics, Markov chains and more.
  
  Even when there is a defined interaction protocol, humans weave in and out of it, and that needs to be modeled somehow.
  
  Loading...
  
elliot cohen says:

March 27, 2017 at 3:16 pm

Probably correct that the Oscar “bug” has very little to do with verification. But a few thoughts anyway – just to keep this intersection of IEEE Transactions and Computer Aided Design and People Magazine alive.

1. A process repeated for a long time is considered bug-free and may actually be. But a modification of the process may either go unnoticed or verified with some hand waving. In short – rev 2.0 requires as rigorous of a verification effort as rev 1.0. I think.

2. I am sure they did a dry run – actors love rehearsals (I am told). But how accurate were the conditions. Did they use 80 year old actors? Was the lighting as dim? Did a movie star walk by so the accountant could tweet. Was anyone there at the level of “Panu” asking why was Mr. High Priced Well Dressed Accountant tweeting.

Thats it for me. Go and solve AV verification.

Loading...

	https://otomotif71.w… on Stuttgart impressions: Scenari…
	Daan van der Keur on About “The coming AI hackers”…
	Mariah Jackson on M-SDL, the autonomous vehicles…
	sakhokhar on Machine Learning for Coverage…
	hongseoklee on How to write AV scenarios (and…
	Erik Panu on GPT-3 and verification
	Yoav Hollander on Autonomy markets and their pot…
	Nakkeeran Kumaraswam… on Autonomy markets and their pot…
	Aman on DeepXplore and new ideas for v…

The Foretellix CTO Blog – AI safety

Now focusing on AI safety (autonomy-related posts go to the company blog)