Summary: This post talks about the somewhat-unpopular topic of how Autonomous Vehicles should behave during (and directly after) unavoidable accidents, and especially how to verify that.
AV accidents are clearly going to happen: Even the best driver in the world is not guaranteed to never have accidents. This is largely due to, ahem, all the other idiots out there.
I discussed that in my 2106 Stuttgart symposium report, where I said:
Let me start with a weird thing I noticed in casual conversations during the symposium: Whenever I said “let’s consider the implications of the first few AV-related fatalities”, most people reacted with “Oh, we hope that does not happen. This could really set back the whole field”.
But this is clearly an unreasonable thing to say, right? The US currently has around 30,000 vehicle-related fatalities/year. Say we replaced 5% of those cars with AVs, and say AVs are 5 times safer. That’s 300 AV-related fatalities/year. Scale that down further if you like: That still gives you several times/month when a lawyer could stand in front of a jury and say something like “My client’s child was killed because this AV manufacturer was negligent”.
That’s going to happen, right? And it is going to be front-page news. And an endless parade of experts will be called to the witness stand. But AVs will not be banned, because 5X less fatalities is still an incredible thing.
That comment was written before the famous Tesla crash. By now, I assume AV companies think seriously about verifying AV behavior during accidents. Quite possibly they always did – they just (understandably) don’t like to talk publicly about that part of their verification effort.
Next, let’s consider what AVs can do during, and directly after, an accident:
Let’s walk through one specific scenario: Our AV notices a car in the opposite lane drifting into its path. The SW does everything it can: Evasive maneuvers and emergency braking and good old honking. Still, it now has high confidence that an accident will happen in, say, 500ms (albeit at a lower speed). What can it do?
Quite a bit, it turns out. I recently met David Hynd, who is head of Biomechanics for TRL. The discussion was mainly about how to describe safety-related scenarios, but then it turned to the topic above, and David mentioned some ideas he has been trying to push for a while:
During that pre-accident brief second, an AV has a lot of information it can use. For instance, if the AV knows you are going to have a collision at just 35 km/h, it could adjust the stiffness of the seat-belt load limiter to a lower level. This would greatly improve the protection of older passengers, who are the biggest group of seriously injured casualties in frontal impacts in Europe.
Somewhat more speculatively, it may be possible to choose the type of collision: For instance, if the AV knows that only the front left seat is occupied, it could do a last-second maneuver, sending the brunt of the collision energy to the right side of the vehicle. Other factors (surrounding traffic, alignments of energy-absorbing structures etc.) can also be considered.
David claims this could make a large difference. And he should know – he’s been studying collisions and their consequences for nearly 20 years.
These are just some examples where AV behavior during an accident matters, and thus should be designed and verified. More on that below. But first:
Not the trolleys again, please: This topic (of last-minute AV decisions) may bring to your mind the somewhat-related trolley problem (i.e. those moral discussions about who-should-the-AV-kill-if-it-must-kill-somebody). This seems to be an endlessly-fascinating philosophical question.
In the current context, I do have a proposal regarding the trolley problem, which is: Please don’t go there. As I said before, I agree with Brad Templeton, who warns of a potentially-worse “meta-trolley” problem:
If people (especially lawyers advising companies or lawmakers) start expressing the view that “we can’t deploy this technology until we have a satisfactory answer to this quandary” then they face the reality that if the technology is indeed life-saving, then people will die through their advised inaction who could have been saved, in order to be sure to save the right people in very rare, complex situations.
The figure below represents my thoughts on the issue: AVs will be much safer than people. Among the accidents they will be involved in, a small (but very visible) part will be “AV-only” accidents, i.e. accidents which will not happen to human drivers. Of those, a tiny, tiny percentage will be “fatalities caused / influenced by wrong trolley-related decisions”.
Just to be clear, I am not suggesting people should ignore the less-important things (in fact, this whole post is about behaving-during-accidents, which is clearly less important than preventing-accidents). I am just talking about (my own intuition of) a sense of proportion: From a practical, number-of-lives-saved point of view, the trolley discussion (which I hear about weekly) may be less important than the proposals above for minimizing collision damage (which I never heard about before I met David). And yes, I know the two are not completely disjoint.
Finally, I realize that behind the shrill headlines (“Your AV may decide to kill you!”), those trolley discussions do serve as a proxy for a really important topic: Namely, that technology has brought us to this strange place where we may need to reconcile the writings of philosophers, lawmakers and C++ programmers.
OK, let me climb off that soap box – I feel much better already. Back to verification:
Verifying accident behavior
So what’s involved in verifying behavior during / after accidents? A few things come to mind (again, please remember I am no expert in AVs):
Accident-time rules are different: AVs should obey rules, but also human conventions, which is tougher. That’s why we’ll probably see highway autonomous trucking before we see it in cities (which are more complex and less rule-based). I talked about the difficulties of verifying the interactions between AVs and people here.
Behavior during, and directly after, accidents is an even more extreme variant of this. This is an often-confused time, with lots of variants, and when many of the normal rules don’t apply. Thus, verifying it involves thinking through lots of very diverse scenarios. Examples:
- While AVs should not e.g. run red lights, it may be legit to run a red light in order to avoid / lessen an accident. This is just a more extreme example of the general issue (mentioned in the post above) that it is sometimes safer for the AV not to “go by the book”.
- Most AV accidents will be small (they already are). Still, there is a need to somehow interact with the other party before moving on
- Somebody needs to carefully think through, and spec, how to move a post-accident, partially-disabled AV out of harm’s way. This is really hard (even accurately assessing the AV’s post-accident capabilities may be impossible), but ignoring it is clearly worse. Remember: it’s the spec bugs that kill you, and those spec bugs are often related to some secondary functionality (e.g. consider the example in that spec-bugs post, where nobody thought through the potentially-fatal consequences of replacing the battery of a targeting device).
- Even driving through other cars’ accident scene may be complex and confusing.
- Perhaps some AVs (e.g. autonomous taxis) will revert to a mostly-autonomous mode, i.e. they will be partially remotely-controlled at such confusing time. Mostly-autonomous systems are both easier and harder to verify, as I explained here.
Just about all of this verification will be done in simulation: This is sort of obvious – there are many possible cases here, and very limited appetite for destroying expensive AVs in test tracks.
Organizations like Euro-NCAP may still want to see some physical tests, but even they may consider mostly-virtual-testing (see discussion in this post).
This needs a different verification mindset: AV teams should probably create many parameterized near-accident scenarios, and then run them multiple times with different random seeds. They should then cluster the resulting runs into the obvious four clusters:
- AV avoided an accident
- AV should have avoided it but did not
- AV had an accident and behaved correctly
- AV had an accident but should have behaved better during / after it
Crucially, they should put enough emphasis on the last two clusters. This is not easy technically (e.g. you need a flexible way to change the checking and coverage criteria), and not easy psychologically, for people whose main job is to avoid accidents (e.g. you have to consciously turn the knobs so there are enough diverse runs in those clusters).
Good verification engineers, of course, have that contrarian streak in them – they need to break the system so the design guys can fix it. But even they may find it hard (psychologically) to dream up lots of accident scenarios, knowing there is no fix for most of them.
I’d like to thank Gil Amid, Brad Templeton, Yaron Kashai, Sankalpo Ghose, Amiram Yehudai, David Hynd, Thomas (Blake) French and Yael Feldman for commenting on earlier drafts of this post.