Summary: This post talks about some implications of the Tesla autopilot crash, claims that some autonomous vehicle fatalities are unavoidable, and suggests some ways to minimize them.
By now you have probably heard about the Tesla fatal autopilot accident, causing the tragic death of a Tesla enthusiast. There is a lot of discussion about what it would do to autonomous vehicles (and about whether the Tesla autopilot should even be considered an AV). But I think we should assume AV accidents will continue to happen. We should try to minimize them (e.g. by better verification – see below), but they will happen.
A month ago, in my post about finding bugs in AVs, I said I was surprised that AV people were assuming fatal AV accidents will not happen:
But this is clearly an unreasonable thing to say, right? The US currently has around 30,000 vehicle-related fatalities/year. Say we replaced 5% of those cars with AVs, and say AVs are 5 times safer. That’s 300 AV-related fatalities/year. …
That’s going to happen, right? And it is going to be front-page news. And an endless parade of experts will be called to the witness stand. But AVs will not be banned, because 5X less fatalities is still an incredible thing. And by the time it is all over, we’ll have requirements for stricter verification. The AV companies will help push that process, because they need to know where they stand. And because this stricter verification will be a lot of work, many of the inefficiencies in the current process will be removed.
A second question on many people’s minds is whether the autopilot design could/should have avoided this accident (note that I don’t use the term “root cause” – this is a very loaded term as I explained here).
The answer will take a while to appear (and much of it will be in the legal, not technical, domain), so I will certainly not try to answer it myself. But it currently seems possible. For instance, this analysis says “Our understanding here is that the high ride height of the trailer confused the radar into thinking it is an overhead road sign. It’s obviously not ideal and the system should be refined to have a greater detection threshold for overhead road signs, …”.
Whether or not the Tesla designers considered the scenario of “reaching a very high trailer from the side”, there are bound to be scenarios which they did not consider. Which leads me to:
Templeton on Tsunamis
About 3 months ago, Brad Templeton had a blog post called How would a robocar handle an oncoming tsunami?. The (somewhat famous) GIF at the top shows a driver in Japan managing to turn his car around in front of an oncoming Tsunami, with under a second to spare.
Would an AV know to turn like this? I like Brad’s answer:
The best reason the car might handle this however, is the very existence of this video, and the posts about it – including this blog post here. The reason is that the developers of robocars, in order to test them, are busy building simulators. In these simulators they are programming every crazy situation they can think of, even impossible situations, just to see what each revision of the car software will do. … If you can think of it without a major effort, and it seems like it could happen, they will put it in.
I think Brad is right (BTW, he has good posts about AV regulation, AV simulators and many others). But I want to emphasize a different point. It has often been said that AVs will have many fewer accidents, but some of those will be AV-specific (i.e. would not happen to a human driver). I hope this does not sound cold-hearted, but I think we have to agree to that tradeoff: I certainly did not consider Tsunamis in my list of AV hazards until I read that post.
I don’t mean we should just accept those AV bugs: we should do our best to improve our verification methods and thinking to catch more and more of those. In fact, this whole blog series is (largely) about that. All I am saying is that it is a long process, and some AV-specific bugs will still remain even when AVs are 100 times safer than human drivers (a long, long time from now). It would be immoral to wait until then.
BTW, one such AV-specific risk is that people might assume an AV can do more than it really can: One reason for that Tesla crash may have been that, because autopilot was so good on the highway, the driver forgot that it was not (yet) meant for roads with cross traffic.
Koopman et al. on locusts
I recently discovered Challenges in Autonomous Vehicle Testing and Validation (pdf – corresponding slides are here) – a really good article by Philip Koopman and Michael Wagner of CMU (and also of Edge Case Research).
These guys seem to have a lot of experience. Philip Koopman also has a blog, Better Embedded System SW. which is a joy to read: His informal explanations of embedded SW concepts and best practices (how to handle interrupts and watchdog timers, how to look out for safety / security problems and so on) are really good.
I’ll get back to these guys in subsequent posts (lots to talk about in this “Challenges” article). For now, let me just quote what it says about the handling of “exceptions”:
… There are likely to be very many different types of these, from bad weather (flooding, fog, snow, smoke, tornados), to traffic rule violations (wrong-direction cars on a divided highway, other drivers running red lights, stolen traffic signs), to local driving conventions (parking chairs, the “Pittsburgh Left”), to animal hazards (deer, armadillos, and the occasional plague of locusts). … Thus, it seems unlikely that a classical V process that starts with a document that enumerates all system requirements will be scalable to autonomous vehicle exception handling in a rigorous way, at least in the immediate future.
I agree with this: A good requirements document, while necessary, is not enough. There are bound to be both omissions and spec bugs (I have written about spec bugs before).
We need more. And here is one thing that could really help:
We need a big scenario catalog
The more I think about this, the more I agree with Brad Templeton above (and with myself in the above-mentioned post): Part of the solution should be an (ideally industry-wide) ever-growing catalog of AV scenarios (and their related coverage definitions / parameter values).
That catalog can have several uses:
- AV manufacturers will use it to test each SW release (mainly in virtual mode, but some subset also in various physical execution platforms)
- Regulators will use (some representative subset of) it to assess / certify cars
- AV / equipment manufacturers will use it to evaluate prototypes / ideas
- AV manufacturers who use some Machine Learning could perhaps use it to train their ML, as I described towards the end of this post).
- And so on.
But where will all the scenarios / parameters come from?
- The main source will be people dreaming up the basic blocks (to be further randomized together – see below)
- Another possible source is (semi) automatic extraction from recorded (actual driving) data. This will probably not get you the Tsunami, but will get a lot of other stuff.
- Hopefully we can find others ways, e.g. (semi) automatic instantiation of generic scenarios / hazard patterns
Finally, that scenario catalog should run on top of a CDV-enabled system:
- That system should be able to execute runs forever, producing more and more random scenario variants (and scenario mixes)
- It should be self-checking and collect functional coverage
- Ideally, it should also have a way to auto-fill coverage (e.g. via MBT)
- Most runs should ideally be repeatable, for debugging, easy regressions etc.
- It should ideally be written in some declarative, extensible language, for multi-vendor deployment and easier maintenance
I’d like to thank Amiram Yehudai, Brad Templeton, Sandeep Desai and Kerstin Eder for reading previous versions of this post.
4 thoughts on “The Tesla crash, Tsunamis and spec errors”
If I was responsible for Autonomous Vehicle Testing, I would utilize an open-source model for the catalog of AV scenarios: share what I have with the rest of the industry, encourage (and reward) contributions from anywhere and everywhere, “democratize” the process to the extent possible. I see two main reasons for doing this:
(1) Nobody has a monopoly on wisdom, others will undoubtedly come up with scenarios that I haven’t thought of (and vice versa)
(2) It provides the best possible legal defense when, despite my best efforts, my AV is involved in a Tesla-like accident
I don’t however expect the automobile industry to act this way. Every manufacturer is going to think that their AV offers a competitive advantage, and will market it that way.
Well, perhaps the catalog (including all scenarios, coverage definitions etc.) can be open-source as you describe, while AV manufacturers keep their competitive advantage via implementation: For instance, if a manufacturer can get the same “success” rate (for some agreed-upon scenario coverage) while using a cheaper sensor set (e.g. no LIDAR), then good for them.
That way, the open-source scenario catalog simply becomes an agreed-upon way to assess safety, much like the NCAP five-star safety rating is assessed via crash tests etc..
A big scenario catalog sounds good for first order testing, but am interpreting catalog entries to be rather singular in nature. Also interpreting that they could not easily combined to make more complex scenarios. Example : While fleeing a plague of locusts, following a trailer with an unusual ride height, a tsunami hits. In my awesome SVE, The ability to overlap ‘events’ on a range of topologies/directions will be needed. Example : While heading dir(north) on topo(foo), you encounter events(A+B) at coordinates (x,y,x) then event C while still experiencing B. To enable, would need to combine simulators thread with scenario thread. Each scenario in catalog has to be broken down into base components, and expressed at a level of abstraction such that they can be consumed (by a simulator) in different contexts and combined with different scenarios. Will also need a constraints engine and config mechanism to help build up ever more complex scenarios from base.
PS. Greetings Bob!!
I very much agree. What we need is indeed a language / methodology for defining abstract scenarios, describing their parts (and variability), and describing how they combine in various ways. And the word “catalog” brings to mind a much more simplistic view. I promise a future post about this topic.