Summary: This post will take an initial look at the US Autonomous Vehicles policy announcement, and claim it is a big deal. It will then examine the verification implications, and claim they are mainly positive.
Yesterday, the US Department of Transportation (DOT) and the National Highway Traffic Safety Administration (NHTSA) came out with an announcement regarding AV regulation (and AV direction in general): recording of the event is here, the full 116-page policy paper (pdf) is here, and President Obama’s op-ed piece about it is here.
It talks a lot about safety, and says that manufacturers should provide regulators with a safety assessment covering “the 15 areas”. These areas are shown in the following picture: 11 in light blue (left), 3 in red (top), and one – testing and validation – in green (bottom). Note that testing and validation should be performed in the simulation, track and on-road environments.
After scanning the documents and looking at what the world has to say, here are my initial opinions (mainly about the verification side of things):
It’s a big deal
This sounds like a really important announcement, and I see it as a pretty positive thing (but see some critique below). Here is why I like it:
- This is phrased not as “Oh, those AV manufacturers are pushing us so we need to do something”, but rather as the Right Thing to do (see that Obama piece and the opening minutes of the event’s recording). I like that, and I think that the message (“The US now says that AVs are the right way to go, but don’t forget safety”) will help the AV field.
- This will make AV regulation a federal matter, avoiding state-by-state differences. And because the US is such an important market for everybody, this will probably influence non-US manufacturers as well. This is pretty good.
- They mainly talk about the (more ambitious) full autonomy “without any expectation of involvement by a human driver”. Good.
They talk a lot about trying to balance the need for regulation with the flexibility required by new technology. Nevertheless, some people still worry that heavy-handed, too early regulation will deter innovation – see for instance this piece on reason.com.
Brad Templeton came out with an initial critique of the policy (mentioning the above worry among others), and promised more. In general, I really like Brad’s posts (I mentioned them in the section “Templeton on Tsunamis” here), and I use him as one of my main sources about AVs, but I am much more positive than him on the overall effect of the announcement, for the reasons I listed above.
I do agree with Brad, for instance, that waiting until all ethical problems are resolved is itself unethical. He discusses here the so-called “trolley problems” (e.g. “If it must choose, should the AV kill one person who jumped into the road, or swerve and risk killing two on the curb”). He then says (correctly, in my view) that spending too much time on them introduces a (much worse) meta-trolley problem:
If people (especially lawyers advising companies or lawmakers) start expressing the view that “we can’t deploy this technology until we have a satisfactory answer to this quandary” then they face the reality that if the technology is indeed life-saving, then people will die through their advised inaction who could have been saved, in order to be sure to save the right people in very rare, complex situations.
So I agree with Brad that AV ethical considerations are important in principle, but it would be immoral to be obsessed with them. I guess they had no choice but to mention those considerations in a policy paper (“What, you don’t care about ethics?”), but they need to be very careful about priorities moving forward.
Perhaps somewhat more controversially, I also feel the same about AV cybersecurity. As I said before:
I think security research is important (I have written about it e.g. here), but in the context of AV verification, safety is much more important. Yes, somebody will be able to remotely take over a car and drive it off a cliff. That’s willful murder, and I bet willful murders account for a tiny percent of those 30,000 US vehicle-related fatalities/year. As the inimitable James Mickens said of a (different) cyber threat, it “can be used against the 0.002% of the population that has both a pacemaker and bitter enemies in the electronics hobbyist community”.
And then there is terrorism. Believe me, I can invent movie-plot scenarios with the best of them, and in fact I think some bad guys will eventually perform remote terrorism-via-cars. So AV security is important and I am happy some people are working on it. It is just that I expect many, many more people to be hurt due to safety bugs than due to security bugs, and I worry that the theater-like character of terrorism may cause research efforts to be misaligned with that fact.
As you can tell (and it should come as no surprise for regular readers of this blog), I think functional and safety verification is one of the most deserving items on the list, and the rest of this post will talk exclusively of that. Note that I use the term verification in the widest sense (covering what the policy paper calls verification, validation and testing).
So what does the policy paper say about it, and what are some of the implications?
What the paper says about verification
What follows are the main interesting pieces I found in that policy paper. Notes:
- This is not a comprehensive, thought-out summary. Still, I hope it will help you get a feeling for what this (important but long) document says about verification.
- They use a lot of acronyms, but for reading this section you only need HAV (Highly Automated Vehicle), ODD (Operational Design Domain) and OEDR (Object and Event Detection and Response).
- In general, I thought they covered verification reasonably well (for such a wide, all-encompassing document), though much remains open
Validation methods: Here is the entire section F.4, which describes the “Validation Methods” part of the safety assessment:
Given that the scope, technology, and capabilities vary widely for different automation functions, manufacturers and other entities should develop tests and validation methods to ensure a high level of safety in the operation of their HAVs.
Tests should demonstrate the performance of the behavioral competencies that the HAV system would be expected to demonstrate during normal operation; the HAV system’s performance during crash avoidance situations, and performance of fall back strategies relevant to the HAV’s ODD.
To demonstrate the expected performance of an HAV system, test approaches should include a combination of simulation, test track, and on-road testing. Manufacturers and other entities should determine and document the mix of methods that are appropriate for their HAV system(s). Testing may be performed by manufacturers and suppliers but could also be performed by an independent third party.
Manufacturers and other entities are encouraged to work with NHTSA and other standards organizations (SAE, NIST, etc.) to develop and update tests that use innovative methods as well as criteria for necessary test facility capabilities.
Much of the complexity of AV verification stems from the fact that an AV is a heterogeneous beast, consisting of SW, digital HW, machine learning components, sensors, actuators and so on. I talked about the current state of AV verification here.
Sub-system verification: As expected, there is the implication that tests should work on the full system and on sub-systems:
All design decisions should be tested, validated, and verified as individual subsystems and as part of the entire vehicle architecture. The entire process should be fully documented and all, changes, design choices, analyses, associated testing and data should be fully traceable.
Verifying SW, machine learning etc.: Here is a discussion of SW testing, and a reference to machine learning and its safety:
Thorough and measurable software testing should complement a structured and documented software development process. The automotive industry should monitor the evolution, implementation, and safety assessment of Artificial Intelligence (AI), machine learning, and other relevant software technologies and algorithms to improve the effectiveness and safety of HAVs.
Re-verifying updates: They clarify that a SW or HW update may need a new safety assessment (including a new description of the validation methods):
… if there is a change to the set of normal driving scenarios (behavioral competencies) or pre-crash scenarios that the HAV system has the capability to address as a result of a software or hardware update, then this should also be summarized in revised Safety Assessment.
Working with other countries: They talk about the possible unification of testing / analysis with other countries:
Ideally, this work would be done in conjunction with other countries so that similar testing and analyses would enable NHTSA and other regulatory authorities to avoid duplication of research, collect and analyze similar data, compare results obtained and lessons learned, and lay the foundation for compatible regulatory approaches.
Complex and special-case checking: They talk about the needs to sometime override rules, and the need to test that:
In certain safety-critical situations (e.g., having to cross double lines on the roadway to travel safely past a broken-down vehicle on the road, other road hazard avoidance, etc.) human drivers currently have the ability to temporarily violate certain State motor vehicle driving laws. It is expected that HAVs have the capability of handling such foreseeable events safely. Also, manufacturers or other entities should have a documented process for independent assessment, testing, and validation of these plausible cases.
This just hints at the whole area of complex and probabilistic checking, which I have discussed here.
Possible test unification: They hint at a possible, future unification of test scenarios by NHSTA:
Manufacturers and other entities should develop tests and verification methods to assess their HAV systems’ capabilities to ensure a high level of safety. In the future, as DOT develops more experience and expertise with HAV systems, NHTSA may promulgate specific performance tests and standards. Presently, manufacturers and other entities should develop and apply tests and standards to establish the safe ODD for each HAV system.
The need for variable, possibly random scenarios: Under tools, they talk about “Variable Test Procedures to Ensure Behavioral Competence and Avoid the Gaming of Tests”. They explain:
… to ensure that automated vehicles are capable of driving safely in complex, busy environments full of other vehicles, bicycles and pedestrians, the Agency must have the ability to create test environments representative of those real-world environments. Due to their complexity and variability, it would not be feasible for one such test environment to fully and identically duplicate another such test environment.
… if NHTSA issued a standard whose test procedure called for an HAV to be driven on a standardized path through a testing track simulating a particular urban or suburban driving environment and to avoid colliding with surrogate vehicles and pedestrians that would always appear in the same sequence at the same locations and at the same time intervals, the manufacturer of an HAV could program the vehicle to “perform to the test.”
Thus, NHTSA needs the ability to vary the tests. I think the best way to do that is using an ever-growing set of constrained-random scenarios, ideally shared between manufacturers, as I described in the section “We need a big scenario catalog” here. Those scenarios will be run (in different ways) in simulated, track and on-road environments.
- I hope this summary has been useful
- I am pretty positive about the whole thing
- The verification stuff seems in the right direction, but much remains open
I’d like to thank Amiram Yehudai for commenting on a previous version of this post.
The Hacker News discussion of the original DOT announcement is here.