Autonomy markets and their potential bugs

Summary: Autonomous Vehicles will enter various markets (robotaxis, delivery bots, autonomous mining and so on) at different rates (and the current pandemic will probably accelerate some while slowing others). This post is about the verification needs of these markets, and how a single, extensible, scenario-based verification language can be used to express both the common scenarios (cut-in, braking-failure etc.) and the market-specific ones. I’ll also give a short update about M-SDL, the Measurable Scenario Description Language.

The various autonomy markets

Autonomous Vehicles (AVs) are obviously not a single market with a single deployment date. There are many sub-markets, such as:

  • Urban robotaxis
  • Autonomous trucks in confined areas (mines, harbors etc.)
  • Autonomous delivery bots
  • Fixed-route full-speed autonomous shuttles
  • Low-speed autonomous shuttles (e.g. for retirement communities)
  • Highway autonomous trucking
  • Various “highway pilot” solutions (ranging from “ADAS+” to L3)

There are many more sub-markets. Ever heard of autonomous yard trucking? Perhaps not, but it is old news to me (i.e. I learned about it earlier this week).

On top of this functional split (what will the AV do), there is also geographical split (where will it do that). Thus, there is a fairly big set of sub-market, each with its own contenders, waiting for the right moment to leap across the chasm.

Those leaps will obviously occur at very different times: Autonomous trucking in Australian mines is already operational. Robotaxis in downtown Calcutta will take a while to materialize.

The bulk of this post is not pandemic-related. Nevertheless, let’s look briefly at:

The Coronavirus effect

So how will the current pandemic influence all that?

<insert here the obligatory nobody-knows-where-this-will-go disclaimer>

BTW, my intuition is broadly with this guy.

Clearly, much of the tech sector is slowing down, and thus many autonomy companies will also suffer. Autonomy is hugely expensive to develop and verify, and companies without deep pockets may not make it.

On the other hand, there are indications that several autonomy markets will become more important, especially if people assume that repeated pandemics are somewhat-likely. Some examples are delivery bots, autonomous trucks in harbors and other confined area, and perhaps even robotaxis.

See for instance the VentureBeat article Despite setbacks, coronavirus could hasten the adoption of autonomous vehicles and delivery robots. Or take a look at the article The effect of COVID-19 on labor availability, which talks about the need for autonomy to battle supply-chain disruptions. Quote:

According to the port authorities, during February the city of Ningbo only had 800 truck drivers working, when usually 24,000 container truck drivers are on duty. This shortage was caused by truck drivers who were not able to get to work due to the quarantine set upon them, while some of the drivers feared being contaminated.

So there will be a fairly large pressure to deploy in at least some of these markets fairly soon. One huge barrier for that is verification (and the related regulation and public trust).

Note also that (at least currently) there is a halt of physical testing, and a shift to more virtual testing. I expect this shift to continue: You need both, but virtual testing is much more productive (for both AVs and ADAS), if you really want to test all the required scenarios and their permutations – see the next section.

Per-market (and common) scenarios

Suppose your company is building an AV for one of these markets. Then you (and your suppliers) must do all of the following:

  • Develop the common functionality (e.g. driving)
  • Develop the market-specific capabilities (e.g. loading containers)
  • Verify the common functionality
  • Verify the market-specific capabilities

The verification part can be huge, with many different scenarios to consider. Take a look at the picture below, which shows some AV markets (and some of their market-specific scenarios), as well as some of the common scenarios (center):


To get a feeling for this, assume you were tasked with verifying autonomous trucking in various confined areas:

  • You must consider special scenarios like dumping loads or lifting containers
  • You must also consider the huge number of common scenarios like vehicle-ahead-braking, bad-weather, braking-failure and so on
  • Life may be somewhat easier because there are normally no “civilians” there – only trained personnel (but you must still consider the rare uninvited person sneaking in)
  • If your truck also goes on a public highway, you must consider all its scenarios
  • “Trucking for confined areas” is really several different sub-markets, each with somewhat-different verification needs (but a lot of commonality)

It is pretty hard to think ahead of all possible market-specific bugs. Consider this story about a delivery bot in Pittsburgh: It did not do anything “rash” – it was just standing still at a crosswalk entry, thus blocking the writer (a lady on a wheelchair) from escaping the street into the safety of the sidewalk.

Note that this bug (which has since been fixed) has nothing to do with the usual kinds of risks, like avoid-running-into-things, that all AVs (including delivery bots) have to worry about.

It is fairly specific to delivery bots, and was probably a spec bug (i.e. nobody thought of that requirements). However, once you know about it, you can (and should) generalize it: For instance, create a whole bunch of scenarios related to “our AV blocking the path of another participant”.

Making verification efficient

So how does one do efficient, thorough verification for a specific kind of AV? One best-known method is Coverage Driven Verification:

  • Enumerate the various risk dimensions (in some verification plan)
  • Write abstract (generalized) scenarios for each of dimension
  • Use automation to expand each scenario into many different runs (on multiple test platforms)
  • Use coverage and performance metrics to analyze how well the scenario space was covered, and how well did the AV perform in that space
  • Refine / repeat as needed

Writing scenarios for all the risk dimensions of a specific AV market is hard. And then you also have to mix them (e.g. see what happens when your autonomous harbor truck encounters a mechanical error while reversing into a container loading area).

In principle, each AV market (or even individual company) could develop its own verification methodology, scenario description language, notation for coverage and checks, and so on. But this will be very inefficient, because it creates separate “islands” and does not allow for reuse.

Consider: If you are verifying autonomous trucking in mines, ideally you would like to use a single language for your vehicle_ahead_braking() scenario, your dump_truck_cant_align_with_target() scenario, your heavy_rain() scenario and your braking_system_fault() scenario (and perhaps mix them).

And you might prefer to get at least the non-market-specific scenarios from some other source. And you may want to hire people who are already familiar with the language / methodology. And regulators may prefer a common language. And so on.

One language to bind them all

So it would be nice to have a single, open, reuse-oriented, scenario-based verification language in which you could express all of these scenarios (as well as their coverage, KPIs, checks etc.).

And indeed, we (Foretellix) designed M-SDL to be exactly such a language.

And then we opened it up. Here is a brief timeline of that:

  • In September we opened M-SDL (v0.9) to the world (under the Apache license)
  • In January we published an M-SDL FAQ, talking (among other things) about some enhancements needed for generalizing to multiple markets
  • Last week the ASAM standardization body published the OpenSCENARIO 2.0 concept document, which is very aligned with M-SDL. This is no accident – we have been working intensively with the concept project team over the last year, getting inputs and adapting the language accordingly. We intend to keep M-SDL aligned with OpenSCENARIO 2.0 as it evolves.
  • We have also been talking to people in many AV markets (including all the markets shown in the picture above), trying to make sure the language will work for them. Expect a new version of the manual soon.

A major goal for M-SDL is to let you reuse scenarios along many dimensions:

  • Reuse by sharing: A common, open language lets you reuse scenario libraries written by other groups / organizations. Perhaps some standard libraries will emerge.
  • Reuse via mixing: You can simply mix braking_system_fault() with dangerous_cut_in() rather than creating a new scenario from scratch.
  • Reuse by aspect-oriented extension: You can take an existing scenario like dangerous_cut_in(), and add to it a constraint (like relative_speed > 10kph) without touching the original scenario
  • Reuse across maps: Scenarios get executed in a random “matching” location on the specified map
  • Reuse across test platforms / simulator configuration. Note that verifying your fixed-route-shuttle may involve using multiple simulators at the same time (e.g. a world-simulator, a shuttle simulator and a supervision-center-plus-comm-network simulator).
  • Reuse across verification goals: The same scenarios can be used for developer testing, regression testing, residual risk estimation etc.

If you have time, perhaps scan the FAQ (above) to see how well the language meets those goals (for your areas of interest). Comments are very welcome – either write them below, or send them to

BTW, we are planning a webinar about these topics in a few weeks – watch this space.


I’d like to thank Thomas (Blake) French, Yaron Kashai, Rahul Razdan, Ziv Binyamini, Gil Amid and Amiram Yehudai for commenting on earlier versions of this post.


2 thoughts on “Autonomy markets and their potential bugs

  1. Good to see you folks making strides on M-SDL. Can you add if there are any youtube video links? Also – How are you folks doing the evaluation process / framework ?

Leave a Reply