Here is another one of those multi-topic summaries:
Dipping my toes into autonomous robot verification
As I discussed here, I am exploring working with Kerstin Eder et al. (of Bristol U) regarding autonomous robot verification.
So I started looking into this. I will not take you into the gory details, gentle reader (especially since I don’t really know them all) ;-), but here are the basics:
Sounds like a big chunk of the robotics world uses the ROS framework for telling the robot what to do. You write your robotics tasks (called “nodes” in ROS-speak), usually in C++ or Python, and then use these nodes (running on top of ROS) to control the robot. This can be either the actual robot, or its simulation (using Gazebo). Gazebo even has a visual component, so you see it all moving on your Linux screen.
So suppose you are trying to verify that a robot hands something correctly to a human, as described in this paper (pdf) from the Bristol gang. You may want to use the following three main execution platforms, ideally using the same verification infrastructure:
- Fully-simulated: Simulate both the robot and (somehow) the human using Gazebo
- Hardware In the Loop (HIL): Use the actual, physical robot, and somehow fake the human
- Person In the loop (PIL): Have an actual person interact with the actual robot (this also implies HIL).
Clearly (3) is the most realistic and most expensive (people don’t like running as part of an overnight regression). (1) is the most flexible, and also the only one which could be really repeatable.
Alas, it is not. Turns out that currently, runs executed with ROS+Gazebo are not repeatable, mainly because nodes are run as separate Linux processes and thus can shift in time relative to each other, at the whim of the Linux scheduler and HW events.
One might argue (wrongly) that non-repeatability is actually a feature, because it helps you get more testing out of a single test script (because it runs differently every time). But in reality, this is a terrible idea in any kind of serious verification.
This is because without repeatability (also called reproducability), it is extremely hard:
- To debug (a timing-related bug may appear only once-per-many-runs)
- To check that a new release did not break something (an apparently-new bug may have existed in old version too)
- To check that a new release indeed fixed some bug (an apparently clean run may be the result of slightly different timing)
- To reliably measure coverage (run it again and you’ll get different coverage)
- To demonstrate things with confidence (oops – let me run this again – hopefully it will show the effect I mean)
- And so on.
So it is much, much, much better to carefully remove all causes of non-repeatability, and then re-introduce the ability to run a test multiple times in different ways, by using a (controllable) randomization governed by a runtime seed.
Now, some execution platforms (e.g. most Hardware-In-the-Loop setups, especially if they involve the moving of actual mechanical parts) are inherently non-repeatable, but that just makes the case for repeatability in the simulator stronger – at least we can use the simulator as much as possible to accomplish the above tasks.
So hopefully this issue will be fixed. There is actually a ROS 2 coming up, and this page says, tantalizingly “In ROS 2 more granular execution models are available (e.g. across multiple nodes) and custom executors can be implemented easily”. So perhaps somebody will implement a repeatable ROS+Gazebo executor.
System simulations conference
About a month ago I attended the third Israeli system simulation conference (in Hebrew). It was sort-of interesting (some was a repeat of the previous two conferences, which I also attended).
One thing I noticed is that these people (mainly defense folks) use simulations for many things, with verification being just one (just somewhat-important) item on the list. The main uses seem to be training and operational research. They also (interestingly) use simulation to try and debug real-life observed bugs (e.g. tweaking simulation parameters, trying to reconstruct a case where a missile went astray in real life).
Here are some other comments about what these guys seem to do:
- The simulation setups are really complex. They often involve multiple, heterogeneous simulation machines, connected via a framework called HLA.
- These setups often put together real and simulated people and equipment.
- They almost always run in real time (second-per-second). To allow for that, models often have to work at multiple resolution (going to a lower resolution to work faster).
- They often have complex HIL setups: E.g. communications equipment will first be simulated, then run on the actual electronics cards but without the actual antennas (using recorded radio data), then with real antennas on a bench (with synthetic radio input), and finally say in the real aircraft with real inputs.
AI dangers strike again
I have talked here (offline from the main blog) about this somewhat futuristic topic of the danger of smart AI, and about verifying the solution (once we have one).
Well, about a month ago I gave a presentation about this topic at the Tel-Aviv LessWrong forum (in the Tel-Aviv Google building), which was followed by a lively discussion. Here are the presentation and the subsequent online discussion. There is even a video (two hours, Hebrew only). Sounds like there will be follow-up – stay tuned.
I’d like to thank Amiram Yehudai and Sandeep Desai for reviewing earlier versions of this post.