I attended HVC’2015 last week. Here is a partial, very subjective trip report.
HVC’2015 was good. I attended the tutorial day and the first two conference days, though I was absent from most of the hard-core formal-verification sessions. Here are the main things I learned:
The Bristol folks
Kerstin Eder, Dejanira Araiza-Illan and David Western of Bristol U gave an interesting talk describing their work (pdf), which is all about applying CDV to the verification of autonomous robots.
One of the reactions I got (from Matan Vax of Cadence) was that it demonstrates (again) why just CDV is not enough and you need some sort of Model-Based Testing (MBT) to get to the corners. Indeed, David’s presentation demonstrated the classical “CDV+MBT” style:
- Decide what it means to verify this DUT
- Create a CDV environment for it
- Run many tests, observe coverage. Oops – some cross-coverage holes are never filled.
- Create an abstracted model-based environment to generate scenarios which guide you to the holes
- Take those scenarios and (ideally) run those many times, randomizing other parameters
I also got to spend some quality time with that Bristol gang: They seem to have some excellent ideas about testing autonomous robots (can’t talk about those yet, sorry), and I am hoping to collaborate with them in the future.
Fuzzers everywhere
There were several presentations about employing fuzzers for finding security vulnerabilities. This was especially interesting to me in the context of my search for that elusive “Universal Verification Framework (UVF)”, since I am trying to find common ground for “normal” verification and security verification.
Patrice Godefroid of Microsoft Research gave a keynote in which he described, among other things, SAGE, an apparently-pretty-successful Microsoft fuzzer, which uses dynamic symbolic execution (also called “Concolic testing”). Note that SAGE is also sort-of-an-example of the “CDV+MBT” style: Randomize heavily, but also use some model-based heuristic (in this case, dynamic symbolic execution) to go to all code blocks. Note that dynamic symbolic execution does not always work (i.e. for some code blocks it fails to find the recipe to reach them), but with enough luck and randomness that does not matter much.
Another presentation about FLINDER-SCA (pdf) was also pretty interesting. The coverage they are trying to fill is not every code block, but rather a set of “alarms” (potential security issues) found via static analysis (essentially, security-guided linting) of the code. They then use slicing and taint analysis to (try to) reach those alarms, and thus check whether the potential vulnerability is indeed a real one.
I talked offline with both presenters about whether these techniques / tools could potentially be used for “normal” (non-security) verification, and both had an intuition that they could.
I also think that AFL (of which I have talked previously) could be useful for “normal” verification. Note that AFL represents the “simpler but faster” end of the spectrum: It instruments your code to track coverage of block-to-block executions (so if block B was executed directly after block A 3 times in this execution, then the “A->B” coverage bucket gets a count of 3). It then mutates the input using 6 mutation operators, and tracks which mutations are best for raising that block-to-block coverage. Thus, without any smart constraint solving or slicing, it is able to do a really good job of covering the input space.
What would it take to use these (different) fuzzing techniques for “normal” verification? This is a topic for another post. Clearly, one would have to go from negative to mostly-positive testing, and from a system which knows (almost) nothing about the functionality of the DUT to one that does.
A side-note about re-finding the Heartbleed security bug: This has now become an official Olympic sport, with just about every security tool trying to demonstrate how it could have discovered Heartbleed (which was originally found by Google using visual inspection, and Codenomicon using a fuzzer). For instance, the above-mentioned FLINDER article talks about that.
This is not as silly as it sounds: Heartbleed was a really well-concealed bug (which is why it took so long to discover it), and it is interesting to see how various techniques can be used to find it (assuming they are not carefully drawing the bullseye around the already-found arrow). Take a look at “How to prevent the next Heartbleed” by David Wheeler – it is longish and somewhat repetitive, but it is an excellent summary of the various available tools, their overlaps and the holes between them, which is also interesting from a “normal” verification perspective.
My presentation
I gave this presentation, which talked about the various verification tribes and about the possibility of creating a Unified Verification Framework (UVF).
In general, I think it was well-received. Several people commented on my skills as a stand-up comedian, which was gratifying: My previous attempts at performance art have failed (and my son Yuval once remarked, “You could make a lot of money by having people pay you not to sing”).
People indeed seemed to agree that there are those “verification tribes” which don’t talk enough to each other. BTW, during the presentation, I asked “does anybody here know what ‘HIL’ means in the context of verification, e.g. in automotive or Mil/Aero”? Just 3 people raised their hand, out of a big room-full of fairly savvy verification people. Note that if you talked to an automotive verification guy and said you don’t know what Hardware-In-the-Loop was, he would probably start talking very slowly, assuming that you are either very new to verification, or just plain dumb.
But the main kind of feedback I was looking for was regarding the feasibility of a UVF, and I got plenty of that. Here are some of the comments I got:
- Somebody suggested this will never work, because (e.g. in security) there are new needs all the time, which are incompatible with everything else. Two examples are:
- Security now involves a lot of self-modifying code. Model checkers can’t handle that.
- Security now involves ROP, which is weird and unique and does not have parallels in “normal” verification.
- My (not so confident) answer to the above was “Yes, security indeed seems to spawn new concepts and techniques at an alarming rate, so it will always need its own unique things, but still much can be shared”.
- Regarding my proposal, which essentially amounts to a GVL (Generic Verification Language, to mirror Hardware Verification Language), Moshe Vardi said “But the world can’t even unify on any single programming language like C++/Java/Python, so what chance does a GVL have?”. To which my (excellent, I thought) answer was: “Well, C++ etc. are all good general programming languages you can choose from. My issue is that we currently have exactly zero GVLs to choose from”.
- Several people asked about the “right” notation for modeling the DUT/VE/scenarios in a UVF. I answered that (1) I am not sure and (2) there should be multiple notations – certainly we can’t rely on just/mainly UML/SysML
- Within the general topic of notations, the topic of “how to define scenarios” came up multiple times (more on that topic in a subsequent post):
- Several people talked about the need to make all scenarios “soft”, in the sense that they can be influenced (or even partially overridden) from the outside
- People also said that if I really believe in this “CDV+MBT” style, then scenario definition must be very flexible, since there are so many kinds of MBT.
And that’s all for today.
Notes
I’d like to thank Yaron Kashai and Amiram Yehudai for reading earlier drafts of this post.