FV has much better PR than CDV

The topic of Formal Verification (FV) and how it relates to Coverage Driven Verification (CDV) came up in some previous posts and in a comment thread. So allow me to briefly discuss it – I promise to get back to the main course soon.

Some definitions

Before I continue, let me define what I mean by CDV (expect a more comprehensive post “soon”). I am using CDV here loosely, to mean a verification technique which has all (or most) of the following attributes:

It consists of performing many runs with lots of randomness in them
- Normally using a constraint solver to produce random values which obey constraints
It aims at going to corner cases of the design
- And thus random distributions do not always follow expected real-life distributions
It is guided by a test plan or coverage plan, and involves collecting of functional coverage
- Though paradoxically, the part of CDV most often skipped is coverage collection and analysis
It runs autonomously, day and night
- Thus no “human in the loop”, and with automatic checking and recording of failures and coverage
It normally runs in simulation
- But can run with HW-in-the-loop, various kinds of emulation etc.
Recorded failures can be reconstructed (for debugging) using the random seed
- Though some environments are just “mostly repeatable”, a funny (but useful) concept reminding me of “mostly dead”

This is in stark contrast to FV: In model checking (the main kind of FV for our purpose here), one checks statically (without running) that some set of assertions always holds for some model (where the model can be an abstract description of the DUT written in some notation, or actual pieces of the DUT like C or Verilog code, or a combination of these).

It is also important to note that while I have done CDV forever, I am not an expert on FV.

My main point, finally

If you are intimately familiar with hardware verification, then you know that many more HW bugs (taking together bugs found at the module, subsystem and chip level) are found via CDV (as defined above) than are found via FV.

However, if you are coming from outside that pressure cooker, then quite possibly you never heard of CDV, and you are sure FV is king. This is because FV has much better public relations, especially in the academic world, where it is considered more, well, formal, and hence the only “verification” worthy of the name. In fact, many academic people call non-formal verification “testing”, with some disdain.

Now don’t get me wrong here:

FV is really important, in fact indispensable, for certain things
Some areas of hardware verification (e.g. floating point engines) are verified almost exclusively using FV
Where it can be applied, FV will find every place where an assertion does not hold
FV capacity is growing all the time (but so are the designs-to-be-verified)
The terminology / methodology that has grown up around FV is helpful for all verification
Some of my best friends are formal-heads (really)

However, usually the price-performance (in Important-Bugs-found-per-Person-Hour, alias IB/PH) of doing CDV is simply much, much better, especially for large, complex, fuzzy, exploratory areas.

So: I am certainly not advocating less FV research, or anything like that. In fact, most of the advances in verification coming out of academia are made by “formal” people. But one reason for that (and for the fact that FV has come as far as it has) is that it has such a bunch of smart, dedicated people hammering on it.

All I am saying is that I can’t help but imagine how much good these folks would do if more of them went into the relatively-neglected CDV field, or “dynamic, simulation-based verification” or whatever you want to call it. There are certainly a lot of excellent people there as well, but the FV people outnumber them by a huge margin.

And if you think that CDV, unlike FV, has no interesting problems to research, than you are (and I say this with the greatest respect) stark raving mad. For instance, just look at all the big, conceptual problems in applying CDV to big-system verification, the kind of problems we’ll be discussing in this blog.

There, I said it. I feel much better now.

A few more comments

Obviously, one needs a mix of verification techniques – for realistic projects, one needs FV, CDV and a whole bunch of other things. Also, there is a lot of good work done in combining these techniques (e.g. look at Concolic testing).

Some areas are, perhaps, better suited for FV. For instance, one could argue that security is one, because in security (unlike functional verification) once a bug (vulnerability) is known, the bad guys will jump on it, so the concept of “a very rare bug” does not exist, so techniques (like CDV) which (somewhat) rely on the frequency of bug occurrence are (somewhat) less useful.

But even in security, most of the security holes which are found “by machine” (by either the good guys or the bad guys) are found via a technique called fuzzing. And “fuzzing” is roughly “the name security people gave to CDV”. For instance, check out the excellent, free AFL.

Another area where formal verification may be more appropriate is that futuristic / all-important / slightly-weird area of friendly-AI verification (I actually wrote a whole post about that area, but then decided it would be a distraction for the normal readership of this blog. You can still read it here if you’d like to be distracted).

FV is used extensively in safety-critical domains such as railroad switching. And a lot of safety critical software is automatically generated from formal models that have been model checked (Lustre etc). And like I said, this is all good: I don’t want less FV research, just more CDV research.

One final thing to remember about formal techniques: The assertions and models used in FV are also code (or textual / graphical notations – same thing) and as such also have bugs in them. Similarly when formally deriving SW from formal models: When somebody says “X is correct by construction”, all they mean is “X was automatically generated from Y, which is hopefully shorter and easier to understand”. And this is sometimes a vain hope: Quite often those formal notations are indeed shorter, but not easier to understand (for the intended audience). So they will have bugs in them, and you’d better do CDV-style verification of the result.

Notes

I’d like to thank Laurent Fournier and Yaron Kashai for reading earlier versions of this post.

Edited 15-July to add: See my comments to this post, containing some feedback I got which sheds a more positive light on current FV practice.

8 thoughts on “FV has much better PR than CDV”

When I wrote the above post, I assumed it could be somewhat controversial,
and that there was even a chance (gasp) that I would be proved wrong on some
things.

I was not expecting answers to come so fast, though. I had two phone calls
this morning from people very familiar with the current state of the
art in HW verification.

One told me that while formal indeed still finds only about 10% of the bugs, to
his knowledge the price-performance per bug is actually
_better_ in formal, mainly because the reported bugs are easier to
reconstruct and more of them end up as “real bugs”. It is hard to grow the
percentage beyond 10% because not everybody can use those tools.

Both reminded me that formal technology has also made great strides in
separating verification aspects. So now they can formally verify some
full-chip aspects (like connectivity security security) very efficiently.

BTW, that latter topic (of when and how one can separate verification aspects)
is hugely important in big-system verification. Most of the bugs which kill
people are actually spec bugs (not implementation bugs), and quite a few of
those result from (wrongly) assuming that two seemingly-orthogonal aspects will
never interact.

tudortimi says:

July 15, 2015 at 1:34 pm

I think we’re on the right way of getting rid of this FV > (or <) CDV mentality. Both have their own uses and they can complement each other nicely. Choosing what approach to use for a certain task shouldn't be a Coca-Cola vs. Pepsi discussion.

Loading...

Reply

“However, usually the price-performance (in Important-Bugs-found-per-Person-Hour, alias IB/PH) of doing CDV is simply much, much better, especially for large, complex, fuzzy, exploratory areas.”

If you want good performance, measured as IB/PH, then have very very buggy designs.

If your objective is to have high assurance of the absence of whole classes of bugs, then FV wins every time.

….. I see that this forum accepts html tags. I hope that you screen for alert (“danger!”) and similar vulnerabilities.

Hi Martyn

I agree with you that very buggy designs will get you excellent bug-finding
performance ;-).

The point I was trying to make, though (perhaps not very clearly) is about
verification experience in the chip-design industry. Bugs in that industry are
considered pretty dangerous to a project (because you can’t patch them). So
50% or more (depending on how you count) of those very expensive engineers
are doing verification.

So chip-design verification practices are by now fairly “industrial-grade”:
there are a lot of chip companies using a lot of FV and CDV tools day in
and day out. And those tools come from quite a few companies (or are inhouse
tools).

And it is this professional context that CDV finds many more bugs than FV
(though as I said in my previous comment, where FV _can_ work, it is
pretty efficient).

Just what this implies for bigger, fuzzier systems like AVs and UAVs etc.
is unclear. My intuition (and it is just that) is that FV will not necessarily
play a much bigger role there.

Yoav

I think there is a confusion here which I want to emphasis. When one says FV he means to different things:
1. running on a relatively small model (module level testing)
2. using a formal engine tool

I think that part of the reason FV give such a good ROI (Return on Investment) is due to the first item above, regardless on the verification method used.
The smaller the block you test, the better ROI you get. This is true for both first bugs (and ease of debug) and for finding the last bugs.

I think the correct methodology is finding the bugs in the right level, as low level as possible:
1. in the module level find all bugs
2. in the full chip find only few “integration bugs” that couldn’t have been found in the module level

I call this method “zero bug delivery”, each model/verification activity is responsible for delivering zero bugs to the next level/activity.
Due to management and ROI, sometimes we may decide to skip some of the testing in one of the levels/features. This may be due to low risk for bug, difficulties in verification at this level, importance of the feature (and the expected bugs), etc.

p.s: I think the reason why FV is loved by the academic is that it is measurable. You can do a competition of proving an FV property, it is much harder in CDV

Thanks Hillel

I think I mostly agree.

Kerstin Eder (of Bristol Univ.) suggested a related reason for why CDV is less popular with academics:

————-
“The “problem” with CDV and (lack of) uptake in academia from my experience is that, for it to be interesting to a researcher, you need a non-trivial system to verify, one that you understand (so you can have expectations of what the results may be, introduce errors, define cross product coverage etc) and a testbench and associated automation (tools).

This takes time and effort to learn and develop, i.e. it takes a long time for you to get to the starting point. There is no point doing this for systems that don’t have interesting / complex behavior – for these you don’t need CDV in the first place.

Also, remember, we are rewarded for papers and citations, and the time/effort to get to the starting point of CDV is time/effort lost in paper writing.”
————-

Yoav

Very interesting blog, thanks Yoav. I think the statement
—-
If you are intimately familiar with hardware verification, then you know that many more HW bugs (taking together bugs found at the module, subsystem and chip level) are found via CDV (as defined above) than are found via FV.
—-
is very misleading. I would agree that most bugs at chip-level are found by simulation and emulation, not formal. However, in many system-level tests there is very little left to random. Portable stimulus may change that, I think it should, but we are still at an early stage.

So let’s not forget that CDV and simulation are not the same thing. A lot of simulation (the majority perhaps, depending of what metrics you pick), is still coming from directed tests. Those have little to do with CDV.

Yoav Hollander says:

July 15, 2015 at 7:53 am

When I wrote the above post, I assumed it could be somewhat controversial,
and that there was even a chance (gasp) that I would be proved wrong on some
things.

I was not expecting answers to come so fast, though. I had two phone calls
this morning from people very familiar with the current state of the
art in HW verification.

One told me that while formal indeed still finds only about 10% of the bugs, to
his knowledge the price-performance per bug is actually
_better_ in formal, mainly because the reported bugs are easier to
reconstruct and more of them end up as “real bugs”. It is hard to grow the
percentage beyond 10% because not everybody can use those tools.

Both reminded me that formal technology has also made great strides in
separating verification aspects. So now they can formally verify some
full-chip aspects (like connectivity security security) very efficiently.

BTW, that latter topic (of when and how one can separate verification aspects)
is hugely important in big-system verification. Most of the bugs which kill
people are actually spec bugs (not implementation bugs), and quite a few of
those result from (wrongly) assuming that two seemingly-orthogonal aspects will
never interact.

Loading...

1. tudortimi says:
  
  July 15, 2015 at 1:34 pm
  
  I think we’re on the right way of getting rid of this FV > (or <) CDV mentality. Both have their own uses and they can complement each other nicely. Choosing what approach to use for a certain task shouldn't be a Coca-Cola vs. Pepsi discussion.
  
  Loading...
  
Martyn Thomas says:

July 16, 2015 at 9:25 am

“However, usually the price-performance (in Important-Bugs-found-per-Person-Hour, alias IB/PH) of doing CDV is simply much, much better, especially for large, complex, fuzzy, exploratory areas.”

If you want good performance, measured as IB/PH, then have very very buggy designs.

If your objective is to have high assurance of the absence of whole classes of bugs, then FV wins every time.

Loading...

Martyn Thomas says:

July 16, 2015 at 9:31 am

….. I see that this forum accepts html tags. I hope that you screen for alert (“danger!”) and similar vulnerabilities.

Loading...

Yoav Hollander says:

July 16, 2015 at 7:54 pm

Hi Martyn

I agree with you that very buggy designs will get you excellent bug-finding
performance ;-).

The point I was trying to make, though (perhaps not very clearly) is about
verification experience in the chip-design industry. Bugs in that industry are
considered pretty dangerous to a project (because you can’t patch them). So
50% or more (depending on how you count) of those very expensive engineers
are doing verification.

So chip-design verification practices are by now fairly “industrial-grade”:
there are a lot of chip companies using a lot of FV and CDV tools day in
and day out. And those tools come from quite a few companies (or are inhouse
tools).

And it is this professional context that CDV finds many more bugs than FV
(though as I said in my previous comment, where FV _can_ work, it is
pretty efficient).

Just what this implies for bigger, fuzzier systems like AVs and UAVs etc.
is unclear. My intuition (and it is just that) is that FV will not necessarily
play a much bigger role there.

Yoav

Loading...

Hillel Chapman says:

July 19, 2015 at 6:03 am

I think there is a confusion here which I want to emphasis. When one says FV he means to different things:
1. running on a relatively small model (module level testing)
2. using a formal engine tool

I think that part of the reason FV give such a good ROI (Return on Investment) is due to the first item above, regardless on the verification method used.
The smaller the block you test, the better ROI you get. This is true for both first bugs (and ease of debug) and for finding the last bugs.

I think the correct methodology is finding the bugs in the right level, as low level as possible:
1. in the module level find all bugs
2. in the full chip find only few “integration bugs” that couldn’t have been found in the module level

I call this method “zero bug delivery”, each model/verification activity is responsible for delivering zero bugs to the next level/activity.
Due to management and ROI, sometimes we may decide to skip some of the testing in one of the levels/features. This may be due to low risk for bug, difficulties in verification at this level, importance of the feature (and the expected bugs), etc.

p.s: I think the reason why FV is loved by the academic is that it is measurable. You can do a competition of proving an FV property, it is much harder in CDV

Loading...

Yoav Hollander says:

July 20, 2015 at 4:52 am

Thanks Hillel

I think I mostly agree.

Kerstin Eder (of Bristol Univ.) suggested a related reason for why CDV is less popular with academics:

————-
“The “problem” with CDV and (lack of) uptake in academia from my experience is that, for it to be interesting to a researcher, you need a non-trivial system to verify, one that you understand (so you can have expectations of what the results may be, introduce errors, define cross product coverage etc) and a testbench and associated automation (tools).

This takes time and effort to learn and develop, i.e. it takes a long time for you to get to the starting point. There is no point doing this for systems that don’t have interesting / complex behavior – for these you don’t need CDV in the first place.

Also, remember, we are rewarded for papers and citations, and the time/effort to get to the starting point of CDV is time/effort lost in paper writing.”
————-

Yoav

Loading...

SERGIO MARCHESE says:

January 25, 2019 at 6:16 am

Very interesting blog, thanks Yoav. I think the statement
—-
If you are intimately familiar with hardware verification, then you know that many more HW bugs (taking together bugs found at the module, subsystem and chip level) are found via CDV (as defined above) than are found via FV.
—-
is very misleading. I would agree that most bugs at chip-level are found by simulation and emulation, not formal. However, in many system-level tests there is very little left to random. Portable stimulus may change that, I think it should, but we are still at an early stage.

So let’s not forget that CDV and simulation are not the same thing. A lot of simulation (the majority perhaps, depending of what metrics you pick), is still coming from directed tests. Those have little to do with CDV.

Loading...

	https://otomotif71.w… on Stuttgart impressions: Scenari…
	Daan van der Keur on About “The coming AI hackers”…
	Mariah Jackson on M-SDL, the autonomous vehicles…
	sakhokhar on Machine Learning for Coverage…
	hongseoklee on How to write AV scenarios (and…
	Erik Panu on GPT-3 and verification
	Yoav Hollander on Autonomy markets and their pot…
	Nakkeeran Kumaraswam… on Autonomy markets and their pot…
	Aman on DeepXplore and new ideas for v…

The Foretellix CTO Blog – AI safety

Now focusing on AI safety (autonomy-related posts go to the company blog)