Program Slicing
Caesar supports program slicing on the HeyVL intermediate verification language. It is the basis for Caesar's localized error reporting (e.g. "this loop invariant may not be inductive"), but we support more generalized slicing criteria as well.
Caesar has support for slicing for errors, which aims at a sliced program that fails with a counter-example. Assertion slicing with its purpose of localizing error messages falls is a kind of slicing for errors. Slicing for correctness is aimed at finding a sliced program that verifies. Assumption slicing is one kind of slicing for correctness.
Fun fact: Caesar's slicing feature was developed with the project name Brutus — alluding to Caesar's adopted son, who also famously participated in a very different way of slicing...
In this document:
Overview
Assertion slicing is enabled by default on every verification query to help interpret a counterexample.
It does this by finding a minimal set of assert
-like statements that must be in the code to obtain a verification error.
Caesar's built-in proof rules have annotations that provide interpretations to every assert
.
For example, there is an assertion in the encoding of induction whose presence in a minimal counterexample slice means that the invariant might not be inductive.
Assumption slicing must be enabled with the --slice-verify
command-line flag.
It can be used to find a minimal set of assume
-like statements with which the program verifies.
This may help optimizing or understanding a HeyVL proof.
For example, there's an assume
in the induction encoding inside a verifying HeyVL program.
The removal of that assume
indicates that a while
loop could also be just a simple if
statement for verification.
This could indicate a problem in the specification to be proved if the user knows a while
statement should be necessary.
Slicing for errors generalizes assertion slicing to statements that are not necessarily assert
-like and slicing for correctness does the same for statements that are not necessarily assume
-like.
We discuss these generalized applications in the respective sections General Slicing for Errors and General Slicing for Correctness.
On a very high level, Caesar's program slicing works by trying all different combinations of removing all candidate statements for the slice.
Our theory of slicing for HeyVL formalizes the ideas of assert
-like and assume
-like via the discordant and concordant definitions and specifies soundness of slicing using these terms.
Caesar's annotations for slicing can be used to add slicing information to statements such as error messages for assertions.
In the last section of this document, we describe implementation details of the slicing algorithm in Caesar.
Interestingly, Caesar's slicing is in essence not very different from slicing-like approaches in classical deductive verifiers. We just generalized the ideas to the quantitative setting of our intermediate verification language HeyVL, but the ideas are equally applicable to the qualitative setting.
Since our approach generalizes the qualitative setting and to simplify the presentation here, we will mostly give qualitative examples in the following to illustrate the ideas. But keep in mind that everything also works for quantitative verification tasks in the same way.
To our knowledge, Caesar's implementation of slicing is the first one that combines both slicing for errors, for correctness, on an IVL, in the quantitative setting.
Our slicing approach is still a work-in-progress which we hope to present in an upcoming academic publication.
The following flowchart illustrates Caesar's slicing algorithm.
Slicing for Errors
Slicing for errors takes a HeyVL program and tries to remove as many statements from the slice selection as possible such that the resulting program fails verification with a counter-example.
Assertion Slicing
Assertion slicing is program slicing with the specific purpose of interpreting counter-examples for verification from the SMT solver.
The slice selection consists of all assert
-like statements (cf. A Theory of Slicing for HeyVL for details).
Consider the following HeyVL proc which does not verify:
proc two_asserts(x: UInt) -> ()
{
assert ?(x >= 0)
assert ?(x >= 1)
}
Clearly, the first assertion is not a problem since all UInt
values are non-negative.
The second one fails when the input is x = 0
.
The SMT solver will indeed give us that counter-example, but we still need to associate this counter-example to a specific assertion.
While we can simply try to evaluate all assertions in state x = 0
for this example, this becomes impossible when we do not know what the exact state will be in a later statement.
This can happen after nondeterministic and probabilistic choices, or after unbounded loops.
Caesar instead looks at different counterfactuals for the current verification problem.
- Does a verification failure still occur with the first assert removed? Yes.
- Does a verification failure still occur with the second assert removed? No.
From this we can conclude that the second assert is at fault and responsible for the verification failure. In general, Caesar will try to find a minimal set of assertions that will lead to a verification error. The result is an error message like this:
program.heyvl::two_asserts: Counter-example to verification found!
input variables:
x: 0 (program.heyvl:1:12)
the pre-quantity evaluated to:
0
program slice:
❌ assertion might not hold (program.heyvl:4:5)
This message's "program slice" section points to the second assertion in line 4.
Note that the minimal sets of assertions from counter-examples are not necessarily unique! There might be even more verification failures after the assertions from slicing are removed. The program slice from slicing for errors only constitutes a set from which no assertion can be removed while still getting an error (hence, minimal).
In theory, we could also look at assertions which are not present in any minimal slice for errors to determine assertions which are never a problem. However, we have not found this useful and have not looked at this further.
General Slicing for Errors
Only trying to remove assert
-like statements may be too restrictive for certain applications.
Consider the following simple example:
proc assign_error() -> (x: UInt) {
x = 2
x = x * 2
x = x * 2
x = x * 2
assert ?(x < 8)
}
This example will fail to verify since x = 16
holds at the assert
.
Caesar will also report the assertion as the culprit, but which assignments are problematic?
We can annotate the assignments to instruct Caesar to also try to remove them to still get a counterexample (cf. slicing annotations documentation).
proc assign_error() -> (x: UInt) {
x = 2
@slice_error {
@error_msg("first needed") x = x * 2
@error_msg("second needed") x = x * 2
@error_msg("third needed") x = x * 2
}
assert ?(x < 8)
}
The result is something like this:
program.heyvl::assign_error: Counter-example to verification found!
the pre-quantity evaluated to:
0
program slice:
❌ first needed (program.heyvl:4:36)
❌ third needed (program.heyvl:6:36)
❌ assertion might not hold (program.heyvl:8:5)
While this example is somewhat silly, one can imagine a more complicated example where removing assignments might be helpful to narrow down the problem.
You may ask: why do we not try slice assignments by default?
The answer is that assignments are not discordant according to our theory of slicing for HeyVL, i.e. removing them does not allow us to conclude something about the original program directly.
In general, a counterexample of a HeyVL program with a removed assertion might imply that the original also has a counterexample (as is the case here), but it could also not imply it.
It might even imply that the original program verified!
Consider a program like x = true; x = !x; assert ?(x)
for example.
Slicing for Correctness
Slicing for correctness takes a HeyVL program and tries to remove as many statements from the slice selection as possible such that the resulting program verifies.
Assumption Slicing
Assumption slicing selects all assume
-like statements as candidates to remove from the program and slices so that the resulting program verifies.
After this process, we can get a minimal set of assumptions which are needed for the program to still verify.
From the set of assumptions which are not included in the minimal necessary slice to verify, we can generate warnings which might indicate a problem to the user.
For example we might find out that a pre to a proc is not necessary.
Consider the following program:
proc assumes(x: UInt) -> ()
pre ?(x == 42)
{
assume 0
assert ?(x >= 1)
}
Caesar will emit an output similar to this:
program.heyvl::assumes: Verified.
program slice:
🤷 assumption is not necessary (program.heyvl:4:5)
As is the case in assertion slicing, the minimal set of assumptions is not necessarily unique.
See the Solving for Correctness Slices section for the different algorithms that Caesar supports and which guarantess are given about optimality of slices.
The default algorithm used is core
, which is fast, but does not guarantee a minimal slice.
To find minimal slices, we recommend using mus
(command-line option --slice-verify-via mus
).
The smallest slice can be searched for via the sus
algorithm.
General Slicing for Correctness
Consider the following example, in which we sample two bits with fair coin flips.
The specification says that the probability of a value of the sampled value r >= 2
is at least 1/2
.
The specification holds, but only one assignment is actually necessary!
proc bits() -> (r: EUReal)
pre 1/2
post [r >= 2]
{
var b0: UInt; var b1: UInt
@slice_verify() {
var choice: Bool = flip(0.5)
if choice { b0 = 0 } else { b0 = 1 }
choice = flip(0.5)
if choice { b1 = 0 } else { b1 = 1 }
r = b0 + 2 * b1
}
}
Running Caesar with the --slice-verify
option on this program, we get the following output:
program.heyvl::bits: Verified.
program slice:
🤷 statement in line 9 is not necessary (program.heyvl:9:21)
🤷 statement in line 9 is not necessary (program.heyvl:9:37)
🤷 statement in line 11 is not necessary (program.heyvl:11:21)
From this we can conclude that all but the b1 = 1
assignment are actuallly not necessary to satisfy the specification.
Analogous to the case in general slicing for errors, we need to explicitly instruct Caesar to try to remove assignments during slicing for correctness using the @slice_verify
annotation (cf. slice selection annotations).
If Caesar tried to remove all assignments by default, we would not necessarily gain any knowledge about the original program.
More details in the theory of slicing for HeyVL.
Slicing Annotations
For both slicing for errors and correctness, Caesar has some built-in defaults. This includes a default slice selection and messages for the statements (such as "assertion might not hold"). These defaults can be overwritten by annotations on statements. Internally, Caesar also uses these statements, e.g. in the generated encodings for proof rules to attach messages to statements such as an assertion for an inductiveness check of a loop.
The annotations can be added to blocks and compositional statements such as if
-statements and will be inherited by all sub-statements.
Slicing Message Annotations
The @error_msg
and the @success_msg
annotations can be used to attach messages in case a statement is sliced for errors and correctness, respectively.
For example:
proc two_asserts(x: UInt) -> ()
{
@error_msg("x at least zero not necessary") assert ?(x >= 0)
@error_msg("x at least one not necessary") assert ?(x >= 1)
}
Will result in the following section in the counter-example:
[...]
program slice:
❌ x at least one does not hold (program.heyvl:4:48)
The @success_msg
annotation is used in the same way.
Tip: You can inspect the internally generated slicing message annotations from built-in proof rules by passing the --print-core-procs
command-line option to Caesar.
Slicing Selections
By default, Caesar selects assert
-like statements ("discordant") for slicing for errors and assume
-like statements ("concordant") for slicing for correctness.
The terms discordant and concordant are defined in our theory of slicing for HeyVL.
In addition to these defaults, Caesar has annotations to add other statements to the slice selection:
@slice_verify
: Also slice this statement during slicing for correctness.@slice_error
: Also slice this statement during slicing for errors.
Note that these annotations will enable slicing for individual sub-statements, not whole blocks.
A Theory of Slicing for HeyVL
When is it "correct" to slice a statement? What does it mean if we know a statement has been removed from the program? Our theory of slicing for HeyVL aims to give clear correctness criteria for slicing.
The theory is based on the verification condition semantics of HeyVL. You can find some information in the statements documentation and all relevant formal background in our OOPSLA '23 paper. We are working on an academic publication about these ideas that goes into more detail.
Slice Effects
At the core of our theory of slicing for HeyVL are slice effects, which categorize each HeyVL statement into one of three categories: concordant, discordant, and ambiguous.
You can think of concordant statements as like assumptions and of discordant statements as like assertions.
Slice effects depend on the current direction in which we verify.
In a proc
, our direction is down (lower bounds), and in a coproc
, the direction is up (upper bounds).
Formally, we define concordant and discordant as follows:
- In
proc
s:- A statement
S
is concordant when for allpost
it holds thatpost = vc[skip](post) <= vc[S](post)
. - A statement
S
is discordant when for allpost
it holds thatpost = vc[skip](post) >= vc[S](post)
.
- A statement
- In
coproc
s, the definitions are reversed:- A statement
S
is concordant when for allpost
it holds thatpost = vc[skip](post) >= vc[S](post)
. - A statement
S
is discordant when for allpost
it holds thatpost = vc[skip](post) <= vc[S](post)
.
- A statement
Informally, introducing concordant will make them only verify more, i.e. the vc
semantics will at most increase in proc
s (or at least decrease in coproc
s).
This means if we removed them, we can re-insert concordant statements and the program will still verify.
On the other hand, discordant statements will make programs verify less, i.e. the vc
semantics will at most decrease in proc
s (or at least increase in coproc
s).
That means if we removed them, we can re-insert concordant statements and if the sliced program failed to verify, the original program will also fail to verify.
Below is a table of HeyVL statements and their slice effects.
It should be pretty intuitive: assume
statements are concordant with respect to lower bounds, and assert
, havoc
, and validate
are discordant.
When they're used in upper bound contexts, then their effects are reversed.
For the co
-statements, the situation is also exactly reversed.
Statement | proc | coproc |
assume | Concordant | Discordant |
assert | Discordant | Concordant |
havoc | Discordant | Concordant |
validate | Discordant | Concordant |
Statement | proc | coproc |
coassume | Discordant | Concordant |
coassert | Concordant | Discordant |
cohavoc | Concordant | Discordant |
covalidate | Concordant | Discordant |
tick * | Concordant | Discordant |
Note that Caesar only tries to slice (co)assume
and (co)assert
statements, but not (co)havoc
nor (co)validate
.
*: tick
/reward
statements are currently not sliced by default.
This must be enabled with the --slice-ticks
command-line option.
Implementation Details
Caesar's implementation of slicing is a two-stage approach. It first does a program transformation to prepare the input program for slicing. Then we use the SMT solver to minimize the number of enabled statements in the slice in the solving for minimal slices stage.
Program Transformation
To prepare a HeyVL program for slicing, Caesar starts with a slice selection of potential statements to slice.
Usually, that's discordant statements and those enabled with explicit slice selection annotations.
If the option --slice-verify
is set, then we also transform concordant statements as well.
For each potentially sliceable statement S
, we create a new Boolean input variable slice_S
to the program.
The statement S_i
is (logically) replaced by if slice_i { S } else {}
.
In the implementation, we have specialized transformations for almost all kinds of statements to avoid exponential blow-up of the generated verification conditions.
Consider an assumption assume f
.
If we just wrapped it into an if
statement, then we would get verification conditions that contain the post twice:
vc[if slice_1 { assume f } else {}](post) = ite(slice_1, f ==> post, post)
However, we do an equivalent transformation so that the post is not duplicated:
vc[assume ite(slice_1, f, 0)](post) = (ite(slice_1, f, 0) ==> post)
Solving for Slices
After this program transformation, every potentially sliceable statement is associated with a Boolean variable that we can use to turn it on or off. That means we can just set constraints on the number of enabled statements in the SMT solver to query for a new slice. We do not need to re-generate verification conditions, nor do we re-do our optimizations. This allows us to take advantage of the abilities of modern SMT solvers to try a lot of Boolean combinations very quickly.
Solving for Error Slices
When we slice for errors, we can generate a nice exists-exists query that looks for a counter-example to verification with an assignment to the enabled variables. It is of the form
exists slice_1,...,slice_n: exists initial_state: vc[S](\infty) != \infty
After building these new verification queries, we minimize the number of enabled statements by iteratively asking the SMT solver for a counter-example with a smaller number of enabled slice variables. We do a kind of binary search that takes into account the possibility of an "unknown" result from the SMT solver.
Solving for Correctness Slices
Slicing for correctness is a bit more complicated, since we are theoretically asking for anan assignment to the slice variables such that for all initial states the program verifies:
exists slice_1,...,slice_n: forall initial_state: vc[S](\infty) == \infty
We have implemented a couple of different algorithms to tackle this problem.
You can switch between them using the --slice-verify-via [METHOD]
option.
- Unsat Core (
core
): Use the result from Z3's unsat core. The result is not minimized, and is therefore often not optimal. But this method is fast. It is the default method when slicing for verification is enabled. - Minimal Unsat Subset (
mus
): Uses an algorithm to find a minimal unsatisfiable subset of the slice variables, i.e. one where no variable can be removed without the problem becoming SAT (or unknown). - Smallest Unsat Subset (
sus
): Iterates over all minimal unsat subsets to find the globally smallest one. This is usually slow, but it finds the optimal solution. - Exists-Forall Encoding (
exists-forall
): The last encoding is a very direct one. We directly translate the problem to SMT via the exists-forall formulation from above and then let Z3 solve the problem on its own. This usually does not work in practice.
Note that the mus
and sus
implementations continue operation when an "unknown" result is obtained from the SMT solver.
This means that we only solve for optimal solutions modulo unknown.
There might be better solutions that the SMT solver just cannot prove.
As far as we could tell, using e.g. Z3's built-in optimizer for these tasks is often infeasible as it is restricted to a (not particularly well-defined) fragment of input formulas and may return unsound results on other inputs. It also seems to be designed to find actually optimal results, whereas we are also often happy with just a "good" result.