OOPSLA '23 Artifact Evaluation Guide
Welcome to the artifact for our OOPSLA '23 submission "A Deductive Verification Infrastructure for Probabilistic Programs".
Contained within the artifact:
- Our tool Caesar, which parses HeyVL programs and tries to verify them. Caesar constitutes our main implementation contribution and is the focus of this artifact.
- A script to reproduce our benchmarks (Table 2).
- We also include our prototypical tool pgcl2heyvl, which takes pGCL programs with annotations and produces a HeyVL file that encodes the required proof obligations.
- Our full source code is contained within the artifact as well.
Contents
- Getting Started
- Piece by Piece: How Our Artifact Supports the Paper's Claims
- Running Our Benchmarks
- The Caesar Tool and Its Source Code
- pgcl2heyvl: Generating HeyVL Files From pGCL
- A Guide to Custom Examples
- Appendix: Accepted pGCL Syntax by the pglc2heyvl Tool
1. Getting Started
Requirements.
- We use Docker, and provide images for both x86 and ARM architectures.
- 16 GB of RAM, enough disk space for the artifact.
- Our benchmark set should terminate in under 10 minutes.
- Note: We provide an x86 Docker image. On ARM machines, Docker will run it in a virtual machine and will print a warning. In these setups, we have observed a slowdown of about 10x.
Downloading the artifact.
Either download from Zenodo and then run docker image load -i caesar.tar.gz
.
Alternatively, via Github packages:
docker pull ghcr.io/philipp15b/caesar:oopsla23-aec --platform linux/amd64
Entering the artifact environment.
Simply run the caesar
image with Docker.
This will open a bash
shell in the /root/caesar
directory with the caesar
and pgcl2heyvl
commands available.
docker run -it ghcr.io/philipp15b/caesar:oopsla23-aec
The image is based on on Debian Bullseye (slim), so the apt
package manager is available.
The editors vim
and nano
are installed already.
Running the benchmarks. To reproduce our benchmarks (Table 2), execute
fish run-benchmarks.fish
The script will run the list of benchmarks specified in benchmarks.txt
in sequence (usually in < 10min).
After completion, the results will be printed as an ASCII table to the terminal as well to the CSV file benchmark-results.csv
.
Documentation.
- We provide more detailed usage and syntax instructions in our documentation. It is available online and the source code can be found in
docs/src
(Markdown files). - Caesar has Rustdoc documentation, but we do not include the generated files or the Rust compiler in this artifact.
2. Piece by Piece: How Our Artifact Supports the Paper's Claims
Section 5.2 of our paper states our key claims with respect to this artifact.
In this document,
- Section 2.1. explains how to run our benchmarks (Table 2).
- Section 2.3. explains how to automatically generate HeyVL files using our prototypical frontend pgcl2heyvl.
2.1. Running Our Benchmarks
To reproduce our benchmarks (Table 2), execute
fish run-benchmarks.fish
The script will run the list of benchmarks specified in benchmarks.txt
in sequence.
After completion (usually in < 10min), the results will be printed as an ASCII table to the terminal as well to the CSV file benchmark-results.csv
.
Note: To allow reproducing results on slower machines and virtualized environments, we increased the timeout for each benchmark from 10s to 60s.
2.2. The Caesar Tool and Its Source Code
Our tool is available as the caesar
command.
The source code is in the entry directory (/root/caesar
).
In particular, the src
and z3rro
directories are relevant.
Both are documented using Rust doc comments.
Running Caesar directly.
It can be executed with caesar [filename]
where the file contains a HeyVL program.
Our benchmarks can be found in the following directories:
pgcl/examples-heyvl
: This directory contains the 85% of our benchmarks which were automatically generated using pgcl2heyvl.tests
: This directory contains remaining 15% of the benchmarks which are hand-written HeyVL files (cf. Section 5.2 in our paper).
The --timeout [SECONDS]
and --mem [MEGABYTES]
command-line options are useful to set runtime and memory limits.
HeyVL Syntax.
Caesar accepts a modified version of our syntax from the paper.
We refer to our documentation section on the language: It is online, and contained in the artifact in docs/src/heyvl
.
Example HeyVL files in the pgcl/examples
and tests/
directories.
VSCode extension for syntax highlighting.
We have a VSCode extension for HeyVL syntax highlighting.
See vscode-ext/README.md
for installation instructions.
You might need to do this outside of your Docker container, so run docker cp CONTAINER:/root/caesar/vscode-ext $PWD
to copy the directory out of the container and then run the installation.
We recommend using this extension with a VSCode Remote connection to the Docker container when editing HeyVL files for convenience.
2.3. pgcl2heyvl: Generating HeyVL Files From pGCL
This pGCL frontend is available as the pgcl2heyvl
command.
Its source code is provided in the pgcl/pgcl2heyvl
directory as a Poetry package.
Automatically generated benchmarks.
In our paper, we claim that 85% of our benchmarks are automatically generated from pGCL code.
These pGCL examples are located in the pgcl/examples
directory.
The corresponding generated HeyVL files are located in pgcl/examples-heyvl
.
We already generated all these HeyVL files with pgcl2heyvl.
In order to reproduce the process of automatically generating the HeyVL files, run
rm pgcl/examples-heyvl/*.heyvl # delete existing files
fish pgcl/examples-heyvl/generate.fish # generate files
See Section 2.2. for how to run the individual HeyVL files with Caesar.
Details. pgcl2heyvl parses pGCL programs in the syntax accepted by the probably Python package. See the last section of this file for the grammar of pGCL programs.
At the top of every file must be a comment of the style
// ARGS: --encoding ... --pre ... --post ...
according to the CLI documentation of the pgcl2heyvl tool (run pgcl2heyvl --help
).
3. A Guide To Custom Examples
We will look at our geo1
example to see how you can create your own examples for pgcl2heyvl and Caesar.
3.1. From pGCL to HeyVL
There is a pGCL file for geo1
in pgcl/examples/geo1.pgcl
:
// ARGS: --encoding "encode-k-induction" --calculus "wp" --post c --pre "c+1" --k 2
nat c;
nat f;
while(f=1){
{f := 0}[0.5]{c := c+1}
}
pgcl2heyvl can be used to create a corresponding HeyVL file that uses the annotations in the first line to prove that wp[geo1](c) <= c+1
holds using k
-induction with k=2
.
pgcl2heyvl pgcl/examples/geo1.pgcl > mygeo.heyvl
This will create a mygeo.heyvl
file.
Things to Try:
- Set the
pre
toc+2
. This is still a valid upper bound (and the HeyVL file should verify.) - Set the
pre
toc
. This is not a valid upper bound (and Caesar should give a counter-example later).
Note: pgcl2heyvl has a slighly different syntax for programs than Caesar accepts. It also does not accept domain declarations (including user-defined functions and axioms). We are working on a more convenient implementation.
3.2. Verifying HeyVL Files With Caesar
Simply run caesar
with your new file:
caesar mygeo.heyvl
and it prints that the main generated coproc
either verifies or not (depending on your modifications).
You can of course create HeyVL files directly. To encode loops, use the encodings detailed in the appendix of our paper.
4. Appendix: Accepted Syntax by the pgcl2heyvl Tool
An excerpt from the Lark grammar for pGCL programs used in the probably library:
declaration: "bool" var -> bool
| "nat" var bounds? -> nat
| "const" var ":=" expression -> const
bounds: "[" expression "," expression "]"
instruction: "skip" -> skip
| "while" "(" expression ")" block -> while
| "if" "(" expression ")" block "else"? block -> if
| var ":=" rvalue -> assign
| block "[" expression "]" block -> choice
| "tick" "(" expression ")" -> tick
rvalue: "unif" "(" expression "," expression ")" -> uniform
| expression
literal: "true" -> true
| "false" -> false
| INT -> nat
| FLOAT -> float
| "∞" -> infinity
| "\infty" -> infinity
Expressions in programs and expectations can be built from the following operators, grouped by precedence:
||
,&
<=
,<
,=
+
,-
*
,:
/
not
,( ... )
,[ ... ]
,literal
,var
Whitespace is generally ignored.