Dragana Milovancevic
Biography
I'm a PhD student in the LARA group at EPFL, under the supervision of Viktor Kuncak. My research interests are in the field of formal verification, and include equivalence checking and automated grading.Publications
Infoscience publications
Formal Autograding in a Classroom
We report our experience in enhancing automated grading in an undergraduate programming course using formal verification. In our experiment, we deploy a program verifier to check the equivalence between student submissions and our reference solutions, alongside the existing testing-based grading infrastructure. We were able to use program equivalence to differentiate student submissions according to their high-level program structure, in particular their recursion pattern, even when their input-output behaviour is identical. Consequently, we achieve (1) higher confidence in correctness of idiomatic solutions but also (2) more thorough assessment of solution landscape that reveals solutions beyond those envisioned by instructors.
2025. European Symposium On Programming 2025.Automated Induction for Proving Program Equivalence
Formal verification has made great advances in recent decades, resulting in tools that can verify strong correctness properties of important pieces of software infrastructure. In this thesis, we study a particular instance of formal verification: equivalence checking, and its applications in program verification. We propose a fully automated approach to equivalence checking that takes as input candidate programs and reference programs and uses functional induction to automatically prove or disprove the correctness of each candidate program. We next extend our approach to support equivalence checking of underspecified programs, and propose a new application of equivalence checking for measure transfer in termination proving. We implement our approach on top of the Stainless verifier to support equivalence checking of Scala programs. To evaluate our approach, we assemble data sets comprising a total of around 5000 Scala programs, drawn from equivalence checking benchmarks and programming assignments. Our evaluation and a case study show that our system outperforms state-of-the-art tools and is applicable to automated grading in undergraduate programming courses.
Lausanne, EPFL, 2025. DOI : 10.5075/epfl-thesis-10552.Proving Termination via Measure Transfer in Equivalence Checking (Extended Version)
Program verification can benefit from proofs with varied induction schemas. A natural class of induction schemas, functional induction, consists of those derived from definitions of functions. For such inductive proofs to be sound, it is necessary to establish that the functions terminate, which is a challenging problem on its own. In this paper, we consider termination in the context of equivalence checking of a candidate program against a provably terminating reference program annotated with termination measures. Using equivalence checking, our approach automatically matches function calls in the reference and candidate programs and proves termination by transferring measures from a measure-annotated program to one without annotations. We evaluate this approach on existing and newly written termination benchmarks, as well as on exercises in programming courses. Our evaluation corpus comprises around 10K lines of code. We show empirically that the termination measures of reference programs often successfully prove the termination of equivalent candidate programs, ensuring the soundness of inductive reasoning in a fully automated manner.
2024Proving Termination via Measure Transfer in Equivalence Checking
Program verification can benefit from proofs with varied induction schemas. A natural class of induction schemas, functional induction, consists of those derived from definitions of functions. For such inductive proofs to be sound, it is necessary to establish that the functions terminate, which is a challenging problem on its own. In this paper, we consider termination in the context of equivalence checking of a candidate program against a provably terminating reference program annotated with termination measures. Using equivalence checking, our approach automatically matches function calls in the reference and candidate programs and proves termination via measure transfer. We evaluate this approach on existing and newly written termination benchmarks, as well as on exercises in programming courses. Our evaluation corpus comprises around 10K lines of code. We show empirically that the termination measures of reference programs often successfully prove the termination of equivalent candidate programs, ensuring the soundness of inductive reasoning in a fully automated manner.
2024Formula Normalizations in Verification
We propose a new approach for normalization and simplification of logical formulas. Our approach is based on algorithms for lattice-like structures. Specifically, we present two efficient algorithms for computing a normal form and deciding the word problem for two subtheories of Boolean algebra, giving a sound procedure for propositional logical equivalence that is incomplete in general but complete with respect to a subset of Boolean algebra axioms. We first show a new algorithm to produce a normal form for expressions in the theory of ortholattices (OL) in time O(n^2). We also consider an algorithm, recently presented but never evaluated in practice, producing a normal form for a slightly weaker theory, orthocomplemented bisemilattices (OCBSL), in time O(n log(n)^2). For both algorithms, we present an implementation and show efficiency in two domains. First, we evaluate the algorithms on large propositional expressions, specifically combinatorial circuits from a benchmark suite, as well as on large random formulas. Second, we implement and evaluate the algorithms in the Stainless verifier, a tool for verifying the correctness of Scala programs. We used these algorithms as a basis for a new formula simplifier, which is applied before valid verification conditions are saved into a persistent cache. The results show that normalization substantially increases cache hit ratio in large benchmarks.
2023. 35th International Conference on Computer Aided Verification, Paris, France.Proving and Disproving Equivalence of Functional Programming Assignments
We present an automated approach to verify the correctness of programming assignments, such as the ones that arise in a functional programming course. Our approach takes as input student submissions and reference solutions, and uses equivalence checking to automatically prove or disprove correctness of each submission. To be effective in the context of a real-world programming course, an automated grading system must be both robust, to support programs written in a variety of style, and scalable, to treat hundreds of submissions at once. We achieve robustness by handling recursion using functional induction and by handling auxiliary functions using function call matching. We achieve scalability using a clustering algorithm that leverages the transitivity of equivalence to discover intermediate reference solutions among student submissions. We implement our approach on top of the Stainless verification system, to support equivalence checking of Scala programs. We evaluate our system and its components on over 4000 programs drawn from a functional programming course and from the program equivalence checking literature; this is the largest such evaluation to date. We show that our system is capable of proving program correctness by generating inductive equivalence proofs, and providing counterexamples for incorrect programs, with a high success rate.
2023. p. 928 - 951. DOI : 10.1145/3591258.