7 The Perses Family

Perses answered one question cleanly: can a language-agnostic reducer use grammar structure? That success immediately surfaced the next five questions, which the papers in the Perses family now address.

The core Perses idea is that a reducer can use grammar and syntax structure to remove or transform parts of a program while preserving validity more often than plain text-level reduction (Sun et al. 2018). Later work asks what else can be improved: minimality, transformation power, speed, generality, and guidance from learned models. Perses is distributed under the GNU General Public License 3, with its implementation at https://github.com/uw-pluverse/perses.

Treat the family as a map of recurring research pressures, not as a catalog to memorize.

7.1 A Research Family

Most of the family below originates in Chengnian Sun’s group at the University of Waterloo, with notable exceptions: HDD (Misherghi and Su at UC Davis), Picireny (Hodován, Kiss, and Gyimóthy at Szeged), and C-Reduce (Regehr et al. at Utah). Treating the family as one PI’s research program is honest about its provenance; treating it as the field’s only view of program reduction is not. Pardis (Gharachorlu and Sumner 2019), ProbDD (Wang et al. 2021), J-Reduce (Kalhauge and Palsberg 2019), and Herfert et al.’s tree reducer (Herfert et al. 2017) are concurrent threads outside this family that the book references where relevant.

The Perses repository also contains implementations of techniques from a broader family of papers. A useful way to read this family is not as a list of disconnected tools, but as a sequence of answers to recurring reduction problems.

Reduction problem	Representative work	Main idea
Plain deletion creates many invalid programs	Perses (Sun et al. 2018)	Use grammar and syntax to guide candidate generation
The final result is valid but still larger than necessary	Vulcan (Xu et al. 2023)	Push language-agnostic reduction closer to one-minimality
The failure depends on relationships between program elements	PPR (Zhang et al. 2023)	Preserve pairwise relationships during reduction
Full grammar setup is too heavy for some users	Ad Hoc Syntax-Guided Program Reduction (J. L. Tian et al. 2023)	Lower the cost of adopting syntax guidance
The same expensive oracle work is repeated	Caching schemes (Y. Tian et al. 2023)	Avoid redundant tests and speed up reduction
The reducer has little guidance about promising edits	LPR (Zhang et al. 2024)	Use large language models to help prioritize reduction
Coarse syntax units miss useful smaller edits	T-Rec (Xu, Tian, Zhang, J. Zhang, et al. 2025)	Use lexical syntax for finer-grained language-agnostic reduction
Uniform search treats all chunks as equally promising	WDD (Zhou et al. 2025)	Use weights to guide Delta Debugging decisions
Probabilistic search needs clearer interpretation	CDD study (Zhang et al. 2025)	Study probabilistic Delta Debugging behavior
Deletion alone cannot express enough useful changes	SFC (Xu, Tian, Zhang, and Sun 2025)	Identify missing syntax-guided transformations
Transformation design is hard to reuse across languages	Latra (Xu, Wang, et al. 2025)	Use templates for language-agnostic transformations

7.2 From Limitation to Research Question

Each paper in the family starts from a practical pressure point. Perses makes syntax central, but syntax guidance alone does not solve every reduction problem. A reduced program may still contain removable structure. A failure may depend on two declarations appearing together. A slow oracle may dominate runtime. A grammar may be unavailable, incomplete, or hard to engineer. A syntax-aware reducer may still lack the transformation needed to expose a smaller explanation.

Each paper turns a limitation into a design question. When you compare reducers, ask:

What unit is being reduced: text, tokens, tree nodes, declarations, or relationships?
What transformations are allowed: deletion only, replacement, rewriting, or template-based edits?
What does the oracle preserve: a crash, diagnostic, output difference, timeout, or semantic property?
What cost dominates the run: invalid candidates, oracle calls, search order, or transformation design?
What kind of minimality or usefulness does the result actually claim?

7.3 Which Tool for Which Situation

The table above is organized by research problem. This one is organized by your situation — what you have, what is going wrong, and which Perses-family technique is most likely to help.

Your situation	Recommended technique	Why
Starting fresh with a structured language	Perses (baseline)	Good general starting point; built-in grammars for C, Java, Go, Python, and others
Result is valid but still larger than necessary	Vulcan	Pushes harder toward 1-minimality than baseline Perses
Bug only appears when two elements co-exist	PPR	Handles pairwise dependencies that independent deletion misses
No ANTLR grammar available for your language	Ad Hoc or T-Rec	Ad Hoc lowers the grammar setup cost; T-Rec works at token level without a full grammar
Oracle is slow (takes minutes per call)	Caching schemes	Avoids re-running the oracle on candidates the reducer has already seen
Large search space, want smarter candidate ordering	LPR or WDD	LPR uses an LLM to prioritize likely-useful edits; WDD weights chunks by prior removal success
Reduction stalls — nothing can be deleted further	SFC or Latra	SFC identifies missing transformations; Latra provides reusable template-based rewrites
Reducing XML, JSON, or another non-C format	Perses with the right grammar	Language-agnostic design handles structured non-code inputs with the appropriate grammar

A practical sequence. For a new compiler bug, a reasonable starting point is:

Run Perses with the simple one-liner oracle. If the result is good enough, stop.
If the result is still large, strengthen the oracle (see Chapter 14) and run again.
If oracle calls dominate the runtime, add caching.
If the result is valid but not minimal, try Vulcan.
If reduction stalls at a non-minimal point, try SFC or Latra for transformation-based cleanup.

No single tool wins on every benchmark. The best tool for a given case depends on the oracle cost, the input structure, and whether the failure involves element interactions. Chapter 9 walks through concrete evaluation numbers from the WDD paper on the gcc-59903 example.

The paper names matter less than the design space they sketch. When you encounter a new Perses-family technique, read it through three questions: what limitation of earlier reduction does it address, what is the key idea, and how would it change the way someone uses, evaluates, or extends a reducer? The same questions apply when deciding whether a future tool belongs in your workflow at all.

Gharachorlu, Golnaz, and Nick Sumner. 2019. “Pardis: Priority-Aware Test-Case Reduction.” Fundamental Approaches to Software Engineering — 22nd International Conference, FASE 2019, 409–26. https://doi.org/10.1007/978-3-030-16722-6_24.

Herfert, Satia, Jibesh Patra, and Michael Pradel. 2017. “Automatically Reducing Tree-Structured Test Inputs.” Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering, 861–71. https://doi.org/10.1109/ASE.2017.8115697.

Kalhauge, Christian Gram, and Jens Palsberg. 2019. “Binary Reduction of Dependency Graphs.” Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 556–66. https://doi.org/10.1145/3338906.3338956.

Sun, Chengnian, Yuanbo Li, Qirun Zhang, Tianxiao Gu, and Zhendong Su. 2018. “Perses: Syntax-Guided Program Reduction.” Proceedings of the 40th International Conference on Software Engineering, 361–71. https://doi.org/10.1145/3180155.3180236.

Tian, Jia Le, Mengxiao Zhang, Zhenyang Xu, Yongqiang Tian, Yiwen Dong, and Chengnian Sun. 2023. “Ad Hoc Syntax-Guided Program Reduction.” Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2137–41. https://doi.org/10.1145/3611643.3613101.

Tian, Yongqiang, Xueyan Zhang, Yiwen Dong, et al. 2023. “On the Caching Schemes to Speed up Program Reduction.” ACM Transactions on Software Engineering and Methodology 33 (1): 1–30.

Wang, Guancheng, Ralf-Dieter Shen, Junjie Chen, Yingfei Xiong, and Lu Zhang. 2021. “Probabilistic Delta Debugging.” Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 881–92. https://doi.org/10.1145/3468264.3468625.

Xu, Zhenyang, Yongqiang Tian, Mengxiao Zhang, Jiarui Zhang, et al. 2025. “T-Rec: Fine-Grained Language-Agnostic Program Reduction Guided by Lexical Syntax.” ACM Transactions on Software Engineering and Methodology 34 (2). https://doi.org/10.1145/3690631.

Xu, Zhenyang, Yongqiang Tian, Mengxiao Zhang, and Chengnian Sun. 2025. “Boosting Program Reduction with the Missing Piece of Syntax-Guided Transformations.” Proceedings of the ACM on Programming Languages 9 (OOPSLA2): 86–112. https://doi.org/10.1145/3763053.

Xu, Zhenyang, Yongqiang Tian, Mengxiao Zhang, Gaosen Zhao, Yu Jiang, and Chengnian Sun. 2023. “Pushing the Limit of 1-Minimality of Language-Agnostic Program Reduction.” Proceedings of the ACM on Programming Languages 7 (OOPSLA1): 636–64.

Xu, Zhenyang, Yiran Wang, Yongqiang Tian, Mengxiao Zhang, and Chengnian Sun. 2025. “Latra: A Template-Based Language-Agnostic Transformation Framework for Effective Program Reduction.” Proceedings of the 40th IEEE/ACM International Conference on Automated Software Engineering.

Zhang, Mengxiao, Yongqiang Tian, Zhenyang Xu, Yiwen Dong, Shin Hwei Tan, and Chengnian Sun. 2024. “LPR: Large Language Models-Aided Program Reduction.” Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis, 261–73.

Zhang, Mengxiao, Zhenyang Xu, Yongqiang Tian, Xinru Cheng, and Chengnian Sun. 2025. “Toward a Better Understanding of Probabilistic Delta Debugging.” Proceedings of the 47th International Conference on Software Engineering, 2024–35. https://doi.org/10.1109/ICSE55347.2025.00117.

Zhang, Mengxiao, Zhenyang Xu, Yongqiang Tian, Yu Jiang, and Chengnian Sun. 2023. “PPR: Pairwise Program Reduction.” Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 338–49.

Zhou, Xintong, Zhenyang Xu, Mengxiao Zhang, Yongqiang Tian, and Chengnian Sun. 2025. “WDD: Weighted Delta Debugging.” Proceedings of the 47th International Conference on Software Engineering, 1592–603. https://doi.org/10.1109/ICSE55347.2025.00071.