Subdomain-Based Generality-Aware Debloating

Published in Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering (ASE), 2020

Recommended citation: Xin, Q., Kim, M., Zhang, Q., & Orso, A. (2020, December). Subdomain-based generality-aware debloating. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering (pp. 224-236). https://codingsoo.github.io/files/ASE2020.pdf

Programs are becoming increasingly complex and typically contain an abundance of unneeded features, which can degrade the performance and security of the software. Recently, we have witnessed a surge of debloating techniques that aim to create a reduced version of a program by eliminating the unneeded features therein. To debloat a program, most existing techniques require a usage profile of the program, typically provided as a set of inputs 𝐼. Unfortunately, these techniques tend to generate a reduced program that is overfitted to 𝐼 and thus fails to behave correctly for other inputs. To address this limitation, we propose DomGad, which has two main advantages over existing debloating approaches. First, it produces a reduced program that is guaranteed to work for subdomains, rather than for specific inputs. Second, it uses stochastic optimization to generate reduced programs that achieve a close-to-optimal tradeoff between reduction and generality (i.e., the extent to which the reduced program is able to correctly handle inputs in its whole domain). To assess the effectiveness of DomGad, we applied our approach to a benchmark of ten Unix utility programs. Our results are promising, as they show that DomGad could produce debloated programs that achieve, on average, 50% code reduction and 95% generality. Our results also show that DomGad performs well when compared with two state-of-the-art debloating approaches.