Program debloating via stochastic optimization

Published in Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER), 2020

Recommended citation: Xin, Q., Kim, M., Zhang, Q., & Orso, A. (2020). Program debloating via stochastic optimization. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: New Ideas and Emerging Results (pp. 65-68). https://codingsoo.github.io/files/ICSE-NIER2020.pdf

Programs typically provide a broad range of features. Because different typologies of users tend to use only a subset of these features, and unnecessary features can harm performance and security, program debloating techniques, which can reduce the size of a program by eliminating (possibly) unneeded features, are becoming increasingly popular. Most existing debloating techniques tend to focus on program-size reduction alone and, although effective, ignore other important aspects of debloating. We believe that program debloating is a multifaceted problem that must be addressed in a more general way. In this spirit, we propose a general approach that allows for formulating program debloating as a multi-objective optimization problem. Given a program to be debloated, our approach lets users specify (1) a usage profile for the program (i.e., a set of inputs with associated usage probabilities), (2) the factors of interest for debloating, and (3) the relative importance of these factors. Based on this information, the approach defines a suitable objective function for associating a score to every possible reduced program and aims to generate an optimal solution that maximizes the objective function. We also present and evaluate Debop, a specific instance of our approach that considers three objectives: size reduction, attack-surface reduction, and generality (i.e., the extent to which the reduced program handles inputs in the usage profile provided). Our results, albeit still preliminary, are promising and show that our approach can be effective at generating debloated programs that achieve a good trade-off between the different debloating objectives considered. Our results also provide insights on the performance of our general approach when compared to a specialized single-goal technique.