Authors: Shan Lu (University of Illinois), Soyeon Park (University of Illinois), Chongfeng Hu (University of Illinois), Xiao Ma (University of Illinois), Weihang Jiang (University of Illinois), Zhenmin Li (University of Illinois), Raluca A. Popa (MIT), and Yuanyuan Zhou (University of Illinois)
Paper: http://www.sosp2007.org/papers/sosp061-lu.pdf
(SOSP presentation)
There are lots of correlated variables in the world (e.g., "when I access X, I should be accessing Y as well"). It's a bug if you don't access them together. Look for this. Their metric is "distance" in the source code. Find variables that are usually accessed together and rarely accessed separately. Use techniques of "itemset" from data mining.
They looked at codebases from Mozilla, MySQL, Linux, and Postgre-SQL. About a thousand correlations. About 15% false positive rates (macros, coincidences). They can't detect conditional correlations (that seems like a weird programming paradigm anyway).
Q: How difficult is it to resolve the false positives, especially in relation to things like RacerX?
A: ??? Hard to compare.
Q: Can it handle more complicated examples? (ed. didn't she say she couldn't deal with these cases?)
A: ??? Branching shouldn't be an issue.
Q: What about more complex correlations, e.g., c is a sum of a and b?
A: Boil down to considering pair as a unit. Room for improvement though.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment