Debugging program failure exhibited by voluminous data

Document Type

Article

Publication Date

1-1-1998

Abstract

It is difficult to debug a program when the data set that causes it to fail is large (or voluminous). The cues that may help in locating the fault are obscured by the large amount of information that is generated from processing the data set. Clearly, a smaller data set which exhibits the same failure should lead to the diagnosis of the fault more quickly than the initial, large data set. We term such a smaller data set a data slice and the process of creating it data slicing. The problem of creating a data slice is undecidable. In this paper, we investigate four generate-and-test heuristics for deriving a smaller data set that reproduces the failure exhibited by a large data set. The four heuristics are: invariance analysis, origin tracking, random elimination and program-specific heuristics. We also provide a classification of programs based upon a certain relationship between their input and output. This classification may be used to choose an appropriate heuristic in a given debugging scenario. As evidence from a database of debugging anecdotes at the Open University, U.K., debugging failures exhibited by large data sets require inordinate amounts of time. Our data slicing techniques would significantly reduce the effort required in such scenarios. © 1998 John Wiley & Sons, Ltd.

This document is currently not available here.

Share

COinS