Refactoring in the Real World


One of the common themes in my work in industry has been to promote refactoring on real-world programs. When working on the Scheme refactoring tool, I was always annoyed that I couldn't use it on the Lisp code used to develop the tool. Since then, I've been hoping to use a decent refactoring tool for the languages I use -- typically C-like languages.

I've had a couple tries to create refactoring tools for C-like languages. At Apple, I implemented refactoring for C and Objective C in the Xcode 3.0 IDE; the Refactoring Tools workshop paper discusses the tradeoffs necessary to make commercial tools. I also implemented a prototype refactoring tool in the IBM VisualAge for C++ 4.0 IDE (also known by its research project name of "Montana"). One interesting lesson from the Montana work was that data structures intended for a compiler often weren't rich enough or were too abstract to help refactoring.

For real-world experience, I also tried refactoring the gcc compiler's tree structures. In order to partition the different kinds of tree nodes, I tried relying on runtime information to understand which fields of the tree structure were used for which kinds of tree nodes. The Workshop on Dynamic Analysis paper describes this exercise.

Star Diagrams

My research in graduate school involved supporting program restructuring through program representations other than the source code. Specifically, I've created one visualization, called the star diagram, that helps a program encapsulate data structures into a new abstract data type. The star diagram helps by presenting a concise view of how the data structure is used throughout the program, hiding the irrelevant portions of the program and presenting the code directly related to the data structure. The view groups similar expressions together in order to identify common abstract operations that should become the functions forming the interface of the ADT. The view is also manipulable, so that manipulations to the star diagram cause meaning-preserving restructuring transformations to be applied to the underlying program. The TOSEM paper provides the clearest and deepest description of the star diagram concept.
Screenshot of the Scheme star diagram interface

Star Diagrams for C Programs

The restructuring tool and star diagram manipulate programs written in the imperative language Scheme; I've also created tools for drawing star diagrams for C programs. While my implementation of the C star diagram generator acts as a proof-of-concept, Morison Chen will be creating a robust C star diagram generator for understanding and restructuring C programs. His star diagrams will use David Morgenthaler's techniques for efficiently restructuring C programs to actually restructure a program's C source code.
The dissertation describes the creation of C star diagrams, and evaluates the scalability of star diagrams for programs from 100,000 to 500,000 lines.

Programmer Studies

I'm also studying how programmers actually use the star diagram during restructuring. I'm using verbal protocol methods to analyze programmer interaction with the tool during a program enhancement task. Such studies give me feedback on the design and implementation of the current tool, and also give me insight into how the tool affects the software maintenance task.
The paper "How Software Tools Organize Programmer Behavior..." describes the first published results from these studies. We observed that the star diagram and restructuring tools provide explicit and implicit support of ``bookkeeping'' tasks such as measuring progress in a modification and maintaining current state. Programmers often moved thro kugh a program linearly to guarantee that they applied a change to every instance of a variable or data structure; while the approach guaranteed performing the change globally, it forced them to perform similar changes at different times, and led to a number of recall errors.
We also saw that the star diagram and restructuring tools encouraged programmers to perform less planning up-front, instead using the tool to explore possible structures. While the restructuring tool allowed programmers to explore, they sometimes used inappropriate cues to determine whether a transformation brought them to a better design. Some programmers used cues in the star diagram (such as size of the star diagram) to indicate the "goodness" of a given transformation.

Structure Diagram

The star diagram was not the first approach we tried; we also implemented the structure diagram, a program representation that showed the named objects of the program (functions, variables, and modules), represented their relationships, and restructured the source code as the structure diagram was manipulated. This representation is discussed in Program Restructuring via Design-Level Manipulation", and is further critiqued in the SIGSOFT '94 paper and dissertation.

Multimedia and Operating Systems Research

I'm also interested in operating systems, and have participated in work connected with multimedia file systems.

Publications


Robert W. Bowdidge,
"Performance Trade-offs Implementing Refactoring Support for Objective-C", 3rd Workshop on Refactoring Tools, October 2009.
Implementing refactoring support for Objective-C was an interesting experience for me because all the challenging bits were the parts not discussed in academic literature: scale, performance, and deciding which customers we were trying to please. This paper discusses all the trade-offs we had to choose in order to build a commercial refactoring tool.

Robert W. Bowdidge,
"Refactoring gcc Using Structure Field Access Traces and Concept Analysis", 3rd International Workshop on Dynamic Analysis, 2005.
As part of speeding up gcc, I tested a common assumption in our group that the size of gcc's data structures caused many of its speed problems. I tried shrinking the size of the data structures used for certain kinds of declarations. Although I found that the changes didn't have much effect on gcc, the act of trying to refactor gcc was interesting in its own right.
This paper describes the case study of trying to refactor gcc using runtime trace data to identify safe refactorings.

Robert W. Bowdidge,
"Collating Results of Syntactic Searches by Context", (IBM Research Technical Report RC21423.)
The star diagram's idea of grouping expressions according to similarity can also be useful for general purpose program understanding tasks. This paper describes the syntactic search facility I implemented for the IBM VisualAge C++ compiler that groups matches that occur in similar expressions.

Robert W. Bowdidge and William G. Griswold,
"Supporting the Restructuring of Data Abstractions through Manipulation of a Program Visualization"
, ACM Transactions on Software Engineering Methodology, 7(2) (1998).
(This provides a nuts-and-bolts description of the star diagram, and contains an expanded section on the scalability of the technique.)

W. G. Griswold, M. I. Chen, R. W. Bowdidge, J. L. Cabaniss, V. B. Nguyen, and J. D. Morgenthaler,
"Tool Support for Planning the Restructuring of Data Abstractions in Large Systems", IEEE Transactions on Software Engineering (accepted).
(Describes work continuing at U.C. San Diego on star diagrams, and contains many details about the design and implementation, and evaluation of a C star diagram tool.)

Robert W. Bowdidge and William G. Griswold,
"How Software Engineering Tools Organize Programmer Behavior During the Task of Data Encapsulation", Empirical Software Engineering 2(3) (1997), p.221-268.
So if programmers had a real refactoring tool, how would it change how they modify their programs? This is a significantly improved version of the programmer study paper previously released as a technical report and as part of my dissertation.

Robert W. Bowdidge,
" Star diagrams: designing abstractions out of existing code", Position paper for OOPSLA '96 Workshop on Transforming Legacy Applications into Object-Oriented Applications.

W. G. Griswold, M. I. Chen, R. W. Bowdidge, and J. D. Morgenthaler,
Tool Support for Planning the Restructuring of Data Abstractions in Large Systems," In Proceedings of the ACM SIGSOFT Conference on the Foundations of Software Engineering (FSE-4), October 1996.

Robert W. Bowdidge,
"Supporting the Restructuring of Data Abstractions through Manipulation of a Program Visualization" (705KB compressed, 16MB uncompressed, ~200 pages),
Ph.D. Thesis, Computer Science and Engineering Department, University of California, San Diego, November 1995.

R. W. Bowdidge, W. G. Griswold,
"How Software Tools Organize Programmer Behavior During the Task of Data Encapsulation",
Technical Report CS95-443, Department of Computer Science and Engineering, University of California, San Diego, August 1995. (Revised Feb. 1996)

Robert W. Bowdidge and William G. Griswold,
Automated Support for Encapsulating Abstract Data Types (6 MB)
SIGSOFT '94: Foundations of Software Engineering, New Orleans, December 1994.

William G. Griswold and Robert W. Bowdidge,
"Program Restructuring via Design-Level Manipulation" (500 KB),
Workshop on Studies of Software Design, Baltimore MD, May 17-18, 1993. To appear in Springer Verlag Lecture Notes in Computer Science.

P. Venkat Rangan, Dr. Walter A. Burkhard, Robert W. Bowdidge, Harrick M. Vin, John W. Lindwall, Kashun Chan, Ingvar A. Aaberg, Linda M. Yamamoto, and Ian G. Harris,
"A Testbed for Managing Digital Video and Audio Storage" (100 KB),
USENIX Summer 1991 Conference, Nashville TN, June 10-14, 1991.
(One screendump missing in on-line version of paper.)

Thomas Schwarz, Robert Bowdidge, and Walter Burkhard,
"Low Cost Comparison of File Copies,"
10th International Conference on Distributed Computing Systems, Paris, 1990.

Research Interests


* Refactoring
* Software design
* Maintenance of legacy systems
* Program visualization
* Program analysis techniques (cohesion, slicing, dataflow)
* Operating systems design