Features

XXIX.5 September - October 2022
Page: 68
Digital Citation

Where’s my jetpack? Waiting for the revolution in statistical analysis software


Authors:
Philip Kortum

back to top 

In the 1950s, science writers envisioned a future in which individual citizens would own and use jetpacks as a primary mode of transportation. The vision was that the jetpack would free highways of congestion and allow personal mobility on a scale never before seen. Importantly, it became a metaphor for how technological advances by scientists and engineers would provide us with enormous freedom to pursue work and pleasure in new ways.

back to top  Insights

The introduction of line-edited, syntax-based statistical analysis software like R has set us back 50 years, making statistical analysis harder than ever for the user.
Interfaces for statistical analysis need to be revolutionized so that the tools work for the user, rather than the other way around.

Sadly, after much development and breathless and tantalizing writings extolling its imminent availability to the public, the jetpack is still just a fantasy for the average citizen, but its metaphorical mythology remains.

The story of statistical analysis software follows a similar arc. In 1962, Wil Dixon of UCLA released the first version of BMD to the public [1]. This program relieved researchers of having to perform tedious, error-prone calculations by hand and revolutionized our ability to process and analyze large datasets effectively. To specify a statistical analysis, a user would fill in codes in specific columns of control cards that would be submitted to a mainframe computer. Subsequent programs, such as SAS, introduced in the mid-1970s, used command-based interfaces that allowed data manipulation as well as statistical analyses. With the introduction of these sophisticated statistical analysis software programs, the promise was that statistical analysis would be transformed from a mundane and difficult mathematical exercise of statistical computation into a tool that would allow not only scientists and engineers but also everyday people to ask questions of data and receive answers that had mathematical backing.

ins01.gif

The early statistical software packages were tedious indeed. Although they freed the user from having to perform copious calculations, the user's focus turned from struggling to perform the mathematical calculations to trying to figure out how to instruct the computer to do their bidding.

The release of the Macintosh computer in 1984 made it possible for developers to write statistical analysis software packages with graphical user interfaces (GUIs). As a result, statistical software made the leap into a more user-friendly realm, employing interfaces that allowed nontechnical users to more easily perform at least moderately complex statistical analyses. These early programs were written from the ground up to use a GUI and were very easy to use. Among the more notable programs were DataDesk, StatView, StatWorks, and JMP [2].

In the 1990s, major software packages including SAS and SPSS incorporated GUIs, although early attempts were somewhat unwieldy because they relied on taking the architecture of the line-editor software and bending it into a bolt-on GUI.

As every good human factors engineer knows, trying to slap a user interface onto an already designed piece of technology almost always leads to suboptimal interfaces. The resulting interface is nearly always constrained by the underlying architecture, and this architecture is rarely designed to conform to human mental models of how the process should work. The user interface must be considered during the fundamental stages of designing the software so that the actions of the core technology can easily match the desired actions of the user in the final user interface. Most of the current generations of statistical software analysis packages follow this model, with some admittedly more successful than others in their final, user-facing interface. The difficulty is that the underlying DNA of the code and the thought processes of the original builders run deep, in everything from the limitations placed on variable names to the cryptic discussions in teaching materials of how it all works.

Despite the fact that programs built from the ground up to be user-centered had better usability, the sheer power and "good enough" interfaces of the major software packages eventually edged these original Mac packages out. A contributing factor was the sharp decline in Macintosh market share in the 1990s, which did not rebound until fairly recently. The exception to the demise of the original GUI statistics software is JMP, perhaps because JMP developers have access to the wide variety of routines from SAS, and because it became available on Windows.

While there is much to criticize in the woefully inadequate GUIs of many of the serious statistical packages on the marketplace today, our biggest concern should be reserved for the increasingly popular R statistical package. Its meteoric rise can likely be attributed to its open-source nature, the wide variety of powerful packages, and the (over) pricing policies of the established players that have forced cash-strapped researchers and students to find and adopt a more cost-effective solution. That said, its release takes us back to the earliest days of statistics software, and reintroduces us to the "tyranny of the blank screen." It is said that R has a steep learning curve [3], and without detailed knowledge of the syntax, the user can do nothing with the program save stare at a blank monitor. The R interface has more in common with the first version of BMD released in 1962 than it does with any modern, GUI-based stats interface.

The base code of R was developed completely as an open-source line-code interface, and this is the way that many researchers learn and use R. Subsequent researchers, however, have tried to build more workable interfaces that hide some of the complexity of that base code. Of course, the results are imperfect, and the ubiquitous code window lingers in the interface, reminding the user that they are driving a Model T interface to a Ferrari. While hardcore R programmers might claim that is a feature, not a bug, for everyone else who just wants to do statistics, it is most certainly a serious disadvantage.

Is it possible that R was designed to be hard? By creating a difficult-to-use tool, only the "experts" and those able to devote sufficient time and resources can effectively use or even identify all the possible, mystical functionality of the software. The argument here might be that we don't want just anybody to be able to easily use this powerful tool. They might remediate the data inappropriately, select the wrong analysis, violate the analysis assumptions, or interpret the output incorrectly. Of course, if this were true, the implications of such a strategy in terms of the disenfranchisement of large portions of the population would be staggering.

It might be argued that the economic benefits provided by all the books, tutorials, and training sessions on R represent an important contribution, all based on the introduction of R. Ludicrous. We don't make unreliable cars to keep mechanics in business, and we shouldn't make hard-to-use software to support self-help entrepreneurs.

In this "make R hard to use" worldview, hard is good, since hard is how you protect the integrity of the data analysis.

It doesn't have to be this way. In fact, it shouldn't be this way. I would argue that this is precisely why we need easy-to-use software. Other software products have taken notably different paths, and their success with the general public is a testament to their hard work in trying to build interfaces that work for the user, rather than for the programmers who built them.

For example, at the same time as the first statistics programs were emerging, the first word processors were also being developed. These early word-processing programs shared many of the same characteristics with their statistical software brethren. If you wanted a new paragraph, you had to type in the code to tell the computer that you wanted a new paragraph inserted. It didn't take long before some, but not all, word-processing developers realized these interfaces were cumbersome to use, and that they prevented nonexperts from enjoying the benefits of the emerging technologies.

It's here that statistical analysis packages made their first mistake. Word-processing engineers developed entirely new concepts, and WYSIWYG (what you see is what you get) architectures were developed. The underlying code was built around the premise that the user would manipulate a screen that looked just like the paper. Voilà, the modern word processor was born. Conversely, statistical package developers concentrated on the underlying power of the software for its computational effectiveness, perhaps rightly so. In the process, however, the interfaces they developed were simply facades that maintained the jargon, the program sequencing, and the underlying architecture of the fundamental code.

Let's not forget that, for many users, statistics is trying to answer a couple of very simple questions: 1) Are the means of these datasets significantly different from each other? and 2) How are these variables related to each other? For example, we ask simple questions like Does this training regimen provide better outcomes than the other training regimen? and Given all the historical data, what is the most likely value of our ice cream sales if we know what the temperature is? Unfortunately, the mechanics of trying to perform the analysis obfuscates the simple task we are trying to do, and users must be trained to perform these simple tasks with complex tools.

This concept of needing extensive training to perform basic calculations is antithetical to good user interface design. Remember when you got your first iPhone and went to a 40-hour session on how to utilize this sophisticated new minicomputer/phone? Of course not! The user interface for the iPhone, whose power exceeds that of the computers that put us on the moon [4], has been developed in such a way as to make the user experience nearly seamless. We see five-year-olds using these tools with ease. How much time did you spend reading the manual for your new phone? Probably none. You may not even know there's a user manual for the iPhone, since one isn't included with the phone. In contrast, one excellent book that I have used to learn SPSS is more than 1,100 pages. Why should statistical analysis, a tool that could be useful on so many levels, require this, when other technologies don't?

The evolution of the ease of use of statistical analysis software is actually turning into a U function. We started with hard-to-use line-edited software (e.g, BMD, SAS, Minitab). We then moved to make statistics more accessible through more-sophisticated graphical user interfaces (JMP, StatView) or by integrating them into well-established user interfaces (e.g., XLSTAT). Subsequently, we started to make things harder again, building GUI-based interfaces that were architected on the fundamental line-code editing (e.g., SPSS, Stata), finally reverting to the hard-to-use line-edited programs (e.g., R). Now we're beginning the process all over again, by trying to bolt on user interfaces to the base R code so that your average nonprogrammer can actually use the program (e.g., RStudio). Predictably, these bolt-on interfaces, while valiant in their efforts, can never reach the potential of truly integrated, easy-to-use interfaces because they haven't started with that premise. Until we refocus our efforts from a code-driven perspective to a user experience perspective, statistical software packages will continue to be difficult to use.

How might we do better? A fictitious but tantalizing view of how we might approach these kinds of problems was already envisioned in the 1960s by the late Gene Roddenberry in his classic Star Trek TV series. When these characters interact with a computer to solve a problem, they are focused entirely on the creative, higher-order problemsolving questions at hand. Scotty never says to Captain Kirk, "I cannot change the laws of physics until I get my computer program detailing the statistics to compile." Instead, they form natural language queries that the computer then translates into the kinds of simple answers that allow more-creative problem-solving to take place. Spock casually asks the computer, "Computer, is the concentration of dilithium on the planet greater than what we would expect?" And the computer responds, "Yes, it is." These queries are not so different from the ones we ask of our own data in the real world.

These kinds of voice user interfaces are not fantastical in the modern world. I dictated this entire article on my iPhone while taking my morning walks. Why shouldn't we expect the same of our statistical analysis software? Obviously, it's not just a matter of recognizing speech commands—that would bring us no further than we are today. Rather, there would need to be some interpretation of the user's intent and then the ability to act on those intentions. It might not be as far-fetched as it sounds. Today, you can ask Google's general search engine to give you the sum, difference, product, or quotient, and it will do so without any mathematical formula entry. It even knows the precedence of operations! Imagine if we could interact with all of our data in this way. Freeing ourselves from the mechanics would allow us to focus on the answers to our questions.

Of course, we need not wait until AI has advanced enough to parse our intentions via speech. Simply creating statistical software that is built from the ground up with user tasks in mind, shedding all the old ideas about the need to write syntax, could be a panacea for students trying to learn statistical concepts and researchers trying to focus on the data, not the tool. If you have ever watched students in an introductory statistics class trying to learn how to analyze data using a statistical software package, you've observed that the focus of the effort has nothing to do with understanding the underlying statistical concepts, but is rather a programming exercise, trying to get the computer to produce some output, any output. That's not the way to learn statistics, and it shouldn't be the way for professionals to conduct their analyses either.

Intriguingly, there has been some progress on this front, as a few researchers have attempted to move statistical analysis away from the task of selecting analysis methods and toward the goal of answering questions (e.g., [5,6]). Good software would help you choose the correct analysis, maybe even going so far as to pick the right analysis based on the question you're trying to answer. One could easily imagine, for example, software that is designed around a user's goal, rather than selecting from a large number of statistical tests. The user asks, "Based on the data, does drug A work better than drug B in treating this disease?" and after selecting and performing an analysis and checking all the assumptions, the computer answers in an equally understandable way, "Drug A is definitely better." Sadly, while these tools aim to aid the user with the data analysis, the coding they require to make them work isn't significantly different from the code required by R, only different in form and function.

Without a doubt, R is an incredibly powerful, versatile, and useful statistical analysis tool. However, in this era of intuitive software interfaces, we can surely do better. The reemergence of syntax-based statistical analysis software has taken us back 50 years, and stopped the (glacially) slow but steady progress of making statistics be about statistics, not about the tool. We should all advocate for tools that work for us, instead of the other way around.

Now, where did I leave my jetpack?

back to top  References

1. Johnson, L. Oral History of Wilfrid J. (Wil) Dixon and Linda Glassner (recorded 1986). Computer History Museum, 2007; http://archive.computerhistory.org/resources/access/text/2012/04/102658169-05-01-acc.pdf

2. Lehman, R.S. Macintosh statistical packages. Behavior Research Methods, Instruments, & Computers 18, 2 (1986), 177–187.

3. Donovan, T., Brown, M., and Katz, J. R for Fledglings. Univ. of Vermont Press, 2015.

4. Kendall, G. Your mobile phone vs. Apollo 11's guidance computer. RealClear Science, 2019; https://www.realclearscience.com/articles/2019/07/02/your_mobile_phone_vs_apollo_11s_guidance_computer_111026.html

5. Jun, E., Daum, M., Roesch, J., Chasins, S., Berger, E., Just, R., and Reinecke, K. Tea: A high-level language and runtime system for automating statistical analysis. Proc. of the 32nd Annual ACM Symposium on User Interface Software and Technology. ACM, New York, 2019, 591–603.

6. Kenny, D.A. DataToText: A consumer-oriented approach to data analysis. Multivariate Behavioral Research 45, 6 (2010), 1000–1019.

back to top  Author

Philip Kortum is a faculty member in the Department of Psychological Sciences at Rice University in Houston. His primary interests are in the research and development of highly usable systems in the voting and mobile computing domains and in the characterization of measures of usability and usable systems. [email protected]

back to top 

Copyright held by author. Publication rights licensed to ACM.

The Digital Library is published by the Association for Computing Machinery. Copyright © 2022 ACM, Inc.

Post Comment


No Comments Found