From 1983 to ~2013: complete list of HCIL papers.
From 2013 to present: only a few papers are included
For more recent publications, please ALSO consult individual HCIL faculty pages.
Human-Centered Artificial Intelligence: Trusted, Reliable & Safe
Abstract: Well-designed technologies that offer high levels of human control and high levels of computer automation can amplify human abilities, leading to wider adoption. The Human-Controlled Automation (HCA) model clarifies how to (1) design for high levels of human control and high levels of computer automation so as to amplify human abilities, (2) understand the situations in which full human control or full computer automation are necessary, and (3) avoid the dangers of excessive human control or excessive computer automation. The new goal of HCA is more likely to produce designs that are Trusted, Reliable & Safe (TRS). Achieving these goals will dramatically increase human performance, while supporting human self-efficacy, mastery, creativity, and responsibility.
The Twin-Win Model: A Human-centered approach to research success
Proceeding of the National Academy of Sciences, 115, 50 (2018) 12590-12594 [Published Version]
Abstract: A 70-year-old simmering debate has erupted into vigorous battles over the most effective ways to conduct research. Well-established beliefs are being forcefully challenged by advocates of new research models. While there can be no final resolution to this battle, this paper offers the Twin-Win Model to guide teams of researchers, academic leaders, business managers, and government funding policymakers. The Twin-Win Model favors a problem-oriented approach to research, which encourages formation of teams to pursue the dual goals of breakthrough theories in published papers and validated solutions that are ready for widespread dissemination. The raised expectations of simultaneously pursuing foundational discoveries and powerful innovations are a step beyond traditional approaches that advocate basic research first. Evidence from citation analysis and researcher interviews suggests that simultaneous pursuit of both goals raises the chance of twin-win success.
Interactive visual event analytics: Opportunities and challenges
Shneiderman, B., Plaisant, C.
IEEE Computer (2018, to appear)
Explainable Recommendation for Event Sequences: A Visual Analytics Approach
Ph.D Dissertation from the Department of Computer Science (2018)
Abstract: People use recommender systems to improve their decisions, for example, item recommender systems help them find films to watch or books to buy. Despite the ubiquity of item recommender systems, they can be improved by giving users greater transparency and control. This dissertation develops and assesses interactive strategies for transparency and control, as applied to event sequence recommender systems, which provide guidance in critical life choices such as medical treatments, careers decisions, and educational course selections. Event sequence recommender systems use archives of similar event sequences, such as patient histories or student academic records, to give users insight into the order and timing of choices, which are more likely to lead to their desired outcomes. This dissertation’s main contribution is the use of both record attributes and temporal event information as features to identify similar records and provide appropriate recommendations. While traditional item recommendations are generated based on choices by people with similar attributes, such as those who looked at this product or watched this movie, the event sequence recommendation approach allows users to select records that share similar attribute values and start with a similar event sequence, and then see how different choices of actions and the orders and times between them might lead to users’ desired outcomes. This dissertation applies a visual analytics approach to present and explain recommendations of event sequences. It presents a workflow for event sequence recommendation that is implemented in EventAction. Results from empirical studies show that these prototypes can assist users in making action plans and raise users’ confidence in following their plans. It presents case studies in three domains to demonstrate the effectiveness and safety of generating event sequence recommendations based on personal histories. It also offers design guidelines for the construction of user interfaces for event sequence recommendation and discusses ethical issues in dealing with personal histories. This dissertation contributes an analytical workflow, an interactive system, and design guidelines identified in empirical studies and case studies, opening new avenues of research in explainable event sequence recommendations based on personal histories. It enables people to make better decisions for critical life choices with higher confidence.
A Task-based Taxonomy of Cognitive Biases for Information Visualization
Dimara, E., Franconeri,, S., Plaisant, C., Bezerianos, A., Dragicevic, P.
IEEE Transactions on Visualization and Computer Graphics (2018) [Published Version]
Abstract: Information visualization designers strive to design data displays that allow for efficient exploration, analysis, and communication of patterns in data, leading to informed decisions. Unfortunately, human judgment and decision making are imperfect and often plagued by cognitive biases. There is limited empirical research documenting how these biases affect visual data analysis activities. Existing taxonomies are organized by cognitive theories that are hard to associate with visualization tasks. Based on a survey of the literature we propose a task-based taxonomy of154 cognitive biases organized in 7 main categories. We hope the taxonomy will help visualization researchers relate their design to the corresponding possible biases, and lead to new research that detects and addresses biased judgment and decision making in data visualization.
Visual Interfaces for Recommendation Systems: Finding Similar and Dissimilar Peers
Du, F., Plaisant, C., Spring, N., Shneiderman, B.
Transactions on Intelligent Systems and Technology, 38, 3 (2018) 21-29 [Published Version]
Abstract: Recommendation applications can guide users in making important life choices by referring to the activities of similar peers. For example, students making academic plans may learn from the data of similar students, while patients and their physicians may explore data from similar patients to select the best treatment. Selecting an appropriate peer group has a strong impact on the value of the guidance that can result from analyzing the peer group data. In this article, we describe a visual interface that helps users review the similarity and differences between a seed record and a group of similar records and refine the selection. We introduce the LikeMeDonuts, Ranking Glyph, and History Heatmap visualizations. The interface was refined through three rounds of formative usability evaluation with 12 target users, and its usefulness was evaluated by a case study with a student review manager using real student data. We describe three analytic workflows observed during use and summarize how users’ input shaped the final design.
Virtual memory palaces: immersion aids recall
Krokos, E., Varshney, A., Plaisant, C.
Virtual Reality (2018) [Published Version]
Abstract: Virtual reality displays, such as head-mounted displays (HMD), afford us a superior spatial awareness by leveraging our vestibular and proprioceptive senses, as compared to traditional desktop displays. Since classical times, people have used memory palaces as a spatial mnemonic to help remember information by organizing it spatially and associating it with salient features in that environment. In this paper, we explore whether using virtual memory palaces in a head-mounted display with head-tracking (HMD condition) would allow a user to better recall information than when using a traditional desktop display with a mouse-based interaction (desktop condition). We found that virtual memory palaces in HMD condition provide a superior memory recall ability compared to the desktop condition. We believe this is a first step in using virtual environments for creating more memorable experiences that enhance productivity through better recall of large amounts of information organized using the idea of virtual memory palaces.
Observations and Reflections on Visualization Literacy at the Elementary School Level
Chevalier, F., Henry Riche, N., Boy, G., Alper, B., Plaisant, C., Elmqvist, N.
IEEE Computer Graphics & Applications Magazine (Visualization Viewpoints) 38, 3 (2018) 21-29 [Published Version]
Abstract: In this article, we share our reflections on visualization literacy and how it might be better developed in early education. We base this on lessons we learned while studying how teachers instruct, and how students acquire basic visualization principles and skills in elementary school. We use these findings to propose directions for future research on visualization literacy.
Designing a Medication Timeline for Patients and Physicians
Belden, J., Wegier, P. Patel, J., Hutson, A., Plaisant, C., Moore, J. L., Lowrance, N. J., Boren, S., Koopman, R.
Journal of the American Medical Informatics Association (2018) to appear
Abstract: Objective: Most electronic health records display historical medication information only in a data table or in clinician notes. We designed a medication timeline data visualization intended to improve ease of use, speed, accuracy, and safety in the ambulatory care of chronic disease. Materials and Methods: We identified information needs for understanding a medication history managing chronic disease in primary care, then applied human factors and interaction design principles to the tools that support that process. Our methodology started with the research and analysis of existing medication lists and timelines, which guided initial requirements. Next, we hosted design workshops with multidisciplinary stakeholders from industry and academic disciplines to expand on our initial concepts. Subsequent weekly meetings of the core team used an iterative user-centered design approach to refine our prototype. Results: We propose an open source online prototype that incorporates user feedback from initial design workshops, subsequent target audience reader reviews, subject matter expert focus groups, and a target audience user survey. We describe the applicable design principles associated with each of the prototype’s key features. Discussion: There is industry interest in developing medication timelines based on the example prototype concepts. An open, standards-based technology platform could enable developers to create a medication timeline that could be deployable across any compatible health IT application. Conclusion: The design goal was to improve physician understanding of a patient’s complex medication history, using a medication timeline visualization. Such a design could reduce the temporal and cognitive load on physicians for improved and safer care.
Mining clinical big data for drug safety: Detecting inadequate treatment with a DNA sequence alignment algorithm
Ledieu, T., Bouzille, ., Plaisant, C., Thiessard, F., Polard, E., Cuggia, M.
Proc. American Medical Informatics Association annual symposium 2018
Abstract: Health data mining can bring valuable information for drug safety activities. We developed a visual analytics tool to find specific clinical event sequences within the data contained in a clinical data warehouse. To this aim, we adapted the Smith-Waterman DNA sequence alignment algorithm to retrieve clinical event sequences with a temporal pattern from the electronic health records included in a clinical data warehouse. A web interface facilitates interactive query specification and result visualization. We describe the adaptation of the Smith-Waterman algorithm, and the implemented user interface. The evaluation with pharmacovigilance use cases involved the detection of inadequate treatment decisions in patient sequences. The precision and recall results (F-measure = 0.87) suggest that our adaptation of the Smith-Waterman-based algorithm is well-suited for this type of pharmacovigilance activities. The user interface allowed the rapid identification of cases of inadequate treatment.
Visualization of temporal patterns in patient record data
Fundamental & clinical pharmacology, 32, 1 (2018) 85-87. [Published Version]
Abstract: Visualization contributes to a variety of tasks, from reviewing individual patient records to helping researchers assess data quality, find patients of interest, review temporal patterns and anomalies, or understand differences between cohorts. We review some of visualization techniques developed at the University of Maryland.
Taking Big Paper and Sticky Notes to the 360th Degree
Golub, E., Agarwal, R., Carroll, D., Mendelsohn, A., Walters, M., Yue, C.
Abstract: The use of low-fidelity prototyping approaches has been a part of user-centered design and participatory/co-design for many years, dating back to at least the 1980s. However, the display experiences for which these were created (first desktops, then laptops, and later adding tablets and smartphones) are flat. The rise in interest about virtual reality (VR) headsets and other technologies that support the viewing of 360° spaces, as well as an increase in their availability, calls for updated lowfidelity prototyping approaches that still support co-design with diverse user populations. We present and discuss how to support collaboration between technical and non-technical design partners using supplies such as a consumer-grade 360° camera and tripod, along with common materials such as foam-core boards, basic metal easels, a standard color printer, paper, tape, and a variety of types of sticky note. The codesign is accomplished by creating, and then annotating during a design session, a basic representation of a 360° scene or experience using low-fidelity techniques, specifically a hybrid of the "big paper" and "sticky note" approaches, but taking them to the 360th degree.
SMIDGen: An Approach for Scalable, Mixed-Initiative Dataset Generation from Online Social Networks
Mauriello, M., Buntain, C., McNally, B., Bagalkotkar, S., Kushnir, S., Froehlich, J.
Abstract: Recent qualitative studies have begun using large amounts of Online Social Network (OSN) data to study how users interact with technologies. However, current approaches to dataset generation are manual, time-consuming, and can be difficult to reproduce. To address these issues, we introduce SMIDGen: a hybrid manual + computational approach for enhancing the replicability and scalability of data collection from OSNs to support qualitative research. We demonstrate how the SMIDGen approach integrates information retrieval (IR) and machine learning (ML) with human expertise through a case study focused on the collection of YouTube videos. Our findings show how SMIDGen surfaces data that manual searches might otherwise miss, increases the overall proportion of relevant data collected, and is robust against IR/ML algorithm selection.
Event analytics for innovation trajectories: Understanding inputs and outcomes for entrepreneurial success
Dempwolf, S., Shneiderman, B.
Technology and Innovation 19 (2017), 397-413 [Published Version]
Abstract: New analysis tools are expanding the options for innovation researchers. While previous researchers often speculated on the relationship between innovation inputs, such as patents or funding, and innovation outcomes, such as product releases or initial public offerings, new software tools enable researchers to analyze innovation event data more efficiently. Tools such as EventFlow make it possible to rapidly scan visual displays, algorithmically search for patterns, and study an aggregated view that shows common and rare patterns. This paper presents initial examples, using data from 34,331 drugs or medical devices, of how event analytic software tools, such as EventFlow, could be applied to innovation research.
Increasing Recognition of Wrong-Patient Errors through Improved Interface Design of a Computerized Provider Order Entry System
Taieb-Maimon, M., Plaisant, C., Hettinger, A., Shneiderman, B.
International Journal of Human–Computer Interaction, 34, 5 (2017) 383-398 [Published Version]
Abstract: Wrong-patient errors from inadvertent menu selections while using computerized provider order entry (CPOE) systems could have fatal consequences. This study investigated whether the manipulation of CPOE interface design could improve healthcare providers’ ability to recognize patient selection errors and also decrease the time to error recognition. Using a 2 × 2 design, 120 participants were randomly assigned to one of four groups, interacting with different versions of a simulated CPOE: (1) control – standard version; (2) highlighted selection – the selected patient row was highlighted for 2 s, by blanking the rest of the screen; (3) photo – photographs of patients’ faces were displayed in all screens; (4) combined – with photo and highlighted selection. Each participant navigated through five order scenarios. On the last scenario, an error was simulated by directing the participant to a wrong patient. Recognition rates of the wrong-patient error and times to error recognition were significantly improved for the highlighted selection, photo, and combined groups, relative to the control group. These results suggest that the addition of patient photos and highlighted selection could substantially reduce errors in CPOE systems and other applications.
Coping with Volume and Variety in Temporal Event Sequences: Strategies for Sharpening Analytic Focus
Du, F., Shneiderman, B., Plaisant, C., Malik, S., Perer, A.
IEEE Transactions on Visualization and Computer Graphics 23,6 (2017) 1636-1649 [Published Version]
Abstract: The growing volume and variety of data presents both opportunities and challenges for visual analytics. Addressing these challenges is needed for big data to provide valuable insights and novel solutions for business, security, social media, and healthcare. In the case of temporal event sequence analytics it is the number of events in the data and variety of temporal sequence patterns that challenges users of visual analytic tools. This paper describes 15 strategies for sharpening analytic focus that analysts can use to reduce the data volume and pattern variety. Four groups of strategies are proposed: (1) extraction strategies, (2) temporal folding, (3) pattern simplification strategies, and (4) iterative strategies. For each strategy, we provide examples of the use and impact of this strategy on volume and/or variety. Examples are selected from 20 case studies gathered from either our own work, the literature, or based on email interviews with individuals who conducted the analyses and developers who observed analysts using the tools. Finally, we discuss how these strategies might be combined and report on the feedback from 10 senior event sequence analysts.
Understanding the Use of the Vistorian: Complementing Logs with Context Mini-Questionnaires
Molinero, V.S., Bach, B., Plaisant, C., Dufournaud, N., Fekete, J.
Proc. of the Workshop on Visualization for the Digital Humanities (2017) 1-5. [Published Version]
Abstract: The Vistorian is a web-based visual analytics tool including four different interactive visualizations. It allows digital humanists to analyze complex geolocated and temporal networks of individuals. A research prototype is now available to researchers. The challenge we try to address is: could we improve our understanding of how digital humanities research prototypes are being used “in the wild”? Standard usage logs are insufficient since they do not capture users’ intent or the reasons why they might struggle with a prototype. Here, we designed a novel lightweight combination of usage logs and mini-questionnaires attempting to consistently capture user intent and usage context. The paper first describes the Vistorian, then introduces our combined log and questionnaire methodology—with design principles and screen mockups. The technique will be pilot tested this summer, and deployed in the fall for evaluation with historians and their students.
The valuation of privacy premium features for smartphone apps: The influence of defaults and experts
Dogruel, L., Joeckel, S., Vitak, J.
Abstract: This study examines the impact of privacy defaults and expert recommendations on smartphone users' willingness to pay for "privacy-enhanced" features on paid applications using a 2 (privacy premium default/no privacy premium default) x 2 (privacy expert recommendation/non-privacy expert recommendation) experimental design. Participants (N = 309) configured four paid apps with respect to privacy features. Selecting premium privacy features was associated with an increased cost, while removing premium privacy features reduced the cost of the application. Replicating findings from behavioral economics on default modes in decision-making, we found that participants presented with apps with privacy premium default features were more likely to retain the more expensive privacy features. However, the recommendation source did not have a significant effect on this relationship. We discuss how these findings extend existing work on users' decision-making process around privacy and suggest potential avenues for nudging users' privacy behaviors on mobile devices.
Sharing Automatically Tracked Activity Data: Implications for Therapists and People with Mobility Impairments
Golub, E., Malu, M., Findlater, L.
Proceedings of PervasiveHealth 2017, 10 pages
Abstract: The ability to share automatically tracked health and fitness behaviors has yielded benefits ranging from increasing user motivation to providing therapists with greater insight into their patients' progress. While past work on sharing this data has primarily focused on users with typical motor abilities, features are now emerging in mainstream tracking technologies to extend to people with mobility impairments (e.g., tracking wheelchair rolling). This paper explores opportunities specifically for users with mobility impairments to share this automatically tracked data both with peers and with physical, occupational or recreational therapists. We conducted semi-structured interviews with 10 therapists and 10 people with mobility impairments. The interviews focused on current and desired activity-tracking and sharing practices, and included a design probe activity to more concretely assess the perceived utility of sharing tracked fitness data. We report on attitudes and concerns toward sharing fitness data from the perspective of therapists and people with mobility impairments as well as outline design opportunities to explore in future work.
Privacy Policies and Their Lack of Clear Disclosure Regarding the Life Cycle of User Information
In Technical Report FS-16: AAAI Fall Symposium Series on Privacy and Language Technologies. Arlington, Virginia. November 17-19, 2016.
Abstract: Companies, particularly those in the information and communications technology sector, collect, aggregate, and store immense amounts of information about billions of people around the world. Privacy policies represent the primary means through which companies articulate to the public how they manage this user information. Extensive research has documented the problems with such policies, including that they are difficult to understand. This paper presents an analysis of 23 policies from 16 of the world's largest internet and telecommunications companies and shows the specific ways that vague or unclear language hinders comprehension of company practice. It argues that the lack of clarity in such policies presents a significant barrier toward empowering people to make informed choices about which products or services to use. The incoherent language in privacy policies can also hinder the widespread adoption of machine learning or other techniques to analyze such policies. Clearer disclosure from companies about how they use, share, and retain all types of information they collect will shed light on what the life cycle of user information looks like.
Acknowledgement to Ranking Digital Rights for making this paper possible.
I Want to Believe: Journalists and Crowdsourced Accuracy Assessments in Twitter
Buntain, C., Golbeck, J.
Abstract: Evaluating information accuracy in social media is an increasingly important and well-studied area, but limited research has compared journalist-sourced accuracy assessments to their crowdsourced counterparts. This paper demonstrates the differences between these two populations by comparing the features used to predict accuracy assessments in two Twitter data sets: CREDBANK and PHEME. While our findings are consistent with existing results on feature importance, we develop models that outperform past research. We also show limited overlap exists between the features used by journalists and crowdsourced assessors, and the resulting models poorly predict each other but produce statistically correlated results. This correlation suggests crowdsourced workers are assessing a different aspect of these stories than their journalist counterparts, but these two aspects are linked in a significant way. These differences may be explained by contrasting factual with perceived accuracy as assessed by expert journalists and non-experts respectively. Following this outcome, we also show preliminary results that models trained from crowdsourced workers outperform journalist-trained models in identifying highly shared "fake news" stories.
A Preliminary Investigation of #a11y Tweets to Understand Accessibility Trends and Concerns
Zhang, J., Findlater, L.
Abstract: Building on recent work analyzing online content to identify accessibility trends and challenges, this poster paper presents preliminary analysis of one month of tweets using the #a11y hashtag. Our analysis of ~4000 tweets suggests that the most active users of this hashtag are accessibility professionals, with less representation from end users in creating new tweets. By far the most common mention is of visual accessibility concerns, although other types of accessibility are represented. Qualitative assessment of the tweets reveals that #a11y is used primarily for design and development tips or resources, self-promotion tweets, and comments on the accessibility of virtual and physical experiences. Finally, we highlight open questions and plans for future work with this type of data
Finding Similar People to Guide Life Choices: Challenge, Design, and Evaluation
Du, F., Plaisant, C., Spring, N., Shneiderman, B.
To appear in Proc. Of ACM CHI'2017
[Abstract] [PDF] [Video]
Abstract: People often seek examples of similar individuals to guide their own life choices. For example, students making academic plans refer to friends; patients refer to acquaintances with similar conditions, physicians mention past cases seen in their practice. How would they want to search for similar people in databases? We discuss the challenge of finding similar people to guide life choices and report on a need analysis based on 13 interviews. Our PeerFinder prototype enables users to find records that are similar to a seed record, using both record attributes and temporal events found in the records. A user study with 18 participants and four experts shows that users are more engaged and more confident about the value of the results to provide useful evidence to guide life choices when provided with more control over the search process and more context for the results, even at the cost of added complexity.
Toward Accessible Health and Fitness Tracking for People with Mobility Impairments
Malu, M., Findlater, L.
Proceedings of the 10th EAI International Conference on Pervasive Computing Technologies for Healthcare, May 2016.
Abstract: Electronic health and fitness trackers have received substantial attention over the past decade, from new mobile and wearable technologies to evaluations of potential health impacts. These trackers, however, may not be accessible to people with mobility impairments, for whom activities such as running, walking, or climbing stairs can be difficult or impossible. To investigate the accessibility of wearable tracking devices and mobile apps, we conducted a study with 14 participants with a range of mobility impairments. The study included an in-person interview, evaluation of two off-the-shelf wearable devices, and a participatory design activity, followed by an optional week-long field evaluation of a mobile fitness app (to which 8 participants opted in). Our findings highlight widespread accessibility challenges with existing tracking technologies and provide implications for designing more inclusive solutions.
Assessing the Necessary Skill Profiles for Playing Video Games
Norman, K., Wang, C., Barnet, J., Mahmud, R.
Abstract: It seems clear that different video games require different skills. However, there has been no systematic way of assessing what these skills are or for assessing the extent to which particular skills are required by a particular game. This study used a psychometric approach to help identify these skills and profile particular games and genres of video games. Experienced gamers generated a list of 32 skills and then a diverse sample of participants rated a number of games on the extent to which they required the skills. Factor analysis revealed seven general components: perceptual-motor, role-playing, numerical reasoning, problem solving, focuspersistence, acceptance of uncertainty, and player interaction. Different genres of games differed significantly on a number of these components. The resulting instrument can be used by the game industry to profile games for review and evaluation.
Gatherplots: Extended Scatterplots for Categorical Data
Park, D., Kim, S., Elmqvist, N.
Journal of Latex Class Files, Vol. 14, No. 8, August 2015
Abstract: Scatterplots are a common tool for exploring multidimensional datasets, especially in the form of scatterplot matrices (SPLOMs). However, scatterplots suffer from overplotting when categorical variables are mapped to one or two axes, or the same continuous variables are used for both axes. Previous methods such as histograms or violin plots for these cases aggregate marks, which makes brushing and linking difficult. To improve this, we propose gatherplots, an extension of scatterplots to manage overplotting for categorical data, while keeping individual object identities. In gatherplots, every data point that maps to the same position coalesces to form a stacked entity, thereby making it easier to see the overview of data groupings. The size and aspect ratio of data points can also be changed dynamically to make it easier to compare the composition of different groups. In the case of a categorical variable vs. a categorical variable, we propose a heuristic to decide bin sizes for optimal space usage. This means that make better use of visual space to show the overall distribution. To validate our work, we conducted a crowdsourced user study that shows that gatherplots enable users to judge the relative portion of subgroups more quickly and more correctly than when using jittered scatterplots.
EventAction: Visual Analytics for Temporal Event Sequence Recommendation
Du, F., Plaisant, C., Spring, N., Shneiderman, B.
To appear in Proceedings of the IEEE Visual Analytics Science and Technology (2016)
Abstract: Recommender systems are being widely used to assist people in making decisions, for example, recommending films to watch or books to buy. Despite its ubiquity, the problem of presenting the recommendations of temporal event sequences has not been studied. We propose EventAction, which to our knowledge, is the first attempt at a prescriptive analytics interface designed to present and explain recommendations of temporal event sequences. EventAction provides a visual analytics approach to (1) identify similar records, (2) explore potential outcomes, (3) review recommended temporal event sequences that might help achieve the users' goals, and (4) interactively assist users as they define a personalized action plan associated with a probability of success. Following the design study framework, we designed and deployed EventAction in the context of student advising and reported on the evaluation with a student review manager and three graduate students.
A Visual Analytics Approach to Comparing Cohorts of Event Sequences
Ph.D Dissertation from the Department of Computer Science
Abstract: Sequences of timestamped events are currently being generated across nearly every domain of data analytics, from e-commerce web logging to electronic health records used by doctors and medical researchers. Every day, this data type is reviewed by humans who apply statistical tests, hoping to learn everything they can about how these processes work, why they break, and how they can be improved upon.
To further uncover how these processes work the way they do, researchers often compare two groups, or cohorts, of event sequences to find the dierences and similarities between outcomes and processes. With temporal event sequence data, this task is complex because of the variety of ways single events and sequences of events can differ between the two cohorts of records: the structure of the event sequences (e.g., event order, co-occurring events, or frequencies of events), the attributes about the events and records (e.g., gender of a patient), or metrics about the timestamps themselves (e.g., duration of an event). Running statistical tests to cover all these cases and determining which results are significant becomes cumbersome.
Current visual analytics tools for comparing groups of event sequences emphasize a purely statistical or purely visual approach for comparison. Visual analytics tools leverage humans' ability to easily see patterns and anomalies that they were not expecting, but is limited by uncertainty in findings. Statistical tools emphasize finding signifiant differences in the data, but often requires researchers have a concrete question and doesn't facilitate more general exploration of the data.
Combining visual analytics tools with statistical methods leverages the benefits of both approaches for quicker and easier insight discovery. Integrating statistics into a visualization tool presents many challenges on the frontend (e.g., displaying the results of many different metrics concisely) and in the backend (e.g., scalability challenges with running various metrics on multi-dimensional data at once). I begin by exploring the problem of comparing cohorts of event sequences and understanding the questions that analysts commonly ask in this task. From there, I demonstrate that combining automated statistics with an interactive user interface amplifies the benefits of both types of tools, thereby enabling analysts to conduct quicker and easier data exploration, hypothesis generation, and insight discovery. The direct contributions of this dissertation are: (1) a taxonomy of metrics for comparing cohorts of temporal event sequences, (2) a statistical framework for exploratory data analysis with a method I refer to as high-volume hypothesis testing (HVHT), (3) a family of visualizations and guidelines for interaction techniques that are useful for understanding and parsing the results, and (4) a user study, five long-term case studies, and five short-term case studies which demonstrate the utility and impact of these methods in various domains: four in the medical domain, one in web log analysis, two in education, and one each in social networks, sports analytics, and security.
My dissertation contributes an understanding of how cohorts of temporal event sequences are commonly compared and the difficulties associated with applying and parsing the results of these metrics. It also contributes a set of visualizations, algorithms, and design guidelines for balancing automated statistics with user-driven analysis to guide users to significant, distinguishing features between cohorts. This work opens avenues for future research in comparing two or more groups of temporal event sequences, opening traditional machine learning and data mining techniques to user interaction, and extending the principles found in this dissertation to data types beyond temporal event sequences.
Related research:Coco: A Visual Analytics Tool for Comparing Cohorts of Event Sequences
Animations 25 Years Later: New Roles and Opportunities
Chevalier, F., Riche, N., Plaisant, C., Chalbi, A., Hurter, C.
To appear in ACM Proc. of Advanced Visual Interfaces (2016)
Abstract: Animations are commonplace in today's user interfaces. From bouncing icons that catch attention, to transitions helping with orientation, to tutorials, animations can serve numerous purposes. We revisit Baecker and Small's pioneering work Animation at the Interface, 25 years later. We reviewed academic publications and commercial systems, and interviewed 20 professionals of various backgrounds. Our insights led to an expanded set of roles played by animation in interfaces today for keeping in context, teaching, improving user experience, data encoding and visual discourse. We illustrate each role with examples from practice and research, discuss evaluation methods and point to opportunities for future research. This expanded description of roles aims at inspiring the HCI research community to find novel uses of animation, guide them towards evaluation and spark further research.
High-volume hypothesis testing for large-scale web log analysis
Malik, S., Koh, E.
Malik, S. and Koh, E., High-volume hypothesis testing for large-scale web log analysis. Extended Abstracts on Human Factors in Computing Systems, CHI '16, 2016 (to appear)
Abstract: Time-stamped event sequence data is being generated across many domains: shopping transactions, web traffic logs, medical histories, etc. Oftentimes, analysts are interested in comparing the similarities and differences between two or more groups of event sequences to better understand processes that lead to different outcomes (e.g., a customer did or did not make a purchase). CoCo is a visual analytics tool for Cohort Comparison that combines automated highvolume hypothesis testing (HVHT) with and interactive visualization and user interface for improved exploratory data analysis. This paper covers the first case study of CoCo for large-scale web log analysis and the challenges that arise when scaling a visual analytics tool to large datasets. The direct contributions of this paper are: (1) solutions to 7 challenges of scaling a visual analytics tool to larger datasets, and (2) a case study with three real-world analysts with these solutions implemented.
The Future Role of Thermography in Human-Building Interaction.
Mauriello, M., Dahlhausen, M., Brown, E., Saha, M., Froehlich, J.
Mauriello, M., Dahlhausen, M., Brown, E., Saha, M., & Froehlich, J. (2016) The Future Role of Thermography in Human-Building Interaction. CHI 2016 Workshop: Future of Human-Building Interaction (To Appear).
Abstract: With recent sensor improvements and falling costs, energy auditors are increasingly using thermography--infrared (IR) cameras--to detect thermal defects and analyze building efficiency. In this workshop paper, we view thermographic energy auditing as a HumanBuilding Interaction (HBI). We provide an overview of emerging thermal data collection techniques in research and industry. We also reflect on our own work in this area and present our vision of citizen science/DIY thermography (Figure 1), which has the potential to engage the public in new HBIs by expanding their ability to: perform energy audits, survey public infrastructure, and contribute to urban energy analysis.
Simplifying Overviews of Temporal Event Sequences
Mauriello, M., Shneiderman, B., Du, F., Malik, S., Plaisant, C.
Mauriello, M. L., Shneiderman, B., Du, F., Malik, S., Plaisant, C., Simplifying Overviews of Temporal Event Sequences, Extended Abstracts on Human Factors in Computing Systems, CHI '16 (2016) to appear
Abstract: Beginning the analysis of new data is often difficult as modern datasets can be overwhelmingly large. With visual analytics in particular, displays of large datasets quickly become crowded and unclear. Through observing the practices of analysts working with the event sequence visualization tool EventFlow, we identified three techniques to reduce initial visual complexity by reducing the number of event categories resulting in a simplified overview. For novice users, we suggest an initial pair of event categories to display. For advanced users, we provide six ranking metrics and display all pairs in a ranked list. Finally, we present the Event Category Matrix (ECM), which simultaneously displays overviews of every event category pair. In this work, we report on the development of these techniques through two formative usability studies and the improvements made as a result. The goal of our work is to investigate strategies that help users overcome the challenges associated with initial visual complexity and to motivate the use of simplified overviews in temporal event sequence analysis.
Understanding the Role of Thermography in Energy Auditing: Current Practices and the Potential for Automated Solutions
Mauriello, M., Norooz, L., Froehlich, J.
In CHI 2015 Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, 1993-2002. DOI: 10.1145/2702123.2702528
Abstract: The building sector accounts for 41% of primary energy consumption in the US, contributing an increasing portion of the country's carbon dioxide emissions. With recent sensor improvements and falling costs, auditors are increasingly using thermography -- infrared (IR) cameras -- to detect thermal defects and analyze building efficiency. Research in automated thermography has grown commensurately, aimed at reducing manual labor and improving thermal models. Though promising, we could find no prior work exploring the professional auditor's perspectives of thermography or reactions to emerging automation. To address this gap, we present results from two studies: a semi-structured interview with 10 professional energy auditors, which includes design probes of five automated thermography scenarios, and an observational case study of a residential audit. We report on common perspectives, concerns, and benefits related to thermography and summarize reactions to our automated scenarios. Our findings have implications for thermography tool designers as well as researchers working on automated solutions in robotics, computer science, and engineering.
Data Visualization Tools for Investigating Health Services Utilization Among Cancer Patients
Onukwugha, E., Plaisant, C., Shneiderman, B.
In Hesse, B., Ahern, D., and Beckjord, E. (Eds.) Oncology Informatics, Elsevier (2016 to appear)
Abstract: The era of "big data" promises more information for health practitioners, patients, researchers, and policy makers. For big data resources to be more than larger haystacks in which to find precious needles, stakeholders will have to aim higher than increasing computing power and producing faster, nimbler machines. We will have to develop tools for visualizing information; generating insight; and creating actionable, on-demand knowledge for clinical decision making. This chapter has three objectives: 1) to review the data visualization tools that are currently available and their use in oncology; 2) to discuss implications for research, practice, and decision making in oncology; and 3) to illustrate the possibilities for generating insight and actionable evidence using targeted case studies. A few innovative applications of data visualization are available from the clinical and research settings. We highlight some of these applications and discuss the implications for evidence generation and clinical practice. In addition, we develop two case studies to illustrate the possibilities for generating insight from the strategic application of data visualization tools where the interoperability problem is solved. Using linked cancer registry and Medicare claims data available from the National Cancer Institute, we illustrate how data visualization tools unlock insights from temporal event sequences represented in large, population-based datasets. We show that the information gained from the application of visualization tools such as EventFlow can define questions, refine measures, and formulate testable hypotheses for the investigation of cancer-related clinical and process outcomes.
VisHive: Creating Ad-hoc Computational Clusters using Mobile Devices in Web-based Visualization
Sen, S., Badam, S., Elmqvist, N.
RBI: A New Approach to Rapid Generation of Big Ideas When Working in Intergenerational Design Teams
Golub, E., McNally, B., Druin, A.
Abstract: In an ideal world, there is time for all members of an intergenerational design team of children and adults to present, aggregate, and evaluate the suggestions that come out of work done with a design target concurrently by sub- groups during a session. However, when presented with either a relatively large set of features or not enough copies of prototypes to distribute, time or resource constraints mean this is not always realistic in practice. For those design experiences when time is short and quick design ideas are needed, a rapid evaluation of designs and big ideas generation can be utilized to provide feedback on numerous designs and/or features.
Understanding Adherence and Prescription Patterns Using Large Scale Claims Data
Bjarnadottir, M., Malik, S., Onukwugha, E., Gooden, T., Plaisant, C.
To appear in PharmacoEconomics
Purpose: Advanced computing capabilities and novel visual analytics tools now allow us to move beyond the traditional cross-sectional summaries to analyze longitudinal prescription patterns and the impact of study design decisions. For example, design decisions regarding gaps and overlaps in prescription fill data are necessary for measuring adherence using prescription claims data. However, little is known regarding the impact of these decisions on measures of medication possession (e.g., medication possession ratio). The goal of the study is to demonstrate the use of visualization tools for pattern discovery, hypothesis generation and study design.
Method: We utilize EventFlow, a novel discrete event sequence visualization software, to investigate patterns of prescription fills, including gaps and overlaps, utilizing large scale healthcare claims data. The study analyzes data of individuals who had at least two prescriptions for one of five hypertension medication classes: ACE inhibitors (ACE-I), Angiotensin II receptor blockers (ARB), Beta blockers (Beta), Calcium channel blockers (CCB) and Diuretics (Diur).
We focus on those members initiating therapy with Diuretics (19.2%) who may concurrently or subsequently take drugs in other classes as well. We identify longitudinal patterns in prescription fills for antihypertensive medications, investigate the implications of decisions regarding gap length and overlaps, and examine the impact on the average cost and adherence of the initial treatment episode.
Results: A total of 790,609 individuals are included in the study sample, 19.2% (N=151,566) of whom started on diuretics first during the study period. The average age is 52.4 years and 53.1% of the population is female. When the allowable gap is zero, 34% of the population has continuous coverage and the average length of continuous coverage is 2 months. In contrast, when the allowable gap is 30 days, 69% of the population shows a single continuous prescription period with an average length of 5 months. The average prescription cost of the period of continuous coverage ranges from $3.44 (when the maximum gap is 0 days) to $9.08 (when the maximum gap is 30 days). Results were less impactful when considering overlaps.
Conclusions: This proof-of-concept study illustrates the use of visual analytics tools in characterizing longitudinal medication possession. We find that prescription patterns and associated prescription costs are more influenced by allowable gap lengths than by definitions and treatment of overlap. Research using medication gaps and overlaps to define medication possession in prescription claims data should pay particular attention to the definition and use of gap lengths.
Evaluating Multi-Column Bar Charts and Treemaps for Dense Visualization of Sorted Numeric Data
Yalcin, A., Elmqvist, N., Bederson, B.
Abstract: A single column bar chart can effectively visualize a sorted and labeled list of numeric records, such as salaries per employee. However, its height limits the number of visible records. As the number of records increase, scrolling requires interaction to see an overview, and using shorter bars hinders observing individual records. For dense visualization of sorted numeric data, we consider two multi-column bar chart designs, wrapped bars and piled bars, in addition to treemaps, a space-filling design that is commonly used to scale in the number of records. We evaluate their design characteristics and graphical perception performance by crowdsourcing under comparison, ranking and overview tasks. Our results suggest that multi-column designs can outperform the space-filling treemap design to show more records for comparison and overview tasks with training.
Data Comics: Sequential Art for Data-Driven Storytelling
Zhao, Z., Marr, R., Elmqvist, N.
Abstract: We present Data Comics, a novel method for storytelling using sequential art---also known as comics---constructed from data-driven visualizations. This allows for building narratives using comic layouts of panels containing both snapshots and live visualizations. Each panel in a comic layout can be decorated with visual comic symbols---such as captions, speech and thought bubbles, directional arrows, and motion lines---to augment the narrative. To validate our method, we implemented a web-based Data Comics application that consists of (1) a Clipper for capturing data-driven content from the web, (2) a Decorator for creating panels and adding comic symbols, (3) a Composer for arranging clips into comic strips, and (4) a Presenter for viewing a finished comic. We compared the method to PowerPoint slideshows in a qualitative study, and found that participants found Data Comics more engaging, efficient, and enjoyable.
Improving Public Transit Accessibility for Blind Riders by Crowdsourcing Bus Stop Landmark Locations with Google Street View: An Extended Analysis
Hara, K., Azenkot, S., Campbell, M., Bennett, C., Le, V., Pannella, S., Moore, R., Minckler, K., Ng, R., Froehlich, J.
To appear in ACM Transactions on Accessibility.
Head-Mounted Display Visualizations to Support Sound Awareness for the Deaf and Hard of Hearing
Jain, D., Findlater, L., Gilkeson, J., Holland, B., Duraiswami, R., Zotkin, D., Vogler, C., Froehlich, J.
Proceedings of CHI 2015, 10 pages
High-Volume Hypothesis Testing: Systematic Exploration of Event Sequence Comparisons
Malik, S., Du, F., Plaisant, C., Bjarnadottir, M., Shneiderman, B.
To appear in ACM Transactions on Interactive Intelligent Systems (2015)
Abstract: Cohort comparison studies have been traditionally hypothesis-driven and conducted with carefully controlled environment (such as clinical trials). Given two groups of event sequence data, researchers test a single hypothesis (e.g., does the group taking Medication A exhibit more deaths and earlier deaths than the group taking Medication B?). However, researchers are now moving towards more exploratory methods and retrospective analysis of existing data. High-Volume Hypothesis Testing (HVHT) becomes useful to compare datasets. Focusing on event sequences we propose new thechniques that provide context, effect, and flexibility during HVHT, and aid researchers in understanding HVHT results (how significant they are, why they are meaningful, and whether the entire dataset has been exhaustively explored). Using interviews and case studies with domain experts, we iteratively designed and implemented techniques dealing with prevalence, time, and frequency in a visual analytics tool, CoCo. These interaction techniques allow users to systematically and flexibly parse large result sets through filtering, searching, and journaling. We illustrate the utility of the method with a case study in the medical domain.
Simplified Overviews for Temporal Event Sequences: Designs for Novice and Expert Analysts
Mauriello, M., Shneiderman, B., Du, F., Malik, S., Plaisant, C.
Contact Catherine Plaisant for a copy.
Abstract: Simplified overviews enable novices to more easily begin data analysis and enable experts to see common and surprising patterns. Simplified overviews have been used in research and commercial software for multi-variate data by choosing two dimensions to show on a scatterplot. We bring this idea to temporal event sequences, by facilitating the selection of two event categories. This simple strategy was inspired by observations of our case study partners and appreciated by pilot study users. The design was extended to provide six metrics for selecting categories that simplified the overview to display. To address the need of expert users, we also present simplified overviews using a lower triangular matrix of small overviews with all pairs of event categories. Along with single event category overviews shown on the diagonal they provide a revealing overview of the dataset. We believe these simplified overviews help novice and expert analysts to more rapidly and successfully extract insights. The design is implemented in the EventFlow software and refined based on two usability studies with 5 and 6 users. As a result of our work, guidelines for the design of simplified overviews are proposed.
Coping with Volume and Variety in Temporal Event Sequences: Strategies for Sharpening Analytic Focus
Du, F., Shneiderman, B., Plaisant, C., Malik, S., Perer, A.
To appear in IEEE Transactions on Visualization and Computer Graphics (2016) [Published Version]
Abstract: The growing volume and variety of data presents both opportunities and challenges for visual analytics. Addressing these challenges is needed for big data to provide valuable insights and novel solutions for business, security, social media, and healthcare. In the case of temporal event sequence analytics it is the number of events in the data and variety of temporal sequence patterns that challenges users of visual analytic tools. This paper describes 14 strategies for sharpening analytic focus that analysts can use to reduce the data volume and pattern variety. Four groups of strategies are proposed: (1) extraction strategies, (2) temporal folding, (3) pattern simplification strategies, and (4) iterative strategies. For each strategy we provide examples of use and of the impact of this strategy on volume and/or variety. Examples are selected from 18 case studies gathered from either our own work, the literature, or based on email interviews with application developers and analysts. Finally, we discuss how these strategies might be combined and opportunities for new technologies and user interfaces.
BodyVis: A New Approach to Body Learning Through Wearable Sensing and Visualization
Norooz, L., Mauriello, M., Jorgensen, A., McNally, B., Froehlich, J.
In CHI 2015 Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, 1025-1034. DOI: 10.1145/2702123.2702299 [Published Version]
Abstract: Internal organs are invisible and untouchable, making it difficult for children to learn their size, position, and function. Traditionally, human anatomy (body form) and physiology (body function) are taught using techniques ranging from worksheets to three-dimensional models. We present a new approach called BodyVis, an e-textile shirt that combines biometric sensing and wearable visualizations to reveal otherwise invisible body parts and functions. We describe our 15-month iterative design process including lessons learned through the development of three prototypes using participatory design and two evaluations of the final prototype: a design probe interview with seven elementary school teachers and three singlesession deployments in after-school programs. Our findings have implications for the growing area of wearables and tangibles for learning.
Social media affordances and their relationship to social capital processes
Ellison, N., Vitak, J.
Published in S. Sundar (Ed.), The handbook of psychology of communication technology (pp. 205-227). Boston: Wiley-Blackwell, (2015). [Published Version]
Abstract: This chapter considers the mechanisms by which social network site (SNS) use is associated with social capital processes, such as supporting beneficial interactions, information exchanges, and relationship maintenance. In doing so, we consider both the high-level affordances of SNSs, such as the persistence and visibility of content, as well as specific features of these sites, such as the profile. The chapter will proceed as follows: First, it will provide a review of research on social media and social network sites, highlighting the primary features and affordances of these sites. It will then synthesize the social capital literature, which is helpful for understanding how we access important human resources such as social and informational support from our social connections, before linking the two streams of research on SNSs and social capital by highlighting some of the key findings in recent years. In the next section, we turn to Ellison and boyd's (2013) revised definition of SNSs to consider the role played by the profile, the articulated network, and the broadcasted stream of content in social capital formation and development. To conclude the chapter, we draw from multiple research streams to examine social grooming practices in SNSs, focusing on the role of visible micro-transactions such as "liking" a comment on Facebook.
Korean mothers' KakaoStory use and its relationships to psychological well-being
Kim, J., Ahn, J., Vitak, J.
Published in First Monday, 20(3), (2015). [Published Version]
Abstract: This study investigates the relationship between life contexts, SNS use, and psychological well-being, by focusing on Korean mothers' interactions on a popular social network site (SNS), KakaoStory. Through analysis of survey and interview data, we find (1) a positive relationship between KakaoStory use and mothers' perceptions of positive relations with others (a construct of psychological well-being), but no relationship with overall life satisfaction; (2) employment status is an important contextual factor that influences Korean mothers' social connections, KakaoStory use, and psychological well-being; and, (3) working mothers lack opportunities for socialization and report lower levels of positive relations with others compared to stay-at-home mothers, when controlling for reported self-esteem. By analyzing these relationships, this study sheds light on the important role contextual factors play in determining women's use of social media and unpacks the effect of social media use on different dimensions of psychological well-being.
Balancing audience and privacy tensions on social network sites
Vitak, J., Blasiola, S., Patil, S., Litt, E.
Published in International Journal of Communication (2015).
Abstract: As social network sites grow and diversify in both users and content, tensions between users' audience composition and their disclosure practices become more prevalent. Users must navigate these spaces carefully to reap relational benefits while ensuring content is not shared with unintended audiences. Through a qualitative study of highly engaged Facebook users, this study provides insight into how people conceptualize "friendship" online, as well as how perceived audience affects privacy concerns and privacy management strategies. Findings suggest an increasingly complex relationship between these variables, fueled by collapsing contexts and invisible audiences. While a diverse range of strategies are available to manage privacy, most participants in this sample still engaged in some degree of self-censorship.
Cohort Comparison of Event Sequences with Balanced Integration of Visual Analytics and Statistics
Malik, S., Du, F., Monroe, M., Onukwugha, E., Plaisant, C., Shneiderman, B.
In ACM Intelligent User Interfaces (IUI) 2015. Atlanta, GA, USA, 38-49. (2015)
DOI: 10.1145/2678025.2701407 [Published Version]
Abstract: Finding the differences and similarities between two datasets is a common analytics task. With temporal event sequence data, this task is complex because of the many ways single events and event sequences can differ between the two datasets (or cohorts) of records: the structure of the event sequences (e.g., event order, co-occurring events, or event frequencies), the attributes of events and records (e.g., patient gender), or metrics about the timestamps themselves (e.g., event duration). In exploratory analyses, running statistical tests to cover all cases is time-consuming and determining which results are significant becomes cumbersome. Current analytics tools for comparing groups of event sequences emphasize a purely statistical or purely visual approach for comparison. This paper presents a taxonomy of metrics for comparing cohorts of temporal event sequences, showing that the problem-space is bounded. We also present a visual analytics tool, CoCo (for "Cohort Comparison"), which implements balanced integration of automated statistics with an intelligent user interface to guide users to significant, distinguishing features between the cohorts. Lastly, we describe two early case studies: the first with a research team studying medical team performance in the emergency department and the second with pharmacy researchers.
Discovering temporal changes in hierarchical transportation data: Visual analytics & text reporting tools
Guerra Gomez, J., Pack, M., Plaisant, C., Shneiderman, B.
To be published in the Transportation Research Part C: Emerging Technologies, Volume 51, February 2015 Pages 167-179
Analyzing important changes to massive transportation datasets like national bottleneck statistics, passenger data for domestic flights, airline maintenance budgets, or even publication data from the Transportation Research Record can be extremely complex. These types of datasets are often grouped by attributes in a tree structure hierarchy. The parent-child relationships of these hierarchical datasets allow for unique analytical opportunities, including the ability to track changes in the dataset at different levels of granularity, over time or between versions. For example, analysts can use hierarchies to uncover changes in the patterns of passengers flying in the United States over the last ten years, breaking down the data by states, cities, airports, and number of passengers. Exploring changes in travel patterns over time can help carriers make better decisions regarding their operations and long-range planning.
This paper describes TreeVersity2, a web-based data comparison tool that provides users with information visualization techniques to find what has changed in a dataset over time. TreeVersity2 enables users to explore data that can be inherently hierarchical or not (by categorizing them by their attributes). An interactive textual reporting tool complements the visual exploration when the amount of data is very large. The results of two case studies conducted with transportation domain experts along with the results of an exit questionnaire are also described. TreeVersity2 preloaded with several demo datasets can be found at http://treeversity.cattlab.umd.edu along with several example videos.