Inhaltsverzeichnis
Forschungsseminar Master Medieninformatik / Research Seminar Master Media Informatics
WS 2018/19
In this research seminar, students conduct medium-sized research projects around generic and specific research questions in human-computer interaction.
Organizers / Advisors: Raphael Wimmer, Andreas Schmid, Jürgen Hahn, Florian Bockes
Groups and Topics
A: Evaluation of the Regensburger Usability Platform (RUP)
Members: Dominik Deller, Julia Grötsch, Philipp Schuhbauer
The Regensburger Usability Platform is a synchronous remote testing tool. It is the first iteration in an ongoing development process, which to this point lacks a proper evaluation. In order to build a possible roadmap for further iterations and implementations, the tool was evaluated in two steps. At first a market analysis and expert interviews were conducted to gather information about functions and attributes that should be considered for future updates and how well regarded remote testing tools are as a whole. (more...)
B: Tasks and metrics for the comparison of user interfaces
Members: Daniel Schmaderer, Anna-Maria Auer
Bad usability and performance limits the use of an user interface. In order to improve usability and performance, an evaluation of an user interface is essential. In this paper we present a set of tasks and metrics which can be used for the comparison and analysis of photo storage applications. For the building process of tasks and the selection of metrics we used literature research combined with interviews and questionnaires. This guarantees both the representation of scientific findings and the representation of actual user needs. For evaluating the task corpus' completeness and adequateness we evaluated the photo storage application Google Photos within the context of a main study. 20 participants contributed to this study. The results show that our set of tasks and metrics allow a comparison of the photo storage application Google Photos. It enables a deeper insight into users' opinions as well as showing the weakness of a software. Based on this set, researchers are able to evaluate photo storage application on various user interfaces and are able to compare them. The study and therefore the paper was motivated by the fact that there is hardly any standardized criteria that allows a neutral comparison of UIs in general. We focused on photo storage application-UIs and examined the question how such a task corpus could look like and which metrics are necessary to determine the before mentioned performance. (more...)
C: Influence of latency and latency variance on UX and performance
Members: Vitus Maierhöfer, Felix Riedl, Andrea Fischer
The influence of base latency on task performance and user experience while interacting with a computer system is a well researched topic. Yet, most components of a system do not provide a stable, but a fluctuating intensity of latency. It has not been researched how this variance is taking effect on the dimensions of performance and user experience. This is why we explore how latency and latency variance influence the performance and experience in the context of computer gaming. Before beginning study conception, we identified an adequate measuring method for the latency characteristics of our study prototypes. Next, a preliminary study with 30 participants was conducted using a custom low latency HID and a custom game prototype in order to explore valid parametrization ranges for base latency and latency variance. Based on the insights of the preliminary study, a more extensive main study was conducted, testing the influence of base latency and latency variance in the range of a Fitts’ Law-style game on 33 participants. We could establish the influence of base latency on player performance and the influence of high base latency on pleasure while playing. The results for different latency variances showed no differences, but may be influenced by limitations of our prototype: A setup based on open source and off-the-shelf components and software was chosen for higher replicability of the study, which came with a high end-to-end latency and a reduced the range of absolute latency variances. (more...)
D: Comparing Ranking Methods and Likert-type scales
Members: Julia Sageder, Ariane Demleitner, Oliver Irlbacher
Likert-type scales are considered as a popular tool in questionnaires and evaluations to gather reviews and opinions from participants taking part in the evaluation process. The need for new Likert-type scale alternatives has arisen through a set of scientifically ambiguous guidelines for the creation of Likert-type scales on the one hand, and through various criticisms of incorrect scientific analysis of Likert-type scales on the other. Within the evaluation process, several objects of investigation can be ranked and thus can be rated over the assigned rank. Now, do ranking methods achieve the same results/winners as Likert-type scales? A preliminary (18 German and Colombian participants) and a main study (24 participants) were conducted to investigate this topic. The results of the explorative pre-study lead us to the design of the ranking method for the main-study. Our findings of the main-study show that multiple ranking scales (evaluated with the Schulze method) achieve the same results when determining a ranking winner, as Likert-type scales. For this reason, it can be assumed that multiple ranking scales are an alternative to the commonly used Likert-type scales. Due to a mistake in our study design, we cannot proof that participants are faster with multiple ranking scales compared to Likert-type scales, further studies have to answer this question. (more...)
E: A non-invasive approach for measuring input device latency
Members: Paul Winderl, Johannes Dengler, Thomas Oswald
Research in latency measurement is conducted since decades, but has ongoing relevance to users of input devices. Therefore we investigate a new approach to this topic. Bockes, Schmid and Wimmer [1] describe an invasive way to measure input latency for different types of devices. We pick up their idea and conduct experiments with the same devices but using a non-invasive approach. This means, that we focus on pressing a button of an input-device that is connected to a Raspberry Pi 2 via USB to trigger an input-event and measure the time it takes to arrive at the system. The button-press is achieved by a robot that physically triggers said button through the movement of a measuring head. (more...)
F: Comparing File Managers
Members: Philipp Weber, Khang Ho
Managing ones files is one of the most basic functions a computer offers. Considering the time one spends with file management, it would be desirable to do so in the most efficient way. There seems to be a lack of research on how the choice of file managers influences the file managing process and the completion time of tasks users perform. Because of that, we compare three different types of file managers, on different operating systems (OS), to find out if certain ones are better suited for specific tasks and if the same type of file manager performs differently based on the OS used. We compare each OS's standard file manager, the cross-platform dual-pane file manager (Double Commander) and each OS's terminal on Windows 10, Ubuntu 18.10 and macOS 10.4 (as this was the only version available to us through our university). The result of our study with 30 participants show that the default file manager overall performs best, while the terminal performs worst. However this does not hold true for all tasks. The difference in performance of the file managers suggests, that certain file managers are better suited for certain tasks than others. We also found no significant difference between the OS's used when performing the tasks. (more...)
G: Why Augmented Reality is Awesome and Nobody Uses It
Members: Marlena Wolfes, Christine Schikora, Sümeyye Reyyan Yildiran
The emergence of Augmented Reality (AR) in the last decades was characterized by a shift from exclusively industrial and governmental to commercial accessibility. This includes an expansion and advancement of the underlying technologies and therefore an increasing complexity and loss of overview of available methods. For this reason, it is difficult for developers to find suitable technologies and methods for evaluating AR applications that offer high validity and reliability. To overcome this issue, a taxonomy could provide an overview over all available possibilities as well as illustrate the linkage between single elements within a larger hierarchy. However, in the literature few classifications of AR technologies and evaluation methods can be found. In our paper we propose taxonomies of AR technologies and evaluation methods and show their applicability, generalizability and expandability by analyzing state-of-the-art AR publications and the used technologies and evaluation methods. By applying different selection criteria we reduced the corpus of publications from 405 to 135 covering the years 2015 to 2017.We analyzed usage frequencies and drew comparisons to results from the literature as well as identified limitations, problems and trends in current AR research. The most preferred technologies in our analysis are represented by visual displays with head-mounted or handheld positionings, optical tracking sensors as well as touch and body motion as input types. Regarding evaluation-related results, user performance and system technology as well as task-based application tests were the most preferred evaluation areas and methods. (more...)
H: Replicating Studies
Members: Marlene Bauer, Florian Habler
Replication of research should be seen as one of the cornerstones of good and honest science (Wilson et al., 2013). Replications confirm the results, strengthen research, and ensure that results are based on solid foundations. However, studies indicate that an alarming number of published study results can not or will not be reproduced. In the field of human-computer interaction (HCI) and computer science, replication is performed even less frequently than in others. We discuss two main questions: (i) What are the reasons for this undesirable effect? (ii) How can this be remedied? We focus on the impact of replication on science and show why it is important to describe the challenges and give some guidelines that should be followed to enable valuable scientific replication. We replicated relevant parts of a study from Harrison et al. (Harrison et al., 2010) on the subjective perception of the duration of various progress bars and applied our elaborated guidelines to the conduct of the study. We compared five different behaviors of progress bars, with different frequencies alternating colors from light blue to blue. In study 1, the slow increasing behaviour in relation to the fast increasing behaviour of the progress bars has a statistically significant effect. The three behaviours slow increasing, fast increasing and constant are generally perceived faster than the slow decreasing and fast decreasing behaviour. Study 2 indicates that lower-height progress bars are perceived faster than larger ones. When comparing progress bars and progress circles, participants more often perceived the progress bar faster. The challenges and implementation problems are discussed and the results of both studies are analyzed in detail. For replication, it is important that the authors provide as much and accurate data as possible from the original study. (more...)
I: User performance with mobile and desktop applications
Members: Michael Hebeisen, Jonathan Seibold
The two main contexts of human computer interaction are stationary and mobile. It is not clear how the different use behavior and form factors of stationary and mobile computers lead to different interaction styles and could affect user performance. Therefore, in this paper we researched the user task performance differences between mobile and desktop devices. Furthermore, as a preparation, we researched common tasks for both system types in focus groups. From these results, we constructed three tasks for our main study which we conducted with 34 participants. We conclude that for our subset of possible comparable tasks no system performed overall better at each of our three tasks. Only for one task containing writing the desktop system performed significant better in regard to task completion time and error rate. Also raised NASA TLX scores for the respective tasks showed no significant difference between mobile and desktop devices with regard on the perceived mental load. (more...)
J: Single vs. Multiple Tangibles
Members: Matthias Schenk, Fabian Zürcher
Tasks of a tangible user interface built for a multi-touch-table are examined in reply to the question wether users prefer one single tangible with dynamic functionality or multiple specific tangibles. (more...)
Latest Blog Posts
Group C: Impact of latency and latency variance on the uni life balance (2019-03-31) (2019-03-31)
This blog entry is a summary about the last semester and our work. (more...)
Group G: Over and out - Final Reflection and Goodbye (2019-03-27) (2019-03-27)
In this blog article we reflect on our project progress and methods from the last months and describe how we wrote our rebuttal. (more...)
Group A: Discussion & Conclusion (2019-03-22) (2019-03-22)
This entry will cover the discussion regarding our evaluation and a final conlusion (more...)
Group A: Evaluation (2019-03-21) (2019-03-21)
Evaluation of the Regensburger Usability Platform (more...)
Group C: Writing the paper (2019-03-18) (2019-03-18)
This blog entry is about our paper with some extractions. Furthermore, our rebuttal letter and review approach is mentioned. (more...)
Group D: Writing of the Rebuttal (2019-03-08) (2019-03-08)
To help to improve our paper, seven people took the time to write a review. (more...)
Group F: Evaluation of the study (2019-03-08) (2019-03-08)
In this blog entry we present our results, which we use in our submission. Not every evaluation is included, only the combinations of varibales that we deemed to be important for our research got included. Different statistical tests were used. (more...)
Group G: Analysis and Discussion of the Literature Research Results (2019-03-07) (2019-03-07)
This blog article will show you the most important results, limitations and conclusions of our research. (more...)
Group B: Results and current work (2019-03-01) (2019-03-01)
First results and current work (more...)
Group E: Results and Analysis (2019-02-28) (2019-02-28)
We evaluated and analyzed our data. (more...)
Group G: The first results of our publication analysis (2019-02-21) (2019-02-21)
first results are presented (more...)
Group D: Writing of the paper (2019-02-19) (2019-02-19)
After conducting the user tests (pre-study, main-study), central thoughts and findings were discussed and the writing of the paper was initiated. (more...)
Group D: Results of the mainstudy (2019-02-18) (2019-02-18)
Evaluation of the mainstudy and analyzing the data to see, if ranking is an alternative to likert scales (more...)
Group C: Related work (2019-02-18) (2019-02-18)
During our study, we read a lot of literature about latency. This blog entry provides a summary of our related work - with the goal to ease the process writing our paper. (more...)