Discussion & Conclusion (2019-03-22)

Tagged as: discussion, conclusion
Group: A This entry will cover the discussion regarding our evaluation and a final conlusion

Discussion

In comparison, both systems got similar feedback and reactions, which were rather positive. Also each participant was able to finish the given tasks. Sometimes we needed to give them a little hint, but only for one task one participant of the evaluation of the RUP needed that much guidance, that we had to mark a task as failed. However, some key features showed weaknesses in both systems. While it was commended for its simplicity and straight-forwardness, Morae was perceived as being overpowering with its plethora of configuration options and functionalities. What has to be improved in the former is the missing feedback for nearly all user interactions. User interface design can however, be improved in both tools, as for both participants were missing a clearer call to action for some functions while working on their tasks. In addition, both tools showed communication problems between the test leader and the test subject. To allow users communication only through a text-based chat leads to many negative remarks. An easy way to address and improve the communication would be the integration of a voice-chat either as direct feature of the RUP, which would be the best solution. An alternative could be to run a second party voice chat program to reduce communication issues. The study proved that the communication between the test leader and the test subject should not be limited in any case, because without the chance to interact with each other properly, both sides can not ensure to deliver the best possible results. While performing our main study, we found several limitations. One limitation found with remote testing tools are the different hardware setups as well as the network connection. These aspects make it hard to compare the gathered data and cannot be easily addressed. In our case the hardware was always the same, this applied for the network connection, too. Yet, problems occurred that were out of our hands. In one test for instance, the network connection went down and we had to pause the test for several minutes. In another case, Morae froze and we had to restart the whole system, which included reloading the study configuration and restarting the whole execution. If these problems occur while performing a real remote usability study, they may lead to a complete loss of information, because the spatial separation may make it impossible to address them. Another issue concerning local separation is weak or old hardware as well as a slow or unstable network connection which remote testing is heavily influenced by, since they may lead to massive delay in the stream, which would make the testing impossible. Another limitation we found was our own task design. Overall some tasks were described in too much detail. This lead to very short completion times in connection with these tasks, which made it hard to compare the tasks with each other. Also the scope of each single task was designed too diverse, which led to the same comparing problems. All these reasons combined with the small count of test participants, made it impossible to evaluate reliable findings in consideration of the task completion time. The interviewed expert did not mention the advantage of synchronous remote testing tools for hard to reach test subjects, this may be due to the niche research field, in order to get insight further interviews with scientists of this research area have to be conducted.

Conclusion

The market analysis showed that remote tools are often bundled with the service of testing and evaluating of the study. The need to install and setup the test by themselves can have negative effects on the participants which can result in worse test results than in lab setups. However, a big concern should be to allow easy participation in a study. A redesign of the software is suggested: the application should rather be hosted on a server in order to reduce the overhead for the participant, this would also solve the problem of compatibility with the operating system. The market analysis and the interviews addressed the issue of the requirement of a stable internet connection, delays in the transmission and the impairment of communication. Those aspects need to be addressed in another study and should be considered as key aspects for the quality and usability of such a system. Another negative aspect is the dependency on the tool and hardware limitations of the participants, for the recording and the conduction of the test. In conclusion, it is safe to state that the RUP will need future updates and improvements to become a fully valuable software for remote usability testing. At the moment the core feature list is set and functioning, nonetheless these features must all be improved regarding their usability and new features must be added to increase the scope of use cases. Comparing the system to Morae, one of its main contenders, there is clear potential for the RUP. Adding new features to it might increase its value, but it is important that these new functions are added with the user in mind. We propose that if new features are added, it should be done one at a time, with own market analyses, expert interviews and usability tests. Admittedly though, as our expert interviews have shown remote testing seems to be of unimportant status at the moment, both in the academic as well as in the economic world. Neither our related work nor our experts showed reasons, why remote testing should be used more or why it would be the best way to do usability testing: rather the opposite was the case. Even our test participants stated they would prefer to test or be tested face to face.