Skip to main content

Evaluation Methods and Replicability of Software Architecture Research Objects


Context: Software architecture (SA) as research area experienced an increase in empirical research, as identified by Galster and Weyns in 2016 [1]. Empirical research builds a sound foundation for the validity and comparability of the research. A current overview on the evaluation and replicability of SA research objects could help to discuss our empirical standards as a community. However, no such current overview exists.Objective: We aim at assessing the current state of practice of evaluating SA research objects and replication artifact provision in full technical conference papers from 2017 to 2021.Method: We first create a categorization of papers regarding their evaluation and provision of replication artifacts. In a systematic literature review (SLR) with 153 papers we then investigate how SA research objects are evaluated and how artifacts are made available.Results: We found that technical experiments (28%) and case studies (29%) are the most frequently used evaluation methods over all research objects. Functional suitability (46% of evaluated properties) and performance (29%) are the most evaluated properties. 17 papers (11%) provide replication packages and 97 papers (63%) explicitly state threats to validity. 17% of papers reference guidelines for evaluations and 14% of papers reference guidelines for threats to validity.Conclusions: Our results indicate that the generalizability and repeatability of evaluations could be improved to enhance the maturity of the field; although, there are valid reasons for contributions to not publish their data. We derive from our findings a set of four proposals for improving the state of practice in evaluating software architecture research objects. Researchers can use our results to find recommendations on relevant properties to evaluate and evaluation methods to use and to identify reusable evaluation artifacts to compare their novel ideas with other research. Reviewers can use our results to compare the evaluation and replicability of submissions with the state of the practice.


author={Konersmann, Marco and Kaplan, Angelika and K{\"u}hn, Thomas and Heinrich, Robert and Koziolek, Anne and Reussner, Ralf and J{\"u}rjens, Jan and al-Doori, Mahmood and Boltz, Nicolas and Ehl, Marco and Fuch{\ss}, Dominik and Gro{\ss}er, Katharina and Hahner, Sebastian and Keim, Jan and Lohr, Matthias and Sa\u{g}lam, Timur and Schulz, Sophie and T{\"o}berg, Jan-Philipp},
booktitle={2022 IEEE 19th International Conference on Software Architecture (ICSA)},
title={Evaluation Methods and Replicability of Software Architecture Research Objects},