Skip to main content

Establishing a Benchmark Dataset for Traceability Link Recovery Between Software Architecture Documentation and Models


In research, evaluation plays a key role to assess the performance of an approach. When evaluating approaches, there is a wide range of possible types of studies that can be used, each with different properties. Benchmarks have the benefit that they establish clearly defined standards and baselines. However, when creating new benchmarks, researchers face various problems regarding the identification of potential data, its mining, as well as the creation of baselines. As a result, some research domains do not have any benchmarks at all. This is the case for traceability link recovery between software architecture documentation and software architecture models. In this paper, we create and describe an open-source benchmark dataset for this research domain. With this benchmark, we define a baseline with a simple approach based on information retrieval techniques. This way, we provide other researchers a way to evaluate and compare their approaches.

Download Preprint


author="Fuch{\ss}, Dominik and Corallo, Sophie and Keim, Jan and Speit, Janek and Koziolek, Anne",
editor="Batista, Thais and Bure{\v{s}}, Tom{\'a}{\v{s}} and Raibulet, Claudia and Muccini, Henry",
title="Establishing a Benchmark Dataset for Traceability Link Recovery Between Software Architecture Documentation and Models",
booktitle="Software Architecture. ECSA 2022 Tracks and Workshops",
publisher="Springer International Publishing",