Accepted papers

Accepted papers for SAM 2025

The papers for SAM 2025 will be published in MODELS 2025 Companion proceedings with IEEE.

Richard Qualis. Mitigating Hallucinations in SysML v2 Generation Using LLMs and a Tri-Layered Knowledge Graph Reasoning Framework
This paper presents a structured reasoning pipeline that integrates Large Language Models (LLMs) with a tri-layered knowledge graph (KG) framework to automate the generation of SysML v2 modeling artifacts from structured requirements. The approach addresses the persistent challenge of LLM hallucinations in model-based systems engineering (MBSE) by grounding generative outputs in curated knowledge sources. Two core KGs are constructed using a custom Python-based Reasoning Engine: one encodes reusable SysML patterns annotated across nine diagram types, and the other captures domain-specific system models (e.g., aerospace, automotive). A third, system-specific KG is automatically synthesized by parsing capability-linked requirements and aligning them with the foundational KGs and structured prompt templates. These KGs enable context-aware, hierarchical system model generation and inform a dual-parameter prompting strategy: one set is auto injected post-KG construction to preserve domain integrity, and another is derived through our Reasoning Engine over system data prior to LLM invocation. The resulting prompts are validated and refined to produce accurate, high-fidelity SysML v2 representations. This framework offers a scalable path for integrating generative AI into digital engineering (DE) workflows. Ongoing work includes semantic alignment, ontology integration, traceability, prompt optimization, and the addition of a knowledge graph-based memory and planner to enable our reasoner to efficiently execute step-by-step plans—further advancing AI-assisted MBSE.

Hamza Haoui, Bianca Wiesmayr, David Hastbacka and Kari Systa. Service-oriented Modeling of Mixed-Fleet Systems in SysML v2 in a Harbor Logistics Scenario
Modern logistics systems aim to leverage digital technologies and may integrate autonomous components to increase efficiency and flexibility. In so-called mixed-fleet systems human workers, manually operated machines, and autonomous machines collaboratively work towards a common goal. The subsystems are loosely coupled and can be reconfigured flexibly, leading to a change in behavior. Addressing this behavior requires flexible architectures for model-driven systems engineering. Service-oriented architectures help focus on the expected behavior, independent of the involved actors. Additionally, event-based communication mechanisms can describe expected interactions between subsystems. This paper explores the use of SysML~v2 for modeling a service-oriented architecture of mixed-fleet systems. Based on an available set of requirements, suitable SysML~v2 modeling elements are identified that can describe services, events, and service choreographies. We use the described concepts to create a SysML~v2 model of a mixed-fleet harbor logistics use case. Based on this model, we demonstrate how business processes can be composed from reusable service and how requirements and verification can be integrated to ensure correctness of behavior. The results show that SysML v2 fulfills key modeling requirements for service-oriented architectures, separating service definitions, actors and verification. Reusable modeling patterns were applied to support scalability and traceability across actors and services within the model. Furthermore, domain-specific constraints and requirements were composed into modeling elements using formal mechanisms to ensure that they are not only documented but actively connected to the model.

Mihal Brumbulli and Emmanuel Gaudin. Optimizing Industrial Operations through Business Process Formalization
Large-scale industrial projects often experience significant delays and cost overruns due to the inherent complexity of modern industrial operations. These challenges necessitate robust process control mechanisms to optimize lead times and expenditures. To enhance operational efficiency and ensure compliance with international standards such as ISO, the formalization of business processes has become imperative. This formalization serves as a fundamental step toward effective activity monitoring within complex organizations. When properly modeled, business processes can be further optimized through verification and simulation techniques. This study examines the research conducted by PragmaDev in collaboration with Airbus under the OneWay project framework. The primary objectives of the project included the verification and simulation of business processes to assess cost implications and lead-time efficiency. Additionally, the study explored methodologies for managing model variability and developing digital twin architectures to enhance industrial process optimization.

Zaki Pauzi and Andrea Capiluppi. Using Concept Traceability to Investigate UML Class Diagram Evolution in Long-Existing FOSS Projects
Traceability of design artifacts is key to software maintainability, yet little empirical evidence exists on UML model evolution in long existing FOSS projects. We refreshed a 2017 dataset—comprising 251 class diagrams across 81 projects—with 735 diagrams from December 2024. Despite a nearly three‐fold increase in raw diagram count, our concept-tracing analysis found virtually no substantive maintenance: early results show most diagrams remained unchanged or experienced only cosmetic updates. Only one diagram update corresponded to a security design proposal. These findings suggest UML dormancy, highlighting a disconnect between model intent and practice in FOSS.

Antonio Bucchiarone, Benoit Combemale, Alfonso Pierantonio, Nelly Bencomo, Mark van den Brand, Jean-Michel Bruel, Antonio Cicchetti, Juri Di Rocco, Leen Lambers, Judith Michael, Bernhard Rumpe, Mikael Sjodin, Gabriele Taentzer, Matthias Tichy, Hans Vangheluwe, Manuel Wimmer and Steffen Zschaler. Modeling: The Heart and Soul of Engineering Smart Ecosystems
The pervasive digitalization of our world has ushered in a new era marked by increased complexity and diversity in the development, optimization, and maintenance of modern software-intensive systems. These systems, often characterized by intricate socio-technical components and AI integration, pose challenges for conventional systems engineering approaches and require an alliance of different disciplines. In this vision paper, we argue that they demand a paradigm shift towards integrative modeling across systems engineering, software engineering, data science, and simulation engineering. We highlight the key challenges faced in the development of modern complex systems that need to be addressed by this paradigm shift. We argue that achieving this shift requires new research, tools, and education.

Sebastian Bergemann, Andreas Bayha, Derui Zhu, Mohammad Sadeghi, Colin Atkinson and Alexander Pretschner. Mind the Leak: Formalizing Confidentiality Preservation Assessment of Multi-Model Consistency Checking Systems
Ensuring confidentiality during multi-model consistency checking is a critical challenge in collaborative systems engineering. However, it is not yet clear how to assess and compare current and future solutions for multi-model consistency checking with regard to confidentiality. Therefore, this paper introduces a formalized system model for confidentiality-preserving consistency checking. A formalization of the confidentiality preservation capability of such a system model is proposed to assess whether a given consistency checking system prevents unauthorized information leakage under specific assumptions. Based on these definitions and formalizations, we present an assessment method where an abstract system model is derived from either an implemented or conceptualized consistency checking system, and our confidentiality formalization is applied to assess the system's guarantees for model data confidentiality. Our approach provides system and software engineers with a structured method to assess the confidentiality preservation capability regarding model data in their multi-model consistency checking systems, helping them to identify potential weaknesses and guiding improvements to enhance confidentiality where needed. To demonstrate the applicability of our framework, we apply it to an existing prototype of a partly confidentiality-preserving consistency checking system, as well as two improved versions, assessing their confidentiality preservation strengths and areas for improvement.

Maged Elaasar, Abdelwahab Hamou-Lhadj, Bentley Oakes and Mohammad Hamdaqa. Model-Based Systems Engineering Perspectives: A Survey of Practitioner Experiences and Challenges
Model-based systems engineering (MBSE) is an emerging field, aiming to bring traceability and complexity management to the systems engineering process. However, multiple conceptual, technical, and organizational challenges continue to impede the effective deployment of MBSE in practice. This paper reports the results of a survey completed by 76 MBSE researchers and/or practitioners, on their organization's use of MBSE. Our analysis indicates that many organizations have yet to fully leverage MBSE. Several have not completely transitioned to MBSE in their systems engineering processes, or do not adhere to any specific method, indicating a lack of a comprehensive and organization-wide MBSE approach. We find that challenges such as change management, cross-team collaboration, and tool customization persist. We report on these challenges and provide recommendations as potential solutions.

Jawher Jerray, Bastien Sultan and Ludovic Apvrille. Fine-Grained Confidentiality and Authenticity Modeling and Verification for Embedded Systems
Handling cybersecurity during system design is mandatory for (critical and) connected embedded systems. Numerous contributions, including standards like ISO 26262, emphasize the need to address cybersecurity as early as possible in the design process. Design space exploration, typically performed early in system design - before software or hardware development - offers an opportunity for early cybersecurity integration. SysML-Sec has demonstrated how cybersecurity concepts can be incorporated into design space exploration. However, its security mechanisms have significant limitations to address some of the modern threats. The paper introduces a new security modeling and verification approach. Our method enables multi-pattern security channels, allowing multiple security patterns to coexist within a single communication channel. It also supports fine-grained verification of individual write and read operations, ensuring that confidentiality and authenticity are independently validated for each data exchange. Additionally, our approach generates traceable counterexamples for unverified properties, helping engineers identify and address security vulnerabilities. We implemented this technique in TTool/DIPLODOCUS, a UML/SysML-based framework for hardware/software co-design, demonstrating how its enhanced version can now support more advanced security mechanisms, and evaluated it on an automotive case-study.

Christian Seifert, Christian Steger and Tiberio Fanti. Bridging the V-Model: Early Pre-Verification of Digital System Architectures via Estimation and Back-Annotation
In modern system design, substantial engineering effort is invested in architectural solutions that later prove suboptimal, when not even impractical. This is often the case because the modelling abstraction level adopted during the concept evolution of a system doesn’t allow for capturing early implementation aspects that have a relevant impact on its resource demand, as well as on its performance. This paper presents a model-based methodology and an open-source SW platform (DESIRE) to support it, for early-stage pre-verification of architectural alternatives, aiming to reduce costly late-stage redesigns. Leveraging formally defined and easily composable node abstractions, the proposed framework enables design space exploration (DSE) at a component level without requiring full RTL implementation. Each node of the network can be adjusted by capturing externally observable behaviour, such as instruction invocation patterns or service delays, while remaining agnostic to internal execution logic. To improve the fidelity of the estimation models, empirical trace data from detailed simulations or physical systems is back-annotated into node semantics. This process enriches high-level estimation with statistically grounded timing and control-flow characteristics, enabling performance analysis on abstract yet comparable scales. Additionally, we address the reuse of back-annotated data across parameter variants and configurations to support scalable system estimation modelling. The methodology is demonstrated on an RISC-V-based NFC data handler, exploring how node-based estimation enables comparative evaluation of early architectural decisions—such as instruction distribution, communication bottlenecks, and contention profiles—before committing to RTL synthesis. While the case study is modest in scope, the framework is designed to scale to complex multi-core architectures and heterogeneous systems. Future extensions may include SDL support, potentially enabling protocol-level behaviours to be integrated within the trace-calibrated framework.

Zakaria Hachm, Théo Le Calvar, Hugo Bruneliere and Massimo Tisi. Towards LLM Agents for Model-Based Engineering: A Case in Transformation Selection
In Model-Based Engineering (MBE), practitioners frequently face the challenge of selecting appropriate tools from a large number of options. This requires both deep domain-specific knowledge and technical expertise. LLM-based agents are software components that depend on Large Language Models (LLMs) to autonomously select and apply software tools to perform specific tasks. Although LLMs have already been applied to support various MBE activities, considering LLM-based agents to autonomously assist users of MBE tools remains underexplored. This is particularly challenging in industrial MBE environments where only medium-size on-premise LLMs can be used due to company policies related to security or data privacy (for instance). To investigate the potential of LLM-based agents for MBE, we start with model-to-model transformation as a core MBE technique. Currently, off-the-shelf agents such as Microsoft Copilot can invoke a transformation engine (e.g. ATL) when the task is explicitly described. However, these agents struggle to select the correct transformation when they only have limited contextual information, especially when coupled with medium-size LLMs. To overcome this, we propose an approach based on complementary elements. First, we build a model transformation server and an LLM agent with dedicated tools for each transformation available on the server. Second, to enable the agent to efficiently select transformations, we rely on a tool retrieval technique based on a tool relevance score computed by an LLM. We evaluate our LLM agent on a model-to-model transformation dataset we also contribute to the community. Our comparative study shows that the newly proposed LLM agent responds more accurately to user instructions.

Oystein Haugen, Stefan Klikovits, Martin Arthur Andersen, Jonathan Beaulieu, Francis Bordeleau, Joachim Denil and Joost Mertens. DarTwin made precise by SysML v2 – An Experiment.
The new SysML v2 adds mechanisms for the built-in specification of domain-specific concepts and language extensions. This feature promises to facilitate the creation of Domain-Specific Languages (DSLs) and interfacing with existing system descriptions and technical designs. In this paper, we review these features and evaluate SysML v2’s capabilities using concrete use cases. We develop DarTwin DSL, a DSL that formalizes the existing DarTwin notation for Digital Twin (DT) evolution, through SysML v2, thereby supposedly enabling the wide application of DarTwin’s evolution templates using any SysML v2 tool. We demonstrate DarTwin DSL, but also point out limitations in the currently available tooling of SysML v2 in terms of graphical notation capabilities. This work contributes to the growing field of Model-Driven Engineering (MDE) for DTs and combines it with the release of SysML v2, thus integrating a systematic approach with DT evolution management in systems engineering.
This paper was awarded the Best Paper Award.

Yaxin Zou, Zhibin Yang, Hao Liu, Jiawei Liang, Zonghua Gu and Yong Zhou. Automated AADL Architecture Modeling : Leveraging Large Language Models for Safety-Critical Software
Architecture modeling is an essential part of model-driven development of safety-critical software. AADL is a modeling language standard to design and analyze safety-critical software. However, there is usually a large gap between software requirements and architectural design. Effectively transforming requirements into formal software architecture models relies on a lot of manual experience and iterative exploration. In order to address these challenges, we conduct an exploratory study of using LLMs for fully automated AADL modeling. We assess three powerful LLMs, GPT-4o, DeepSeek-V3 and GLM-4-Plus. First, we decompose and refine high-level software requirements and design constraints, mapping them to the proposed prompt framework, RNL-Prompt, which significantly improved the accuracy of LLMs in generating different modeling elements. Our findings reveal that GPT-4o and DeepSeek-V3 perform on pair with each other, and both outperform GLM-4-Plus in complex modeling elements, such as modes, behavior annex, etc. To enhance the potential of GLM-4-Plus, we optimize its performance using N-shot prompting and retrieval-augmented generation (RAG). The results indicate that N-shot prompting performs more effectively. Finally, we demonstrate the effectiveness of our proposed approach in generating AADL architecture models in five examples of safety-critical domains. In addition, we implement a LLM-based modeling tool based on the AADL open source environment OSATE, which supports GPT-4o, DeepSeek-V3 and GLM-4-Plus. The tool is successfully applied to the modeling of an avionics control system in the industry.

Farzaneh Kargozari and Sanaa Alwidian. A Real-Time Multi-modal Framework for Human-Centric Requirements Engineering in Autonomous Vehicles
Trustworthy and human-centric adaptation remains a central challenge for autonomous vehicles (AVs) which operate in dynamic and uncertain environments. This paper proposes a real-time, multimodal and self-adaptive framework that operationalizes Human-Centric Requirements Engineering (HCRE) by treating contextual signals as non-functional requirements (NFRs). The proposed framework integrates driver emotions, behaviors, traffic conditions, and vehicle dynamics within an interpretable neural architecture to deliver proactive behavior recommendations aligned with drivers’ needs. Unlike prior approaches which rely on static rules or thresholds, our framework continuously elicits, monitors, and fulfills latent human-centric goals, such as cognitive comfort, trust, and perceived safety, through transparent and context-aware adaptation. Trained and evaluated on the AIDE dataset, the system achieves high accuracy across perception modules (83–93%) and 89.32% exact match accuracy for integrated behavior recommendations. It satisfies real-time constraints with an average inference latency of 106.84 ms and maintains interpretability through explicit mappings from multi-modal input to adaptive output. The results demonstrate the feasibility of embedding the HCRE principles, particularly dynamic NFR fulfillment, into the core of AV control architectures, enabling emotionally responsive and stakeholder-aligned autonomous systems.

Emmanuel Charleson Dapaah and Jens Grabowski. Model-Driven Root Cause Analysis for Trustworthy AI: A Data-and-Model-Centric Explanation Framework
Building trust in AI systems requires not only accurate models but also mechanisms to diagnose why machine learning (ML) pipelines succeed or fail. In this work, we propose a model-driven Root Cause Analysis (RCA) framework that attributes pipeline performance to interpretable factors spanning both data properties and model configurations. Unlike post-hoc explainers that approximate black-box behavior, our approach learns a faithful, inherently interpretable meta-model using Explainable Boosting Machines (EBM) to capture the mapping from data complexity and hyperparameters to predictive accuracy. To evaluate this framework, we curated a large-scale meta-dataset comprising 81,000 Decision Tree pipeline runs generated from 270 OpenML datasets combined with 300 hyperparameter configurations. The RCA meta-model achieved high predictive fidelity (R^2 = 0.90, MAE = 0.030), far outperforming a mean-regressor baseline (R^2 ≈ 0). This fidelity ensures that feature attributions reflect genuine performance determinants rather than artifacts of an ill-fitting surrogate. Beyond predictive accuracy, we assessed the attribution validity of our RCA framework through observational analysis of ten representative pipelines—five high-performing and five low-performing—drawn from the test set. Results show that the attributions are concise, with typically fewer than three dominant contributors per case, making them easy to interpret. In success cases, low class overlap and balanced distributions dominate attributions, while failure cases are driven by severe imbalance, harmful interactions, and, in some cases, context-dependent effects such as redundant dense features. Hyperparameter effects emerge as secondary but aggravating under challenging conditions. These findings demonstrate that our RCA framework provides theoretically grounded yet empirically adaptive explanations, enabling robust root cause analysis for trustworthy AI.