Integrated Data as a Foundation for Systems Engineering
This is a continuation of part 3a of this blog series. In the first section, I discussed the SE work products and underlying data and information that are generated as part of each of the SE lifecycle activities. I then addressed the questions: “What is a model?” and “What is model-based SE (MBSE)?”
In this section I discuss what the concept of Integrated Data as the foundation is and provide alternate definitions of Systems Engineering from a data-centric perspective.
Figure 1: Integrated Data as the foundation for SE
As shown at the top of Figure 1, work products such as budgets, schedules, requirements, designs, diagrams, drawings, SysML or other language based models, processes, and plans, etc. are created as part of the SE lifecycle process activities. The underlying data and information representing these work products is stored and managed either electronically in databases or as electronic files and documents. This data and information can be combined into an integrated dataset that represent those work products.
Guides, standards, policies, and procedures can also be included in the integrated dataset. The integrated dataset is managed via the enterprise and project data governance, information management, records management, and DB administration requirements and processes. In order for the data and information to be considered the “ground truth” as discussed earlier, the integrated dataset must be maintained and managed under strict configuration control.
Fundamental to forming an integrated dataset, the organization needs to define and document a project ontology. The ontology includes the formal naming and definition of a set of terms, entities, data types, and properties as well as defining the relationships between these terms, entities, data types that are fundamental to the project and organization the project is part of. The projects within the organization need to define and document an ontology that is consistent with the organization’s ontology as well as their customer’s ontology. Having a documented ontology for an organization and projects within that organization helps ensure consistent use of this information across all lifecycle stage activities and across various groups within and external to the organization.
Based on this ontology, the project needs to define a master schema for the project’s integrated dataset. The schema is a description, in a formal language, of the database structure that defines the objects in the databases, shows how real-world entities are modelled in the database, and integrity constraints that ensure compatibility between parts of the schema.
All databases and file management repositories in the project’s integrated dataset need to have a schema consistent with the project’s master schema to ensure compatibility of the data and information, allowing the data and information to be shared among the various SE tools in the project’s toolset, and allowing data and information to be shared across various groups within and external to the organization.
As shown at the bottom of Figure 1, once the integrated dataset has been populated, it becomes the source for subsequent system lifecycle activities, resulting work products, and their underlying data and information. The database management tools allow the project’s SE toolset to access data and information from the integrated dataset which becomes the foundation of all the project’s SE lifecycle activities. This view of SE from a data-centric perspective is essential to manage the system development efforts across all lifecycle stages and to address the challenges of increasingly complex systems of the future.
Different organizations or parts of an organization are often involved in the various SE processes for each lifecycle stage, using a variety of project management and SE tools to produce and manage the various work products and their underlying data and information. Without a view of SE from a data-centric perspective, it is common for each organizational element to use their own SE tools, databases, and file repositories containing this underlying data and information. This can result in stovepipes where data and information generated by one group or organizational element using a specific tool are either not made available to those involved in the other lifecycle processes or the tools used and resulting data and information are not compatible, preventing the data and information to be shared.
For increasingly complex systems of the future, these issues need to be addressed. Organizations need to be able to manage the large number of work products and their underlying data and information electronically rather than in printed documents, diagrams, or drawings. The datasets representing these work products and underlying data and information need to be able to be shared. Organizations need to establish and maintain the “ground truth” as well as remove stovepipes and integrate work products and their underlying data and information across disciplines and system development lifecycle process activities into a common, integrated dataset.
SE from a data-centric perspective Defined
To help emphasize the concept that data is the foundation of SE, the authors propose the following modified, definitions of SE from a data-centric perspective:
“SE, from a data-centric perspective, involves the formalized use of a common, integrated dataset to support concept maturation, requirements development, design, analysis, verification and validation activities beginning in the conceptual design phase and continuing throughout the later life cycle phases. The integrated dataset represents the SE work products and their underlying data and information generated during each lifecycle phase.”
Or perhaps a little shorter version in a single sentence: “SE, from a data-centric perspective, involves the formalized application of a common, integrated dataset to represent the SE work products and underlying data and information generated to support concept maturation, requirements development, design, analysis, and verification and validation activities throughout the system life cycle, from conceptual design to retirement.”
Or more briefly still (although perhaps too brief): “SE, from a data-centric perspective, involves the formalized application of a common, integrated dataset to represent the SE work products and underlying data and information generated throughout the system life cycle.”
In the Vitech paper (DLONG 2016), One Model to Coordinate Them All, David Long discusses the concept of an overall model that coordinates all other models. He states: “Requirements models, activity models, interface models, parametric models, reliability models, thermal models, power models, finite element models, … the list goes on and on. In this drive towards model-based systems engineering (MBSE) – and ultimately model-based engineering to connect the product lifecycle – how can we make sense of this vast portfolio of models? How can we effectively manage the models and use them to gain leverage over the problem at hand so that we engineer the system rather than becoming distracted by our models? The models for these analytic dimensions are not new. These are the models that engineering disciplines have developed over the years. Which we choose differs based upon the system of interest, and the set of analytic models chosen bring rigor, effectiveness, and efficiency to the systems engineering.”
“Within the INCOSE community, we often focus on a second type of model – what many call the descriptive systems model, what I often term the architectural systems model. This covers the space from concept of operations through requirements, behavior, physical architecture, and verification & validation.”
“There is one and only one architectural model – broad in scope, fundamentally interconnected in nature – and that architectural model connects and coordinates the diverse analytic models. Done well, the architectural model addresses both the problem and solution, reflecting and integrating the key dimensions of both in a manner that clearly reflects the interconnected nature of the system. Done well, the architectural model aligns and maps key terminology across disciplines and concerns, connecting the various perspectives and analytical considerations. In addressing needs, logical solution, physical solution, and V&V, the descriptive model is highly connected.”
From a data-centric perspective of SE, this one architectural model is represented by the common, integrated dataset. Building on this concept, and taking a broader, data-centric view, the integrated dataset represents not only a model of the system under development (architectural model) but also represents a model of all the SE lifecycle process activities, resulting work products, and their underlying data and information.
With a data-centric perspective of SE, the capability to capture, manage, access data, and manage the interrelationships between SE work products can be accomplished through a variety of methodologies, which range from the establishment of a single relational database to a virtually integrated, but distributed, database by means of a federation (or data map/index) of disparate data sources (as shown in Figure 3). As stated in NASA’s Expanded Guidance for SE (NASA 2016):
“In all cases, the interrelationships (both within and between data sources) among the various data items are captured. Establishment of a “master map” or ontology (i.e., a common vocabulary for the types and attributes of the data items and their associated interrelationships) up front, for all these data items and their associated interrelationships, facilitates the establishment of this capability.”
In Part 4 of this blog series, Practicing SE from a Data-Centric Perspective, I go into more detail on what it means to practice SE from a data-centric perspective providing guidance that can be used to understand and successfully create and manage the integrated dataset within an organization.
Comments are welcome.
If you have any other topics you would like addressed in our blog, feel free to let us know via our “Ask the Experts” page and we will do our best to provide a timely response.Tags: artifacts, data-centric, Integrated Data, Integrated dataset, MBSE, MBSE Capability Levels, MCLs, Model Based Systems Engineering, model-centric, systems engineering