Introduction
Trust, interoperability and data sovereignty, these are the objectives and values for secure and sustainable peer-to-peer data exchange between organizations and companies. The claim is data sovereignty: Whoever makes data available retains control and decides individually who is involved in the data exchange, how, when, where and under what conditions.
A corresponding concept was developed in the context of Gaia -X and the International Data Space Association (IDSA). The essential software component is the connector.
Extended functionality through the Eclipse Data Space Connector (EDC)
Catena-X has expanded the concept in terms of data throughput and control of sovereignty. With the Eclipse Data Space Connector (EDC), a new central communication component for Catena-X was created, which implements the following architectural principles:
- Simple, maintaining a small and efficient core with as few external dependencies as possible
- Interoperable, independent of platforms and ecosystems
- Decentralized, software components with the necessary capabilities for participating in a data room are located on the partners' side, data is only exchanged between the agreed points.
- Data protection is more important than data sharing, data to be transmitted are fundamentally linked to policies via contracts; a transfer without a contract is not possible.
- Separation of metadata and data enables high throughput rates for the actual data transfer.
- Consistent semantics for the data is the basis for the consistency of digital value creation.
- As far as possible, all processes, starting with determining the identity, through ensuring the contractually agreed regulations to data transmission, are automated.
- Existing standards and protocols (GAIA-X and IDSA) are used as far as possible.
The EDC as a connector implements a framework agreement for sovereign, cross-organizational data exchange. The International Data Spaces Standard (IDS) and relevant principles in connection with GAIA-X were implemented. The connector is designed to be extensible to support alternative protocols and to be integrated into different ecosystems.
The objective is to set up a decentralized software component on the part of the respective partner, which bundles the skills required to participate in a data room and enables peer-to-peer connections between participants. The focus here is particularly on the data sovereignty of the independent companies. The functionality required for this is bundled in the open-source project "Eclipse Dataspace Connectors", to which the Catena-X partners contribute as part of the Eclipse Foundation.
The main difference between the EDC and the previous connectors of the IDSA is the separation of the communication into a channel for the metadata and one for the actual data exchange. The channel for the data supports various transmission protocols via so-called data plane extensions. The metadata is transmitted directly via the EDC interface, while the actual data exchange then takes place via the appropriate channel extension. In this way, a highly scalable data exchange is made possible.
The architecture of the EDC combines various services that are necessary for the above principles:
- An interface to the Identity Provider service, currently IDSA's Dynamic Attribute Provisioning System (DAPS). This central service provides the identity and the corresponding authentication of the participants in the data exchange. (There is no authorization at this point). Decentralized solutions will also be supported in the future.
- The provision of possible offers (contract offering) which, on the one hand, stipulates the data offered and the associated terms of use (policies) in corresponding contracts.
- An interface for manual selection of data and associated contract offers.
- The actual data transfer via the data plane extension
- Interfaces for using other services such as a broker service or a registration service
- The connection of software systems on the customer and provider side
The following figure gives a rough overview of services already available for testing and further planning.
Regarding the software-technical implementation, the EDC module consists of two parts, the "Control Plane" and the "Data Plane". Different communication protocols such as https, S3 file transfer, REST are integrated in the data plane via the "Data Plane Extension". The Control Plane consists of components for handling contracts and the "EDC Standard API", which is the interface for any interaction of the connector, i.e. to backend data services and data transfer. References to all aspects of contracts are stored in four registers (indices).
- In the "Contract Definition Index" all available contract drafts are listed as templates, above which the terms of use (policies) are displayed.
- The “Asset Index” lists the available data and information, the actual assets for the desired data exchange.
- The "Context Offer" index contains pointers to specific contract offers that are derived from the templates and refer to the corresponding assets and terms of use for the exchange.
- If contracts are finally agreed, the references to these contracts are stored in the "Contract Index"
Functionality
How does data exchange between two partners via EDC work?
Provided that the identities have been queried and verified, thus the participants have been identified and authenticated, the customer asks the potential supplier for the available contract offers with the data (assets) combined with their terms of use (policies) and ultimately selects one or more contracts. The provider confirms the contract and sends it to the user (agreement is in place). Both, the customer and the supplier store the valid contracts in their contract index. This then authorizes the data transfer. Based on the valid contracts, tokens are now generated that are used to enforce the terms and conditions of the contracts during data transfer.
The EDC is not only used for communication between two participants but can be used for any data exchange within the data room. The federated services are also connected via the EDC.
State of Development
The EDC is still being continuously developed within the framework of the Catena-X Consortium and the Eclipse Foundation. The latest versions can be downloaded from GitHub (eclipse-dataspaceconnector/DataSpaceConnector: DataspaceConnector project (github.com) ). The quality is sufficient for demonstration.
Future planned developments:
1. In the future, it is planned to also support decentralized solutions for identification. A promising concept is based on self-determined identities (SSI - Self Sovereign Identity), which are being developed in the European "European Self Sovereign Identity Framework" (ESSIF) and Gaia-X. In addition, the concept of a business partner data management system (BPDM) is being developed within Catena-X, which brings together distributed information of a company from a wide variety of data sources via the identity of a BPN (Business Partner Number), uses it for the self-description and then for synchronising the original data sources.
2. More flexibility of general data transfer: Further data lane extensions are planned, especially for streaming protocols such as Kafka and MQTT. Further, the increased integration of the Industry 4.0 asset administration shell (AAS) is on the roadmap. A start has already been made on an AAS API wrapper, which transparently maps the communication via the EDC for an Industry 4.0 application as data access via the AAS. The integration of the AAS as an extension for the backend data services is open.
3. Overall, the connection via further channels in the data plane extension for other backend data services (ERP, MES, PPS, ...) is a key activity of the consortium.
Stefan Ettl
Product Owner
Johannes Diemer
Transfer & Communication