CDI: How to contribute?
The CDI system connects the SeaDataNet portal and the databases of the SeaDataNet distributed data centres. It enables that registered users can search for data sets, submit requests for data sets and if ok, can download data sets from the distributed data centres via a unique interface at the SeaDataNet portal. In practice, the user is downloading unrestricted data sets directly from the SeaDataNet data cloud which is maintained by the SeaDataNet data centres. In case of restricted data sets, downloading, if granted, takes place directly from the relevant data centres. However, in both cases (unrestricted and restricted data) all communication is facilitated through the SeaDataNet portal.
Governance
The maintenance of the Common Data Index (CDI) system is undertaken by the distributed data centres that are connected to the SeaDataNet infrastructure and maintain their CDI metadata and associated data files as part of the SeaDataNet exchange and distribution. For populating and updating of data entries and for making these entries available to users for discovery and downloading, the data centres operate local and central components of the CDI system. The core partners in the SeaDataNet network are National Oceanographic Data Centres and Marine Data Centres. They can give guidance and assistance to other marine data centres in their country for installing and configuring the CDI connection and for generating and submitting CDI and data entries. The pan-European CDI system is managed and operated by the SeaDataNet partner MARIS, who also runs a CDI-support desk together with IFREMER to give support. Questions can also be submitted to the SeaDataNet helpdesk.
Formats
For the CDI, a content model has been defined, based upon the ISO 19115 content model. Considerable effort has been applied in establishing the XML coding of CDI on the basis of ISO 19139 and making it fully INSPIRE compliant. Detailed information about the formats and XML schema's can be found here.
Maintenance modality
DOWNLOAD the USER MANUAL compiling the CDI metadata, coupling table and associated data
The CDI directory is maintained by an XML export from the connected data centres to the pan-European directory:
- Using the MIKADO software, provided by SeaDataNet, to generate new and updated CDI XML files from locally maintained databases
XML files, generated using the latest MIKADO software, will be valid and should parse to the associated Schema. However, partners not making use of MIKADO but generating XML entries for CDI should perform a validation before they can prepare and submit regular contributions. The Schema for CDI includes Schematron rules which allow to validate both the Syntax and the Semantics of CDI XML files, using an XML editor (e.g. Oxygene, XML Spy, ..) and the related Schema, which can be found at the SeaDataNet portal in the Standards and Formats section. If you are online, the Schema should be found by the editor automatically at the SeaDataNet namespace.
Including CSR references in CDI entries
It is encouraged to include Cruise Summary Report (CSR) references in existing and new CDIs in order to establish a direct relation between the CSR and CDI resources of SeaDataNet. The document below gives guidance on how to do that in practice in an efficient way:
System components and configuration
The CDI system consists of a number of components, centrally at the portal and locally at each connected data centre. The central components are:
- CDI Data Discovery and Access service: for searching and browsing of metadata of data sets and requesting access to data sets via a shopping basket; operated at the portal. Users can login to their MySeaDataNet space for submitting data requests, storing previous queries, overseeing processing of shopping requests, and accessing data sets, when ready for download;
- Central User Register: contains details of users, their organisations and addresses, Id-Passwords, and their SeaDataNet roles; operated at the portal;
- Shopping Basket: part of the user interface for preparing a user request of multiple data sets from multiple data centres in one go, handling login validation of users, and routing requests to the Request Status Manager; operated at the portal;
- Request Status Manager (RSM): for processing and administration of all requests and data deliveries (downloads); for users to handle the communication with data centres; for data centres to oversee all transactions; operated at the portal, and to interact in case of restricted data requests
- CDI Import Manager: for controlling the import of new or updated CDI metadata entries to the central CDI catalogue and associated data sets to the central data cloud (the latter in case of unrestricted data sets, while restricted data sets are managed only locally at the data centres);
- SeaDataNet central data cloud: for providing a central data cache for all unrestricted data sets which has major benefits for quality control and providing access services to users. The data cloud is hosted at EUDAT;
For connecting to the CDI system there are 2 modalities possible for data centres, both requiring local system arrangements:
- Connecting as a full data centre: this means that a data centre installs and configures a local Java component, Replication Manager, which communicates with the CDI Import Manager and Central data cloud for self-maintaining their metadata and data contributions to the CDI catalogue, and with the Request Status Manager to process data delivery in case requests for restricted data sets are accepted by the data centre.
- Alternative is not to install the Replication Manager, but to process all data set requests as registered in the Request Status Manager by data centre staff. In this case, the population of their CDI metadata and data entries to the CDI catalogue and central data cloud is done working together with the CDI Support desk at MARIS. The SeaDataNet cloud arranges delivery to users of unrestricted data; deliveries of restricted data sets are done on a bilateral basis between data centre and user. This is the interim solution. It is intended as interim, because it is strived that all data centres become fully connected data centres, unless not possible due to legislation or security policies.
The data sets might be locally managed as files in a file management system, possibly supported by a local metadatabase, or managed in a relational database management system. For participating in the CDI service, it is required that data sets aredelivered to users via the CDI system in standard SeaDataNet Data Transport Formats. This implies that a data centre, in case of a file management system configuration, must arrange that the data files are also available in the SeaDataNet formats. This might require pre-processing via a conversion routine. SeaDataNet provides useful software tools for that purpose:
- NEMO software for converting from any kind of ASCII format to the SeaDataNet ODV4 ASCII format and new SeaDataNet NetCDF (CF) format
- OCTOPUS software for splitting and conversion of SeaDataNet files
In case of a relational database management system it might not be needed to arrange pre-processed files in the SeaDataNet formats, because the Replication Manager software includes functionality to generate the requested data files in the SeaDataNet data formats, while MIKADO is used for the CDI metadata generation. However, in case of an Interim solution and a relational dbms, the data centre itself has to set up a solution, making use of NEMO and OCTOPUS and its local system.
In both situations, full and Interim, a number of configuration parameters have to be agreed and set at the system of the data centre and at the SeaDataNet portal. Therefore it is recommended that new data centres contact the CDI system coordinator MARIS for detailed instructions and documentation. MARIS will then provide guidance and validate the initial process of getting connected.