How to manage and link datasets
Step 1. Navigate to datasets
Dataset cards are used to describe the data used to train, test, validate or operate AI systems. Dataset cards form a data catalog of all the datasets used in AI systems. The Datasets view contains basic information about the datasets including name, organisation owner, subject, modality and the number of systems using the dataset.
.png?inst-v=13c6a7a8-f0d3-47c3-9b97-e185df070b51)
Step 2. Add dataset
Add a new dataset card in the systems overview including name, owner, dataset card owner, dataset subject, modality and description.

Step 3. Fill in dataset card data
Complete the additional information in the dataset card view, including the type of personal data, special categories, source, version, license, tags and data collection and processing methods. You may also add specific documents describing the dataset, see what other systems are using the same dataset, and add links to other tools containing information about the dataset.

Step 4. Link data related risks
Link data related risks to your Dataset card to allow these risks to be automatically inherited to the AI System Risks. You can use Saidot Risk Library for analysing potential risks and the search field to find them. Saidot Risk Library has extensive selection of data related risks that can be found in several ways, for example using the search word “data”.

Step 5. Link dataset to your system
Link a documented dataset to AI system in System components section of the system overview. Describe the Dataset purpose when linking an existing Dataset. You can also describe the use of this Dataset in more detail if needed.
Dataset will be then visible in the System Components.

The risks will be automatically inherited from the dataset to the Risk tab for further governance.

Link dataset to your Model Card
Dataset linking can also be established through the Model Catalogue. When a Dataset has been linked to the Model Card, it will be automatically added to the System components when linking the Model from the Model Catalogue.

When a Dataset is linked to the System components via Model Catalogue, the risk are not inherited automatically. If you would like to inherit the data related risks, you can unlink the Dataset and link it again from the Data Catalogue.
