Skip to the content.

MISP Opendata

Description

MISP opendata can be used to query Open data portals (like data.public.lu) in order to create, update or delete a dataset or a resource. All the resources created are pointing to restSearch queries in MISP, giving access to the actual data shared in the given MISP server.
(The list of supported portals will be extended to support more and more european opendata portals.)

The opendata.py python script provides the necessary functionalities to interact with opendata portals. It is used and called by MISP to make the features available from any MISP server, but it also works as a standalone functionality. Details about its usage are provided below.

The Opendata format

The opendata format provides metadata information describing the datasets and resources stored within the portal.

Datasets are the containers used to give a general description of the data stored within the resources.
A dataset has some mandatory fields that must be defined by its creator:

Some extra mandatory fields are generated by the portal at the creation or update of the dataset, like the creation date, the date of last modification or update, the url to the dataset, etc.
Alongside those required fields, users can also add some optional pieces of information to add more specifications to the dataset, like an acronym, the license used, a temportal or spatial coverage, or the resources.

A dataset has 2 identifiers:

Both of those identifiers can be used in a link to access to a dataset.

Resources are the containers used within datasets to describe each data collection.
A resource also has mandatory fields:

As for datasets, some optional fields can also be defined for resources, such as the description of the resource, its release date, its size in bytes, its mime type, etc.

A resource is identified by a unique id, that is set at the creation of the resource and never changes.

A dataset can contain multiple resources, and a resource always belongs to a dataset. You can find more information about the format, and the different fields within the References part


Requirements

For an optimal usage, the following features are required:

For the creation or update of datasets or resources, some json documents are required:

When it is about deleting a dataset or a resource, only the body.json document requirement still stands, and you need to know the identifier or the dataset and/or resource(s) you want to delete.

The filters defined by default in the last mentioned document, available as an example, are some of the requirement in MISP side (returnFormat) and a usefull tag filter to avoid sharing data the is not public. Please read the references you can find below for more information about filtering data with the built-in restSearch API in MISP.

The fields defined in the setup document are the minimum requirements to make any API query a success (some of the required fields are defined in the python script and are thus not mandatory in the json document). Please refer also to the below mentioned documentation about the open data API for more explanations about the fields and the requirements.


Usage

The script uses a couple of different parameters in order to define the values associated with the required dataset/resource fields, the API key that should be used, and the kind of data from MISP that is going to be shared. For those 3 features, 3 different parameters are defined and pointing by default to the file names of the 3 json documents already mentioned above. The user can use some different ones as long as the content of the json document(s) used meet the requirements.

Another important parameter is the url of the MISP instance to use as resource for the actual data described in the open data resources. This url also has a default value that can be overwritten by user while executing the script.
The url value set by default is misppriv.circl.lu but you can choose any MISP server you have access to and that you want to use to share data, using the --misp_url parameter.

As you can now choose which portal you want to query, there is also a default value for this feature, which is data.public.lu, but again you can query any of the supported portals (An overview of the current progression for the support of some Open data platforms is also available) by using the --portal_url parameter.

The last parameter is the type of data that should be used as data in MISP (attributes or events). This one defines the level of data in MISP to be used as data resource. In other words, do we want the open data resource url to point to MISP events containing at least 1 attribute matching the restSearch filters define in the body.json document? Or simply the single attributes matching those filters?

Alternatively, there is an option to delete a dataset and/or its resource(s).

For the following examples, we will consider we want to make available in the open data portal some MISP collections of data containing single attributes of x509 certificates tagged as tlp:white.

Create a dataset

In this case, the dataset with the title mentioned as example does not exist yet.

Update a dataset

Only if the dataset with the title mentioned already exists.

Create a resource

Only if the dataset with the title mentioned already exists, but not the resource with the title mentioned.

Update a resource

Only if the dataset and resource with the titles mentioned already exist.

Delete a dataset

Delete at least one resource

Alternatively, you can just look for the existing datasets and resources, without modifying anything, by simply using the search parameter.
In that case, the required fields are the same as the ones used to delete content.
The difference with the previous feature is you can use exclusively titles of datasets and resources to proceed your search.

There is no other requirements for this query to be successful since we only get data and there is no data modification.


Usage in MISP

The functionality of creating, updating or deleting datasets and resources is now available in MISP via its restSearch client.

Creation and update

We can then use the same example as before and query the opendata portal to create or update a dataset or one of its resource(s).

Example of creation or update of a resource within the given dataset:

{
    "returnFormat": "opendata",
    "type": "x509-fingerprint-md5",
    "tags": "tlp:white",
    "auth": "_YOUR_OPENDATA_PORTAL_API_KEY_",
    "setup": {
        "dataset": {
            "description": "Dataset test from MISP containing data shared via a MISP platform.",
            "title": "x509 certificates shared in MISP"
        },
        "resources": {
            "title": "All x509 certificates shared with MISP",
            "type": "api",
            "format": "json"
        }
    },
    "misp-url": "https://mispriv.circl.lu",
    "portal-url": "data.public.lu"
}

Deletion

It is also possible to delete a dataset or its resource(s) using the restSearch client in MISP.

Example of deletion of resources:

{
    "returnFormat": "opendata",
    "auth": "_YOUR_OPENDATA_PORTAL_API_KEY_",
    "setup": {
        "dataset": "x509 certificates shared in MISP",
        "resources": [
            "x509 certificates (sha256) shared with MISP",
            "x509 certificates (sha1) shared with MISP",
            "x509 certificates (md5) shared with MISP"
        ],
    },
    "delete": 1,
    "portal-url": "data.public.lu"
}

Search

As for the previous features, the search functionality is also available in MISP:

{
    "returnFormat": "opendata",
    "setup": {
        "dataset": "x509 certificates shared in MISP",
        "resources": "All x509 certificates shared with MISP"
    },
    "search": 1,
    "portal-url": "data.public.lu"
}

References

logo

This feature was co-funded as part of the European Union INEA/HADEA CEF VARIoT project 2018-EU-IA-0100.