Data Matching

Provides functionalities for managing Data Matching Definitions, which are configurations for matching jobs to identify duplicates or link records across data sources. These endpoints support creating, reading, updating, and deleting these configurations.

Create Data Matching Definition

Create a new Data Matching Definition with the given configuration.

Configuration example:

  <?xml version="1.0" standalone="no" ?>
  <duke>
      <object class="no.priv.garshol.duke.comparators.QGramComparator" name="NameComparator">
          <param name="formula" value="DICE"/>
          <param name="q" value="3"/>
      </object>
      <schema>
          <threshold>0.7</threshold>
          <property type="ignore">
              <name>STORAGE_ID</name>
          </property>
          <property type="ignore">
              <name>DATA_SOURCE_ID</name>
          </property>
          <property type="id">
              <name>BUSINESS_PARTNER_ID</name>
          </property>
          <property lookup="required">
              <name>COUNTRY_SHORTNAME</name>
              <comparator>no.priv.garshol.duke.comparators.ExactComparator</comparator>
              <low>0.0</low>
              <high>0.5</high>
          </property>
          <property lookup="true">
              <name>NAME</name>
              <comparator>NameComparator</comparator>
              <low>0.1</low>
              <high>0.9</high>
          </property>
      </schema>
      <database class="no.priv.garshol.duke.databases.LuceneDatabase">
          <param name="max-search-hits" value="100"/>
          <param name="min-relevance" value="0.9"/>
          <param name="fuzzy-search" value="true"/>
          <param name="boost-mode" value="INDEX"/>
      </database>

      <data-source class="cdq.cdl.matching.datasource.MatchingDataSource">
          <column
                  cleaner="no.priv.garshol.duke.cleaners.LowerCaseNormalizeCleaner" name="businessPartner.address.country.shortName" property="COUNTRY_SHORTNAME"/>
          <column
                  cleaner="no.priv.garshol.duke.cleaners.LowerCaseNormalizeCleaner"
                  configProperty="COUNTRY_SHORTNAME" name="businessPartner.names[0].value" property="NAME"/>
      </data-source>
  </duke>
SecurityapiKey
Request
Request Body schema: application/json
required
object (GoldenRecordConfiguration)

Configuration for creating Golden Records in the context of Data Matching.

name
required
string <= 30 characters

A human-readable Name for the Data Matching Definition instance.

Example: "Custom Matching Definition Name"
object (ReportConfiguration)

Configuration for generating reports related to the Data Matching Definition.

type
required
string (MatchingType)

Type of the matching configuration.

Enum: Description
DEDUPLICATION

The matching configuration is used to identify duplicates in a storage.

LINKAGE

The matching configuration is used to identify links between records in different storages.

Example: "DEDUPLICATION"
object (Workspace)

Collaboration space to share/configure configurations.

xmlDukeConfiguration
required
string

Holds the Duke XML configuration for data matching.

Example: "<duke>...</duke>"
Responses
200

OK

400

The sent request is malformed.

post/datamatchingdefinitions
Request samples
application/json
{
  • "name": "Custom Matching Definition Name",
  • "type": "DEDUPLICATION",
  • "xmlDukeConfiguration": "<duke>...</duke>",
  • "reportConfiguration": {
    },
  • "goldenRecordConfiguration": {
    },
  • "workspace": {
    }
}
Response samples
application/json
{
  • "id": "6461e6113b1865304b3038b6",
  • "name": "Custom Matching Definition Name",
  • "creatorUsername": "johndoe",
  • "creatorOrganization": "cdq_monitor",
  • "createdAt": "2025-01-29T16:52:44Z",
  • "lastModifiedAt": "2025-01-29T16:52:44Z",
  • "type": "DEDUPLICATION",
  • "xmlDukeConfiguration": "<duke>...</duke>",
  • "reportConfiguration": {
    },
  • "goldenRecordConfiguration": {
    },
  • "workspace": {
    }
}

Delete Date Matching Definition

Delete a Data Matching Definition by its ID.

SecurityapiKey
Request
path Parameters
id
required
string (DataMatchingDefinitionId)

ID of the Data Matching Definition.

Example: 6461e6113b1865304b3038b6
Responses
200

OK

400

The sent request is malformed.

403

Permission denied.

delete/datamatchingdefinitions/{id}
Request samples
Response samples
application/json
{
  • "status": "400",
  • "path": "/v2/businesspartners/lookup",
  • "timestamp": "2025-01-29T16:52:44Z",
  • "error": "BAD_REQUEST",
  • "message": "This user is not allowed to access this service."
}

List Data Matching Definitions

Read a page of existing Data Matching Definitions.

SecurityapiKey
Request
query Parameters
matchingType
string (MatchingType)

Type of the matching configuration.

Enum: Description
DEDUPLICATION

The matching configuration is used to identify duplicates in a storage.

LINKAGE

The matching configuration is used to identify links between records in different storages.

Example: matchingType=DEDUPLICATION
page
integer <int32> >= 0
Default: 0

Number of the retrieved result page.

Example: page=1
pageSize
integer <int32> >= 1
Default: 20

Maximum number of items to be returned in the result page.

Example: pageSize=20
workspaceId
string (WorkspaceId)

ID of the workspace.

Example: workspaceId=c074b9f3-abf0-4f8e-9a20-74deb6cfa2a4
Responses
200

OK

400

The sent request is malformed.

403

Permission denied.

get/datamatchingdefinitions
Request samples
Response samples
application/json
{
  • "page": "1",
  • "pageSize": "100",
  • "numberOfPages": "3",
  • "total": "67",
  • "values": [
    ]
}

Read Data Matching Definition

Read an existing Data Matching Definition by its ID.

SecurityapiKey
Request
path Parameters
id
required
string (DataMatchingDefinitionId)

ID of the Data Matching Definition.

Example: 6461e6113b1865304b3038b6
Responses
200

OK

400

The sent request is malformed.

403

Permission denied.

get/datamatchingdefinitions/{id}
Request samples
Response samples
application/json
{
  • "id": "6461e6113b1865304b3038b6",
  • "name": "Custom Matching Definition Name",
  • "creatorUsername": "johndoe",
  • "creatorOrganization": "cdq_monitor",
  • "createdAt": "2025-01-29T16:52:44Z",
  • "lastModifiedAt": "2025-01-29T16:52:44Z",
  • "type": "DEDUPLICATION",
  • "xmlDukeConfiguration": "<duke>...</duke>",
  • "reportConfiguration": {
    },
  • "goldenRecordConfiguration": {
    },
  • "workspace": {
    }
}

Update Data Matching Definition

Update an existing Data Matching Definition with the given configuration.

SecurityapiKey
Request
path Parameters
id
required
string (DataMatchingDefinitionId)

ID of the Data Matching Definition.

Example: 6461e6113b1865304b3038b6
Request Body schema: application/json
required
object (GoldenRecordConfiguration)

Configuration for creating Golden Records in the context of Data Matching.

name
required
string <= 30 characters

A human-readable name for the Data Matching Definition instance.

Example: "Custom Matching Definition Name"
object (ReportConfiguration)

Configuration for generating reports related to the Data Matching Definition.

type
required
string (MatchingType)

Type of the matching configuration.

Enum: Description
DEDUPLICATION

The matching configuration is used to identify duplicates in a storage.

LINKAGE

The matching configuration is used to identify links between records in different storages.

Example: "DEDUPLICATION"
object (Workspace)

Collaboration space to share/configure configurations.

xmlDukeConfiguration
required
string

Holds the Duke XML configuration for data matching.

Example: "<duke>...</duke>"
Responses
200

OK

400

The sent request is malformed.

403

Permission denied.

put/datamatchingdefinitions/{id}
Request samples
application/json
{
  • "name": "Custom Matching Definition Name",
  • "type": "DEDUPLICATION",
  • "xmlDukeConfiguration": "<duke>...</duke>",
  • "reportConfiguration": {
    },
  • "goldenRecordConfiguration": {
    },
  • "workspace": {
    }
}
Response samples
application/json
{
  • "id": "6461e6113b1865304b3038b6",
  • "name": "Custom Matching Definition Name",
  • "creatorUsername": "johndoe",
  • "creatorOrganization": "cdq_monitor",
  • "createdAt": "2025-01-29T16:52:44Z",
  • "lastModifiedAt": "2025-01-29T16:52:44Z",
  • "type": "DEDUPLICATION",
  • "xmlDukeConfiguration": "<duke>...</duke>",
  • "reportConfiguration": {
    },
  • "goldenRecordConfiguration": {
    },
  • "workspace": {
    }
}