Data Matching

Everything about Data Matching.

Create Data Matching Definition

Create a new Data Matching Definition with the given configuration.

Configuration example:

  <?xml version="1.0" standalone="no" ?>
  <duke>
      <object class="no.priv.garshol.duke.comparators.QGramComparator" name="NameComparator">
          <param name="formula" value="DICE"/>
          <param name="q" value="3"/>
      </object>
      <schema>
          <threshold>0.7</threshold>
          <property type="ignore">
              <name>STORAGE_ID</name>
          </property>
          <property type="ignore">
              <name>DATA_SOURCE_ID</name>
          </property>
          <property type="id">
              <name>BUSINESS_PARTNER_ID</name>
          </property>
          <property lookup="required">
              <name>COUNTRY_SHORTNAME</name>
              <comparator>no.priv.garshol.duke.comparators.ExactComparator</comparator>
              <low>0.0</low>
              <high>0.5</high>
          </property>
          <property lookup="true">
              <name>NAME</name>
              <comparator>NameComparator</comparator>
              <low>0.1</low>
              <high>0.9</high>
          </property>
      </schema>
      <database class="no.priv.garshol.duke.databases.LuceneDatabase">
          <param name="max-search-hits" value="100"/>
          <param name="min-relevance" value="0.9"/>
          <param name="fuzzy-search" value="true"/>
          <param name="boost-mode" value="INDEX"/>
      </database>

      <data-source class="cdq.cdl.matching.datasource.MatchingDataSource">
          <column
                  cleaner="no.priv.garshol.duke.cleaners.LowerCaseNormalizeCleaner" name="businessPartner.address.country.shortName" property="COUNTRY_SHORTNAME"/>
          <column
                  cleaner="no.priv.garshol.duke.cleaners.LowerCaseNormalizeCleaner"
                  configProperty="COUNTRY_SHORTNAME" name="businessPartner.names[0].value" property="NAME"/>
      </data-source>
  </duke>
SecurityapiKey
Request
Request Body schema: application/json
required
object (GoldenRecordConfiguration)

Configuration for creating Golden Records in the context of Data Matching.

name
required
string <= 30 characters

A human-readable Name for the Data Matching Definition instance.

Example: "Custom Matching Definition Name"
object (ReportConfiguration)

Configuration for generating reports related to the Data Matching Definition.

type
required
string (MatchingType)

Type of the matching configuration.

Enum: Description
DEDUPLICATION

The matching configuration is used to identify duplicates in a storage.

LINKAGE

The matching configuration is used to identify links between records in different storages.

Example: "DEDUPLICATION"
object (Workspace)

Collaboration space to share/configure configurations.

xmlDukeConfiguration
required
string

Holds the Duke XML configuration for data matching.

Example: "<duke>...</duke>"
Responses
200

OK

400

The sent request is malformed.

post/datamatchingdefinitions
Request samples
application/json
{
  • "name": "Custom Matching Definition Name",
  • "type": "DEDUPLICATION",
  • "xmlDukeConfiguration": "<duke>...</duke>",
  • "reportConfiguration": {
    },
  • "goldenRecordConfiguration": {
    },
  • "workspace": {
    }
}
Response samples
application/json
{
  • "id": "6461e6113b1865304b3038b6",
  • "name": "Custom Matching Definition Name",
  • "creatorUsername": "johndoe",
  • "creatorOrganization": "cdq_monitor",
  • "createdAt": "2024-10-22T10:00:22Z",
  • "lastModifiedAt": "2024-10-22T10:00:22Z",
  • "type": "DEDUPLICATION",
  • "xmlDukeConfiguration": "<duke>...</duke>",
  • "reportConfiguration": {
    },
  • "goldenRecordConfiguration": {
    },
  • "workspace": {
    }
}

Delete a Date Matching Definition

Delete a Data Matching Definition by its ID.

SecurityapiKey
Request
path Parameters
id
required
string (DataMatchingDefinitionId)

ID of the Data Matching Definition.

Example: 6461e6113b1865304b3038b6
Responses
200

OK

400

The sent request is malformed.

403

Permission denied.

delete/datamatchingdefinitions/{id}
Request samples
Response samples
application/json
{
  • "status": "400",
  • "path": "/v2/businesspartners/lookup",
  • "timestamp": "2024-10-22T10:00:22Z",
  • "error": "BAD_REQUEST",
  • "message": "This user is not allowed to access this service."
}

Read Data Matching Definition

Read an existing Data Matching Definition by its ID.

SecurityapiKey
Request
path Parameters
id
required
string (DataMatchingDefinitionId)

ID of the Data Matching Definition.

Example: 6461e6113b1865304b3038b6
Responses
200

OK

400

The sent request is malformed.

403

Permission denied.

get/datamatchingdefinitions/{id}
Request samples
Response samples
application/json
{
  • "id": "6461e6113b1865304b3038b6",
  • "name": "Custom Matching Definition Name",
  • "creatorUsername": "johndoe",
  • "creatorOrganization": "cdq_monitor",
  • "createdAt": "2024-10-22T10:00:22Z",
  • "lastModifiedAt": "2024-10-22T10:00:22Z",
  • "type": "DEDUPLICATION",
  • "xmlDukeConfiguration": "<duke>...</duke>",
  • "reportConfiguration": {
    },
  • "goldenRecordConfiguration": {
    },
  • "workspace": {
    }
}

Read Data Matching Definitions

Read a page of existing Data Matching Definitions.

SecurityapiKey
Request
query Parameters
matchingType
string (MatchingType)

Type of the matching configuration.

Enum: Description
DEDUPLICATION

The matching configuration is used to identify duplicates in a storage.

LINKAGE

The matching configuration is used to identify links between records in different storages.

Example: matchingType=DEDUPLICATION
page
integer <int32> >= 0
Default: 0

Number of the retrieved result page.

Example: page=1
pageSize
integer <int32> >= 1
Default: 20

Maximum number of items to be returned in the result page.

Example: pageSize=20
workspaceId
string (WorkspaceId)

ID of the workspace.

Example: workspaceId=c074b9f3-abf0-4f8e-9a20-74deb6cfa2a4
Responses
200

OK

400

The sent request is malformed.

403

Permission denied.

get/datamatchingdefinitions
Request samples
Response samples
application/json
{
  • "page": "1",
  • "pageSize": "100",
  • "numberOfPages": "3",
  • "total": "67",
  • "values": [
    ]
}

Update Data Matching Definition

Update an existing Data Matching Definition with the given configuration.

SecurityapiKey
Request
path Parameters
id
required
string (DataMatchingDefinitionId)

ID of the Data Matching Definition.

Example: 6461e6113b1865304b3038b6
Request Body schema: application/json
required
object (GoldenRecordConfiguration)

Configuration for creating Golden Records in the context of Data Matching.

name
required
string <= 30 characters

A human-readable name for the Data Matching Definition instance.

Example: "Custom Matching Definition Name"
object (ReportConfiguration)

Configuration for generating reports related to the Data Matching Definition.

type
required
string (MatchingType)

Type of the matching configuration.

Enum: Description
DEDUPLICATION

The matching configuration is used to identify duplicates in a storage.

LINKAGE

The matching configuration is used to identify links between records in different storages.

Example: "DEDUPLICATION"
object (Workspace)

Collaboration space to share/configure configurations.

xmlDukeConfiguration
required
string

Holds the Duke XML configuration for data matching.

Example: "<duke>...</duke>"
Responses
200

OK

400

The sent request is malformed.

403

Permission denied.

put/datamatchingdefinitions/{id}
Request samples
application/json
{
  • "name": "Custom Matching Definition Name",
  • "type": "DEDUPLICATION",
  • "xmlDukeConfiguration": "<duke>...</duke>",
  • "reportConfiguration": {
    },
  • "goldenRecordConfiguration": {
    },
  • "workspace": {
    }
}
Response samples
application/json
{
  • "id": "6461e6113b1865304b3038b6",
  • "name": "Custom Matching Definition Name",
  • "creatorUsername": "johndoe",
  • "creatorOrganization": "cdq_monitor",
  • "createdAt": "2024-10-22T10:00:22Z",
  • "lastModifiedAt": "2024-10-22T10:00:22Z",
  • "type": "DEDUPLICATION",
  • "xmlDukeConfiguration": "<duke>...</duke>",
  • "reportConfiguration": {
    },
  • "goldenRecordConfiguration": {
    },
  • "workspace": {
    }
}