Skip to content

Data Curation API (3)

This API provides services to curate and enrich Business Partner and address data.

Download OpenAPI description
Languages
Servers
Production

https://api.cdq.com/data-curation/rest/

Batch Curation

Everything about Batch Curation

Operations

Request

Start a batch curation job on a given storage.

Security
apiKey
Bodyapplication/jsonrequired
namestring<= 50 characters

Name of a Job.

Example: "Process vendor data."
descriptionstring<= 200 characters

Detailed description of a Job.

Example: "I started this job to improve quality of our data."
storageIdstring(BusinessPartnerStorageId)required

Unique identifier of the Storage.

Example: "72d6900fce6b326088f5d9d91049e3e6"
dataSourceIdsArray of strings(BusinessPartnerStorageDataSourceId)

If set, only the records that belong to the data sources identified by these IDs are processed. By default, all records of the storage (means from all data sources) are processed (considering other filters).

Example: ["648824a691d8d2503d65103e"]
countryShortNamesArray of strings(CountryShortName)

If set, only the records that belong to the countries identified by these short names are processed. By default, all records of the storage (means from all countries) are processed (considering other filters).

Example: ["CH"]
workersinteger[ 1 .. 8 ]

Number of workers to be used for the job. By default, the number of workers is 1.

Default 1
Example: "3"
profilestring(Profile)

Profiles are predefined sets of features to configure the curation process.

Enum ValueDescription
STANDARD

Curation process with standard profile is turned on by default. This profile consist of most important API toggle features and because of that it's recommended for most data curation cases.

Features included:

  • PARSE_ADDRESS,
  • ENRICH_ADDRESS,
  • ENRICH_POST_CODE_TYPE,
  • ENRICH_POST_CODE,
  • TRANSLATE_ADDRESS,
  • ENRICH_ADMINISTRATIVE_AREA_ISO,
  • PREPROCESS_ADDRESS,
  • DETECT_INDUSTRIAL_ZONE,
  • NORMALIZE_ADDRESS,
  • LAB_ALLOW_LONG_LOOKUP_CALLS,
  • ENRICH_CATEGORY,
  • ENRICH_IDENTIFIERS,
  • REMOVE_INVALID_IDENTIFIERS,
  • ENRICH_LEGAL_FORM,
  • PARSE_LEGAL_FORM,
  • TRANSLATE_NAMES,
  • PREPROCESS_BUSINESS_PARTNER,
  • NORMALIZE_BUSINESS_PARTNER,
  • HARMONIZE_IDENTIFIERS,
  • HARMONIZE_NAME,
  • PARSE_NAMES,
  • EXTRACT_CARE_OF,
  • REMOVE_CONTACT_INFORMATION,
  • EXTRACT_DOING_BUSINESS_AS,
  • EXTRACT_ADDRESS_CONTEXT,
  • EXTRACT_NAME_DETAILS
ADDRESS_ONLY

This profile is recommended, when the goal is to perform the address curation of your business partner. Address only consists of the most important address API features from the precuration to harmonization phase.

Features included:

  • PARSE_ADDRESS,
  • ENRICH_ADDRESS,
  • ENRICH_POST_CODE_TYPE,
  • ENRICH_POST_CODE,
  • TRANSLATE_ADDRESS,
  • ENRICH_ADMINISTRATIVE_AREA_ISO,
  • PREPROCESS_ADDRESS,
  • DETECT_INDUSTRIAL_ZONE,
  • NORMALIZE_ADDRESS,
  • EXTRACT_CARE_OF,
  • REMOVE_CONTACT_INFORMATION,
  • EXTRACT_DOING_BUSINESS_AS,
  • EXTRACT_ADDRESS_CONTEXT,
  • EXTRACT_NAME_DETAILS,
  • CACHE
STANDARD_ADDRESS_CURATION_AND_ENRICHMENT

The Address Curation & Enrichment profile allows for cleansing addresses in terms of parsing the given content, identifying reference addresses in specifically configured or default address data sources, enriching the input based on the reference addresses and additional CDQ reference data, and harmonizing the different address components.

Features included:

  • DETECT_INDUSTRIAL_ZONE,
  • ENRICH_ADDRESS,
  • USE_ADDRESS_INDEX_KR,
  • ENRICH_ADMINISTRATIVE_AREA_ISO,
  • ENRICH_GEOGRAPHIC_COORDINATES,
  • ENRICH_POST_CODE,
  • ENRICH_POST_CODE_TYPE,
  • EXTRACT_ADDRESS_CONTEXT,
  • EXTRACT_CARE_OF,
  • NORMALIZE_ADDRESS,
  • PARSE_ADDRESS,
  • PREPROCESS_ADDRESS,
  • TRANSLATE_ADDRESS,
  • CACHE
ADDRESS_STANDARDIZATION

The Address Standardization profile standardizes a given input address according to the CDQ standards without considering any reference addresses from address data sources. It extracts different address components and places them in distinct fields (e.g. a PO Box maintained as street is put into a separate PO Box concept), enriches address components only based on already provided input (e.g. country name is enriched based on a given country code) and harmonizes given components (e.g. post code is formatted according to the reference standard in a country).

Features included:

  • DETECT_INDUSTRIAL_ZONE,
  • ENRICH_ADMINISTRATIVE_AREA_ISO,
  • EXTRACT_ADDRESS_CONTEXT,
  • EXTRACT_CARE_OF,
  • NORMALIZE_ADDRESS,
  • PARSE_ADDRESS,
  • PREPROCESS_ADDRESS,
  • CACHE
ADDRESS_TRANSLATION

Translates only business partner address without any enrichment. There is also a possible to transliterate, by turned it on in the curation settings.

Features included:

  • TRANSLATE_ADDRESS,
  • CACHE
BUSINESS_PARTNER_ONLY

This profile is recommended, when the goal is to perform the curation of the business partner name, legal form, identifiers etc. Business partner only consists of the most important API features from the precuration to harmonization phase.

Features included:

  • PREPROCESS_BUSINESS_PARTNER,
  • PARSE_NAMES,
  • ENRICH_CATEGORY,
  • TRANSLATE_NAMES,
  • PARSE_LEGAL_FORM,
  • ENRICH_LEGAL_FORM,
  • REMOVE_INVALID_IDENTIFIERS,
  • ENRICH_IDENTIFIERS,
  • LAB_ALLOW_LONG_LOOKUP_CALLS,
  • NORMALIZE_BUSINESS_PARTNER,
  • HARMONIZE_IDENTIFIERS,
  • HARMONIZE_NAME,
  • CACHE
FEATURES_OFF

This profile turns off all API features enabled by default (standard profile). It may be used for a test purpose to check the results of the curation using single features.

Features included:

  • CACHE
NATURAL_PERSON_SCREENING

Identifies Natural Person data based on identifiers, legal forms, and known legal entities.

Features included:

  • ENRICH_CATEGORY,
  • CACHE
PRECURATION

Parses, preprocess and harmonize business partner data without additional enrichment.

Features included:

  • PREPROCESS_ADDRESS,
  • PREPROCESS_BUSINESS_PARTNER,
  • PARSE_ADDRESS,
  • PARSE_LEGAL_FORM,
  • PARSE_NAMES,
  • NORMALIZE_ADDRESS,
  • NORMALIZE_BUSINESS_PARTNER,
  • CACHE
GOLDEN_RECORD

Golden Record curation profile consist of API toggle features used during golden record generation and is recommended to be used in Data Clinic monitoring.

Features included:

  • CACHE,
  • DETECT_INDUSTRIAL_ZONE,
  • ENABLE_FUZZY_ENRICHMENTS,
  • ENRICH_VAT_REGISTERED_INFORMATION,
  • ENRICH_ADDRESS,
  • ENRICH_ADMINISTRATIVE_AREA_ISO,
  • ENRICH_CATEGORY,
  • ENRICH_IDENTIFIERS,
  • ENRICH_LEGAL_ADDRESS,
  • ENRICH_LEGAL_FORM,
  • ENRICH_LOCAL_ADDRESS,
  • ENRICH_POST_CODE_TYPE,
  • ENRICH_REGISTERED_NAME,
  • EXTRACT_ADDRESS_CONTEXT,
  • EXTRACT_CARE_OF,
  • EXTRACT_DOING_BUSINESS_AS,
  • HARMONIZE_IDENTIFIERS,
  • HARMONIZE_NAME,
  • NORMALIZE_ADDRESS,
  • NORMALIZE_BUSINESS_PARTNER,
  • PARSE_ADDRESS,
  • PARSE_LEGAL_FORM,
  • PARSE_NAMES,
  • PREPROCESS_ADDRESS,
  • PREPROCESS_BUSINESS_PARTNER,
  • REMOVE_CONTACT_INFORMATION,
  • REMOVE_INVALID_IDENTIFIERS,
  • TRANSLATE_ADDRESS,
  • TRANSLATE_NAMES
Example: "STANDARD"
languagestring(LanguageTechnicalKey)

ISO 639-1 two-letter code of languages.

Example: "DE"
outputCharsetsArray of objects(OutputCharset)

List of Output Character Sets.

addressDataSourcesobject(AddressDataSources)

Preferred data sources for curation. Default PrimaryAddressDataSource is HERE. Default SecondaryAddressDataSource is CDQ.

addressCurationLevelThresholdstring(CurationLevel)

Indicator for curation quality. Defines how good curation was.

UNKNOWN: No possibility to determine curation level. LEVEL_1: The address was not found by the CDQ in the employed external data sources. LEVEL_2: The address was found, but there were significant changes in critical fields. LEVEL_3: The address was found and there are minor changes in highly important fields. LEVEL_4: The address was found by the CDQ. There were only changes in less critical fields such as the address/premise or address/thoroughfare/number. LEVEL_5: The address was found by the CDQ, but no major changes have been made as the address was correct. LEVEL_6: The address was found in the shared CDQ data pool. This means another company uses the same address which is a very reliable indicator that the address is correct (only in a alpha version)

IdentifierNameDescriptionCuration level score
UNKNOWNUNKNOWNNo possibility to determine curation level.overall score [0.0, 0.2]
LEVEL_1Not foundThe address was not found by the CDQ in the employed external data sources.overall score [0.2, 0.4]
LEVEL_2Low confidence matchThe address was found, but there were significant changes in critical fields.overall score [0.4, 0.6]
LEVEL_3Medium confidence matchThe address was found and there are minor changes in highly important fields.overall score [0.6, 0.7]
LEVEL_4High confidence matchThe address was found by the CDQ. There were only changes in less critical fields such as the address/premise or address/thoroughfare/number.overall score [0.7, 0.8]
LEVEL_5Reliable matchThe address was found by the CDQ, but no major changes have been made as the address was correct.overall score [0.8, 0.9]
LEVEL_6ValidatedThe address was found in the shared CDQ data pool. This means another company uses the same address which is a very reliable indicator that the address is correct (only in a alpha version)overall score [0.9, 1.0]
Example: "LEVEL_1"
fieldsArray of strings

Fields are deprecated.

Items Enum"formattedAddress""legalAddress""companyStatus""classifications"
Example: ["formattedAddress"]
featuresOnArray of strings(Feature)

List of features to be activated.

Items Enum"ACTIVATE_DATASOURCE_BVD""ACTIVATE_DATASOURCE_DNB""ALL_ADDRESS_VERSIONS""CAPITALIZE_ADDRESS""CONFIRM_IDENTIFIERS""DETECT_INDUSTRIAL_ZONE""ENABLE_FUZZY_ENRICHMENTS""ENABLE_SETTINGS""ENABLE_UNALLOWED_NAME_VALUE_VALIDATION""ENRICH_ADDRESS"
Example: ["ENRICH_ADDRESS"]
featuresOffArray of strings(Feature)

List of features to be deactivated.

Items Enum"ACTIVATE_DATASOURCE_BVD""ACTIVATE_DATASOURCE_DNB""ALL_ADDRESS_VERSIONS""CAPITALIZE_ADDRESS""CONFIRM_IDENTIFIERS""DETECT_INDUSTRIAL_ZONE""ENABLE_FUZZY_ENRICHMENTS""ENABLE_SETTINGS""ENABLE_UNALLOWED_NAME_VALUE_VALIDATION""ENRICH_ADDRESS"
Example: ["ENRICH_ADDRESS"]
configurationIdstring(DataCurationConfigurationId)

Configuration ID used to set up curation. If provided, those parameters will be affected. If any of them is provided in this request, will overwrite one from configuration (except for features which are merged):

  • outputLanguageTechnicalKey
  • addressDataSources
  • profile
  • featuresOn
  • featuresOff
  • outputCharsets
  • addressCurationLevelThreshold
  • numberSeparator
  • inputAddressConceptsIgnored
Example: "5c5356588c72a028c448adbd"
optionSkipReportbooleanDeprecated

Deprecated and not usable. For a report creation, use reportsRequest.

Default true
Example: "true"
reportsRequestobject(DataCurationReportsRequest)Deprecated
curl -i -X POST \
  https://api.cdq.com/data-curation/rest/curationjobs \
  -H 'Content-Type: application/json' \
  -H 'X-API-KEY: YOUR_API_KEY_HERE' \
  -d '{
    "name": "Process vendor data.",
    "description": "I started this job to improve quality of our data.",
    "storageId": "72d6900fce6b326088f5d9d91049e3e6",
    "dataSourceIds": [
      "648824a691d8d2503d65103e"
    ],
    "countryShortNames": [
      "CH"
    ],
    "workers": "3",
    "profile": "STANDARD",
    "language": "DE",
    "outputCharsets": [
      {
        "concept": "ADDRESS",
        "charset": "LATIN"
      }
    ],
    "addressDataSources": {
      "primaryAddressDataSource": {
        "technicalKey": "HERE",
        "threshold": "0.4"
      },
      "secondaryAddressDataSources": [
        {
          "technicalKey": "HERE",
          "threshold": "0.4"
        }
      ]
    },
    "addressCurationLevelThreshold": "LEVEL_1",
    "fields": [
      "formattedAddress"
    ],
    "featuresOn": [
      "ENRICH_ADDRESS"
    ],
    "featuresOff": [
      "ENRICH_ADDRESS"
    ],
    "optionSkipReport": "true",
    "reportsRequest": {
      "dataCurationJobId": "a34fb367-85aa-400f-b369-53863432050c",
      "name": "Data Curation Reports Job",
      "description": "The report will be generated for the Data Curation Job with ID: a34fb367-85aa-400f-b369-53863432050c.",
      "reportsConfiguration": {
        "addressCuration": {
          "build": "true"
        },
        "legalEntityCuration": {
          "build": "true"
        },
        "naturalPersonScreening": {
          "build": "true"
        },
        "vatRegistrationData": {
          "build": "true"
        }
      }
    },
    "configurationId": "5c5356588c72a028c448adbd"
  }'

Responses

OK

Bodyapplication/json
idstring(JobId)

Unique identifier of a job.

Example: "35f23c03-1c22-45fe-9484-3ffe769325de"
namestring

Name of a Job.

Example: "Process vendor data"
descriptionstring

Detailed description of a Job.

Example: "I started this job to improve quality of our data."
storageIdstring(BusinessPartnerStorageId)

Unique identifier of the Storage.

Example: "72d6900fce6b326088f5d9d91049e3e6"
dataSourceIdsArray of strings(BusinessPartnerStorageDataSourceId)

List of Data Source IDs.

Example: ["648824a691d8d2503d65103e"]
countryShortNamesArray of strings(CountryShortName)

List of country short names.

Example: ["CH"]
statusstring(JobStatus)

Curation Job execution status.

Enum"UNKNOWN""CREATED""PERSISTED""SCHEDULED""WAITING""RUNNING""FINISHED""DIED""CANCELED""FAILED"
Example: "RUNNING"
statusMessagestring(JobStatusMessage)

Additional information to explain the status.

Example: "The job failed because storage is empty."
createdAtstring(CreatedAt)

Date of creation (ISO 8601-compliant).

Example: "2025-12-19T17:00:42Z"
userstring(JobUser)

ID of (human) user or API key.

Example: "742429-234242-4343-232323"
progressinteger(JobProgress)[ 0 .. 100 ]

Progress (%) of the job.

Example: "77"
attachmentsArray of objects(FileResource)

List of attachments.

reportsJobIdstring

Unique identifier of a Reports Job.

Example: "6be92567-4327-4463-813f-a8c990410d79"
reportsConfigurationobject(DataCurationReportsConfiguration)

Configures if and how Data Curation reports are generated.

Response
application/json
{ "id": "35f23c03-1c22-45fe-9484-3ffe769325de", "name": "Process vendor data", "description": "I started this job to improve quality of our data.", "storageId": "72d6900fce6b326088f5d9d91049e3e6", "dataSourceIds": [ "648824a691d8d2503d65103e" ], "countryShortNames": [ "CH" ], "status": "RUNNING", "statusMessage": "The job failed because storage is empty.", "createdAt": "2025-12-19T17:00:42Z", "user": "742429-234242-4343-232323", "progress": "77", "attachments": [ {} ], "reportsJobId": "6be92567-4327-4463-813f-a8c990410d79", "reportsConfiguration": { "addressCuration": {}, "legalEntityCuration": {}, "naturalPersonScreening": {}, "vatRegistrationData": {} } }

Request

After you have started a curation job, you will receive a job id in the response. { 'id' : <ID> } Use this ID to poll for the status of the job using this endpoint. Once the status is FINISHED, you can download the results.

Security
apiKey
Path
idstring(JobId)required

ID of the Data Curation job.

Example: 35f23c03-1c22-45fe-9484-3ffe769325de
curl -i -X GET \
  https://api.cdq.com/data-curation/rest/curationjobs/35f23c03-1c22-45fe-9484-3ffe769325de \
  -H 'X-API-KEY: YOUR_API_KEY_HERE'

Responses

OK

Bodyapplication/json
idstring(JobId)

Unique identifier of a job.

Example: "35f23c03-1c22-45fe-9484-3ffe769325de"
namestring

Name of a Job.

Example: "Process vendor data"
descriptionstring

Detailed description of a Job.

Example: "I started this job to improve quality of our data."
storageIdstring(BusinessPartnerStorageId)

Unique identifier of the Storage.

Example: "72d6900fce6b326088f5d9d91049e3e6"
dataSourceIdsArray of strings(BusinessPartnerStorageDataSourceId)

List of Data Source IDs.

Example: ["648824a691d8d2503d65103e"]
countryShortNamesArray of strings(CountryShortName)

List of country short names.

Example: ["CH"]
statusstring(JobStatus)

Curation Job execution status.

Enum"UNKNOWN""CREATED""PERSISTED""SCHEDULED""WAITING""RUNNING""FINISHED""DIED""CANCELED""FAILED"
Example: "RUNNING"
statusMessagestring(JobStatusMessage)

Additional information to explain the status.

Example: "The job failed because storage is empty."
createdAtstring(CreatedAt)

Date of creation (ISO 8601-compliant).

Example: "2025-12-19T17:00:42Z"
userstring(JobUser)

ID of (human) user or API key.

Example: "742429-234242-4343-232323"
progressinteger(JobProgress)[ 0 .. 100 ]

Progress (%) of the job.

Example: "77"
attachmentsArray of objects(FileResource)

List of attachments.

reportsJobIdstring

Unique identifier of a Reports Job.

Example: "6be92567-4327-4463-813f-a8c990410d79"
reportsConfigurationobject(DataCurationReportsConfiguration)

Configures if and how Data Curation reports are generated.

Response
application/json
{ "id": "35f23c03-1c22-45fe-9484-3ffe769325de", "name": "Process vendor data", "description": "I started this job to improve quality of our data.", "storageId": "72d6900fce6b326088f5d9d91049e3e6", "dataSourceIds": [ "648824a691d8d2503d65103e" ], "countryShortNames": [ "CH" ], "status": "RUNNING", "statusMessage": "The job failed because storage is empty.", "createdAt": "2025-12-19T17:00:42Z", "user": "742429-234242-4343-232323", "progress": "77", "attachments": [ {} ], "reportsJobId": "6be92567-4327-4463-813f-a8c990410d79", "reportsConfiguration": { "addressCuration": {}, "legalEntityCuration": {}, "naturalPersonScreening": {}, "vatRegistrationData": {} } }

List Business Partner Curation Results

Request

Retrieves curation results for particular job.

Security
apiKey
Path
idstring(JobId)required

ID of the Data Curation job.

Example: 35f23c03-1c22-45fe-9484-3ffe769325de
Query
businessPartnerIdArray of strings(BusinessPartnerId)

Business Partner IDs which should be filtered.

Example: businessPartnerId=63e635235c06b7396330fe40
startAfterstring(StartAfter)

Used to retrieve the next page of results.

Example: startAfter=5712566172571652
limitinteger(int32)[ 1 .. 100 ]

Number of results that should be fetched. Maximum 100 results can be returned in one page.

Default 100
Example: limit=50
curl -i -X GET \
  'https://api.cdq.com/data-curation/rest/v2/curationjobs/35f23c03-1c22-45fe-9484-3ffe769325de/results?businessPartnerId=63e635235c06b7396330fe40&startAfter=5712566172571652&limit=50' \
  -H 'X-API-KEY: YOUR_API_KEY_HERE'

Responses

OK

Bodyapplication/json
startAfterstring(StartAfter)

The ID which is used to read the page.

Example: "5712566172571652"
limitinteger(Limit)

Number of items per page.

Example: "100"
totalinteger(PageTotal)

Total number of items which can be paged.

Example: "67"
valuesArray of objects(CurationJobResult)

List of Curation Job Results.

nextStartAfterstring(NextStartAfter)

Provides a value to be used as a startAfter in next page request.

Example: "5712566172571652"
Response
application/json
{ "startAfter": "5712566172571652", "limit": "100", "total": "67", "values": [ {} ], "nextStartAfter": "5712566172571652" }

Business Partners

Everything about Business Partners

Operations
Operations
Operations