Batch Curation

Everything about Batch Curation

Poll Curation JobWSDL

After you have started a curation job, you will receive a job id in the response. { 'id' : '' } Use this ID to poll for the status of the job using this endpoint. Once the status is FINISHED, you can download the results.

SecurityapiKey
Request
path Parameters
id
required
string (JobId)

ID of the Data Curation job.

Example: 35f23c03-1c22-45fe-9484-3ffe769325de
Responses
200

OK

401

Unauthorized

403

Forbidden

404

Not Found

get/curationjobs/{id}
Request samples
Response samples
application/json
{
  • "id": "35f23c03-1c22-45fe-9484-3ffe769325de",
  • "name": "Process vendor data",
  • "description": "I started this job to improve quality of our data.",
  • "storageId": "72d6900fce6b326088f5d9d91049e3e6",
  • "dataSourceIds": [
    ],
  • "countryShortNames": [
    ],
  • "status": "RUNNING",
  • "statusMessage": "The job failed because storage is empty.",
  • "createdAt": "2024-11-20T14:46:33Z",
  • "user": "742429-234242-4343-232323",
  • "progress": "77",
  • "attachments": [
    ],
  • "reportsJobId": "6be92567-4327-4463-813f-a8c990410d79",
  • "reportsConfiguration": {
    }
}

Read Business Partner Curation Batch ResultsWSDL

Retrieves curation results for particular job.

SecurityapiKey
Request
path Parameters
id
required
string (JobId)

ID of the Data Curation job.

Example: 35f23c03-1c22-45fe-9484-3ffe769325de
query Parameters
businessPartnerId
Array of strings (BusinessPartnerId)

Business Partner IDs which should be filtered.

Example: businessPartnerId=63e635235c06b7396330fe40
limit
integer <int32> [ 1 .. 100 ]
Default: 100

Number of results that should be fetched. Maximum 100 results can be returned in one page.

Example: limit=50
startAfter
string (StartAfter)

Used to retrieve the next page of results. Prepared in the Curation Job Result Page.nextStartAfter.

Example: startAfter=5712566172571652
Responses
200

OK

400

Unauthorized or missing

get/v2/curationjobs/{id}/results
Request samples
Response samples
application/json
{
  • "startAfter": "5712566172571652",
  • "limit": "100",
  • "total": "67",
  • "values": [
    ],
  • "nextStartAfter": "5712566172571652"
}

Start Curation JobWSDL

Start a batch curation job on a given storage.

SecurityapiKey
Request
Request Body schema: application/json
required
addressCurationLevelThreshold
string (CurationLevel)

Indicator for curation quality. Defines how good curation was.

Additional documentation can be found here.

Enum: Description
UNKNOWN

No possibility to determine curation level.

LEVEL_1

The address was not found by the CDQ in the employed external data sources.

LEVEL_2

The address was found, but there were significant changes in critical fields.

LEVEL_3

The address was found and there are minor changes in highly important fields.

LEVEL_4

The address was found by the CDQ. There were only changes in less critical fields such as the address/premise or address/thoroughfare/number.

LEVEL_5

The address was found by the CDQ, but no major changes have been made as the address was correct.

LEVEL_6

The address was found in the shared CDQ data pool. This means another company uses the same address which is a very reliable indicator that the address is correct (only in a alpha version)

Example: "LEVEL_1"
object (AddressDataSources)

Preferred data sources for curation. Default PrimaryAddressDataSource is HERE. Default SecondaryAddressDataSource is CDQ.

configurationId
string (DataCurationConfigurationId)

Configuration ID used to set up curation. If provided, those parameters will be affected. If any of them is provided in this request, will overwrite one from configuration (except for features which are merged):

  • outputLanguageTechnicalKey
  • addressDataSources
  • profile
  • featuresOn
  • featuresOff
  • outputCharsets
  • addressCurationLevelThreshold
  • numberSeparator
  • inputAddressConceptsIgnored
Example: "5c5356588c72a028c448adbd"
countryShortNames
Array of strings (CountryShortName)

If set, only the records that belong to the countries identified by these short names are processed. By default, all records of the storage (means from all countries) are processed (considering other filters).

Example: ["CH"]
dataSourceIds
Array of strings (BusinessPartnerStorageDataSourceId)

If set, only the records that belong to the data sources identified by these IDs are processed. By default, all records of the storage (means from all data sources) are processed (considering other filters).

Example: ["648824a691d8d2503d65103e"]
description
string <= 200 characters

Detailed description of a Job.

Example: "I started this job to improve quality of our data."
featuresOff
Array of strings (Feature)

List of features to be deactivated.

Items Enum: "ACTIVATE_DATASOURCE_BVD" "ACTIVATE_DATASOURCE_DNB" "ALL_ADDRESS_VERSIONS" "CAPITALIZE_ADDRESS" "CONFIRM_IDENTIFIERS" "DETECT_INDUSTRIAL_ZONE" "ENABLE_FUZZY_ENRICHMENTS" "ENABLE_SETTINGS" "ENABLE_UNALLOWED_NAME_VALUE_VALIDATION" "ENRICH_ADDRESS" "ENRICH_ADDRESS_TYPE" "ENRICH_ADMINISTRATIVE_AREA_ISO" "ENRICH_CATEGORY" "ENRICH_CLASSIFICATIONS" "ENRICH_COMPANY_STATUS" "ENRICH_GEOGRAPHIC_COORDINATES" "ENRICH_GOOGLE_PLACES_INFORMATION" "ENRICH_IDENTIFIERS" "ENRICH_LEGAL_ADDRESS" "ENRICH_LOCAL_ADDRESS" "ENRICH_LEGAL_FORM" "ENRICH_MINORITY_INDICATOR" "ENRICH_POST_CODE" "ENRICH_POST_CODE_TYPE" "ENRICH_REGISTERED_NAME" "ENRICH_VAT_REGISTERED_INFORMATION" "EXTRACT_ADDRESS_CONTEXT" "EXTRACT_CARE_OF" "EXTRACT_DOING_BUSINESS_AS" "HARMONIZE_IDENTIFIERS" "HARMONIZE_NAME" "JOIN_THOROUGHFARE_NUMBER_POSTFIX" "JOIN_THOROUGHFARE_NUMBER_PREFIX" "NORMALIZE_ADDRESS" "NORMALIZE_BUSINESS_PARTNER" "PARSE_ADDRESS" "PARSE_LEGAL_FORM" "PARSE_LEGAL_FORM_WORLDWIDE" "PARSE_NAMES" "PREFER_COUNTY" "PREFER_THOROUGHFARE_NAME" "PREFER_THOROUGHFARE_SHORTNAME" "PREPROCESS_ADDRESS" "PREPROCESS_BUSINESS_PARTNER" "REMOVE_CONTACT_INFORMATION" "REMOVE_INVALID_IDENTIFIERS" "SHARING_PRIMARY" "SHOW_DEBUG_INFO" "SHOW_FORMATTED_ADDRESS" "SHOW_FORMATTED_SAP_RECORD" "SHOW_SETTINGS" "STANDARDIZE_REGISTERED_ADDRESS" "STANDARDIZE_VAT_REGISTERED_ADDRESS" "TRANSLATE_ADDRESS" "TRANSLATE_NAMES" "USE_ADDRESS_INDEX_KR" "LAB_ALLOW_LONG_LOOKUP_CALLS" "LAB_QUERY_CLIENT" "LAB_QUERY_GOOGLE_WITH_COMPANY" "LAB_QUERY_PELIAS_WITH_COMPANY" "LAB_USE_GOOGLE_MAPS" "LAB_USE_LOOKUP_CLASSIFICATION_ONLY" "LAB_USE_PELIAS" "LAB_USE_QUEUES"
Example: ["ENRICH_ADDRESS"]
featuresOn
Array of strings (Feature)

List of features to be activated.

Items Enum: "ACTIVATE_DATASOURCE_BVD" "ACTIVATE_DATASOURCE_DNB" "ALL_ADDRESS_VERSIONS" "CAPITALIZE_ADDRESS" "CONFIRM_IDENTIFIERS" "DETECT_INDUSTRIAL_ZONE" "ENABLE_FUZZY_ENRICHMENTS" "ENABLE_SETTINGS" "ENABLE_UNALLOWED_NAME_VALUE_VALIDATION" "ENRICH_ADDRESS" "ENRICH_ADDRESS_TYPE" "ENRICH_ADMINISTRATIVE_AREA_ISO" "ENRICH_CATEGORY" "ENRICH_CLASSIFICATIONS" "ENRICH_COMPANY_STATUS" "ENRICH_GEOGRAPHIC_COORDINATES" "ENRICH_GOOGLE_PLACES_INFORMATION" "ENRICH_IDENTIFIERS" "ENRICH_LEGAL_ADDRESS" "ENRICH_LOCAL_ADDRESS" "ENRICH_LEGAL_FORM" "ENRICH_MINORITY_INDICATOR" "ENRICH_POST_CODE" "ENRICH_POST_CODE_TYPE" "ENRICH_REGISTERED_NAME" "ENRICH_VAT_REGISTERED_INFORMATION" "EXTRACT_ADDRESS_CONTEXT" "EXTRACT_CARE_OF" "EXTRACT_DOING_BUSINESS_AS" "HARMONIZE_IDENTIFIERS" "HARMONIZE_NAME" "JOIN_THOROUGHFARE_NUMBER_POSTFIX" "JOIN_THOROUGHFARE_NUMBER_PREFIX" "NORMALIZE_ADDRESS" "NORMALIZE_BUSINESS_PARTNER" "PARSE_ADDRESS" "PARSE_LEGAL_FORM" "PARSE_LEGAL_FORM_WORLDWIDE" "PARSE_NAMES" "PREFER_COUNTY" "PREFER_THOROUGHFARE_NAME" "PREFER_THOROUGHFARE_SHORTNAME" "PREPROCESS_ADDRESS" "PREPROCESS_BUSINESS_PARTNER" "REMOVE_CONTACT_INFORMATION" "REMOVE_INVALID_IDENTIFIERS" "SHARING_PRIMARY" "SHOW_DEBUG_INFO" "SHOW_FORMATTED_ADDRESS" "SHOW_FORMATTED_SAP_RECORD" "SHOW_SETTINGS" "STANDARDIZE_REGISTERED_ADDRESS" "STANDARDIZE_VAT_REGISTERED_ADDRESS" "TRANSLATE_ADDRESS" "TRANSLATE_NAMES" "USE_ADDRESS_INDEX_KR" "LAB_ALLOW_LONG_LOOKUP_CALLS" "LAB_QUERY_CLIENT" "LAB_QUERY_GOOGLE_WITH_COMPANY" "LAB_QUERY_PELIAS_WITH_COMPANY" "LAB_USE_GOOGLE_MAPS" "LAB_USE_LOOKUP_CLASSIFICATION_ONLY" "LAB_USE_PELIAS" "LAB_USE_QUEUES"
Example: ["ENRICH_ADDRESS"]
fields
Array of strings

Fields are deprecated. Use features instead.

Items Enum: "formattedAddress" "legalAddress" "companyStatus" "classifications"
Example: ["formattedAddress"]
language
string (LanguageTechnicalKey)

ISO 639-1 two-letter code of languages.

Example: "DE"
name
string <= 50 characters

Name of a Job.

Example: "Process vendor data."
optionSkipReport
boolean
Deprecated
Default: true

Deprecated and not usable. For a report creation, use reportsRequest.

Example: "true"
Array of objects (OutputCharset)

List of Output Character Sets.

profile
string (Profile)
Enum: "STANDARD" "ADDRESS_ONLY" "STANDARD_ADDRESS_CURATION_AND_ENRICHMENT" "ADDRESS_STANDARDIZATION" "ADDRESS_TRANSLATION" "BUSINESS_PARTNER_ONLY" "FEATURES_OFF" "NATURAL_PERSON_SCREENING" "PRECURATION" "GOLDEN_RECORD"
Example: "STANDARD"
object (DataCurationReportsRequest)
Deprecated

Deprecated. Reports are available in Data Clinic app.

storageId
required
string (BusinessPartnerStorageId)

Unique identifier of the Storage.

Example: "72d6900fce6b326088f5d9d91049e3e6"
workers
integer [ 1 .. 8 ]
Default: 1

Number of workers to be used for the job. By default, the number of workers is 1.

Example: "3"
Responses
200

OK

201

Created

400

The sent request is malformed.

401

Unauthorized

403

Forbidden

404

Not Found

post/curationjobs
Request samples
application/json
{
  • "name": "Process vendor data.",
  • "description": "I started this job to improve quality of our data.",
  • "storageId": "72d6900fce6b326088f5d9d91049e3e6",
  • "dataSourceIds": [
    ],
  • "countryShortNames": [
    ],
  • "workers": "3",
  • "profile": "STANDARD",
  • "language": "DE",
  • "outputCharsets": [
    ],
  • "addressDataSources": {
    },
  • "addressCurationLevelThreshold": "LEVEL_1",
  • "fields": [
    ],
  • "featuresOn": [
    ],
  • "featuresOff": [
    ],
  • "optionSkipReport": "true",
  • "reportsRequest": {
    },
  • "configurationId": "5c5356588c72a028c448adbd"
}
Response samples
application/json
{
  • "id": "35f23c03-1c22-45fe-9484-3ffe769325de",
  • "name": "Process vendor data",
  • "description": "I started this job to improve quality of our data.",
  • "storageId": "72d6900fce6b326088f5d9d91049e3e6",
  • "dataSourceIds": [
    ],
  • "countryShortNames": [
    ],
  • "status": "RUNNING",
  • "statusMessage": "The job failed because storage is empty.",
  • "createdAt": "2024-11-20T14:46:33Z",
  • "user": "742429-234242-4343-232323",
  • "progress": "77",
  • "attachments": [
    ],
  • "reportsJobId": "6be92567-4327-4463-813f-a8c990410d79",
  • "reportsConfiguration": {
    }
}