Getting started
POST /v2/people/match
The Match API v2 is designed to find exact matches for your data. It uses a different matching technology from Match API v1, and returns only high confidence matches.
Matching to the Places
and People
databases is supported.
curl https://qa.api.data-axle.com/v2/people/match \
-H "X-AUTH-TOKEN: ffddaa789012345678901234" \
-d '{
"identifiers": {
"first_name": "Homer",
"last_name": "Simpson",
"street": "123 Main St",
"city": "Springfield",
"state": "WA",
"postal_code": "98103"
}
}'
The response includes a persistent person_id
along with appended attributes:
{
"document": {
"person_id": "002293031973",
"rules": [
"name",
"address"
],
"score": "1.0",
"attributes": {
"person_id": "002293031973",
"family_id": "800047540468",
"first_name": "Homer",
"last_name": "Simpson",
"gender": "M",
"age": 58,
"city": "Springfield",
...
}
}
}
GET
orPOST
can be used.- By default, one match is returned. Set the
limit
option to retrieve multiple matches. - All available fields are included in the output. This can be changed with the
fields
option. - If a match is not found, a
null
document is returned.
Parameters
Parameter | Description | Default |
identifiers | The set of fields used to find a match for your record. | Required. |
required_rule_groups | The set of rules the result must match. | All rules are used by default. |
filter | Reduce potential results with the Filter DSL. | All records visible to the user. |
fields | The fields returned with the matched document. | Any Data Dictionary fields. |
include_labels | Get the labels for encoded fields. | false |
limit | Return multiple results up to the given limit. Since v2 returns only high confidence matches, setting this parameter above 2 is not likely to have a useful impact. | 1 |
minimum_match_score | Exclude results below a certain match score. May be used to fine tune the returned matches. Lowering it will not lower the quality of the returned matches. | 0.85 |
contract | If multiple contracts are present on an account, you will have to specify a contract ID. | User's current contract if it's the only one. |
Identifiers
The fields you can submit for matching are:
Field | Description | Example |
reference_id | Optional field used to reference ID's from your system | 1cd345b729, User 12567 |
person_id | The unique identifier for the person. | 123456789000 |
first_name | The first name of the person. | John, Mary Ann, ... |
last_name | The last name of the person. | Smith |
street | The location address of the person. | 123 N Main St |
mailing_street | The mailing address of the person or family. | PO Box 123 |
suite | The unit or apartment number of the person. | A |
city | The city name for the person. | Seattle |
state | The state of the person. | WA |
postal_code | The postal code for the person's address. | 98134 |
phone | Telephone numbers associated with the family. | (206) 555-1212, 2065551212, ... |
email | The email address of the person or family. | someone@example.com |
email_md5 | MD5 encrypted hash of the email address. | 16d113840f999444259f73bac9ab8b10 |
email_sha256 | SHA-256 encrypted hash of the email address. | 72497f475e4f76d0b28f57c73a084ece576... |
See the rule descriptions for which fields are required for each rule to match.
reference_id
is not used for matching, but is an optional field that can be used to reference requests using an ID from your own system. It will be returned with the results of batch requests.
Rules
The data provided is matched using a series of rules, sequentially, with the rules ordered from high confidence to medium-high confidence. When a match according to the rule is found, that match is returned and the other rules are skipped. Each rule is a combination of one or more matching identifiers. For example, in order for the "Name + Address" rule to be satisfied, the input must match both the name and address.
Rule | Description |
person_id | Match by person_id . Use this by itself if you already have a person_id and would like data appended. |
family_id | Match by family_id . Will be used if you have a family_id in your inputs. In most cases the Head of Family will be returned. |
address + name + phone | Match by address, name and phone. |
address + name | Match by address and name. |
name + phone | Match by name and phone number. Cell phone numbers will also be matched. |
address + name + email + phone | Match by address, name, phone and email. Any one of email , email_md5 , or email_sha256 can be provided. |
address + name + email | Match by addres, name and email. |
address + email + phone | Match by address, email and phone. |
name + email + phone | Match by name, email and phone. |
address + email | Match by address and email. |
name + email | Match by name and email. |
address + name (last name only) | Match by address and last name. First name will not match, therefore this is a lower ranked rule. |
Match by email. Only exact email matches are returned. | |
phone | Match by phone. |
address | Match by address. This is inherently the lowest confidence rule in the list. |
Score
Match scores are provided to make it easier to compare the similarity of the input to the matched record. Records are scored 0-1. No matches with scores below 0.85 are returned by v2.
Match score is relative to the rule that was used to pick the winning match. For example, if the resulting match used address + email
rule and scored 1, that means that address and email matched exactly. It does not imply that other inputs, for example name or phone, match the returned output.
API v2 returns only high confidence matches. To select only the absolute best matches, use the minimum_match_score
parameter.
{
"minimum_match_score": 0.96
}
Required Rule Groups
Use the required_rule_groups parameter
to specify groups of rules. Only results that match one or more of the groups are returned.
{
"required_rule_groups": [
["address", "name"],
["address", "phone"]
]
}
In this example, only the following rules will be evaluated:
- address + name + phone
- address + name
- address + name + email + phone
- address + name + email
- address + email + phone
- address + name (last name only)
Note that any rule that does not include address + name
or address + phone
is skipped:
- name + phone
- name + email + phone
- address + email
- name + email
- phone
- address
Filter
The filter
parameter reduces results to records matching a specified criteria, using the Filter DSL.
Fields
By default, all fields in the Data Dictionary are included in the output. Use the fields
parameter to reduce the number of elements returned:
{
...
"fields": ["city", "date_of_birth", "street", "city", "zip", "first_name", "last_name"]
}
Some contact information, including email addresses, will be used for matching, but will not be included with the matched record. Some data may be suppressed for certain records and will not be returned.
Include Labels
The fields returned within records frequently contain encoded values that reference lookup data. To retrieve the labels for lookups, add the include_labels
option:
{
...
"include_labels": true
}
Read the Lookups API documentation for more information.
Limit
By default, one match is returned. Specify a limit
parameter to retrieve multiple matches. When limit is specified, an array of documents is returned.
{
"identifiers": {
"first_name": "Joe",
"last_name": "Smith",
"street": "123 Main St",
"city": "Seattle",
"state": "WA"
},
"limit": 2
}
Note that the response has a documents
property which is an array:
{
"documents": [
{
"person_id": "002293031973",
"rules": [
"name + address"
],
"score": 1.0,
"attributes": {
"person_id": "002293031973",
"family_id": "800047540468",
"first_name": "Joe",
"last_name": "Smith",
"street": "123 Main St",
...
}
},
{
"person_id": "200089221963",
"rules": [
"email"
],
"score": 0.96,
"attributes": {
"person_id": "200089221963",
"family_id": "800047540468",
"first_name": "Jamie",
"last_name": "Smith",
"street": "123 Main St",
...
}
}
]
}
Bulk Match
Use bulk requests to process large volumes of match requests in a batch:
- Create a batch
- Add match requests to existing batch
- Retrieve results
- Request results with appended data to be delivered
Create a Batch
POST /v2/people/match/batch
Start by creating a batch. The initial request can include up to 1,000 match requests:
curl -XPOST https://qa.api.data-axle.com/v2/people/match/batch \
-H "X-AUTH-TOKEN: ffddaa789012345678901234" \
-d '{
"contract": "a86ef85a2ec",
"minimum_match_score": 0.8,
"identifiers": [
{
"reference_id": "123",
"first_name": "John",
"last_name": "Smith",
"street": "123 Main St",
"city": "Seattle",
"state": "WA",
"postal_code": "98103"
},
{
"reference_id": "125",
"email": "johnson-family@example.com"
}
]
}'
The immediate response includes a batch_id
and an array of objects containing a match_id
. Each match_id
is returned in the order the match request identifiers were submitted:
{
"batch_id": "8a140451fe3f095f1c205cf185efffec",
"matches": [
{
"match_id": "5cbb3b15c8bee0f706239b45cb763fed"
},
{
"match_id": "5f0b1070c3e4761cabf116bbed2b49c4"
}
]
}
Adding Requests
PUT /v2/people/match/batch/:batch_id
Add match requests to an existing batch_id
:
curl -XPUT https://qa.api.data-axle.com/v2/people/match/batch/:batch_id \
-H "X-AUTH-TOKEN: ffddaa789012345678901234" \
-d '{
"identifiers": [...]
}'
Millions of match requests can be added to a batch. However, only 1,000 match requests are allowed per API request.
Viewing Bulk Match Stats
GET /v2/people/match/batch/:batch_id/report
Use the Match Results API with batch_id
and report
to view match batch stats.
curl https://qa.api.data-axle.com/v2/people/match/batch/8a140451fe3f095f1c205cf185efffec/report \
-H "X-AUTH-TOKEN: ffddaa789012345678901234"
The match report shows total counts for processed and matched records as well as a breakdown by rule groups and match rates. Matches that are pending will not appear in the results.
{
"match_request_count": 501,
"processed_count": 501,
"matched_count": 489,
"rule_counts": [
{
"group": "Address + Name + Phone",
"count": 165
},
{
"group": "Address + Name",
"count": 219
},
{
"group": "Address + Phone",
"count": 27
},
{
"group": "Name + Phone",
"count": 8
},
{
"group": "Address",
"count": 24
},
{
"group": "Phone",
"count": 46
},
{
"group": "Non-matched Records",
"count": 12
}
]
}
Getting Bulk Match Results
GET /v2/people/match/batch/:batch_id
Use the Match Results API with the batch_id
to fetch completed results for the batch.
Matches that are pending will not appear in the results.
curl https://qa.api.data-axle.com/v2/people/match/batch/8a140451fe3f095f1c205cf185efffec \
-H "X-AUTH-TOKEN: ffddaa789012345678901234"
Use the "status" object to determine batch progress. The batch has completed when
processed
is the same as requests
.
{
"next_token": "13835315192676945401741312",
"status": {
"requests": 500,
"processed": 223
},
"documents": [
{
"match_id": "1a221c5c42ca3458ff715378d951801f4d65976d",
"reference_id": "7",
"documents": [
{
"person_id": "000027083596",
"rules": [
"phone"
],
"score": "1.0",
"attributes": {
"first_name": "Laura",
"last_name": "Smith",
"street": "777 Windy Rd",
"city": "Chatham",
"state": "VA",
"postal_code": "24531-2222"
}
}
]
},
{
"match_id": "e870b9bf4bea74ecac078055c992d1c498863078",
"reference_id": "6",
"documents": [
{
"person_id": "000017742859",
"rules": [
"phone"
],
"score": "1.0",
"attributes": {
"first_name": "Wendy",
"last_name": "Noble",
"street": "3737 Eastern Rd",
"city": "North Chesterfield",
"state": "VA",
"postal_code": "23234-9191"
}
}
]
},
...
]
}
Scrolling Through Match Results
GET /v2/people/match/batch/:batch_id?since=next_token
Each request returns up to 1,000 results. To read the next set of results, use the next_token
from the previous request and append it to the request URL via the since
parameter:
curl https://qa.api.data-axle.com/v2/people/match/batch/8a140451fe3f095f1c205cf185efffec?since=13835315192676945401741312 \
-H "X-AUTH-TOKEN: ffddaa789012345678901234"
Repeat this process until an empty list of documents
is returned. Store the final next_token
for use in future requests from the same batch:
{
"next_token": "13835315198738972450972346",
"status": {
"requests": 500,
"processed": 500
},
"documents": []
}
Bulk Match Parameters
Parameter | Description | Default |
identifiers | The set of fields that are used to find a match for your record. | Required. |
required_rule_groups | Rule groups which determine which rules are used. | All rules are used by default. |
filter | Reduce potential results with a filter. | All records visible to the user. |
limit | Return multiple results up to the given limit count. Max 400 of results. | 1 |
Bulk Match Result Parameters
Parameter | Description | Default |
fields | The fields that are returned with the matched document. This overrides any fields that were created with the batch. | All fields in the Data Dictionary |
include_labels | Get the labels for encoded fields. | false |
since | The token of the earliest result you would like to receive. | None - start at the beginning of the batch. |
required_rule_groups | Furhter filter the results by these rule groups. | All results. |
Bulk Match Result Response
The match result includes the following fields:
Field | Description |
next_token | The next token to use when requesting more results. |
status | Processing status of this batch. |
status.requests | The count of requests in the batch. |
status.processed | The count of requests that have been processed. The batch is complete when this number equals the requests count. |
documents | The documents that matched your identifiers. If there were no matches, this will be an null or an empty array if a limit is provided. |
documents[n].match_id | The ID of the match request. |
documents[n].rules | The rule that provided the best match to your identifiers. |
documents[n].score | The score of the match result. |
documents[n].person_id | The ID of the matched record. |
documents[n].attributes | The fields specified by fields in the request or those specified by your contract. |
Special Note on Using Limit
Using the limit
parameter when calling POST /v2/people/match/batch
determines how the subsequent call to GET /v2/people/match/batch/:batch_id
formats its results.
For example, for a match batch request without specifying limit
, such as:
{
"identifiers": [
{
"first_name": "Rosa",
"last_name": "Sanchez",
"phone": "4024012938"
},
{
"last_name": "Walker",
"street": "81 Union Drive",
"state": "GA",
"postal_code": "31707"
}
]
}
a call to GET /v2/people/match/batch/:batch_id
will always return a single document (or null
) for every set of identifiers.
Notice how in the example below, each match_id
is associated with a single document ({}
):
{
"status": {
...
},
"documents": [
{
"match_id": "4828afa0c77bb282c872be2695a57d5aded36382",
"document": {
"person_id": "900859717460",
"attributes": {
...
}
}
},
{
"match_id": "1f820be8c349d99bef7fc7764915033dcf44356b",
"document": null
}
]
}
If, however, a limit
is specified:
{
"limit": 2,
"identifiers": [
{
"street": "8818 N Colton St",
"postal_code": "77218"
},
...
]
}
then a call to GET /v2/people/match/batch/:batch_id
will return a list of documents for every set of identifiers.
Notice how in the example below, each match_id
is associated with an array ([]
) of documents:
{
"status": {
...
},
"documents": [
{
"match_id": "2b1dc5ae81f7bcbad5b4a9777b14efd1e8abf2fc",
"reference_id": "001",
"documents": [
{
"person_id": "800402483665",
"attributes": {
...
}
},
{
"infogroup_id": "783747234",
"attributes": {
...
}
},
]
},
{
"match_id": "621649dd2ac450033afb774e814bdff3c2b15dbb",
"documents": [
{
"person_id": "800555277933",
"attributes": {
...
}
}
]
}
]
}
Bulk Match Deliveries
Deliveries of the bulk match results as files may be managed using the Bulk Match Delivery API.
Requesting A Delivery
POST /v2/people/match/batch/:batch_id/deliveries
Request a file delivery for a match batch:
curl -XPOST https://qa.api.data-axle.com/v2/people/match/batch/8a140451fe3f095f1c205cf185efffec/deliveries \
-H "X-AUTH-TOKEN: ffddaa789012345678901234" \
-d '{
"contract": "a86ef85a2ec",
"name": "Delivery for Imogen",
"fields": ["person_id", "first_name", "last_name", "street"],
"file_format": "csv",
"location": "s3://bucket-somewhere-in-aws-s3/deliveries/..."
}'
The delivery may be requested at any time. If the match batch has not yet completed processing, the delivery will not be generated until after all the requests in the match batch have been processed.
Delivery Request Parameters
Parameter | Description | Example |
name | Name for this delivery. Will show up in the status calls and in the UI. | "customers in NC" |
fields | List of fields to be included in the delivery. | ["street", "city", "state", "gender"] |
file_format | Delivery file format. | "csv", "json", "xml", "pipe" |
location | May be used to specify a custom AWS S3 URL for this delivery. | "s3://somewhere-in-the-clouds/path/to/delivery" |
contract | Contract to be used to look up permissions, allowed fields, etc (optional). | "abcab34134" |
Listing Deliveries for Match Batch
GET /v2/people/match/batch/:batch_id/deliveries
Use this API to list all the deliveries that have been built for the given match batch:
curl https://qa.api.data-axle.com/v2/people/match/batch/8a140451fe3f095f1c205cf185efffec/deliveries \
-H "X-AUTH-TOKEN: ffddaa789012345678901234"
{
"deliveries": [
{
"id": "5df8267b2ce8d9b5",
"name": "customers in NC",
"status": "success",
"type": "match",
"url": "https://axle-deliveries.s3.amazonaws.com/.../deliveries/5df8267b2ce8d9b5",
"format": "csv",
"delivery_time": "2023-10-13T15:55:21Z"
},
{
"id": "5ee78fc3fcf77ec4",
"name": "triggered from api on 13 Oct 10:26",
"status": "success",
"type": "match",
"url": "https://axle-deliveries.s3.amazonaws.com/.../deliveries/5ee78fc3fcf77ec4",
"format": "csv",
"delivery_time": "2023-10-13T15:55:20Z"
}
]
}
Delivery Details
GET /v2/people/deliveries/:delivery_id
Use this API to get the details of the given delivery. Note, the results will include signed links to download the files. All signed links expire after 5 minutes. Avoid storing the signed links in your database.
curl https://qa.api.data-axle.com/v2/people/deliveries/5412f077b97ed005 \
-H "X-AUTH-TOKEN: ffddaa789012345678901234"
{
"id": "5412f077b97ed005",
"name": "triggered from api on 13 Oct 10:46",
"status": "success",
"type": "match",
"url": "https://places.test/api/v1/deliveries/5412f077b97ed005",
"files": [
{
"filename": "triggered_from_api_on_13_Oct_10_46-00000_people.csv.gz",
"md5": "f5c890ae9eb215281ffee72c29f03303",
"model": "person",
"records": 15,
"size": 1272,
"url": "https://axle-tmp.s3.amazonaws.com/organizati..."
}
],
"sample": [
{
"filename": "sample_people.csv.gz",
"md5": "6361898c6b085ef353cfa17f0d711c75",
"model": "person",
"records": 10,
"size": 907,
"url": "https://axle-tmp.s3.amazonaws.com/organizati..."
}
],
"format": "csv",
"delivery_time": "2023-10-13T15:55:21Z"
}
The delivery status API response includes the following fields:
Field | Description |
id | Delivery ID. |
name | Name assigned to this delivery. |
status | Status of this delivery's processing, usually "success" . |
type | All the match batch deliveries will have this set to "match" . |
url | UI url showing the details of this deivery. |
format | File format. |
delivery_time | Time this delivery was built. |
files | List of files in this delivery. |
files[n].filename | File name (without path). |
files[n].md5 | MD5 checksum for this file. |
files[n].model | Model name. Used to denote nested objects in separate files. |
files[n].records | Number of records in the file. |
files[n].size | File size. |
files[n].url | Signed URL that may be used to retrieve the file. |
sample | Sample file for this delivery. Like "files" , it is a list with the same detail attributes. |