Introduction to Data Onboarding

Data onboarding is the transfer of customer data you’ve gathered offline into an online environment—a critical process because customer journeys are now omnichannel.

By Kelly Kirwan

If you've driven a car and obtained your license—hopefully not in that order—you know that every vehicle has blind spots. These are areas next to or behind the car that you can't see in your rearview mirror.

Most businesses also have blind spots in their customer data. They might have a full view of what their customers do online, but anything that happens offline is either not consolidated with that online data or not captured at all.

Such offline blind spots can cause all sorts of problems. Sometimes in the form of missed opportunities—like ads that could have converted based on more data—or as damaging mistakes, like promoting an online ad to a customer that's already purchased that product.

Data onboarding helps illuminate this offline blind spot. To understand how it does that, we'll review:

  • What is data onboarding?

  • How does the data onboarding process work?

  • Three components of a successful data onboarding strategy

  • Four benefits of data onboarding

  • Segment: the most powerful data onboarding tool on the market

  • Data onboarding FAQs

What is data onboarding?

Data onboarding is the transfer of customer data you’ve gathered offline into an online, digital environment. These days, that’s usually a Customer Data Platform (CDP), where offline information gets matched with customer data from online sources to enrich customer profiles.

Data onboarding has become critical in marketing because modern customer journeys are omnichannel, which includes offline touchpoints. Without that information, it's hard to offer personalized customer experiences—or offer them at all.

Offline data includes information customers leave or generate in a store, like in-store purchase and transaction details or registering for a loyalty program with their name and email address. Customers also generate offline data through surveys, discount coupons, or information a salesperson gathers and enters into a CRM.

Online data is all the information businesses can collect from actions their customers take on the company's digital channels—web pages visited, clicks made, forms submitted, media consumed, topics searched, purchases made, and anything else someone might do online.

How does the data onboarding process work?

Once you upload offline data to a CDP like Segment, it looks for matching identifiers with existing online data. This process is called identity resolution, and identifiers can include names, email addresses, or telephone numbers. But you can also resolve an identity based on ID numbers generated automatically by specific platforms and services, such as an IP address, user ID, anonymous ID, ad ID, or device ID.

Segment user ID

Segment allows you to see which IDs are connected to a customer's profile.

Segment's Personas feature will update an existing customer profile if it finds a match or will create a new profile when it doesn't. If there are multiple matching profiles, it will merge them.

Data brokers and third-party cookies—identifiers placed on your devices by advertisers—used to be rich sources for identity resolution. But consumer sentiment and government regulations have forced a shift toward first-party data, which is information you have collected yourself. This change further increases the value of other, still acceptable sources of customer information, like the data you gather offline.

first-party data

An overview of the differences between first and third-party data.

Three components of a successful data onboarding strategy

When you start a program for onboarding your offline data, you’ll need to pay attention to several critical factors: accuracy, speed, privacy, and security.


Accuracy is crucial during identity resolution, where offline customer data gets matched and potentially merged with existing online data. Mistakes in this process can corrupt your customer profiles. Using this data in a downstream tool, like email marketing, push notifications, or advertising can lead to ineffective or even damaging campaigns.

Segment guarantees accuracy by using “deterministic” identity resolution:

“Deterministic is where you resolve identities based on what you know to be true. It merges new data into customer records by searching for matches among the phone numbers, emails, device IDs, and user IDs you already have. Deterministic identity resolution is a high-confidence approach using first-party data where you know with certainty that this user did that.”

We believe that the most accurate approach is deterministic for identity resolution because it uses first-party data your customers produce. The other method, probabilistic, uses predictive algorithms to understand who your customers likely are, but without complete certainty.


Today's customers are active 24 hours a day, 7 days a week. They might race through the customer journey on several devices in a matter of hours or even minutes. These conditions mean that merging your offline and online data at set intervals—say once per day—isn't sufficient anymore. Data needs to be continuously synced in real-time.

Segment Connections pulls data from sources you've set up immediately when there's activity. It also instantly delivers any new or changed information to destinations connected to Segment. This two-way, real-time synchronization ensures you always have the most up-to-date data across all your tools and channels, whether they're online or offline.

Privacy and security

Consumer expectations and government regulations require you to keep your customer data private and secure. When the information you've collected gets misused, stolen, or lost, the responsibility is primarily on you as the data collector.

This liability is especially critical to keep in mind during data onboarding when making mistakes is easy. You might, for example, transfer offline data that you’re not allowed to use into your online system. Or, data might get stolen or lost if you don't have secure data management policies for the people handling the information.

Segment’s Privacy Portal automatically creates an inventory of your customer data and keeps it up to date. It classifies Personally Identifiable Information (PII) as it comes in, so you only store data that local regulations allow and your customers have consented to.

Four benefits of data onboarding

Connecting your offline and online data has four main benefits for your business and marketing efforts.

A single view of the customer

A single customer view sits at the heart of your business. It gives all departments in your organization valuable information to enhance their performance, either through improved decision-making or more effective marketing campaigns. But you can only truly call this view "single" if it also includes your offline data—otherwise, you might have one view with a large blind spot.

unified user profiles

Segment automatically creates unified customer profiles from all the incoming sources you connect to your CDP.

Personalized customer journeys

You need to continuously relate to your customers and provide them relevant experiences at every stage of their journey. The more accurate data you have on your customers, the higher the degree of personalization you can offer them on their journey across your channels. Consequently, customers feel like you know and understand them, leading to higher loyalty and lifetime value (LTV).

Extended retargeting options

You have more opportunities for retargeting—showing ads based on people's previous behavior—when you have more customer data. You can target people with online ads based on their recent offline behavior when you continuously onboard offline data. Say someone purchased a product in your store yesterday. You can then target them with online ads to offer an upsell or a service subscription that complements their purchase.

Increased ROI on marketing spend

Including offline information in the data you use for your marketing campaigns increases the ROI on your marketing spend. As you base your ads on more complete customer profiles, they'll likely be more relevant and lead to higher conversions. You can also avoid showing ads to people for whom the ads are not relevant at all—a customer, for example, that indicated they're not interested in a specific product category while signing up for your loyalty program.

Segment: the most powerful data onboarding tool on the market

You can now eliminate a car's blindspots with a simple technological innovation: a camera on the back of your vehicle that connects to your rearview mirror. Segment does the same for your company's offline blind spot.

Segment Connections connects to hundreds of sources out of the box, including ones like Stripe and Salesforce that make it easy to gather data coming in from your stores or sales teams. You can also connect custom sources for any offline data that requires a manual or tailored approach.

Segment Sources

An overview of the hundreds of sources Segment connects to.

Segment can automatically do most of the identity resolution work required when you connect your offline data sources. And you can insert Segment into your existing processes and systems quickly and without much disruption.

Segment standardizes all the data you onboard through automated tracking plans as part of Segment Protocols and our Privacy Portal, so you ensure high-quality data and compliance with privacy regulations. It then uses the Personas feature to create or merge customer profiles from the imported information to create a single view of each customer.

The state of personalization 2023

The State of Personalization 2023

Our annual look at how attitudes, preferences, and experiences with personalization have evolved over the past year.

Frequently asked questions

By adding data collected offline to customer profiles, marketers have a more accurate and complete view of their customers. This enrichment enables improved personalization, more retargeting options for advertising, and a higher ROI on advertising spend.

Depending on the sources you want to include, you can automate the entire or most of the data onboarding process. If it involves information that's already digital but previously siloed—say in-store payment or loyalty card transactions—the process can be fully automated. If the data is in physical forms—like business cards or paper customer surveys—some manual work is involved in getting the information ready for transfer into Segment.

The purpose of data onboarding is to bring offline data online to have a complete and accurate view of your customers. Such a view enables marketing campaigns, personalization, and other business decisions and processes that rely on customer information.

Offline data onboarding connects and matches offline information with existing, online data, usually through a Customer Data Platform (CDP).

Recommended articles


Want to keep updated on Segment launches, events, and updates?