What does it take to achieve good data at enterprise scale?

By Seth Familian

You’ve probably heard that having “high-quality” data is critical for enterprise success. It drives trustworthy analytics, reliable automations, and measurable business impact like revenue growth and customer retention. But what ensures good data—especially at scale?

As a Solutions Architect helping customers implement Segment, I’ve found that achieving high-quality data always boils down to three key ingredients: standardization, ownership, and agility.

In this post, you’ll learn why data is worth standardizing, two models of ownership for driving data standards at your company, and how to stay agile in the process.

Why standardize?

Let’s say your company runs a SaaS app on web, iOS, and Android. If you don’t pay attention to data standards, you run the risk of measuring the same events (like Signed In or Step Completed) with slightly different spellings, hyphenation, property names, and values on each platform:

asset Eh4bFwgiegbFj9JD

There’s a lot of inconsistent data in the table above:

  • Website and Android use spaces in event names, while iOS uses hyphens

  • Website and iOS use camelCased property names, while Android uses snake_case

  • Website uses lowercase property values, while iOS uses Title Case and Android uses Title Case or integers

As a result of these inconsistencies, you can’t accurately compare the same event across platforms. To fix this problem you need standardized data—which ensures that…

asset 42M9th3zqUEc57ow

While these issues can be automatically detected with Segment’s Protocols product, it’s still important that your organization stays focused on ensuring this consistency even during the data planning process. Doing so drives a number of benefits for yourself, your team, and your organization:

  • Data science and IT won’t waste hours or days performing “retroactive ETL” to normalize otherwise inconsistent property values.

  • Product, engineering, and BI will produce reports with greater clarity and consistency when exploring the data in analytics and dashboarding tools.

  • Marketing will build more accurate automations and audiences, which will lead to higher ROI and ROAS.

  • The C-Suite will view your product and performance metrics as trustworthy and reliable. And that trust will cascade down through all levels of the organization, erasing the suspicion that those great (or problematic) outcomes shown in reporting “must be due to bad data.”

As your standardized data gains trust throughout your organization, it’ll also become easier to onboard new brands and products onto your tracking framework. Ultimately, this paves the way for unified analytics across teams and business units. This shared framework will become a common language for employees across teams—whether in BI, marketing, product, finance, sales, or engineering—to more easily communicate and collaborate with one another. 

How to standardize?

So how do you achieve organizational data Zen? By standardizing ownership of your data framework through people and not just a data dictionary. Don’t get me wrong—data dictionaries and solid documentation are critical for driving successful adoption of any data framework. That’s why Segment encourages all of its customers to build a robust tracking plan. But having the right technology and people in place to advocate for that framework—and to enforce it—is what really makes all the difference in the world. 

Two models of ownership: The Wrangler & The Champions

asset 1Yj0rs3YGUEa3RPQ

In our experience helping thousands of companies onboard to Segment, we’ve found that two basic models of ownership can each drive successful adoption of data standards across an organization. Neither of these frameworks is inherently “better” than the other, and their efficacy all depends on the nature of your organizational culture. So with that in mind, let’s explore each.The Wrangler is the white hat standards sheriff in the wild west of your organization’s data management. This individual (usually there’s only one Wrangler) typically:

  • Owns the authorship of data standards, 

  • Instructs product, engineering, and marketing managers on those data standards, 

  • Oversees and approves the creation and revision of all tracking plans, 

  • Monitors the Segment workspace for violations, and 

  • Holds each team accountable for any data inconsistencies that might arise. 

The Wrangler is especially good for organizations who rely on a sole “Directly Responsible Individual” (DRI) to drive change management initiatives or for organizations with strongly hierarchical models and reporting structures. Within these organizations, the Wrangler reinforces accountability to a unified, standardized model of data reporting. And while the Wrangler might often be seen as the data “Bad Cop,” they can be quite effective in their role as long as all data standards and violations monitoring flows through them. 

The Champions model fosters the development of a series of more enthusiastic and positive-minded Wranglers throughout the organization. As a result, this model helps address the one big downside to the Wrangler model: that standards and violations monitoring rests upon the shoulders of one person. In contrast, Champions act to collectively educate on and enforce data standards. This model is more useful for matrix organizational structures or “flatter” hierarchies which have many teams reporting up to a large executive team. 

Each functional group within the organization—such as product, marketing, sales, and finance—has its own “Champion” responsible for buying-in to the organization’s data standards, and advocating for their team’s needs. In doing so their teammates are more likely to abide by the standards framework since they know their voice can be easily represented on the larger “council” of Champions. This council can also help collectively steer improvements to the company’s common schema and data standards, meeting periodically to review change requests. 

While the Champions model seems potentially idyllic, it’s a structure that only works for the most collaborative and interconnected organizations. Applying a Champions model to a more hierarchical company might result in slowdowns and frustration in efforts to build consensus. 

Embrace agility

Regardless of which ownership model you adopt, being agile and open to constant change is critical to your data governance and standardization strategy. The initial hypotheses posited by the first versions of your data standards might be disproven over time—and if they do, that’s okay! Here are some of the easiest ways to stay agile with your data standards development:

  • Periodically send a “data standard satisfaction survey” to all relevant stakeholders—from engineers and product managers to marketers and analysts—so you can take an organizational temperature check on the efficacy of the data standard. 

  • Conduct a quarterly data standard review either on your own (if you’re the Wrangler) or with all Champions to brainstorm and evaluate adjustments that will make your data increasingly useful and consistent. 

  • Consider the implications of changing the standard before introducing those changes, so you’ll avoid wasting engineering time on retroactive ETL or other potential headaches.

Ready for good data?

Here at Segment we’re always looking to deliver useful products, tooling, and processes to help customers standardize and optimize their data. Our infrastructure helps organizations of every size take a proactive approach to good data by helping them plan standards thoughtfully, monitor easily, and enforce effortlessly. That’s why we believe good data is Segment data. Ready to standardize your data with Segment? Reach out. We’re happy to discuss how we can help! 

The state of personalization 2023

The State of Personalization 2023

Our annual look at how attitudes, preferences, and experiences with personalization have evolved over the past year.

Recommended articles


Want to keep updated on Segment launches, events, and updates?