What Are They Talking About? Data & Analytics Part 1
Tech, localization, and global strategy - decoded.
Note: All of the images in today’s article feature works by artist Nathalie Miebach, who tells data stories through installations, woven sculptures and musical scores/ performances. I highly recommend checking out her work, and seeing one of her exhibits in person (esp if you are in the Boston area)!
She’s Coming on Strong by Nathalie Miebach, 2011 Paper, wood, data. Description of the work: This piece is a musical score that tracks the paths of both Hurricane Grace and the Halloween Storm, which together created the “Perfect Storm”
Have you ever been in a meeting where someone says,
“Let’s pull the metrics from the last A/B test and check the funnel drop-off,”
and you’re sitting there thinking… “funnel what now?”
You’re not alone. Data vocabulary sounds so cool, and it’s also so intimidating! My first time working with a data scientist I had to ask the most basic questions so I could understand what he was talking about, which later evolved into how I could apply those learnings to my work and my own reporting.
Data and analytics terms describe an entire ecosystem of tools and workflows that power every decision in tech. Understanding how data is captured, recorded, and interpreted (and what kinds of data people are actually talking about) is essential if you want to follow the “data stories” that drive decisions.
But just as importantly, it helps YOU create and tell your own data stories: how your work moves the needle, what impact you’re making, and why it matters.
So in this edition of What Are They Talking About?, I’m breaking down the key Data & Analytics vocabulary you’ll hear in product, engineering, and leadership discussions. My goal is to help you understand these terms, so you can work to pull more data into your own work and reporting.
Metric vs. KPI
Let’s start with two words you’ll hear constantly in tech: metrics and KPIs.
At first, they might sound like the same thing - both involve numbers, right? But there’s an important difference:
A metric is any measurable number that helps you understand what’s happening.
A KPI (Key Performance Indicator) is a specific metric you’ve chosen to track progress toward a goal.
Think of it like this:
All KPIs are metrics, but not all metrics are KPIs.
→ Example:
“Number of words translated this week” is a metric, because it tells you what’s happening.
“Reduce average time to localization by 20% this quarter” is a KPI, because it’s a goal-driven metric that shows success or progress toward an outcome.
Metrics help you observe what’s happening.
KPIs help you focus on what matters most, as you work to achieve a specific goal.
You can have hundreds of metrics across your systems, but your KPIs are the handful you highlight in reports, dashboards, or presentations. These are the metrics that tell the story of whether your team is succeeding.
Tip: When you hear someone say “What’s the KPI for this?” they’re really asking, “How will we know this is working?”
How are metrics tracked?
Now, here’s where it gets tricky (and really important for localization and international work): how data is broken out and tracked.
In many companies, data teams track language and country data differently — and that can change how metrics are interpreted.
For example:
One team might measure user activity by language used (like English, Spanish, or Japanese).
Another might group users by geo-location (US, Spain, Japan).
And another might only track key markets separately (say, US, France, Japan, Brazil) and lump everything else into “ROW” — Rest of World.
These differences matter. If one team’s “Japanese users” means “users in Japan,” and another’s means “users using Japanese,” your numbers may not line up.
That’s why it’s so important to ask how data is tracked and defined, and when possible, influence those definitions so that reporting is consistent across teams. This ensures that your metrics (and especially your KPIs) are comparing apples to apples, not apples to kiwis and dragon fruit
Questions to ask your data or analytics partners
Here are some practical questions that can help you dig deeper and align on data definitions:
How are users grouped, by language, locale, or country?
Do we have separate fields for UI language and user location, or are they merged?
Which markets are tracked individually, and which are grouped under ROW (Rest of World)?
Are we tracking by browser language, device settings, or account preferences?
If a user changes their language mid-session, how is that recorded?
Are there any known gaps or inconsistencies in how international data is collected?
For KPIs involving localization or content usage, how can we ensure we’re comparing the same dimensions across teams?
Even if you’re not the one pulling the data, these kinds of questions show that you understand how metrics are shaped, AND they help ensure that the story your data tells is accurate, consistent, and actionable. It also positions you as a reliable partner who is bringing your expertise to meet theirs, so that you can work together!
Comfort Zones II by Nathalie Miebach, 2022, paper, string, data. Description of the work: This piece explores the temperature range outdoors that was comfortable to (the artist) while having to wear a mask. Using data from Boston, this piece translates data from May ;21 - Jan ’22: cloud cover, temperature, vaccinator status of every state as of February 22, monthly weather anomalies, Covid-19 variants, and US Covid-19 infection and death rates
Event Tracking
An event is any action a user takes in your product, like clicking a button, completing a form, or opening a page.
Teams use event tracking to understand user behavior.
→ Example: “We added event tracking for language switch usage to see how many users actually change their app language.”
Data Pipeline
A pipeline is the system that moves data from where it’s generated (like your app or platform) to where it’s stored and analyzed (like a database or dashboard).
It often includes three steps: extract, transform, and load (ETL, more on this below).
→ Example: “Our localization metrics pipeline pulls data from multiple systems and normalizes it so we can track turnaround time across vendors.”
ETL (Extract, Transform, Load)
This is the backbone of most analytics systems:
Extract: Pull data from different sources
Transform: Clean or standardize it
Load: Move it into a database or data warehouse
→ Example: “We run ETL jobs nightly to update translation cost data in the dashboard.”
Data Warehouse
A data warehouse stores structured data (meaning it’s been cleaned, organized, and formatted into tables with defined columns - like in a spreadsheet or database).
It’s optimized for analysis and reporting.
→ Example:
Your localization dashboard might pull from a data warehouse that organizes metrics like:
languagecountrytime_to_translationcost_per_word
Analysts and PMs love warehouses because the data is tidy and easy to query, which is perfect for dashboards and business decisions.
Data Lake
A data lake stores raw, unstructured, or semi-structured data (logs, images, JSON files, even text dumps). It’s a big, flexible “pool” where all kinds of data can flow in before being cleaned or transformed.
It’s optimized for storage and exploration, not quick analysis.
→ Example:
Imagine you have millions of raw translation job logs, MT output files, and QA comments. You might store all of that in a data lake, where data scientists can later process it to find patterns (like where MT quality drops or certain errors repeat).
How Data Lakes + Warehouses Work Together
In many companies, the data flow looks like this:
Data Lake → ETL (Extract, Transform, Load) → Data Warehouse → Dashboard
The data lake collects everything in its raw form.
ETL jobs clean and structure that data.
The data warehouse stores the cleaned version.
The dashboard visualizes the KPIs you care about.
You can think of the data lake as a giant filing cabinet full of everything, and the data warehouse as the organized report binder that people actually use to make decisions.
Dashboard
A dashboard is a visual display of key metrics and KPIs, often updated in real time.
Think of it as a control center for tracking progress.
→ Example: “Our localization dashboard shows time-to-translation, cost per word, and vendor SLA compliance at a glance.”
Cohort
A cohort is a group of users who share something in common, like the month they signed up or the market they’re from. Analyzing cohorts helps identify trends over time.
→ Example: “We saw higher retention in cohorts where the app launched with localized onboarding flows.”
Understanding How Cohorts Are Defined
From a localization perspective, it’s extremely important to understand how cohorts are defined so that you can better understand the data (or influence the way it is collected and stored). These questions help you uncover what data is grouped together and whether it aligns with how you think about markets and languages.
How are cohorts currently defined, by country, language, locale, or market segment?
Are language and country tracked separately, or are they tied together (e.g., “Spanish users in Spain” vs. “Spanish users anywhere”)?
When users switch languages or travel, do they stay in the same cohort or move to a new one?
Are there custom cohorts for key markets (like Japan, Germany, or Brazil), or is the rest of the world grouped together as ROW (“Rest of World”)?
How do we define “new users” vs. “returning users” within each locale?
Exploring Behavior & Engagement by Locale
Once you understand how cohorts are defined, here are some questions you can ask to dig deeper and better understand where/how you can use cohort data.
Note: I defined these questions to help you see how localization (or missing localization) affects engagement, but you can use these as inspiration to look into other metrics as well.
Can we look at cohort retention or churn by locale or country?
Do certain cohorts (e.g., users in non-English locales) have lower engagement after onboarding?
Are there differences in conversion rates or feature adoption between localized and non-localized experiences?
Can we see whether localized content correlates with higher repeat usage or subscription renewals?
How do cohorts behave after a new market launch or after we add localization for that language?
Solar Beginnings of Everything that Changes by Nathalie Miebach, 2008, reed, wood, data. Description of the work: This piece translates data related to recorded ecological changes observed at Herring Cove Beach (Cape Cod) during October ’07 - March ’08.
What’s Next? And subscribe for Deep Dives + *even more* focused resources
That’s what I’ve got for this first installment, I hope you walk away from this post feeling more confident in your technical vocabulary! You don’t need to be a data scientist to understand analytics or contribute to it. But knowing the words lets you ask better questions, plan more effectively, and show up with confidence in technical spaces.
If you enjoyed this one, stay tuned for Part 2 (which will release in a few weeks), where I’ll dive into the next layer: funnels, A/B testing, data visualization, and data governance.
And for paid subscribers, next week’s Deep Dive Data Pipelines for Humans: How Information Flows Shape What We See will take this a step further. I’ll walk through how data actually moves from product events to dashboards, and why that journey shapes what we see (and don’t see) in our reports. I’m including a one-page visual diagram, a simple template to map your own team’s data flow, and a few practical prompts to spark better conversations with your data partners.
If that sounds useful, subscribe to get it straight to your inbox! Thanks for being here, I really love putting these together.




