What is i18n? How Can I Be a Resource for Engineers I Work With?
Demystifying the Tech Side of Product Globalization
If you're working in localization, you've probably come across the term i18n (short for internationalization) but you might not have had the chance to learn what it actually means, why it’s important, and how you can help your organization do it better.
This post is your quick-start guide to i18n: what it is, why it matters, which tools developers can lean on (like Unicode libraries), and how localization professionals can play a vital role in improving i18n outcomes across your product.
Alsoooooo two bonus learning opps: 1) If you are attending LocWorld Malmo this year I will be giving a talk on How Localization Teams Can Use Unicode to Solve i18n Issues I would love to see you there! 2) My June Deep Dive will take this post and expand it into a full guide on how localization teams can implement strategies to improve i18n and be the go-to resource for their teams (I’ll be including a bunch of resources in the Deep Dive). Subscribe for access, it’s the price of buying me a coffee a month and helps me justify spending saturdays writing these posts.
First, what the heck is i18n?
For starters…we have to talk about this strange naming system. Whoever started globalization was really into alphanumeric abbreviations, or leet code. Which just means you take the first and last letters of a word, and replace the number of characters in between with the number. You can tell we are all a lot of fun people - the life of the party.
Internationalization = The process of making an application work internationally.
Internationalization (i18n) comes first and is the groundwork that enables all the subsequent steps. I18n allows the codebase to support multiple languages (unicode encoding whenever possible, setting up locale frameworks, extracting strings for translations) and also enables the code to support regional preferences (date/time, number formats, currency formats, address fields, etc) Think of it this way: i18n sets the coding up so that localization can happen.
Once i18n is done the next step is l10n. Which is the process of adapting an application to a particular local market.
Localization: Translation, Language style (tone/voice), UX/UI front end customizations, content adaptation, Sound (music, includes voiceover for video, subtitles), Graphics/symbols/icons/colors, etc
Of course we all know that many companies don’t know about i18n and they start with translation (not even a full localization), but soon they run into i18n issues and they don’t always know how to fix them. OR they build custom solutions not knowing that standardized solutions exist that they could make use of.
Visual of what falls under the umbrella of Globalization -> Internationalization (or i18n), then Localization (or l10n) and finally translation (or t9n).
When done right, i18n means:
Your app could support any language, including right-to-left scripts like Arabic or Hebrew.
Text expands or contracts without breaking layouts.
Fonts, encodings, sorting orders, and number/date formats are all handled correctly.
You don’t hardcode anything that varies by language or region (text, images, even logic sometimes).
Done wrong? You'll face broken layouts, incorrect character display, garbled text, and costly re-engineering work. That’s why i18n matters, and why localization professionals can and should be active contributors to getting it right.
Unicode: The Bedrock of Modern i18n
Unicode is the global standard for encoding and handling text. It ensures that when you type “こんにちは” or “مرحبا” or “🌍,” those characters appear correctly, are stored properly, and can be transmitted and rendered across systems consistently. Before Unicode, companies like Apple and Adobe were figuring this out on their own, creating their own systems and logic to expand their tech products to global markets. “Uni” (one) “code” is a global standard that ensures that the encoding for an Arabic character shows the same on an apple phone as it does on an android phone. So you can think of it as encoding standards that everyone should adhere to so that their products can interact with the many global products around the world.
For most people, if they’re heard of Unicode that is the extent of their knowledge. Kind of like how people assume “localization” is just “translation.” But Unicode is much more than encoding standards and character sets. It also comes with a suite of libraries and tools to help engineers handle complex i18n requirements.
There are MANY Unicode resources and libraries that you can and should learn about, and I cannot possibly cover all of them today. But to get you started, here are some key resources to know about and to share with your engineering teams.
Key Unicode Libraries
There are three main ICU libraries. ICU = international components for unicode. Pause. I can hear you asking “time out, what’s a component?” A component is a self-contained, reusable piece of code that represents a specific part of a user interface (UI) or application functionality. For front end development (front end = what you see in an app or on a page) a component would be a piece of the UI, like a button, a form, a header, or even an entire page. For backend (the encoded pieces of an app that you don’t see) a component could be any modular part of a system, like a service, database access layer, or an API module. The important thing to know about components is that it is best practice to use components in code. This is because they enable reusability (write once, use many times), maintainability (each part can be updated or fixed without touching everything else) and they keep work/concerns separate, which keeps code cleaner and easier to work with.
Ok, back to Unicode. So there are three main component libraries (ICU = International Components for Unicode). The three libraries are in three different common coding languages. Each of these can be adapted to work in other coding languages as well. And the components that you can get from these libraries (remember, they are like modules, you can use them all or just a few) help solve for common internationalization issues.
ICU4C : ICU for C and C++
ICU4J : ICU for Java
ICU4X : ICU for Rust (and GitHub repository)
Unicode is used by ALL the large tech companies (Google, Netflix, Microsoft, Apple, Meta, etc) and all of those tech companies (and many more medium and smaller sized tech companies) contribute to these libraries, which are always being adapted to meet current needs. Knowing that over half of the world’s population is on Meta apps alone for example, this means that key issues that users experience around the world are surfaced to Unicode through their contributions and subsequently handled through the use of Unicode components. Add to that the contributions of other companies that are members of Unicode like Netflix, Microsoft, Apple etc, and you can see how Unicode will be able to identify and handle most of the i18n issues users experience around the world.
It’s like a huge knowledge share with the goal of making the international experience for users flawless!
So how do you know which libraries to use when? I put together the following chart to give you an overview of the three libraries (which doubles as a cheat sheet for use cases):
Bonus: Using ICU4X for Javascript and Typescript
These libraries handle critical i18n functions like:
Locale-aware string comparison and sorting.
Plural rules and gendered message formatting.
Date, time, number, and currency formatting.
Script and character handling (e.g., normalization, bidirectional text).
Lesser-Known Gems
In addition to the above, here are some other resources + libraries you need to know about and elevate to your team:
CLDR (Common Locale Data Repository): The world’s largest repository of locale-specific data, powering ICU and used by most modern systems.
Unicode Technical Reports: These define standards for things like line breaking, emoji grouping, and more. They may not be “libraries,” but understanding them can help you understand why something behaves the way it does (and craft solutions).
I highly suggest checking out this specific Unicode technical report doc on line breaking for Asian languages and sharing it with your engineering teams!
BudouX is a lightweight, machine-learned line-breaking library that helps fix one of the thorniest issues in multilingual UI: automatic line breaks for languages without whitespace between words, like Japanese, Simplified Chinese, Traditional Chinese, and Thai. It’s also available for front end and backend (python, javascript and java respectively).
BudouX is a really cool solution because it predicts natural breakpoints in sentences using an ML model, improving text readability in narrow UI containers (e.g., mobile cards, buttons). It's a great example of how a specialized tool can solve a language-specific UX problem that Unicode libraries don’t directly address.
Ok cool, but how can Localization Professionals use this knowledge to improve i18n?
Even though i18n is a developer-focused domain, localization professionals are uniquely positioned to help spot issues early and scale knowledge across teams.
Here are 5 ways you can be a resource (In June’s Deep Dive for paid subscribers I will delve into all of these into more detail!):
Shared vision: Ok, if you followed my Change Management series on LinkedIn, you know that the first step to creating organizational change is a shared understanding of the problem. With this in mind the strategy I find MOST HELPFUL with engineering teams is to have a i18n Bug Bash. Whats a bug bash? Heres a guide showing how you can set one up and run it. In June’s Deep Dive I will go into this in more detail (subscribe to paid to get access, it’s the cost of buying me a coffee a month!). ALSO, if you run a bug bash, make sure to reward participants with snacks, because people show up for snacks :)
Partner with QA: Ideally you want to know what the key i18n issues are that your end users are experiencing. The best team to parter on this is with QA, however if you have a LQA flow with a language vendor you can also work with them as well.
The goal is to document i18n bugs during testing (e.g., layout issues, encoding errors, incorrect plural forms). The way I have done this in the past is to have testers assign a label (or some other form of metadata) to the Jira tickets or issues. These labels will define which type of i18n issue is coming up in testing (i.e. date/time/number/currency issue, text expansion, line break, right to left issue, etc)
Once you start collecting this data you want to try to track them over time—quantifying these issues helps build the case for prioritizing i18n. You can also make a clear argument that this creates a poor end user experience. Depending on the type of issue and prevalence of the issue, it could be a major problem for your end users.
Encourage teams to adopt static analysis tools that catch hardcoded strings or missing locale support at build time.
Create an i18n Education Deck
Build a slide deck or Notion doc explaining key i18n principles and Unicode tools.
Share real examples of i18n failures to show why it matters (possibly from the bug bash).
Find the right places for these educational resources to live (engineering onboarding, Localization FAQs, pinned resources in slack channels, etc)
Get familiar with these tools
Familiarize yourself with ICU, CLDR, and BudouX. You don’t need to code, but you should understand what these tools do and how they fit into the tech stack.
Create a Unicode cheat sheet summarizing what each library handles (or use the table I created above, I did the heavy lifting for you!).
Be a Bridge
Get in front of the right people to position yourself as a resource! Join product/engineering slack channels and meetings where UI or infrastructure is discussed. Flag potential i18n risks before code gets written, shoutout resources they can use to avoid i18n issues, present solutions to UI issues.
If you are able to, translate localization challenges into engineering terms (e.g., “this needs to support bidirectional text” → “use logical vs. visual ordering”). Don’t worry, as your knowledge grows this will get easier.
Closing Thoughti18n isn’t just an engineering problem, it’s a product quality issue that touches UX, accessibility, scalability, and even revenue. The more you understand how it works, the better equipped you’ll be to help your teams ship global-ready experiences.
And with Unicode and tools like BudouX in your toolkit, plus the right partnerships across QA and engineering, you can be the go-to localization pro who helps bridge the gap between language and code!
What’s Next? And subscribe for Deep Dives + more focused resources
Hope this was a useful one, I had a lot of fun putting it together! In the next two posts, I’ll go more in depth in two areas where I have grown my technical skills and been able to apply them in a meaningful ways in my roles:
Next week I will release the April Deep Dive for paid subscribers: Let’s Talk Localization Infrastructure: What Do I Need to Know? This Deep Dive will be a full on nerdy lesson summarizing what I’ve learned from 2 paid courses I took on infrastructure and sharing it in a practical way related to Localization specifically (i.e. how I’ve applied it and how you can too).
The first week of May: How do I position my engineering asks? How do I build a case for what I need that will be successful?
At the end of this month, I’ll release the deep dive I mentioned above for paid subscribers: “Let’s Talk Localization Infrastructure: What Do I Need to Know?” And my next free post will be on positioning engineering asks (which I know is a tough one!). Subscribing is the cost of grabbing me a coffee, and it helps justify me spending a few Saturday afternoons creating these structured lessons.
What did you think? Let me know in the comments and/or connect with me on LinkedIn!