Migrate from IAB Content Taxonomy 2.x to 3.0 (Free Open-Source Mapper + CLI Tool)

Migrate from IAB Content Taxonomy 2.x to 3.0 in minutes with this free open-source mapper. Runs locally, offers a demo UI, pip/npm CLI, and AI-powered methods (TF-IDF, BM25, KNN, LLM re-rank) for accurate results.
Migrate from IAB Content Taxonomy 2.x to 3.0 (Free Open-Source Mapper + CLI Tool)

Migration from IAB Content Taxonomy 2.x to 3.0 is here — and it’s not optional. Whether you’re an SSP, DSP, publisher, or brand safety vendor, keeping your classification pipelines compliant with IAB’s latest standards is table stakes.

Today we’re releasing the open-source IAB Taxonomy Mapper — a free tool that makes upgrading painless. It’s open source, runs locally, and leverages multiple AI techniques to deliver accurate, confidence-scored mappings.


What is the IAB Taxonomy Mapper?

The IAB Mapper is a utility for converting 2.x category codes into 3.0 equivalents.

  • Upload a CSV/JSON (UI) or run the CLI on your machine.
  • The mapper uses multiple methods (from exact label match to LLM re-ranking) to identify the best 3.0 category.
  • You get back a mapped file with codes, labels, confidence scores, and methods.
IAB Tech Lab Content Taxonomy
The Content Taxonomy provides a “common language” that can be used when describing content for contextual targeting and brand safety.

It’s designed for:

  • Adtech platforms ensuring compliance
  • Brand safety vendors improving classification accuracy
  • Publishers tagging CTV/video content at the scene level

Why does this matter?

  • IAB 3.0 expands coverage → from ~400 categories to 1,500+
  • Mandatory compliance → 2.x is being deprecated
  • Manual migration is painful → errors = rejected ads, lost revenue
  • Automation is key → mapper gives deterministic, reproducible results

In short: this tool saves teams weeks of manual work and ensures you’re 3.0-ready.


How it Works

At the heart of the mapper is a pipeline of methods that progressively improve the match between 2.x and 3.0:

1. Exact Label Match

If the label is identical across versions, it maps instantly.

  • Example: “Sports” → “Sports”

2. TF-IDF / BM25 (keyword similarity)

For slight text variations, the mapper uses classical information retrieval methods to find the best match.

  • Example: “Food & Drink” → “Cooking & Recipes”

3. Vector KNN (semantic similarity)

When labels differ significantly, categories are embedded into vector space. A k-nearest neighbor search finds semantically similar concepts.

  • Example: “Tobacco” → “Smoking & Vaping”

4. LLM Re-ranking (Ollama)

For ambiguous cases, a local LLM (via Ollama) re-ranks top candidates based on contextual cues.

  • Example: “Fitness Shows” could match “Health & Fitness” vs “Sports Entertainment” — LLM helps choose correctly.
Ollama
Get up and running with large language models.


Using the Demo UI

You don’t need to install anything — just head to mxp.co/taxonomy.

  • Drag-and-drop a CSV or JSON of 2.x categories.
  • See results in a table with confidence scores and mapping methods.
  • Filter by confidence threshold or “unmatched only.”
  • Export mapped results to CSV/JSON.

Using the CLI

For devs and ops teams, the CLI is the fastest way to integrate mapping into pipelines.

Install:

pip install iab-mapper

Available via https://pypi.org/project/iab-mapper/

Run a mapping job:

iab-mapper map input.csv --out mapped.csv

📄 Example Input (CSV):

code,label
1-4,Sports
2-12,Food & Drink
5-9,Tobacco

📄 Example Output (CSV):

input_code,input_label,output_code,output_label,confidence,method,notes
1-4,Sports,2-3-18,Team Sports,0.94,exact_code,
2-12,Food & Drink,3-5-2,Cooking & Recipes,0.89,label_match,
5-9,Tobacco,,, , ,No equivalent in 3.0


Advanced Features

  • Runs entirely locally — no data leaves your machine.
  • Open source — MIT licensed, extend however you want.
  • Configurable thresholds — set minimum confidence cutoff.
  • Batch processing — map thousands of rows at once.
  • Version-aware — command to fetch the latest official IAB taxonomies.

Use Cases

  • Ad exchanges: ensure all ads classify cleanly in 3.0
  • Brand safety tools: better mapping for “sensitive” categories
  • Publishers: precise tagging for video/CTV catalogs
  • Data scientists: experimenting with taxonomy enrichment + semantic methods

Benefits Recap

  • Save weeks of manual migration
  • Deterministic + reproducible mappings
  • Confidence-scored outputs for human review
  • Open-source, transparent, and extensible
  • Built for the ecosystem, not vendor lock-in

Get Started


About the author
Ethan Steininger

Ethan Steininger

Former lead of MongoDB's Search Team, Ethan noticed the most common problem customers faced was building indexing and search infrastructure on their S3 buckets. Mixpeek was born.

Mixpeek Engineering Blog

Deep dive into multimodal AI, data processing, and best practices from our engineering team.

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to Mixpeek Engineering Blog.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.