Introduction

Hugo Logo

If you run a Hugo site, you’ve probably ended up with a messy tag list at some point.

Duplicate tags, one-off singletons, forgotten categories… it all builds up over time and clutters your taxonomies. Hugo won’t clean them for you.

I wanted a way to see exactly which tags and categories I’ve used, how often, and where — so I built a quick little scanner. It’s great for spring cleaning your frontmatter—but fast, and without the sneezing.


Elevator pitch:

Tag Audit is a no-frills Python CLI that:

  • Counts tag and category usage across your content tree
  • Flags singletons (items used only once)
  • Shows which posts use which tag or category
  • Outputs in text, markdown, CSV, or JSON

Quickstart

🔗 View tag-audit on GitHub

# Install dep
python3 -m pip install pyyaml

# Basic run (from your Hugo repo root)
python3 tag-audit.py

# One file only
python3 tag-audit.py --file content/posts/metaclean.md

# Show which files use each tag (markdown output)
python3 tag-audit.py --by-tag --format markdown

# Top 20 tags/categories by count, with mappings
python3 tag-audit.py --top 20 --by-tag --by-cat

# Ignore drafts, only include items used ≥ 2 times
python3 tag-audit.py --ignore-drafts --min-count 2

Features

  • Scan entire Hugo content/ tree or a single file
  • Totals for tags and categories
  • Singleton detection (used only once)
  • Inverse mappings: files grouped by tag or category
  • Multiple output formats: text, markdown, csv, json
  • Filters: --min-count, --top, --ignore-drafts, --ext

Usage

tag-audit.py [OPTIONS]

Options (common):

  • --dir PATH — Path to Hugo content (default: ./content)
  • --file FILE — Scan a single file
  • --ignore-drafts — Skip drafts
  • --per-file — Show per-file usage
  • --by-tag — Show files grouped by tag
  • --by-cat — Show files grouped by category
  • --formattext (default), markdown, csv, json
  • --sort — Sort by count (default) or alpha
  • --min-count N — Only include items with count ≥ N
  • --top N — Limit to top N items

Sample Output

Tags
====
count  name
-----  ----------------
   21  cli
   14  python
   13  utilities
   10  bash
    5  unix
    4  fun
    4  sysadmin
    3  cromulent
    3  gibberish
    3  homelab
    3  productivity
    3  scripting
    2  hugo
    2  linux
    2  star-trek
    1  adminjitsu
    1  automation
    1  commodore-64
    1  zsh
    [...]
Total tags: 171

Categories
==========
count  name
-----  ---------------
   22  projects
   13  tools
    3  fun
    3  trivia
    2  guides
    2  workflow
    1  retrocomputing
    1  sysadmin
    1  tips
    [...]
Total categories: 61

Singleton tags (used only once)
===============================
count  name
-----  ------------------------
    1  adminjitsu
    1  automation
    1  commodore-64
    1  cryptography
    1  docker
    1  espanso
    1  fortune
    1  vim
    1  zsh
    [...]
Total singleton tags: 69

Singleton categories (used only once)
=====================================
count  name
-----  ------------------------------
    1  retrocomputing
    1  sandbox
    1  sysadmin
    1  troubleshooting
    [...]
Total singleton categories: 9

Files by Tag (excerpt)

linux

  • content/posts/kernel-tips.md
  • content/posts/bash-wizardry.md

docker

  • content/posts/compose-cleanup.md
  • content/posts/build-secrets.md

hugo

  • content/posts/theme-tweaks.md
  • content/posts/tag-audit-release.md

Conclusion

Whether you’re prepping a cleanup pass or just curious which tags are pulling their weight, tag-audit.py gives you a clear snapshot of your Hugo taxonomy.

PRs, issues, and suggestions welcome! feedback@adminjitsu.com