dev101.io

Base56 Encode / Decode

Encode and decode Base56 — no-confusables alphabet, safe to type by hand.

Loading tool…

How to use Base56 Encode / Decode

  1. Choose Encode or Decode at the top of the tool.
  2. Paste your raw text (Encode mode) or Base56 string (Decode mode) into the left pane.
  3. Watch the output update live. Invalid characters in Decode mode are flagged with position and a hint if they're a known confusable.
  4. Press ⌘/Ctrl + Enter to swap modes and feed the last output back — the fastest way to sanity-check a round-trip.
  5. Use Copy output to grab the result, or Share to produce a URL that restores your session.

Base56 Encode / Decode

A lossless Base56 converter built for the cases where a human will read, type, or say the output. Paste any text or Base56 string, pick a mode, copy the result — every transform runs in your browser using TextEncoder / TextDecoder and native BigInt arithmetic, so nothing leaves the page.

Why Base56

Base56 exists because O and 0 look identical in almost every font, 1 and I and l are a well-known trio, and lowercase o is a sneaky fourth offender for 0. Any encoding that a human will eventually read, type, or speak is one transcription error away from a support ticket. Base56 solves the problem at the alphabet level: the six most confusable characters are simply never produced, so there's nothing to mistake.

The alphabet is 23456789 for digits (no 0, no 1), A-Z minus I and O, and a-z minus l and o. That's 8 + 24 + 24 = 56 characters, deliberately chosen so no pair of printed characters looks the same at a glance — even in a rushed scan on a phone screen or a smudged thermal receipt.

Where Base56 wins

  • Activation and voucher codes. A 12-character Base56 code encodes over 61 bits of entropy — more than enough for a single-use coupon or product activation key — without the "is that an I or a 1?" support calls.
  • Printed receipts and packing slips. Thermal print can blur an O into a 0 or an l into a 1 after a week in a wallet. Base56 survives the blur.
  • Verbal sharing. Reading a Base56 string over the phone is reliable in a way that a raw hex or Base64 code simply isn't. No "I as in India" clarifications needed.
  • Screenshots and OCR. When a customer photographs a code on a sign, even the best OCR will confuse 0/O and 1/l. Base56 removes the ambiguity before it starts.

How the encoding works

Same strategy as Base62, Base58, and other big-int variable-length encodings: interpret the input bytes as a single big-endian unsigned integer, then divide repeatedly by 56 to emit digits. Two details keep the round-trip lossless:

  • Leading zero bytes become leading 2 characters (2 is the first character of the alphabet). Without this, encoding [0x00, 0xff] and [0xff] would collapse to the same output. The encoder counts leading zero bytes and prefixes one 2 per zero byte; the decoder does the reverse.
  • UTF-8 normalisation. Text input runs through TextEncoder first, so emoji, CJK ideographs, and accented Latin all survive a round-trip without any special handling.

Encode to Base56

Paste any text into the left pane in Encode mode. The output is a string drawn only from the 56-character alphabet — guaranteed no 0, 1, O, I, l, or o. For an N-byte input the Base56 encoding is approximately N × log₂(256) / log₂(56) ≈ N × 1.377 characters, so it's about 3% longer than Base62 and 9% longer than Base64. That's the cost of removing six characters from the alphabet, and it's nearly always worth it when the output is going to touch a human.

Decode from Base56

Decode mode accepts any Base56 string and reconstructs the original byte sequence, then decodes it as UTF-8 text. If the decoded bytes aren't valid UTF-8 (a Base56-encoded SHA-256 digest, for instance), the tool surfaces a typed error rather than silently returning replacement glyphs.

Three specific error conditions produce helpful, actionable messages:

  • Confusable characters. If you paste a string containing 0, 1, O, I, l, or o, the error not only points at the exact position — it also reminds you that Base56 deliberately omits those characters. That's the most common transcription error, and the one worth catching loudly.
  • Other invalid characters. Anything outside the alphabet (a stray $, whitespace in the middle of the string) produces an error with the offending character and its index.
  • Non-UTF-8 payload. When the decoded bytes aren't valid UTF-8, the error says so explicitly — the tool never quietly replaces bad bytes with U+FFFD.

What this tool deliberately doesn't do

  • Custom alphabets. There are many "no-confusables" alphabets in the wild — some drop u and v, some keep O but drop Q. This tool ships one opinionated choice so that encode and decode always agree. If you need a different alphabet, the transform module is small enough to fork.
  • Checksums. A Base56 string is just a reversible encoding, not a tamper-proof code. If you want the code to reject itself on a typo, pair this with a Luhn-style check digit or a Crockford Base32 checksum — that's a separate problem from confusable removal.
  • Case folding. Upper and lower case carry information — A and a are different characters in the alphabet. If your downstream system folds case (some printers, some OCR pipelines), restrict the input to one case and lose roughly 5 bits per character. For case-insensitive human input, Base32 is usually a better fit.

Privacy promise

Every byte of your input stays in your browser. No analytics on input content, no server logging, no third-party scripts in the transform path. The Share button encodes state into the URL fragment — which browsers never send to servers — so shared links preserve the same guarantee.

Related tools

  • Base62 Encode / Decode — the compact, URL-safe alphanumeric encoding when humans won't be reading the output.
  • Base64 Encode / Decode — the standard 64-character encoding used in email, data URIs, and HTTP headers.
  • Base32 Encode / Decode — case-insensitive, phone-friendly 32-character alphabet; a good alternative when your channel drops case.

Frequently asked questions

What makes Base56 different from Base62 or Base58?

Base56 is Base62 with every visually confusable character removed. `0` and `O` drop out, `1`, `I`, and `l` drop out, and lowercase `o` drops out for symmetry — leaving exactly 56 characters. Base58 (the Bitcoin alphabet) removes only `0`, `O`, `I`, and `l`, which is enough to disambiguate fixed-width hashes but still leaves lowercase `o` looking suspiciously like `0`. Base56 is the pedantic, humans-first choice when someone will actually read or type the result — activation codes, receipt references, voucher numbers, verbal share codes over the phone.

When should I reach for Base56 instead of Base62?

Anywhere a human is in the loop. If a code will be printed on a receipt, read out loud over a phone call, typed into a form, or transcribed from a screenshot, Base56 removes the most common transcription errors before they happen. If the code lives entirely in a URL or a database row and never touches a human, Base62 is shorter and simpler. The 10% length overhead of Base56 is a fair trade for "zero support tickets about `l` vs `1`."

Why drop `o` but not `c` or `e`?

Lowercase `o` is dropped to avoid any ambiguity with `0` — without `0` in the alphabet, `o` alone is unambiguous, but the moment someone writes the code on paper, the shape is indistinguishable. Characters like `c` and `e` have distinctive ascenders, descenders, or curves that stay readable in most handwriting. The goal isn't to eliminate every character that's ever been misread; it's to remove the classic offenders — `0`/`O`, `1`/`I`/`l`, and `o` — that cause the overwhelming majority of transcription errors.

Can Base56 encode any binary data, or only text?

Any bytes. The encoder treats the input as a big-endian unsigned integer, so it round-trips arbitrary byte sequences including UUIDs, hashes, encryption keys, and protocol buffers. The tool itself pipes string input through UTF-8 first, so emoji and CJK characters work transparently in the browser UI. If you need to encode raw bytes (not text), import `encodeBytes` / `decodeToBytes` from the module directly — they take and return `Uint8Array`.

Is the Base56 alphabet standardised anywhere?

Not formally — there is no RFC for Base56. Different tools pick slightly different character sets depending on which confusables they consider worth dropping. This tool uses the alphabet `23456789 ABCDEFGHJKLMNPQRSTUVWXYZ abcdefghijkmnpqrstuvwxyz`, which is the most common "drop all confusables" variant and matches several well-known voucher/coupon generators. Because Base56 is unstandardised, always pair encoding and decoding with the same alphabet — a Base56 string from one tool will not decode correctly in another unless the alphabets match exactly.

Related tools