Base64 vs Base64URL: The Tiny Difference That Breaks Tokens
There's a class of bug that makes senior engineers feel stupid: your JWT validation fails intermittently. Sometimes the token works. Sometimes it doesn't. The payload looks fine when you decode it by hand. The signature is correct. And yet, every few hundred requests, your auth middleware throws a fit.
Nine times out of ten, the culprit is a single misplaced function call. Somewhere in the pipeline, someone called btoa() when they should have called the URL-safe variant, or vice versa. The characters look almost identical. The output is almost correct. Almost.
Let's dissect exactly what's different, why it matters at a mechanical level, and what happens when you mix the two.
The Original Problem: Binary in Text-Only Channels
Base64 was invented to solve a specific problem: you have arbitrary binary data (a file, a hash, a blob of encrypted bytes) and you need to send it through a channel that only speaks printable ASCII — think email in the 1980s, or old MIME attachments. The algorithm takes every 3 bytes of input, treats them as a 24-bit number, then splits that into four 6-bit groups. Each 6-bit group indexes into a 64-character alphabet.
The original Base64 alphabet, standardized in RFC 4648 §4, uses these 64 characters:
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/
When the input length isn't a multiple of 3, padding = signs are appended to make the output length a multiple of 4.
This worked beautifully for email. Then the web happened.
Why Two Characters Ruin Everything in URLs
URLs have a specific grammar. RFC 3986 defines a set of characters that are reserved — meaning they carry structural meaning in a URL and cannot appear literally in a query string or path segment without being percent-encoded. Two of the characters from the standard Base64 alphabet are on that reserved list: + and /.
+ has a particular double life: in application/x-www-form-urlencoded (the format used when HTML forms submit), a + is shorthand for a space character. So a Base64 string like abc+def round-trips through a form submission as abc def. Your token is now corrupted without any error.
/ is worse. In URL path segments, / is the path delimiter. If your Base64-encoded value contains a forward slash and you embed it in a path — say /tokens/eyJhbGc/verify — the server sees three path segments: tokens, eyJhbGc, and verify. Your router matches a completely different route handler. Or returns a 404. Neither is helpful.
The = padding character also causes trouble. Some URL parsers treat = as a key-value separator, some proxy servers strip trailing equals signs from query parameters, and older systems round-trip it unreliably.
Base64URL: Three Changes, One Purpose
RFC 4648 §5 defines the URL-safe variant, often called Base64URL, Base64url, or base64url depending on who you ask. The modification is minimal but complete:
+is replaced with-(hyphen)/is replaced with_(underscore)- Padding
=is omitted entirely
Both - and _ are defined as "unreserved characters" in RFC 3986. They can appear anywhere in a URL without percent-encoding, without ambiguity, and without being interpreted as delimiters. Dropping the padding works because the decoder can always infer the original length from the encoded string length — padding was only ever a convenience, not a necessity for decoding.
The result is a string that travels safely through URLs, HTTP headers, cookie values, and HTML attribute values without transformation.
JWTs Live Entirely in Base64URL
JSON Web Tokens are the most common place developers encounter this distinction under pressure. A JWT has three parts, each Base64URL-encoded and joined with periods:
eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkFsaWNlIn0.SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c
Every single one of those segments is Base64URL-encoded — no padding, no +, no /. The header, the payload, and critically the signature. If you decode the signature using standard Base64 instead of Base64URL, you get the wrong bytes. Your HMAC verification fails. Your RSA signature is invalid. The JWT library rejects the token.
What makes this painful is that many standard Base64 libraries produce valid-looking output that's subtly wrong. Consider this Node.js code:
// Wrong: uses standard Base64, which may produce + and / characters
const encoded = Buffer.from(signature).toString('base64');
// Correct: converts to Base64URL
const encoded = Buffer.from(signature)
.toString('base64')
.replace(/\+/g, '-')
.replace(/\//g, '_')
.replace(/=/g, '');
Or in Python, the difference between modules:
import base64
# Wrong for JWTs
standard = base64.b64encode(data)
# Correct for JWTs and URLs
url_safe = base64.urlsafe_b64encode(data).rstrip(b'=')
The urlsafe_b64encode function handles the character substitution, but you still have to strip the padding yourself. That trailing .rstrip(b'=') is easy to forget, and when you do forget it, tokens work most of the time — only failing when the encoded length happens to need padding.
The Intermittent Failure Pattern
Here's why Base64/Base64URL confusion produces that maddening intermittent behavior rather than consistent failures:
Whether a Base64-encoded value contains + or / depends entirely on the actual byte values being encoded. If you're encoding a random 32-byte secret, roughly 1 in 32 output characters will be a + or /. Most tokens won't have them. A meaningful fraction will.
Whether padding = appears depends on whether the input length is divisible by 3. One-third of inputs produce one padding character, one-third produce two, and one-third produce none.
So if you're encoding JWTs with standard Base64: about two-thirds of tokens work. The ones that happen not to contain reserved characters pass through fine. The rest fail. You spend hours debugging. The bug only reproduces on some users, some of the time, depending on their randomly-assigned user IDs or session tokens.
Where Each Encoding Belongs
The practical rule is straightforward once you internalize it:
Standard Base64 belongs in: MIME email attachments, PEM certificates (those -----BEGIN CERTIFICATE----- blocks), binary file encoding in JSON APIs where the value won't be embedded in a URL, basic authentication headers (Authorization: Basic), and anywhere the encoded value is transmitted as an opaque blob through a channel that handles arbitrary bytes.
Base64URL belongs in: JWTs and anything in the OAuth/OIDC ecosystem, URL query parameters, URL path segments, HTML form values, cookie values, any token intended to be shared via link, and hash fragments.
A reliable heuristic: if the encoded value will ever appear in an address bar, in a link, in a log that gets clicked, or in an HTTP header parsed as a URL — use Base64URL.
Detection and Conversion
If you're handed an encoded string and aren't sure which variant it is, look for +, /, or =. Their presence indicates standard Base64. Their absence doesn't prove it's Base64URL (a short token might just not need those characters), but it's a signal.
Converting between them is arithmetic, not re-encoding:
// Standard Base64 → Base64URL
function toBase64URL(standard: string): string {
return standard
.replace(/\+/g, '-')
.replace(/\//g, '_')
.replace(/=/g, '');
}
// Base64URL → Standard Base64 (for libraries that require padding)
function fromBase64URL(urlSafe: string): string {
const padded = urlSafe + '==='.slice((urlSafe.length + 3) % 4);
return padded.replace(/-/g, '+').replace(/_/g, '/');
}
The padding reconstruction formula — '==='.slice((length + 3) % 4) — adds the minimum number of = signs needed to make the string length a multiple of 4. It's one of those formulas worth memorizing.
Hash Converters and the Same Problem
SHA-256 hashes encoded as Base64 run into the same issue when used as cache keys, ETags, content-addressable IDs, or integrity verification tokens in URLs. A SHA-256 digest is 32 bytes, which encodes to 43 Base64URL characters (no padding, since 32 bytes → 43 chars, and 44 would require one = in standard Base64). If you encode that digest with standard Base64 and embed it in a URL, you get precisely one character of padding and a potential + or / character to corrupt the value.
The Web Crypto API in browsers actually returns this correctly — SubtleCrypto.digest() gives you raw bytes, and if you encode them for a URL, you should use a Base64URL encoder. But many tutorials demonstrate using btoa() on the result, which produces standard Base64. The tutorials work in most demos because demos rarely hit the character combinations that expose the bug.
One Concrete Test to Add to Your Suite
The most reliable way to catch these bugs is to test with a value that guarantees problematic characters. For Base64, the byte sequence [0xFB, 0xFF, 0xFE] encodes to +//+ in standard Base64 and --_- in Base64URL. If you're building any token generation or verification code, include this byte sequence in your test fixtures:
const problematic = new Uint8Array([0xFB, 0xFF, 0xFE]);
// Standard: +//+ (breaks in URLs)
// URL-safe: --_- (safe everywhere)
Run your encoding pipeline against it. If the output contains + or / and the output is destined for a URL, you've found a bug before your users did.
The Takeaway
Base64 and Base64URL are two variants of the same algorithm, differing in three characters. The difference is invisible in most data, fatal in some, and intermittent in a way that resists easy debugging. The rule is simple: anything that will live in a URL uses Base64URL. Everything else can use standard Base64. The hard part isn't knowing the rule — it's knowing which category your specific use case falls into before you write the code.
JWTs, OAuth tokens, and shared links are URLs in disguise. Treat them accordingly.