Base64 is one of those things every developer uses but few truly understand. You've seen it in JWTs, data URIs, email attachments, and API payloads. But why does it exist? When should you use it? And what are the common pitfalls?
What Base64 Actually Does
Base64 converts binary data into a text-safe representation using 64 characters: A-Z, a-z, 0-9, +, and / (with = for padding).
The encoding takes every 3 bytes of input (24 bits) and splits them into 4 groups of 6 bits. Each 6-bit value maps to one of the 64 characters. This means Base64 output is always ~33% larger than the input.
Input: "Hi" → 01001000 01101001 (2 bytes, 16 bits)
Padded: 01001000 01101001 00000000 (3 bytes, 24 bits)
Split: 010010 000110 100100 000000
Mapped: S G k =
Output: "SGk="
The = padding indicates the last group had fewer than 3 input bytes.
Why Base64 Exists
The fundamental problem: many protocols and formats only handle text characters safely. Binary data (images, files, compressed data) contains bytes that can break text-based systems:
- Null bytes (
0x00) terminate strings in C and many protocols - Control characters (bytes 0-31) can be misinterpreted by terminals and parsers
- High bytes (128-255) break ASCII-only systems
Base64 solves this by representing any binary data using only safe, printable ASCII characters.
Real-World Use Cases
1. Data URIs
Embedding small images directly in HTML or CSS:
<img src="data:image/png;base64,iVBORw0KGgo..." />
.icon { background-image: url(data:image/svg+xml;base64,PHN2Zy...); }
When to use data URIs:
- Icons under 2KB — eliminates an HTTP request
- Critical images that must load with the page
- Email templates (many email clients block external images)
When NOT to use data URIs:
- Images over 10KB — Base64 increases size by 33%, and you lose browser caching
- Repeated images — each occurrence duplicates the full encoded string
- Dynamic content — URLs can be cached, data URIs cannot
2. JWT Tokens
JWTs use a variant called Base64URL where + becomes -, / becomes _, and padding = is stripped. This makes the encoded string safe for URLs without percent-encoding.
Standard Base64: SGVsbG8gV29ybGQ+Pw==
Base64URL: SGVsbG8gV29ybGQ-Pw
When you decode a JWT, you're Base64URL-decoding the header and payload segments, then parsing the resulting JSON.
3. API Payloads
REST APIs that accept binary data (file uploads, images, documents) often use Base64 encoding in JSON:
{
"filename": "report.pdf",
"content": "JVBERi0xLjQKMSAwIG9iago8PAovVHlwZS...",
"content_type": "application/pdf"
}
The alternative is multipart form data, which is more efficient but harder to work with in some API clients.
4. Email Attachments (MIME)
Email protocols are text-based. Binary attachments are Base64-encoded in MIME format:
Content-Type: application/pdf
Content-Transfer-Encoding: base64
JVBERi0xLjQKMSAwIG9iago8PAovVHlwZS...
This is why email attachments make messages ~33% larger than the original file.
5. Configuration Files
Storing binary data in config files (Kubernetes secrets, CI/CD variables):
apiVersion: v1
kind: Secret
data:
tls.crt: LS0tLS1CRUdJTi...
tls.key: LS0tLS1CRUdJTi...
Common Mistakes
Confusing Encoding with Encryption
Base64 is encoding, not encryption. It provides zero security. Anyone can decode a Base64 string instantly. Never use Base64 to "hide" sensitive data.
Base64 encode "password123" → "cGFzc3dvcmQxMjM="
Base64 decode "cGFzc3dvcmQxMjM=" → "password123"
If you need to protect data, use actual encryption (AES, RSA) and then Base64-encode the encrypted output if you need text representation.
Double Encoding
A surprisingly common bug: encoding something that's already Base64, or URL-encoding a Base64 string that's already URL-safe.
Original: "Hello"
Base64: "SGVsbG8="
Double-encoded: "U0dWc2JHOD0=" ← Now you need to decode twice
If your decoded output looks like Base64, you probably double-encoded somewhere.
Not Handling Padding Correctly
Some systems strip the = padding, others require it. When interoperating between systems:
- Standard Base64: Requires padding (
=or==) - Base64URL (JWTs): Strips padding
- Many libraries: Accept both with and without padding on decode
If decoding fails, try adding padding: append = until the string length is a multiple of 4.
Using Base64 for Large Files
Base64 increases data size by 33%. For a 10MB file, that's 13.3MB encoded. Transferring, storing, and processing that extra data adds up:
- Network bandwidth wasted
- JSON parsing slows down with large strings
- Memory usage spikes during encode/decode
For files over a few hundred KB, use proper binary transfer (multipart upload, presigned URLs, or streaming).
Base64 Variants
| Variant | Characters | Padding | Used In |
|---|---|---|---|
| Standard | A-Z, a-z, 0-9, +, / | = | Email (MIME), PEM certificates |
| URL-safe | A-Z, a-z, 0-9, -, _ | Optional | JWTs, URLs, filenames |
| XML-safe | A-Z, a-z, 0-9, ., - | N/A | XML identifiers |
Base32 and Base58
Base32 uses only uppercase letters and digits 2-7. It's case-insensitive and avoids confusing characters. Used in TOTP (Google Authenticator) codes.
Base58 (used in Bitcoin addresses) removes visually ambiguous characters: 0, O, I, l, +, /. The goal is reducing transcription errors when humans type encoded values.
Performance: Browser vs Server
Encoding and decoding Base64 in JavaScript is fast for reasonable data sizes:
btoa()/atob(): Built-in browser functions. Handle strings only (not binary).btoafails on characters outside Latin1.Buffer.from(str, 'base64'): Node.js. Handles binary data correctly.TextEncoder+ manual conversion: For binary data in the browser.
For files under 1MB, client-side Base64 encoding completes in milliseconds. For larger files, consider using Web Workers to avoid blocking the UI thread.
The Practical Takeaway
Base64 is a transport encoding — it makes binary data safe for text-based systems. Use it when you need to embed binary data in JSON, HTML, CSS, or config files. Avoid it when you have access to binary transport mechanisms.
And remember: it's encoding, not encryption. If you can see the Base64 string, you can see the original data.