Base64 Encoding Explained: When, Why, and How Developers Use It

Base64 is one of those things every developer uses but few truly understand. You've seen it in JWTs, data URIs, email attachments, and API payloads. But why does it exist? When should you use it? And what are the common pitfalls?

What Base64 Actually Does

Base64 converts binary data into a text-safe representation using 64 characters: A-Z, a-z, 0-9, +, and / (with = for padding).

The encoding takes every 3 bytes of input (24 bits) and splits them into 4 groups of 6 bits. Each 6-bit value maps to one of the 64 characters. This means Base64 output is always ~33% larger than the input.

Input:  "Hi"  →  01001000 01101001 (2 bytes, 16 bits)
Padded: 01001000 01101001 00000000 (3 bytes, 24 bits)
Split:  010010 000110 100100 000000
Mapped: S      G      k      =
Output: "SGk="

The = padding indicates the last group had fewer than 3 input bytes.

Why Base64 Exists

The fundamental problem: many protocols and formats only handle text characters safely. Binary data (images, files, compressed data) contains bytes that can break text-based systems:

Null bytes (0x00) terminate strings in C and many protocols
Control characters (bytes 0-31) can be misinterpreted by terminals and parsers
High bytes (128-255) break ASCII-only systems

Base64 solves this by representing any binary data using only safe, printable ASCII characters.

Real-World Use Cases

1. Data URIs

Embedding small images directly in HTML or CSS:

<img src="data:image/png;base64,iVBORw0KGgo..." />

.icon { background-image: url(data:image/svg+xml;base64,PHN2Zy...); }

When to use data URIs:

Icons under 2KB — eliminates an HTTP request
Critical images that must load with the page
Email templates (many email clients block external images)

When NOT to use data URIs:

Images over 10KB — Base64 increases size by 33%, and you lose browser caching
Repeated images — each occurrence duplicates the full encoded string
Dynamic content — URLs can be cached, data URIs cannot

2. JWT Tokens

JWTs use a variant called Base64URL where + becomes -, / becomes _, and padding = is stripped. This makes the encoded string safe for URLs without percent-encoding.

Standard Base64: SGVsbG8gV29ybGQ+Pw==
Base64URL:       SGVsbG8gV29ybGQ-Pw

When you decode a JWT, you're Base64URL-decoding the header and payload segments, then parsing the resulting JSON.

3. API Payloads

REST APIs that accept binary data (file uploads, images, documents) often use Base64 encoding in JSON:

{
  "filename": "report.pdf",
  "content": "JVBERi0xLjQKMSAwIG9iago8PAovVHlwZS...",
  "content_type": "application/pdf"
}

The alternative is multipart form data, which is more efficient but harder to work with in some API clients.

4. Email Attachments (MIME)

Email protocols are text-based. Binary attachments are Base64-encoded in MIME format:

Content-Type: application/pdf
Content-Transfer-Encoding: base64

JVBERi0xLjQKMSAwIG9iago8PAovVHlwZS...

This is why email attachments make messages ~33% larger than the original file.

5. Configuration Files

Storing binary data in config files (Kubernetes secrets, CI/CD variables):

apiVersion: v1
kind: Secret
data:
  tls.crt: LS0tLS1CRUdJTi...
  tls.key: LS0tLS1CRUdJTi...

Common Mistakes

Confusing Encoding with Encryption

Base64 is encoding, not encryption. It provides zero security. Anyone can decode a Base64 string instantly. Never use Base64 to "hide" sensitive data.

Base64 encode "password123" → "cGFzc3dvcmQxMjM="
Base64 decode "cGFzc3dvcmQxMjM=" → "password123"

If you need to protect data, use actual encryption (AES, RSA) and then Base64-encode the encrypted output if you need text representation.

Double Encoding

A surprisingly common bug: encoding something that's already Base64, or URL-encoding a Base64 string that's already URL-safe.

Original:        "Hello"
Base64:          "SGVsbG8="
Double-encoded:  "U0dWc2JHOD0="  ← Now you need to decode twice

If your decoded output looks like Base64, you probably double-encoded somewhere.

Not Handling Padding Correctly

Some systems strip the = padding, others require it. When interoperating between systems:

Standard Base64: Requires padding (= or ==)
Base64URL (JWTs): Strips padding
Many libraries: Accept both with and without padding on decode

If decoding fails, try adding padding: append = until the string length is a multiple of 4.

Using Base64 for Large Files

Base64 increases data size by 33%. For a 10MB file, that's 13.3MB encoded. Transferring, storing, and processing that extra data adds up:

Network bandwidth wasted
JSON parsing slows down with large strings
Memory usage spikes during encode/decode

For files over a few hundred KB, use proper binary transfer (multipart upload, presigned URLs, or streaming).

Base64 Variants

Variant	Characters	Padding	Used In
Standard	A-Z, a-z, 0-9, +, /	=	Email (MIME), PEM certificates
URL-safe	A-Z, a-z, 0-9, -, _	Optional	JWTs, URLs, filenames
XML-safe	A-Z, a-z, 0-9, ., -	N/A	XML identifiers

Base32 and Base58

Base32 uses only uppercase letters and digits 2-7. It's case-insensitive and avoids confusing characters. Used in TOTP (Google Authenticator) codes.

Base58 (used in Bitcoin addresses) removes visually ambiguous characters: 0, O, I, l, +, /. The goal is reducing transcription errors when humans type encoded values.

Performance: Browser vs Server

Encoding and decoding Base64 in JavaScript is fast for reasonable data sizes:

btoa() / atob(): Built-in browser functions. Handle strings only (not binary). btoa fails on characters outside Latin1.
Buffer.from(str, 'base64'): Node.js. Handles binary data correctly.
TextEncoder + manual conversion: For binary data in the browser.

For files under 1MB, client-side Base64 encoding completes in milliseconds. For larger files, consider using Web Workers to avoid blocking the UI thread.

The Practical Takeaway

Base64 is a transport encoding — it makes binary data safe for text-based systems. Use it when you need to embed binary data in JSON, HTML, CSS, or config files. Avoid it when you have access to binary transport mechanisms.

And remember: it's encoding, not encryption. If you can see the Base64 string, you can see the original data.