What Is Base64 Encoding?
Base64 is a binary-to-text encoding scheme that converts binary data into a string of printable ASCII characters. It takes any sequence of bytes — an image, a PDF, a chunk of raw binary — and represents it using only 64 safe characters: A–Z, a–z, 0–9, +, and /, with = used for padding.
The purpose of Base64 is simple: many communication protocols (like email, HTTP headers, and JSON) were designed to handle text, not raw binary. Base64 bridges that gap by encoding binary data into a text-safe format that can be transmitted through text-only channels without corruption.
Base64 is an encoding, not encryption. It does not protect or secure data in any way. Anyone can decode a Base64 string instantly. Never use Base64 as a security measure — it provides zero confidentiality.
How Base64 Encoding Works
Base64 encoding converts every 3 bytes (24 bits) of binary data into 4 characters of ASCII text. Here is the step-by-step process:
Convert Input to Binary
Take the input data and represent it as a stream of bytes. For example, the ASCII string Hi! becomes three bytes: 72, 105, 33 — or in binary: 01001000 01101001 00100001.
Split into 6-Bit Groups
Concatenate all the bits and split them into groups of 6 (instead of the usual 8). Our 24 bits become four 6-bit groups:
Binary: 01001000 01101001 00100001
6-bit: 010010 000110 100100 100001
Decimal: 18 6 36 33Map to Base64 Characters
Each 6-bit value (0–63) maps to a character in the Base64 alphabet:
Index: 0-25 → A-Z
Index: 26-51 → a-z
Index: 52-61 → 0-9
Index: 62 → +
Index: 63 → /
Our values: 18=S, 6=G, 36=k, 33=h
Result: "SGkh"Handle Padding
Base64 processes 3 bytes at a time. If the input length is not a multiple of 3, padding characters (=) are added to the output:
| Input Bytes | Remaining | Padding | Example |
|---|---|---|---|
| 3 (or multiple) | 0 extra | None | Hi! → SGkh |
| Leaves 1 extra byte | 1 byte | == | H → SA== |
| Leaves 2 extra bytes | 2 bytes | = | Hi → SGk= |
Encoding and Decoding in Practice
Every major programming language and platform provides built-in Base64 functions. Here are examples in the most common environments:
JavaScript (Browser & Node.js)
// Encoding a string
const encoded = btoa("Hello, World!");
console.log(encoded); // "SGVsbG8sIFdvcmxkIQ=="
// Decoding back
const decoded = atob("SGVsbG8sIFdvcmxkIQ==");
console.log(decoded); // "Hello, World!"
// For binary data / Unicode (modern approach)
const bytes = new TextEncoder().encode("Hello 🌍");
const base64 = btoa(String.fromCharCode(...bytes));
// Node.js Buffer approach
const buf = Buffer.from("Hello, World!");
const b64 = buf.toString("base64");
const original = Buffer.from(b64, "base64").toString("utf-8");Python
import base64
# Encode
encoded = base64.b64encode(b"Hello, World!")
print(encoded) # b'SGVsbG8sIFdvcmxkIQ=='
# Decode
decoded = base64.b64decode(encoded)
print(decoded) # b'Hello, World!'
# Encode a file
with open("image.png", "rb") as f:
file_b64 = base64.b64encode(f.read()).decode("ascii")Command Line
# Linux / macOS
echo -n "Hello, World!" | base64 # Encode
echo "SGVsbG8sIFdvcmxkIQ==" | base64 -d # Decode
# Encode a file
base64 image.png > image_b64.txt
# Windows PowerShell
[Convert]::ToBase64String([Text.Encoding]::UTF8.GetBytes("Hello"))
[Text.Encoding]::UTF8.GetString([Convert]::FromBase64String("SGVsbG8="))Common Use Cases for Base64
Base64 encoding appears in more places than you might expect. Here are the most important real-world applications:
Data URIs (Inline Images in CSS/HTML)
Data URIs let you embed small files directly in HTML or CSS, eliminating an HTTP request:
/* CSS background with embedded image */
.icon {
background-image: url("data:image/png;base64,iVBORw0KGgo...");
}
<!-- HTML inline image -->
<img src="data:image/svg+xml;base64,PHN2ZyB4bWxu..." alt="icon" />Data URIs are best for very small assets — icons under 2-4 KB. For larger images, the 33% size overhead of Base64 and the loss of browser caching make regular image files a better choice. Also consider using inline SVG directly instead of Base64-encoding it.
Email Attachments (MIME)
Email protocols (SMTP) are text-based and cannot transmit raw binary. MIME encoding uses Base64 to embed attachments — PDFs, images, spreadsheets — inside email messages. Every attachment you have ever sent via email was Base64-encoded behind the scenes.
API Payloads (JSON)
JSON does not support binary data natively. When APIs need to transmit files, thumbnails, or binary blobs within JSON payloads, Base64 encoding is the standard approach:
{
"filename": "report.pdf",
"content_type": "application/pdf",
"data": "JVBERi0xLjQKMSAwIG9iago8PC..."
}Authentication Headers
HTTP Basic Authentication encodes the username and password as a Base64 string in the Authorization header:
// "user:password" → Base64
Authorization: Basic dXNlcjpwYXNzd29yZA==Other Use Cases
- JWTs (JSON Web Tokens) — The header and payload sections are Base64URL-encoded
- Cryptographic data — Keys, certificates, and signatures are often represented in Base64 (PEM format)
- Binary data in XML — Similar to JSON, XML uses Base64 for binary content
- URL-safe encoding — Base64URL replaces
+with-and/with_for safe use in URLs
The 33% Size Overhead
Base64 encoding is not free. Because it converts 3 bytes into 4 characters, the encoded output is always approximately 33% larger than the original binary data. This is an inherent mathematical property of the encoding — you cannot avoid it.
| Original Size | Base64 Size | Overhead |
|---|---|---|
| 1 KB | ~1.33 KB | +0.33 KB |
| 10 KB | ~13.3 KB | +3.3 KB |
| 100 KB | ~133 KB | +33 KB |
| 1 MB | ~1.33 MB | +0.33 MB |
This overhead matters when you are considering using data URIs for images or Base64 for large API payloads. For small assets (a few KB), the overhead is negligible. For large files, it becomes significant and you should use binary transfer methods instead.
Base64-encoded data URIs also bypass browser caching. If you inline a 50 KB image as Base64 in your CSS file, that 66 KB of Base64 text is downloaded every time the CSS is fetched — it cannot be cached independently. Use regular image files for anything above a few kilobytes.
When NOT to Use Base64
Base64 is a tool with specific use cases. Here is when you should avoid it:
- Large files — Do not Base64-encode images, videos, or documents larger than a few KB for web embedding. Use proper file serving with caching headers instead.
- Security — Base64 is not encryption. Do not use it to "hide" sensitive data like passwords, tokens, or personal information. Anyone can decode it in seconds.
- Database storage — Storing Base64-encoded files in a database is almost always worse than storing binary blobs or using file system / object storage (like S3).
- Compression — Base64 actually makes data harder to compress because it replaces binary patterns with a narrower character set. Compress first, then encode if needed.
- Performance-critical paths — Encoding and decoding Base64 consumes CPU cycles. In high-throughput systems, prefer binary protocols (gRPC, WebSocket binary frames).
🎯 Key Takeaways
- Base64 converts binary data to ASCII text using a 64-character alphabet, processing 3 bytes into 4 characters
- It exists to safely transmit binary data through text-only channels (email, JSON, HTTP headers)
- The encoding adds a 33% size overhead — keep this in mind for large payloads
- Common uses include data URIs, email attachments, API payloads, JWTs, and HTTP Basic Auth
- Base64 is NOT encryption — it provides zero security and anyone can decode it instantly
- Avoid Base64 for large files, database storage, or anywhere binary transfer is available
- Base64URL is a URL-safe variant that replaces
+and/with-and_
Conclusion
Base64 encoding is a fundamental piece of web infrastructure that most developers interact with regularly — often without realizing it. Understanding how it works, when to use it, and when to avoid it will help you make better architectural decisions.
Use Base64 when you need to embed small binary assets in text-based formats. Avoid it when binary transfer is available, when performance matters, or when you are tempted to treat it as a security measure. It is an encoding — a way to represent data — and nothing more. Used appropriately, it is an essential tool in your developer toolkit.