Eight Bits to Six Bits

Base-64 encoding takes a sequence of 8-bit bytes, breaks the sequence into 6-bit pieces, and assigns each 6-bit piece to one of 64 characters comprising the base-64 alphabet. The 64 possible output characters are common and safe to place in HTTP header fields. The 64 characters include upper- and lowercase letters, numbers, +, and /. The special character = also is used. The base-64 alphabet is shown in Table E-1.

Note that because the base-64 encoding uses 8-bit characters to represent 6 bits of information, base 64-encoded strings are about 33% larger than the original values.

Table E-1. Base-64 alphabet

0

A

8

I

16

Q

24

Y

32

g

40

o

48

w

56

4

1

B

9

J

17

R

25

Z

33

h

41

p

49

x

57

5

2

C

10

K

18

S

26

a

34

i

42

q

50

y

58

6

3

D

11

L

19

T

27

b

35

j

43

r

51

z

59

7

4

E

12

M

20

U

28

c

36

k

44

s

52

0

60

8

5

F

13

N

21

V

29

d

37

l

45

t

53

1

61

9

6

G

14

O

22

W

30

e

38

m

46

u

54

2

62

+

7

H

15

P

23

X

31

f

39

n

47

v

55

3

63

/

Figure E-1 shows a simple example of base-64 encoding. Here, the three-character input value “Ow!” is base 64-encoded, resulting in the four-character base 64-encoded value “T3ch”. It works like this:

  1. The string “Ow!” is broken into 3 8-bit bytes (0x4F, 0x77, 0x21).

  2. The 3 bytes create the 24-bit binary value 010011110111011100100001.

  3. These bits are segmented into the 6-bit sequences 010011, 110111, 01110, 100001.

  4. Each of these ...

Get HTTP: The Definitive Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.