simons blog

The Bijection Between Natural Numbers and Binary Strings

Computers fundamentally operate on bits and sequences of bits. It is therefore useful to study the associated mathematical structure of the concept of sequences of bits. In this blogpost we give a brief introduction to the mathematical concept of finite binary strings and it's relation to the natural numbers.

We write a binary string of length n as x=x1x2...xn where xi{0,1}.

We define the concatenation of two binary string x=x1...xn and y=y1...ym as xy=x1...xny1...ym.

Let B*={ϵ,0,1,00,01,11,000,...} the set of all finite binary strings with ϵ the empty string. We call l the length of a binary string, i.e. l(00)=2.

Let B:N+B* be the function that takes a number and returns its binary representation, i.e. B(4)=100.

B by itself won't give us a bijection because we can't match binary strings like ϵ or 00.

To get a bijection we have to use the following function from N0 to B*

n:=B(n+1)2:l where l=log2(n+1)+1

This function simply adds one to the number n, than represents n as a binary and takes the substring from the second character on. The first character is always a one and we drop it.

For example we have 3=B(4)2:3=00 or 0=B(1)2:1=ϵ where we use the convention that a slice xi:j with i>j is empty.

We now proof that the mapping above is indeed bijective:

Injective: Suppose x=y, this would mean B(x+1)2:lx=B(y+1)2:ly. This can only be the case when l(x)=l(y). But than equality would mean that B(x+1)=B(y+1) because both have the same leading one as first char. That implies x+1=y+1 because the binary representation of a number is unique. Therefore we can conclude that n is injective.

Surjective: Let x be an arbitrary binary string. Concatenate this string with 1 from the left side and call this string y. The natural number which fulfils B(n+1)=y is the number that fulfils n=x.

More formally we may define the inverse as

x1=b(1x)1 where b(x)=i=1l(x)2l(x)i[[xi=1]]

b:B*N+ is the natural mapping that interprets the binary string 1x as a natural number.

Example:

x=101, l(101)=3

b(101)=231·1+232·0+233·1=5.

Note that 0b(x) because the min value is obtained iff all xi are 0

Note that b(x)2l(x)i=1l(x)2i=2l(x)(12l(x))=2l(x)1 because the maximum value is obtained iff all xi are 1

We have b(1x)=2l(x)+b(x) and therefore 2l(x)1b(1x)1. Furthermore we have b(1x)12l(x)+2l(x)11=2l(x)+12.

This gives us an effective bound for x1:

2l(x)1x12l(x)+12

This shows that the range in which a number associated with a binary string is determined solely by the length of the binary string.

Note that the above bijection is often used in literature such that we write l(n) for l(n) and x+1 instead of x1+1 if clear from the context. It also gives us a way to enumerate all binary strings.