Cantor set

Klaus Kähler Holst

Oct 16, 2020 4 min read

The Cantor ternary set is a remarkable subset of the real numbers named after German mathematician Georg Cantor who described the set in 1883. It has the same cardinality as $R$ , yet it has zero Lebesgue measure.

The set can be constructed recursively by first considering the unit interval $C_{0} = [0, 1]$ . In the next step, we divide the set into three equal parts and discard the open middle set. This leads to the new set $C_{1} = [0, \frac{1}{3}] \cup [\frac{2}{3}, 1]$ . This procedure is repeated on the two remaining subintervals $[0, \frac{1}{3}]$ and $[\frac{2}{3}, 1]$ , and iteratively on all remaining subsets such that $C_{n} = \frac{1}{3} C_{n - 1} \cup (\frac{2}{3} + \frac{1}{3} C_{n - 1})$ . The Cantor set is defined by the limit as $n \to \infty$ , i.e., $C = \cap_{n = 0}^{\infty} C_{n}$ .

The recursion can be illustrated in python in the following way

  import numpy as np

  def transform_interval(x, scale=1.0, translation=0.0):
      return tuple(map(lambda z: z * scale + translation, x))

  def Cantor(n):
      if n==0: return {(0,1)}
      Cleft = set(map(lambda x: transform_interval(x, scale=1/3), Cantor(n-1)))
      Cright = set(map(lambda x: transform_interval(x, translation=2/3), Cleft))
      return Cleft.union(Cright)

  print(Cantor(1))

{(0.6666666666666666, 1.0), (0.0, 0.3333333333333333)}

The following code displays the first 5 iterations of the algorithm and clearly illustrates the self-similarity property in the sets $C_{n}$ .

  def intervalprint(x, n=240):
     sym = n*[' ']
     val = np.linspace(0, 1, num=n)
     for interval in x:
	idx = np.where((val >= interval[0]) & (val <= interval[1]))[0]
	for i in idx:
           sym[i] = u'\u2500'
     st = ''.join(sym)
     print(st)

  x = list(map(Cantor, [0,1,2,3,4,5]))
  for i in range(0, len(x)): intervalprint(x[i])

────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
────────────────────────────────────────────────────────────────────────────────                                                                                ────────────────────────────────────────────────────────────────────────────────
───────────────────────────                           ──────────────────────────                                                                                ──────────────────────────                           ───────────────────────────
─────────         ─────────                           ────────         ─────────                                                                                ─────────         ────────                           ─────────         ─────────
───   ───         ───   ───                           ───   ──         ───   ───                                                                                ───   ───         ──   ───                           ───   ───         ───   ───
─ ─   ─ ─         ─ ─   ─ ─                           ─ ─    ─         ─ ─   ─ ─                                                                                ─ ─   ─ ─         ─    ─ ─                           ─ ─   ─ ─         ─ ─   ─ ─

It is clear that $C$ is bounded (being a subset of the unit interval) and closed since it constructed by removing countably many open intervals from a closed interval. It also has zero measure, meaning that it does not contain any intervals.

To see this, note that in the first step the open middle set $(\frac{1}{3}, \frac{2}{3})$ is removed which has measure 1/3. From the remaining two closed sets, $[0, \frac{1}{3}]$ and $[\frac{2}{3}, 1]$ , the open middle sets are removed in the next iteration: $(\frac{1}{3^{2}}, \frac{2}{3^{2}})$ and $(\frac{2}{3} + \frac{1}{3^{2}}, \frac{2}{3} + \frac{2}{3^{2}})$ , each of measure $\frac{1}{3^{2}}$ . In the $n$ th iteration $2^{n} - 1$ open intervals are removed each of length $\frac{1}{3^{n}} .$ Hence, the Lebesgue measure of the Cantor set is $m (C) = m ([0, 1]) - \sum_{k = 0}^{\infty} \frac{2^{n}}{3^{n + 1}} = 1 - \frac{1}{3} \sum_{k = 1}^{\infty} {(\frac{2}{3})}^{n} = 1 - \frac{1}{3} (1 - \frac{2}{3})^{- 1} = 0.$

In this sense, we can think of the Cantor set just containing “dust”.

It can also be shown that the Cantor set can be represented as $C = {\sum_{i = 1}^{\infty} \frac{x_{n}}{3^{n}} ∣ x_{n} \in 0, 2},$ from which it again follows that the set is uncountable. This can also be seen by considering an arbitrary countable subset $N = {n_{1}, n_{2}, \dots} \subseteq C .$ In the construction of $C$ , we must have that $n_{1}$ belongs to one of the two closed intervals in $C_{1}$ . Hence, there is one interval $I_{1}$ of length $\frac{1}{3}$ such that $n_{1} \notin I_{1}$ . Similarly, $C_{2} \cap I_{1}$ is the union of two new closed intervals of length $\frac{1}{3^{2}}$ , and $n_{2}$ can at most be in one of these two sets. Let $I_{2} \subset I_{1}$ be this interval s.t. $n_{2} \notin I_{2}$ . Continuing in this fashion we construct a series of intervals $C \supset I_{1} \supset I_{2} \supset \dots \supset I_{k} \supset \dots$ , where $I_{k}$ is one of the closed intervals of $C_{n}$ of length $\frac{1}{3^{n}}$ such that $n_{k} \notin I_{k}$ . The limit $I_{\infty} = \cap_{k = 1}^{\infty} I_{k}$ is non-empty (see below), and by construction $N \cap I_{\infty} = \emptyset$ . Hence given any arbitrary countable subset of $C$ we can still find elements in $C$ that does not belong to this subset, and thus there cannot exist a bijection between $N$ and $C$ and we conclude that $C$ is uncountable infinite.

Here we used Cantor’s Intersection Theorem: Let $(X, d)$ be a complete metric space. Let $(I_{k})_{k \in N}$ be a sequence of decreasing non-empty, closed subsets of $X$ then $⋂_{k \in N} I_{k} \neq \emptyset .$

Proof: For each $k \in N$ choose $x_{k} \in I_{k}$ . The diameter of $I_{k}$ is defined as $diam (I_{k}) = sup {𝑑 (𝑥, 𝑦) : 𝑥, 𝑦 \in I_{k}}$ . We only need to consider the case where $diam (I_{k}) \to 0.$ In this case, let $ϵ > 0$ then there exists $N > 0$ such that $d (x, x^{'}) < ϵ$ for all $x, x^{'} \in I_{N}$ . Let $m, n > N$ then $x_{n}, x_{m} \in I_{N}$ since $I_{N} \supset I_{N + 1} \supset \dots \supset I_{m}, I_{n}$ , and thus $d (x_{n}, x_{m}) < ϵ$ . We conclude that $(x_{k})_{k \in N}$ is a Cauchy sequence. Since $X$ is complete $x_{n} \to x$ as $n \to \infty$ for some $x \in X$ . Note that $(x_{n})_{n = N}^{\infty} \subset I_{N}$ and since $I_{N}$ is closed, the limit point also lies in the set $x \in I_{N}$ . As this holds for any $N$ , we conclude that $x \in ⋂_{k \in N} I_{k}$ . This concludes the proof. Note, that in this case the limit point $x$ is in fact the only element. Since if we let $x^{'}$ be another element then $x, x^{'} \in I_{n}$ for all $n \in N$ but $d (x, x^{'}) \leq diam (F_{n}) \to 0$ as $n \to \infty$ . Hence $d (x, x^{'}) = 0$ , and therefore $⋂_{k \in N} I_{k} = {x} .$