Cauchy Sequences

Definition: Cauchy sequence

Let (x{}_{n}) be a sequence of real numbers.  We say that (x{}_{n}) is a cauchy sequence if it has the following property:

\forall \varepsilon > 0, \exists N \in {\mathbb N} such that n,m \ge N \Longrightarrow |x_m - x_n| < \varepsilon

The first thing to notice is that this definition looks remarkably similar to the definition of convergence.  However, in this case we’re not looking at a sequence getting closer and closer to a given number.  Now we’re looking at a sequence whose terms are all getting closer and closer to each other.

So, let’s give a simple example of a Cauchy sequence.  Consider x{}_{n} = \frac{1}{n}.  We can quite easily prove that this sequence is Cauchy as follows:

Fix \varepsilon > 0.  Then, choose N \in {\mathbb N} such that N > \frac{2}{\varepsilon }.  Now, let n, m \ge N

Then, |x_m - x_n| \le \frac{1}{m} + \frac{1}{n}. (triangle inequality)

< \frac{\varepsilon }{2}+\frac{\varepsilon }{2}.

= \varepsilon.

And so, \frac{1}{n} is a Cauchy sequence by definition.

But now let’s think a bit more deeply about what it means for a sequence to be Cauchy.  The definition requires that there must be some point beyond which all remaining terms of the sequence must be arbitrarily close together, the crucial word being ‘all’.  So, with that in mind, consider the following question:

Let x{}_{n} be a sequence with the property that (x{}_{n+1} - x{}_{n}) \Longrightarrow 0 as n \Longrightarrow \infty.  Does it follow that x{}_{n} must be Cauchy?

So, putting that property into English, x{}_{n} is such that the difference between each individual term is tending to zero.  This property clearly holds for \frac{1}{n}, a sequence already shown to be Cauchy.  But are all such sequences necessarily Cauchy?  The answer is that contrary to what intuition might suggest, such sequences are not necessarily Cauchy.  The Cauchy property requires something stronger – that the difference between all sufficiently large terms can be made arbitrarily small – and it is in fact possible to find sequences which satisfy the property above, but yet are not Cauchy.  I’m making this point clear now because a common mistake that some students make is to try to show that a sequence is Cauchy by showing that (x{}_{n+1} - x{}_{n}) \Longrightarrow 0, which is erroneous thinking.

So, for an example of such a sequence, consider x{}_{n} = \sqrt{n}

We already know that \sqrt{n+1} - \sqrt{n} tends to 0, and thus that (x{}_{n+1} - x{}_{n}){}_{ } tends to 0.  However, we can show that \sqrt{n} is not Cauchy, as follows:

Assume for contradiction that \sqrt{n} is Cauchy.  Then by definition there exists N\in {\mathbb N} such that n,m \ge N \Longrightarrow |x_m - x_n|< 1.  But we can find a counterexample by considering n = N and m = 4N.  Then,

|x_m - x_n| = \sqrt{4N} - \sqrt{N}.

= \sqrt{N}.

\ge 1.

And we have our desired contradiction.  Even though the difference between individual terms of the sequence is tending to 0, this isn’t enough to make the sequence Cauchy.

So that having been said, what does make a sequence Cauchy?  You may already have wondered if there is a link between convergent sequences and Cauchy sequences.  After all, if a sequence is converging to some limit – and the terms of the sequence are therefore getting closer and closer to some fixed real number, then it should surely follow logically that the said terms must also be getting closer and closer to each other.  Therefore, it is reasonable to hypothesise that every convergent sequence must be a Cauchy sequence – and it turns out that we can prove this hypothesis very easily to be true.

Theorem: All convergent sequences are Cauchy

Let x{}_{n} be convergent.  Then x{}_{n} is Cauchy.

Proof

Let x{}_{n} be an arbitrary convergent sequence with limit L \in {\rm \ }{\mathbb R}.  Fix \varepsilon > 0.

Then, by definition of convergence, there exists N \in {\mathbb N} such that n \ge N \Longrightarrow |x_n-L|<\frac{\varepsilon }{2}.

Now, let m, n \in {\mathbb N} with m, n = N.  Then, |x_m-x_n| = |x_m-x_n+L-L|

\le |x_m-L| + |x_n-L| < \frac{\varepsilon }{2} + \frac{\varepsilon }{2}

= \varepsilon .

And so, x{}_{n} is Cauchy by definition.

\Box .

And with one stroke, this result gives us access to a vast range of Cauchy sequences, since we now know that every convergent sequence we have encountered is also Cauchy.  But this does raise a fairly significant question; namely, can we find a Cauchy sequence which isn’t convergent?

The answer is no.  All Cauchy sequences of real numbers converge.  In addition with the result I’ve just proved, this tells us that convergent sequences and Cauchy sequences (in the context of the real numbers) are one and the same.  Proving this however will take a little bit more work!

We start with an intermediary step – that Cauchy sequences are bounded.

Theorem: Boundedness of Cauchy sequences

Let x{}_{n} be a Cauchy sequence.  Then x{}_{n} is bounded.

Proof

By definition, there exists a N \in {\mathbb N} such that m, n \ge N \Longrightarrow |x{}_{m} - x{}_{n}| < 1

Now, let n > N.  Then |x{}_{n}| \le |x{}_{n }- x{}_{N}| +|x{}_{N}|

< 1 +|x{}_{N}|.

Then, define M = max\{|x{}_{1}|,|x{}_{2}|, ... ,|x{}_{N-1}|, 1 +|x{}_{N}| \}

Then by construction, |x{}_{n}| \le M for all n \in {\mathbb N}

\Box .

Now, you might already have foreseen one particular difficulty with attempting to prove that all Cauchy sequences converge.  If we are to prove that a given Cauchy sequence converges, then we must be able to produce some number that the sequence converges to.  How on earth are we to find such a number?  Thankfully, the Bolzano-Weierstrass Theorem provides a way, and the result we have just proved enables us to invoke it!

Theorem:  Convergence of Cauchy sequences

Let x{}_{n} be Cauchy sequence of real numbers.  Then x{}_{n} converges.

Proof

Let x{}_{n} be an arbitrary Cauchy sequence.  Then by the previous theorem x{}_{n} is bounded, and so by the Bolzano-Weierstrass Theorem has a convergent subsequence, x_{n_k}.  Denote the limit of this subsequence as L.  We now proceed to show that x{}_{n} also converges to L.

Fix \varepsilon > 0.  Then there exists N{}_{1} \in {\mathbb N} such that n, m \ge N{}_{1} \Longrightarrow |x{}_{m} - x{}_{n}| < \frac{\varepsilon }{2} by definition of the Cauchy sequence.

Also, there exists K \in {\mathbb N} such that k \ge K \Longrightarrow |x_{n_k} - L| <\frac{\varepsilon }{2} by convergence of the subsequence

So, define N = max\{ N{}_{1}, K \} and let n \ge  N.

Then, |x{}_{n} - L| =|x{}_{n} - L + x_{n_N}- x_{n_N}|.

\le |x_{n_N} - L| + |{}_{n} - x_{n_N}|.

< \frac{\varepsilon }{2} + \frac{\varepsilon }{2}                           since n{}_{N }= N

= \varepsilon .

And so, x{}_{n} by definition converges to L.

\Box .

And so we have that Cauchy sequences and convergent sequences (in {\mathbb R}) are one and the same.  But why do we care?  Why have two definitions for what is, in effect, the same thing?

There are two answers to that question.  Firstly, note that I have been continually emphasising that these two are the same in {\mathbb R}.  If you go on to study mathematical analysis at a higher level (by studying metric spaces for example) then you will encounter a more general definition of convergent sequences and Cauchy sequences, and in this more general setting, the two are certainly not the same.  The space of real numbers is simply a special ‘nice’ case in which they do happen to be the same.

For a simple example of this, consider the sequence defined as follows:  For each n \in {\mathbb N}, define x{}_{n} to be an arbitrary rational number in the open interval ( \sqrt{2}-\frac{1}{n} , \sqrt{2}).  It follows easily by the squeezing theorem that x{}_{n} converges to \sqrt{2}  and is thus a Cauchy sequence.

However, let’s now consider x{}_{n} as a sequence in the rational numbers, which we can do since every term of the sequence is rational.  The sequence is obviously Cauchy, but it doesn’t converge to any rational number.  If it did, then the sequence would also converge to that rational number in the space of real numbers, which is impossible by the uniqueness of limits.  Therefore x{}_{n}, although Cauchy in the rationals, is not convergent in the rationals.

Now, I said there were two reasons, so here’s the second one.  It is often easier to prove that a given sequence does or does not converge by considering the Cauchy property.  For instance, consider the sequence x{}_{n} = sin(n) .  We could prove that the sequence isn’t convergent by letting L \in {\rm \ }{\mathbb R} be arbitrary and showing that sin(n) cannot converge to L.  However, it is far simpler to prove that it cannot be Cauchy.

To do so, assume for contradiction that sin(n) is Cauchy.  Then, by definition, there exists N \in {\mathbb N} such that m, n \ge N \Longrightarrow |sin(m) - sin(n)| < 1

And now we generate a contradiction as follows.  Firstly, using the fact that the sine function is 2\pi periodic, we know that for any x \in [\frac{\pi }{6}+\ 2\pi n\ , \frac{5\pi }{6}+\ 2\pi n], sin(x) > \frac{1}{2} (where n is any arbitrary natural number).  Similarly, for any x \in  [\frac{7\pi }{6}+\ 2\pi n\ , \frac{11\pi }{6}+\ 2\pi n], sin(x) = \frac{-1}{2}.

Notice that both these intervals have a length of \frac{2\pi }{3}> 2 > 1.

So to generate the required contradiction, choose m, n \in {\mathbb N} such that m \in [\frac{\pi }{6}+\ 2\pi N\ , \frac{5\pi }{6}+\ 2\pi N] and n \in [\frac{5\pi }{6}+\ 2\pi N\ , \frac{11\pi }{6}+\ 2\pi N].  Clearly, both m and n are greater than N, and we know that such natural numbers must exist in these intervals, since they have a length greater than 1.

And thus, |sin(m) - sin(n)| > 1, by our choice of m and n, and we have our required contradiction.  The sequence cannot be Cauchy and thus cannot be convergent.

Now before I close this section on Cauchy sequences, I’d like to remind you of something I said long ago in a previous article – namely that the key assumption we needed to construct the real numbers was the Completeness axiom, which, I hope you remember, is that every bounded, non-empty set of real numbers has a supremum.  Now I also said at the time that this is not the only assumption that we can make in order to construct the real numbers.  I said that there are equivalent assumptions that we can make and which other mathematical courses very well might (and I’m sure do) make.  I said that these other assumptions were as follows:

  • The Bolzano-Weierstrass Theorem.
  • That every real Cauchy sequence converges.
  • That every increasing and bounded sequence converges.

Now, you might not have known then what these results were, but hopefully you certainly do now!  And you have seen that we can use the Completeness axiom to prove that every increasing and bounded sequence converges, which we can then use to prove the Bolzano-Weierstrass Theorem, which we can then use to prove that every real Cauchy sequence converges.

However, it would be really neat if we could show that the convergence of Cauchy sequences implies the Completeness Axiom.  If we could show this, then we would effectively have proved that these four statements are equivalent, since if any one of them is true, then all the rest have to be true as well.

If you’re still confused about what it is we’re doing, then you can think of it another way.  Everything we’ve done so far – every major result we’ve proved in the study of the mathematical analysis of real numbers – has relied upon the completeness Axion, which remember is an unproved assumption.  Proving that the convergence of Cauchy sequences implies the Completeness Axiom, will not, from our point of view, prove the Completeness Axiom to be true – that would be circular reasoning.  What such a proof does do however, is give us good grounds to believe that the initial assumption was reasonable.  After all, if, using the fact that all Cauchy sequences converge, we could construct a bounded and non-empty set of real numbers that didn’t have a supremum, then we would effectively have proved (by a very long proof by contradiction) our assumption to be false.

Theorem

The Completeness axiom is implied by the convergence of Cauchy sequences.

Proof

Let A be a non-empty bounded set of real numbers.  Then, there exists a \in A and M \in {\mathbb R} (with M \neq a) such that y \le M for all y \in A.

Now I construct a sequence of real numbers, x{}_{n}, and a sequence of sets, I{}_{n}, as follows.

Set x{}_{1} = \frac{M+a}{2}.  If x{}_{1} is an upper bound for A, set I{}_{1} = [a, x{}_{1}].  If not, set I{}_{1} = [x{}_{1}, M].

Now, I{}_{1} is of the form [p, q], for some p, q \in [a, M].  Define x{}_{2} = \frac{q+p}{2}.  And again, if x{}_{2} is an upper bound for A, set I{}_{2} = [p, x{}_{2}], and if not, I{}_{2} = [x{}_{2}, q].

And now I go on to define the sequences x{}_{n} and I{}_{n} inductively.  If x{}_{n} and I{}_{n} are defined, with I{}_{n} = [p{}_{n}, q{}_{n}] for some p{}_{n} and q{}_{n} in [a, M], then set x{}_{n+1} = \frac{p_n+q_n}{2} and I{}_{n+1} = [p{}_{n}, x{}_{n}] if x{}_{n} is an upper bound for A, or [x{}_{n}, q{}_{n}] if it isn’t.

This definition probably looks quite terrifying, but the idea behind it is very simple.  The sets I{}_{n} represent the range of values in which the supremum of A, if it exists, could lie.  What each individual x{}_{n} does is to bisect the set I{}_{n-1} (except of course for x{}_{1}, which bisects [a, M]), thereby allowing us to narrow down the range of values which could be the supremum (again, if it exists).  If x{}_{n} is an upper bound for A, then certainly no greater number can be the supremum, so we discard the second half of I{}_{n-1} and define the first half to be I{}_{n}.  If however x{}_{n} is not an upper bound, then no smaller number can be the supremum, so we discard the first half of I{}_{n-1} and denote the second half to be I{}_{n}.

So notice that by construction, I{}_{n+1 } \subset {}_{ } I{}_{n} for all n \in {\mathbb N}.  Also by construction, notice that the two end points of any arbitrary I{}_{n} (which I’ve denoted as p{}_{n} and q{}_{n} in the definition above) must either be M, a or a previous point of the sequence.

So how can we use this to prove that A must have a supremum?  Well, the idea is quite simple.  First, I will prove that x{}_{n} is Cauchy and thus that it converges to some \alpha \in {\mathbb R}.  I will then show that this \alpha has to be A’s supremum.

So to prove that x{}_{n} is Cauchy, notice that |x{}_{n+1} - x{}_{n}| = \frac{M-a}{2^{n+1}} for all n \in {\mathbb N}.  This follows from the fact that the distance between each individual term of the sequence is halving at each step from an initial starting point of (x{}_{2} - x{}_{1}) = \frac{M-a}{4}, and can be proved quite easily (if a little tediously) by a simple induction argument, which I leave to you to verify.

So the difference between individual terms is clearly tending to 0; but of course, as I was careful to emphasise above, this alone is not sufficient to prove that x{}_{n} is Cauchy.  However, so strong is this bound, that the Cauchy property follows quite easily from it.  Observe:

Let n \in {\mathbb N} be a fixed arbitrary natural number with m \in {\mathbb N} such that m > n.  Then,

|x{}_{m} - x{}_{n}| =| (x{}_{m} - x{}_{m-1}) + (x{}_{m-1} - x{}_{m-2}) + ... + (x{}_{n+1} - x{}_{n})| \le .

= \frac{M - a}{2^m} + \frac{M - a}{2^{m-1}} + ... + \frac{M - a}{2^{n+1}}

= (M-a)\sum^m_{i=n+1}{{(\frac{1}{2})}^i}.

<(M-a)\sum^{\infty }_{i=n+1}{{(\frac{1}{2})}^i}   (*)

= \frac{2(M-a)}{2^{n+1}}                          (**)

= \frac{(M-a)}{2^n}.

So we have that for any m > n,|x{}_{m} - x{}_{n}|{}_{ }<\frac{(M-a)}{2^n}

But n itself was arbitrary, and the bound \frac{(M-a)}{2^n} certainly tends to 0 as n tends to infinity.  This is sufficient to prove that x{}_{n} is Cauchy.  Fix \varepsilon > 0 and choose N \in {\mathbb N} such that n \ge N \Longrightarrow \frac{(M-a)}{2^n}<\varepsilon.  Then for any m, n \ge N,|x{}_{m} - x{}_{n}| <\frac{(M-a)}{2^n}<\varepsilon

Now, the steps in the argument above that I’ve marked (*) and (**) are a little cheeky and merit further comment, since they rely on concepts that I haven’t yet defined.  Nevertheless, most A level students of mathematics encounter geometric series and are shown that \sum^{\infty }_{i=p}{x^i} = \frac{x^p}{1-x} for |x| < 1, so the argument above hopefully shouldn’t look to bizarre.  Needless to say, steps (*) and (**) will be shown to be rigorous when, in a later article, I come to define the concept of an infinite series of numbers.

But back to the argument.  Since x{}_{n} is Cauchy, we know it converges to some \alpha \in [a, M].  I will now show that \alpha must be the supremum for A.

There are two things to prove here, firstly that \alpha is an upper bound for A, and secondly that it is the smallest of all the upper bounds.  Let’s take the first of these.  Assume for contradiction that \alpha is not an upper bound for A.  Then, there exists y \in A such that \alpha < y \le M.

The first point to notice is that it is impossible for there to be any term of the sequence that lies in the interval (\alpha, y).  If there were such a term, x{}_{k}, for some natural k, then it would follow by definition of the sequence that x{}_{k+1}> x{}_{k}, since x{}_{k} is not an upper bound.  But this would then imply, by construction, that I{}_{k+1} = [x{}_{k+1}, q] (for some irrelevant q).  But since I{}_{n} \subset I{}_{k+1} for all n > k + 1, and since x{}_{n} \in I{}_{n} for all n, we have that x{}_{n} > x{}_{k+1} for all n > k + 1.  And since x{}_{k+1}>\alpha, this would make it impossible for the sequence to converge to \alpha.

And so we have that no term of the sequence lies in the interval (\alpha, y).  So define \varepsilon = y - \alpha.  Then, by definition of convergence, there exists N \in {\mathbb N} such that n \ge N \Longrightarrow \alpha - x{}_{n}< \frac{\varepsilon }{2}.  (I’ve simplified the usual modulus expression since by the choice of \varepsilon , we know that x{}_{n} must be smaller than \alpha.)  And since x{}_{n} is not an upper bound, I{}_{n} is of the form [x{}_{n}, q] for q > x{}_{n}.  But we know more about q.  It cannot be that q < \alpha, since if so that would make it impossible for the sequence to converge to \alpha.  However, we also know that since q is either M or a previous term of the sequence, it cannot lie in the interval (\alpha, y) either.  Thus, q > y.

And so, x{}_{N+1} = \frac{x_N+q}{2} (by construction, since x{}_{n} is not an upper bound)

= \frac{x_N+y}{2} = \frac{x_N+\varepsilon +\alpha }{2} >\frac{(\alpha -\frac{\varepsilon }{2})+\varepsilon +\alpha }{2} = \alpha + \frac{\varepsilon }{4} >\alpha

But this contradicts the fact that by the choice of \varepsilon, x{}_{n} < \alpha for all n \ge N.  Therefore, \alpha must be an upper bound for A.

Now to prove that \alpha must be the smallest upper bound.  Assume for contradiction that it isn’t, and therefore that there exists a number \beta satisfying a \le \beta < \alpha such that \beta is an upper bound for A.  But now the argument used to generate the required contradiction is almost identical, so I won’t spell it out in detail.  Again we have that no term of the sequence can appear in the interval (\beta, \alpha), since otherwise the sequence could not converge to \alpha by construction.  And again, by considering convergence of the sequence for \varepsilon = \frac{\alpha -\beta }{2}, we can again generate a term of the sequence that appears in that very interval, thus giving us a contradiction.

And so we have that \alpha must be the supremum for the set A.  The convergence of all real Cauchy sequences does indeed imply that every bounded non-empty set of real numbers must have a supremum.

\Box .