SYSTEMS OF LINEAR ALGEBRAIC EQUATIONS
1.1 LINEAR ALGEBRAIC EQUATIONS
Systems of equations such as (1)–(2), (3′), (4′), (5′), and (6) given in the “Introduction to Part I” are instances of a mathematical structure that pervades practically every application of mathematics.
Systems of Linear Equations
An equation that can be put in the form
where the coefficients ai and the right-hand side b are constants, is called a linear algebraic equation in the variables x1, x2, … , xn. If m linear equations, each having n variables, are all to be satisfied by the same set of values for the variables, then the m equations constitute a system of linear algebraic equations (or simply a linear system):
We assume that the reader has had some experience solving linear systems of two equations in two unknowns, such as (1, 2) in the preceding section, although the lack of solutions to (4′) and the multiplicity of solutions to (5′) may have come as a surprise. The graph of the solutions to a single equation in two unknowns is, of course, a straight line (hence the designation “linear”) and the graphs in Figure 1.1(a–c) illustrate why a system of two such equations typically has one solution, but special alignments can result in zero or an infinite number of solutions. If a third equation is added to the system (Figure 1.1(d–f)), typically there will be no solutions, although accidental alignments can result in one, or an infinity of, solutions.
Fig. 1.1 Typical graphs of linear systems with two unknowns.
The graph of the solutions to a “linear” algebraic equation in three unknowns is not literally a line, but a plane, and Figure 1.2 reinforces our expectation that systems with three unknowns can again possess zero, one, or an infinity of solutions.
Fig. 1.2 Graph of three equations in three unknown.
Although the graphs provide much insight into the nature of the solutions of linear systems, they are obviously inadequate as tools for finding these solutions when more than three unknowns are present. We need an analytic technique for solving simultaneous linear equations that is efficient, accurate, foolproof, and easy to automate. After all, the task of solving linear systems is performed billions of times each day in computers around the world. It arises in an enormous number of applications. The allocation of medical resources such as hospital beds, antibiotics, health personnel, etc. after a disaster involves dozens of variables. Displaying and/or saving an image on a computer screen may require thousands of equations converting the red/green/blue intensities to the wavelet coefficients for the JPEG 2000 image compression format; similar considerations hold for face recognition and other image processing schemes. The algorithm employed in CAT scans, we have noted, can involve millions of unknowns. In fact, the task of solving large systems of linear equations is intimately entwined with the history and evolution of the electronic computer. So bear in mind that although you will be solving small linear systems by hand as you learn the basics, in practice you will usually be entrusting your analyses to software. In such applications it is absolutely essential that the computer’s speed be coupled with a precise understanding of what it is doing, and that is the goal of our exposition in the next few sections.
We are going to focus on one algorithmic procedure for solving linear systems. It is known as Gauss elimination (in tribute to the great German mathematician Carl Friedrich Gauss (1777–1855), although his other works dwarf this effort). It1 is used almost universally for solving generic linear systems. (To be sure, faster algorithms exist for systems with special structures.)
The Gauss elimination algorithm, “by the book,” is inflexible. Just the same, anytime you’re solving a system by hand and you see a shortcut, by all means take it; most of the time you will save time and work. However, there will be occasions when you get nonsense (see Problems 15–18 for examples), and then you must back up and follow Gauss’s rules rigidly.
The basic idea of Gauss elimination is to add multiples of earlier equations to subsequent equations, choosing the multiples so as to reduce the number of unknowns. (Equivalently, one can say we are subtracting multiples of earlier equations from subsequent equations. We’ll employ whichever interpretation seems appropriate.) The resulting system of equations can then be solved in reverse order. We demonstrate with a system of three equations in three unknowns before giving the general formulation.
Solve the system
Solution. We eliminate x1 in the second equation by adding (−2) times the first equation (or subtracting 2 times the first equation):
Similarly x1 is eliminated from the third equation by adding (−1) times the first;
Next x2 is eliminated from the third equation by adding (−1/3) times the second:
The third equation only has one unknown and its solution is immediate:
But now, the second equation has, effectively, only one unknown since we can substitute 1 for x3. Hence, its solution is
And substitution for x3 and x2 in the first equation yields
To maintain focus on the methodology of Gauss elimination, we usually contrive examples like the above, with integer or small-fraction constants. But sometimes the transparency of simplified examples can obscure the underlying algorithm. For example, if you examine the system (3), you may see that adding (2/3) times the second equation to first would have enabled us to conclude x1 = 2 immediately. Obviously we can’t rely on such serendipity in general. So we include the following unwieldy example to focus your attention on the regimented steps of Gauss elimination. Don’t bother to follow the details of the arithmetic; just note the general procedure.
Solve the linear system
Solution. To eliminate x1 from the second equation, we add (–0.333333/0.202131) times the first. Similarly we add (0.486542/0.202131) times the first equation to the third, and add (–0.101101/0.202131) times the first to the fourth. These three operations produce (to 6 digits)
(Roundoff effects will be discussed in Section 1.5.)
Next, we eliminate x2 from the third and fourth equations by adding, in turn, the multiples (2.26317/1.32102) and (–0.045289/1.32102) of the second equation. And so on. Continuing with the forward part of the algorithm we arrive at the system
And performing “back substitution,” that is solving for the variables in reverse order from the bottom up, we obtain the solution. It turns out to be
We now give a brief exposition of the steps of the Gauss elimination process, as demonstrated in these examples. It may have occurred to you that the algorithm, as described so far, has a flaw; it can be foiled by the untimely appearance of zeros at certain critical stages. But we’ll ignore this possibility for the moment. After you’ve become more familiar with the procedure, we shall patch it up and streamline it (Section 1.3). Refer to Example 2 as you read the description below.
Gauss Elimination Algorithm (without anomalies) (n equations in n unknowns, no “zero denominators”)
- Eliminate x1 from the second, third, … , nth equations by adding appropriate multiples of the first equation.
- Eliminate x2 from the (new) third, … , nth equations by adding the appropriate multiples of the (new) second equation.
- Continue in this manner: eliminate xi from the subsequent equations by adding the appropriate multiples of the ith equation.
When xn–1 has been eliminated from the nth equation, the forward part of the algorithm is finished. The solution is completed by performing back substitution:
- Solve the nth equation for xn.
- Substitute the value of xn into the (n–1)st equation and solve for xn–1.
- Continue in this manner until x1 is determined.
Problems 15–18 at the end of this section demonstrate some of the common traps that people can fall into if they don’t follow the Gauss procedure rigorously. So how can we be sure that we haven’t introduced new “solutions” or lost valid solutions with this algorithm? To prove this, let us define two systems of simultaneous linear equations to be equivalent if, and only if, they have identical solution sets. Then we claim that the following three basic operations, which are at the core of the Gauss elimination algorithm, are guaranteed not to alter...