When I am working on a problem, I never think about beauty. I think only of how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong.
R. Buckminster Fuller
If you cannot describe what you are doing as a process, you don't know what you're doing.
W. Edwards Deming
A programming language is a system of notation for describing computations. A useful programming language must therefore be suited for both description(i.e., for human writers and readers of programs) and for computation (i.e., for efficient implementation on computers). But human beings and computers are so different that it is difficult to find notational devices that are well suited to the capabilities of both.
R.D. Tennant
The central concept underlying all computation is that of the algorithm, a
step-by-step sequence of instructions for carrying out some task. When you interact with
a computer, you provide instructions in the form of mouse clicks, menu selections, or
command sequences. These instructions tell the computer to execute a particular task,
such as copying a piece of text or saving an image. At a more detailed level, when you
write a JavaScript program, you are providing instructions that the browser must follow
in order to perform a task, such as prompting the user for input or displaying a message
in the page.
This chapter presents an overview of algorithms, their design and analysis, and their connection to computer science. After introducing techniques for solving problems, we provide several example algorithms for you to study, accompanying each with an interactive Web page that will help you visualize and experiment with the algorithms. We also discuss the evolution of programming languages and the advantage of writing computer algorithms at a higher level of abstraction. When you finish reading the chapter, you will understand the steps involved in designing an algorithm, coding it in a programming language, and executing the resulting program.
Programming may be viewed as the process of designing and implementing algorithms that a computer can carry out. The programmer's job is to create an algorithm for accomplishing an objective and then to translate the individual steps of the algorithm into a programming language that the computer understands. For example, the JavaScript programs you have written have contained statements that instruct the browser to carry out particular tasks, such as displaying a message or updating an image in the page. The browser understands these statements and is therefore able to carry them out and achieve the desired results.
The use of algorithms is not limited to the domain of computing. A recipe for baking chocolate-chip cookies is an algorithm. By following the instructions carefully, anyone familiar with cooking should be able to bake a perfect batch of cookies. Similarly, when you give someone directions to your house, you are defining an algorithm that the person can follow to reach a specific destination. Algorithms are prevalent in modern society, because we are constantly performing unfamiliar tasks for which we require instructions. You probably don't know exactly how long to cook macaroni or how to assemble a bicycle, so algorithms are provided (cooking instructions on the macaroni box, or printed assembly instructions accompanying the bike) to guide you through these types of tasks.
Of course, we have all had experiences with vague or misleading instructions, such as recipes that assume the knowledge of a master chef or directions to a house that rely on obscure landmarks. In order for an algorithm to be effective, it must be stated in a manner that its intended executor can understand. If you buy an expensive cookbook, it will probably assume that you are experienced in the kitchen. Instructions such as "create a white sauce" or "fold in egg whites until desired consistency" are probably clear to a chef. On the other hand, the typical instructions on the side of a macaroni-and-cheese box do not assume much culinary expertise. They tend to be much more precise, such as "pour contents of box into boiling water" and "add 1/2 cup of milk and stir." As you have no doubt experienced in developing interactive Web pages, computers are probably the most demanding of all algorithm executors. Computers require extremely precise languages for specifying the steps in an algorithm. Even simple mistakes in a program, such as misspelling a variable name or forgetting a comma, can confuse the computer and leave it unable to execute the program.
|
Life is full of problems whose solutions require a careful, step-by-step approach. Designing solutions to those problems requires logical reasoning and creativity. In his classic book, How to Solve It, the mathematician George Polya outlined four steps that can be applied to solving most problems:
Understanding the problem involves identifying exactly what is required (What are the initial conditions? What is the overall goal?) and what constraints are involved (What properties must the solution have?). For example, suppose that you are presented with the following problem:
PROBLEM STATEMENT: Find the oldest person in a room full of people.
At first glance, this problem seems relatively straightforward to understand. The initial condition is a room full of people. The goal is to identify the oldest person. However, deeper issues have to be explored before a solution can be proposed. How do you determine how old a person is? Can you tell someone's age just by looking at her? If you ask someone his age, will he tell the truth? What if there is more than one oldest person (born on the same day at the same time)?
For simplicity, let us assume that when asked, a person will give his or her real birthday. Also, let us further assume that we don't care about the time of day at which a person was born - if two people were born on the same day, they are considered the same age. If there is more than one oldest person finding any one of them is acceptable.
Given our new understanding of the problem, we can now devise a solution to this problem:
Finding the oldest person in a room (Algorithm 1):
- Line up all the people along one wall.
- Ask the first person his or her name and birthday, and write this information down on a piece of paper.
- For each successive person in line:
- Ask the person his or her name and birthday.
- If their birthday is earlier than the date written on the paper, cross out the old information and write down the name and birthday of this person.
- When you have reached the end of the line, the name and birthday of the oldest person will be written on the paper.
To see how this algorithm works, we can carry out the steps on an example (Figure 10.2). Initially, you write down the name and birthday of the first person (Chris, 8/4/82). Working your way down the line, you ask each person's birthday and update the page when an earlier birthday is encountered (first Pat, 10/7/80; then Joan, 2/6/80). When you reach the end of the line, the name and birthday of the oldest person (Joan, 2/6/80) is written on the paper.
|
This algorithm is simple, and it's easy to see that it works in general. Since you go through the entire line of people, every person is eventually asked his or her birthday. When you reach the oldest person, his or her birthday will be earlier than the birthday currently on the paper. And once you write the oldest person's information down, that name will stay on the paper, since no one older will be found.
Algorithm 1, as demonstrated in Figure 10.2, effectively locates the oldest person in a room. However, this problem - like most - can be solved many different ways. Once alternative algorithms for the same task have been proposed, you can analyze each option to determine which is simpler, more efficient, or better serves your particular objectives. As an example, consider the following alternative algorithm for finding the oldest person:
Finding the oldest person in a room (Algorithm 2):
- Line up all the people along one wall.
- As long as there is more than one person in the line, repeatedly:
- Have the people pair up (1st and 2nd in line, 3rd and 4th in line, etc.). If there is an odd number of people, the last person will remain without a partner.
- Ask each pair of people compare their birthdays.
- Request that the younger of the two leave the line.
- When there is only one person left in the line, that person is the oldest.
Algorithm 2 is slightly more complex than Algorithm 1, but has the same end result - the oldest person in the room is located. In the first round, each pair compares their birthdays, and the younger of the two partners subsequently leaves the line. Thus, the number of people in line is cut roughly in half (if there is an odd number of people, the extra person will remain). The older members of the original pairs then repeat this process, reducing the size of the line each time. Eventually, the number of people in line drops to one. Clearly, the oldest person cannot have left, since he or she would have been older than any potential partner in the line. Therefore, the last person in line is the oldest. Figure 10.3 demonstrates performing this algorithm on the same eight people as in Figure 10.2. In only four rounds, the line of people shrinks until it contains only the oldest person.
|
When more than one algorithm can be used to solve a problem, it is necessary to determine which is better. Often, there is not a single correct answer: your choice for the "better" algorithm depends upon what features matter to you. If you don't care how long it takes to solve a problem but want to be sure of the answer, it makes sense for you to select the simplest, most easily understood algorithm. However, if you are concerned about the time or effort required to solve the problem, you will want to analyze your alternatives more carefully before making a decision.
For example, consider the two algorithms that we have developed for finding the oldest person in a room. As Figure 8.2 demonstrates, Algorithm 1 involves asking each person's birthday and then comparing it with the birthday written on the page. Thus, the amount of time needed to find the oldest person using Algorithm 1 will be proportional to the number of people. If there are 100 people in line and it takes 5 seconds to compare birthdays, then Algorithm 1 will require 5*100 = 500 seconds. Likewise, if there are 200 people in line, then Algorithm 1 will require 5*200 = 1,000 seconds. In general, if you double the number of people, the time necessary to complete Algorithm 1 will double.
By contrast, Algorithm 2 allows you to perform multiple comparisons simultaneously, which saves time. While the first pair of people compares their ages, all other pairs are doing the same. Thus, you can eliminate half the people in line in the time it takes to perform one comparison. This implies that the total time needed to find the oldest person using Algorithm 2 will be proportional to the number of rounds needed to shrink the line down to one person. In Figure 8.3, we reduced a group of eight people to the single oldest person in three rounds: 8 -> 4 -> 2 -> 1. If there were twice as many people in the room, it would still only require four rounds to reduce those 16 people down to the single oldest person: 16 -> 8 -> 4 -> 2 -> 1. In mathematics, the notion of a logarithm captures this halving effect: log2(N) represents the number of times a value N can be halved before it reaches 1 (see Figure 10.4). Therefore, Algorithm 2 will find the oldest person in an amount of time proportional to the logarithm of the number of people. For example, if there are 100 people in line and it takes 5 seconds to compare birthdays, Algorithm 2 will require 5*log2(100) = 5*7 = 35 seconds. If there are 200 people in line, then Algorithm 2 will require 5*log2(200) = 5*8 = 40 seconds. In general, if you double the number of people, the time required to execute Algorithm 2 will increase by the length of one additional comparison.
N | log2(N) |
---|---|
100 | 7 |
200 | 8 |
400 | 9 |
800 | 10 |
1,600 | 11 |
... | ... |
10,000 | 14 |
20,000 | 15 |
40,000 | 16 |
... | ... |
1,000,000 | 20 |
Performance measures the speed at which a particular logarithm (or program designed to carry out that algorithm) accomplishes its task. It should be noted that the relative performance of these algorithms remains the same as the number of people increases, regardless of how long it takes to complete an individual comparison. In the timings we have considered so far, we assumed that birthdays could be compared in 5 seconds. If a comparison instead took 10 seconds, you would still find that doubling the number of people doubles the time needed to perform Algorithm 1-100 people would require 10*100 = 1,000 seconds, and 200 people would require 10*200 = 2,000 seconds. Likewise, doubling the number of people increases Algorithm 2's execution time by the duration of one additional comparison -- 100 people would require 10*log2(100) = 10*7 = 70 seconds, and 200 people would require 10*log2(200) = 10*8 = 80 seconds.
The problem of finding the oldest person in a crowd is similar to a more general problem that occurs frequently in computer science. Computers are often required to store and maintain large amounts of information and then search that information for particular values. For example, commercial databases often contain large numbers of records, such as product inventories or payroll receipts, and allow the user to search for information on particular entries. Similarly, a Web browser that executes a JavaScript program must keep track of all the variables used throughout the program, as well as their corresponding values. Each time a variable appears in an expression, the browser must search through the list of variables to obtain the associated value.
If a computer attempted to locate information in an arbitrary manner, searching for a particular entry in a list could be time-consuming and tedious. For example, consider the task of searching a large payroll database for a particular record. If the computer simply selected entries at random and compared them with the desired record, there is no guarantee that the computer will eventually find the correct entry. A systematic approach (i.e., an algorithm) is needed to ensure that you find the desired entry, no matter where it appears in the database.
The simplest algorithm for searching a list is sequential search, a technique that involves examining each list item in sequential order until the desired item is found.
Sequential search for finding an item in a list:
- Start at the beginning of the list.
- For each item in the list:
- If that item is the one you are looking for, then you are done.
- If not, then go on to the next item in the list.
If you reach the end of the list and have not found the item, then it was not in the list. This algorithm is simple and guaranteed to find the item if it is in the list, but its execution can take a very long time. If the desired item is at the end of the list or not in the list at all, the algorithm will have to look at every listed item before it can obtain a result. Although sequentially searching a list of 10 or 100 items might be feasible, this approach becomes impractical when a list contains thousands or tens of thousands of entries. Just imagine how tedious searching a phone book in this manner would be!
Fortunately, there is a more efficient algorithm for searching a list, as long as the list adheres to some organizational structure. For example, entries in the phone book are alphabetized, which enables a person to find a particular entry quickly. If you were looking for "Dave Reed" in the phone book, you might guess that it would appear toward the back of the book, since 'R' is late in the alphabet. If you opened the book to a page near the end and found "Joan Smith" at the top, you would know that you had gone too far and would back up some number of pages. In most cases, your knowledge of the alphabet and the phone book's alphabetical ordering system allow you to hone in on an entry after a few page flips.
By generalizing this approach, we can create an algorithm for searching any ordered list. The first step is to inspect the entry at the middle position of the list (rounding the position down if there is an even number of items in the list). If this middle entry is the one you are looking for, then you are done. If, in the list's ordering scheme, the middle entry comes after the item you are seeking (e.g., "Joan Smith" comes after "Dave Reed" alphabetically), then you know the desired item must appear in the first half of the list. However, if the middle entry comes before the desired item in sequence (e.g., "Adam Miller" comes before "Dave Reed"), then the desired item must appear in the second half of the list. Once you have determined the half in which the item must appear, you can search that half via the same technique. Because each check cuts the list that must be searched in half, this algorithm is known as binary search.
Binary search for finding an item in an ordered list:
- Initially, the potential range in which the item could occur is the entire list.
- As long as the potential range is nonempty and the item has not been found, repeatedly:
- Look at the middle entry in the potential range.
- If the middle entry is the item you are looking for, then you are done.
- If the middle entry is greater than the desired item, then reduce the potential range to those entries left of the middle.
- If the middle entry is less than the desired item, then reduce the potential range to those entries right of the middle.
Since step 2 rules out entire sections of the list and thus reduces the range where the item could occur, repeating this step will eventually converge on the item (or reduce the potential range down to nothing if the item is not in the list). Figure10.5 depicts a binary search that locates "Illinois" in an alphabetical list of state names. Since "Illinois" could conceivably appear anywhere in the list, the initial range that must be searched is the entire list (positions 1 through 15). Thus, you begin by checking the midpoint of this range, which is located by averaging the left and right boundaries of the range: position (1 + 15)/2 = 8. Since the entry at position 8, "Missouri", comes after "Illinois" in the alphabet, you know that "Illinois" must appear in the left half of the list. Note that, in the second step, we have crossed out the entries at position 8 and beyond, to highlight the fact that they are no longer under consideration. You then repeat this process, checking the midpoint of the new potential range, position (1 + 7)/2 = 4. Since the entry at position 4, "Florida", comes before "Illinois" in the alphabet, you know that "Illinois" must occupy a position to the right of position 4. The potential range in which "Illinois" can appear is now limited to positions 5 through 7. As before, you check the midpoint of the new potential range, position (5 + 7)/2 = 6. Since the entry at that position is indeed "Illinois", the search concludes successfully.
|
The Web page search.html provides an interactive tool for studying the behavior of the binary search algorithm. Included with the page is a list of state names similar to the one used in Figure 10.5. The user can enter a desired state name in a text box, then click a button to see the steps required to locate that name in the list.
Recall that, in the worst case, sequential search involves checking every single entry in a list. This implies that the time required to search a list using sequential search is proportional to the size of the list. Thus, sequential search is an O(N) algorithm, where problem size N is defined as the number of items in the list.
By contrast, when you perform a binary search, each examination of an entry enables you to rule out an entire range of entries. For example, in Figure 10.6, checking the midpoint of the list and finding that "Missouri" comes after "Illinois" eliminated 8 entries (positions 8 - 15). Subsequently checking the midpoint of the list's first half and finding that "Florida" comes before "Illinois" eliminated an additional 4 entries (positions 1 - 4). In general, each time you look at an entry, you can rule out roughly half the remaining entries. This halving effect yields the same logarithmic behavior that we noted in Algorithm 2 for finding the oldest person in a room. Thus, the time required to search a list using binary search is proportional to the logarithm of the size of the list, making binary search an O(log N) algorithm. This means that binary search is much more efficient than sequential search is when traversing large amounts of information. Using binary search, a phone book for a small town (10,000 people) could be searched in at most log2(10,000) = 14 checks, a large city (1 million people) in at most log2(1,000,000) = 24 checks, and the entire United States (280 million people) in at most log2(280,000,000) = 29 checks.
Programming is all about designing and coding algorithms for solving problems. The intended executor of those algorithms is the computer itself, which must be able to understand the instructions and then carry them out in order. Since computers are not very intelligent, the instructions they receive must be very specific. Unfortunately, human languages such as English are notoriously ambiguous; thus, the ability to program computers in a human language is still many years away. Instead, computer instructions are written in programming languages, which are more constrained and exact than human languages are.
The level of precision necessary to write a successful program often frustrates beginning programmers. However, it is much easier to program today's computers than those of fifty years ago. As we explained in Chapter 6, the first electronic computers were not programmable at all. Huge machines such as the ENIAC were wired to perform a certain computation. Although users could enter inputs via switches, the computational steps carried out by the computer could only be changed by rewiring the physical components into a different configuration. Clearly, this arrangement made specifying an algorithm tedious and time-consuming. With the advent of von Neumann's stored-program architecture, computers could be programmed instead of rewired. Users could specify algorithms as sequences of instructions in a language understood by the computer. The instructions could then be loaded into memory and executed by the computer
Programming languages introduce a level of abstraction that insulates the programmer from the computer's low-level, mechanical details. Instead of having to worry about which wires should connect which components, the programmer must instead learn to speak the language that controls the machinery. However, the first programming languages, which were developed in the late 1940s, provided a very low level of abstraction. Instructions written in these languages correspond directly to the hardware operations of a particular machine. As such, these languages are called machine languages. Machine-language instructions deal directly with the computer's physical components, including main memory locations and registers, fast memory cells within the CPU. Examples of these primitive instructions might include:
Of course, machine-language instructions aren't represented as statements in English. As we learned in Chapter 1, computers store and manipulate data in the form of binary values (i.e., patterns of 0s and 1s). It is not surprising, then, that machine-language instructions are also represented in binary. For example, the above instructions might be written as:
Thus, machine-language programs are comprised of binary-number sequences corresponding to primitive machine operations. Although writing such a program and storing it in memory is certainly preferable to rewiring the machine, machine languages do not make for easy programming. Programmers have to memorize the binary codes that corresponded to machine instructions, which is extremely tedious. Entering in long sequences of zeros and ones is error-prone, to say the least. And, if errors do occur, trying to debug a sheet full of zeros and ones is next to impossible. To make matters worse, each type of computer has its own machine language that is specific to the machine's underlying hardware. Machine-language programs written on one type of computer can be executed only on identical machines.
To provide a taste of what these earlier programmers went through, Figure 10.6 depicts an excerpt from an actual machine-language program. How easy do you think it would be to look at this program and determine what it does? If you ran this program and it did not behave as desired, how easy would it be to work through the code and determine where it went wrong?
|
Although machine languages enable computers to be reprogrammed without being rewired, these languages introduce only a low level of abstraction. Since each binary instruction corresponds to a physical operation, programmers must still specify algorithms at the level of the machinery. Furthermore, the programmer has to specify all instructions and data as binary numbers. In the early 1950s, programming capabilities evolved with the introduction of assembly languages. Assembly languages provide a slightly higher level of abstraction by allowing programmers to specify instructions using words, rather than binary-number sequences. However, these words still correspond to operations performed by the computer's physical components, so assembly-language programming still involves thinking at a low level of abstraction.
In order to solve complex problems quickly and reliably, programmers need to write instructions at a higher level of abstraction that more closely relates to the way humans think. When we solve problems, we make use of high-level abstractions and constructs, such as abstract values, conditional choices, repetition, and modules. A language that includes such abstractions provides a more natural framework through which to solve problems. Starting in the late 1950s, computer scientists began developing high-level languages to address this need. The first such language was FORTRAN, written by John Backus at IBM in 1957. Two years later, John McCarthy invented LISP at MIT, and a multitude of high-level languages soon followed. JavaScript, the high level language you have been using throughout this text, was invented in 1995 by Brendan Eich and his research team at Netscape Communications Corporation.
Figure 10.7 shows two different high-level language programs that perform the same task: each asks the user to enter a name and then displays a greeting. The program on the left is written in JavaScript, whereas the program on the right is written in C++. As you can see, high-level programming languages are much closer to human languages than machine languages are. With only a little knowledge of each language, anyone could detect and fix errors in these programs.
|
|
Another advantage of high-level languages is that the resulting programs are machine-independent. Because high-level instructions are not specific to the underlying computer's hardware configuration, programs written in high-level languages are theoretically portable across computers. Such portability is essential, because it enables programmers to market their software to a wide variety of users. If different versions of the same program were required for each brand of computer, software development would be both inefficient and prohibitively expensive.
While it is desirable for programmers to be able to reason at a high level of abstraction, we cannot forget that programs are written to be executed. At some point, a high-level program must be translated to machine instructions that computer hardware can implement. Two basic techniques, known as interpretation and compilation, are employed to perform these translations. Before we formally define these approaches, let us examine the following analogies.
Consider the task of translating a speech from one language to another. Usually, this is accomplished by having an interpreter listen to the speech as it is being given and translate it a phrase or sentence at a time (Figure 10.8). In effect, the interpreter becomes a substitute for the original speaker, repeating the speaker's words in a different language after a slight delay to perform the translation. One advantage of this approach is that it provides a real-time translation: an observer can see the speaker and hear the translated words in the same approximate time frame. The immediacy of the translation also makes it possible for the observer to give feedback to the interpreter. For example, if the interpreter were using the wrong dialect of a language, the observer could correct this early in the speech.
|
If immediacy is not an issue, to the interpreter could instead translate the speech in its entirety after the fact. Given a recording (or transcript), a translator could take the time to convert the entire speech to the desired language and then produce a recording of the translation (Figure 10.9). Although this approach is less immediate and more time-consuming than real-time translation is, there are several advantages. Once the speech has been translated, multiple people can listen to it as many times as desired, without to the need for retranslation. In addition, after the initial translation process, no additional delays occur when hearing the content of the speech. This approach is clearly necessary for translating books, since it is impractical to translate the book one phrase at a time every time someone wants to read it.
|
Translating a high-level language program into machine language is not unlike translating spoken languages. The two techniques we have described are analogous to the two approaches used in programming-language translation. The interpretation approach involves a program known as an interpreter, which translates and executes the statements in a high-level language program. An interpreter reads the statements one at a time, immediately translating and executing that statement before processing the next one. This is, in fact, what happens when JavaScript programs are executed. A JavaScript interpreter is embedded in modern Web browsers. When the browser loads a page containing JavaScript code, the interpreter executes the code one line at a time, displaying the code's output in the page (Figure 10.10).
|
By contrast, the compilation approach uses a program known as a compiler to translate the entire high-level language program into its equivalent machine-language instructions. The resulting machine-language program can then be executed directly on the computer (Figure 10.11). Most programming languages used in the development of commercial software, such as C and C++, employ the compilation approach.
|
As with the human-language translation options, tradeoffs exist between the interpretation and compilation approaches. An interpreter produces results almost immediately, since it reads in and executes each instruction before moving on to the next. This immediacy is particularly desirable for languages used in highly interactive applications. For example, JavaScript was specifically designed for adding dynamic, interactive features to Web pages. If the first statement in a JavaScript program involved prompting the user for a value, then you would like this to happen immediately, without having to wait for the rest of the program to be translated. The disadvantage of interpretation, however, is that the program executes more slowly. The interpreter must take the time to translate each statement as it executes, so there will be a slight delay between executing each statement. A compiler, on the other hand, produces a machine-language program that can be run directly on the underlying hardware. Once compiled, the program will run very quickly. If a language is intended for developing large software applications, execution speed is of the utmost importance; thus, languages such as C++ are translated using compilers. The drawback to compilation, however, is that the programmer must take the time to compile the entire program before executing.
Answers to be submitted to your instructor via email by Thursday at 9 a.m.