How to Solve Mathematical Problems

April 5, 2018 | Author: Anonymous | Category: Documents
Report this link


Description

How to Solve Mathematical Problems Wayne A. Wickelgren How to So|ve Prob|ems ELEMENTS OF A THEORY OF PROBLEMS AND PROBLEM SOLVING Wayne A. Wickelgrn UNlVLHÛͯY Ll LHLLLN ¯ W. M. FREEMAN AND COMPANY San Francisco The Dover reprint added "Mathematical" to the title. LIbrarÿ o¡ Congre88 CatalogIng In PublIcatIon Data Wi ckel gren, Wayne A 1938- How to solve problems. Bi bl i ography: p. 1. Mathemati cs-Problems, exerci ses, etc. 2. Problem solvi ng. I. Ti tle. QA43.w52 511 73-15787 ISBN 0-7167-0846-9 ISBN 0-7167-0845-0 (pbk.) Copyri ght @1974 by W. H. Freeman and Company No part of thi s book may be reproduced by any mechan i cal, photograph i c, or electroni c process, or in the form of a phonographi c recordi ng, nor may i t be stored i n a retri eval system, transmi tted, or otherwi se copi ed for publi c or pri vate use wi thout wri tten per mi sSi on of the publ i sher. Pri nted in t he Uni ted States of Ameri ca 1 2 3 º D 6 7 8 9 For as long as / can remember, / have been more interested in reflecting on what / was doing or thinking and in thinking about ways to improve my methods than / have been in the particular things / was doing or thinking about. This emphasis on self-analysis and improvement reflects the influence of my mother and father, Alma and Herman Wickelgren, to whom this book is dedicated and whose values and practical principles have contributed so much to my life. Preface ix 3 Introduction Z Problem Theory ¯ ó Inference Z+ Contents 4 Classification of Action Sequences ºÛ b State Evaluation and Hill Climbing Û¯ b Subgoals d+ 1 Contradiction +Ûd ö Working Backward +Û¯ Ü Relations Between Problems +ÕZ 3Ü Topics in Mathematical Representation +ܺ 33 Problems from Mathematics, Science, and Engineering ZÛd References ZÕ¯ Index ZÕd Preface I n the mathematics and sci ence courses I took in col l ege, I was enor­ mousl y irritated by the hundreds of hours that I wasted staring at probl ems without any good idea about what approach to try next in attempting to solve them. I thought at the time that there was no edu­ cational value i n those "bl ank" mi nutes, and I see no val ue i n them today. The general probl em-sol vi ng methods described i n thi s book vi rtual l y guarantee that you wi l l never agai n have a bl ank mi nd i n such ci rcumstances. They shoul d al so hel p you sol ve many more probl ems and sol ve them faster. But whether or not you sol ve any parti cul ar probl em, you wi l l al ways have l ots of ideas about ways to attack the probl em. Also, the use of general probl em-sol vi ng methods ofen i ndi ­ cates the properties of the pri nci pl es you need to know from the sub­ ject matter that the probl em i s attempt i ng to teach and t est . Thus, whether you succeed of fai l i n sol vi ng any parti cular probl em, t he efort wi l l be i nteresti ng and educati onal . The theoretical and practical anal yses of probl ems and probl em sol vi ng presented here were heavi l y i nfuenced by advances made over the last ·years i n the fel ds of artifcial i ntel l igence and com­ puter si mul ation of thought. My greatest i ntel l ectual debts are to Al l en Newel l , Herbert Si mon, and George Pol ya. Newel l and Si mon' s X Preface anal yses of probl ems and probl em sol vi ng constituted my starting point for working i n this area, and many of the best ideas i n the book are ideas they have al ready presented in one form or another. Many other good ideas were taken more or less di rectl y from Polya, whose books on mathemati cal probl em solving are a rich source of methods and a sti mul us for thought. My eforts to understand and organi ze probl em-sol vi ng methods began in 1 959 when, as an undergraduate at Harvard, I frst became aware of the pi oneering work of Al l en Newel l , Cl if Shaw, and Herb Simon on the computer si mul ation of thi nki ng. During graduate school at the University of California, Berkel ey, I regarded probl em sol vi ng as my major research area. I do not thi nk that my experimental studies of human probl em sol vi ng ever amounted to much. However, I thought at the time (and think today) that my theoretical (mathematical ) under­ standi ng of probl ems and probl em sol vi ng was i mmeasurabl y i n­ creased and that t hi s greatl y enhanced my abi l i ty t o sol ve al l ki nds of mathematical probl ems. Shortl y afer coming to MI T as a new facuI ty member i n the Psychol ogy Department, I decided that one contribution I could make to the undergraduates there was to teach them this newly acqui red ski l l of mathematical probl em sol ving. The students enjoyed the course and, more i mportant, reported back to me i n later years that they thought that their probl em-sol vi ng abi l i ty i n mathemati cs, sci ence, and engineering courses had been greatl y i ncreased by learning these general probl em-sol vi ng methods. En­ rol l ment i n the course went from ·to ·i n three years, when I stopped gi vi ng it because my pri mary research i nterest had shifted to human memory. Some years l ater, after moving to the University of Oregon, I deci ded that I now had the time to write a book contai ni ng al l the ideas that I had acqui red from others and generated mysel f concerning probl ems and probl em sol ving. The purpose of the book i s to i mprove your abil ity to sol ve al l ki nds of mathematical probl ems whether i n mathemati cs, science, en­ gineering, busi ness, or purel y recreational mathematical problems (puzzl es, games, and so on). Thi s book i s pri mari l y i ntended for col­ l ege students who are currentl y taking el ementary mathemati cs, sci ence, or engineering courses. However, I hope that students with less mathemati cal background can read the book and master the methods without an undue degree of addi tional efort and also that more advanced readers wi l l proft from it without being bored. I bel i eve that almost everyone who sol ves mathemati cal probl ems can proft substantial l y from l earni ng the general probl em-sol ving methods Preface XÌ described here, and I have tried to write in a way that wi l l communi ­ cate efecti vel y to al l such peopl e. The approach i s to defne each general probl em-sol vi ng method and i l l ustrate i ts appl i cati on to s i mpl e recreational mathemati cs probl ems that requi re no more mathematical background than that possessed by someone wi th a year of hi gh school algebra and a year of plane geometry. An el ementary knowl edge of "new mathematics" ( sets, rel ati ons, functions, probabi l i ty, and so on) would be hel pful , and some of thi s i s briefy taught i n Chapter The sol uti ons to exampl e probl ems are presented gradual l y, usual l y i n t he form of hi nt s t o gi ve t he reader more and more chances t o go back and sol ve the probl em. Thi s techni que is founded on the bel ief that you wi l l remember best what you di scover for yourself. The book aims to guide you to di scoveri ng how to appl y general probl em-sol vi ng methods to a ri ch variety of probl ems. I bel i eve that if you read t hi s book and t ry to appl y the methods to around ·or of your own probl ems, you wi l l i mprove substanti al l y i n probl em-sol vi ng abi l i t y, wi t h consequent benefts i n job performance, school grades , and " i n­ tel l igence" test scores ( i ncl udi ng SAT col lege entrance exams, and The Graduate Record Exam). Fi nal l y, I woul d l i ke to make a negat i ve acknowledgment . Thi s book was wri tten i n spite of my four-year-ol d son, Abraham, and my si x­ year-ol d daughter, I ngri d, who are such del i ghtful peopl e that I cannot resi st spendi ng vast amounts of t i me wi th them. October 1973 Wayne A. Wicke/gren How to Solve Problems 1 Introduction The purpose of thi s book is to hel p you i mprove your abi l i ty to sol ve mathemati cal , sci entifc, and engineering probl ems. With thi s i n mi nd, I wi l l describe certain el ementary concepts and pri nci pl es of the theory of probl ems and probl em solving, something we have learned a great deal about si nce the 1 950s, when the advent of computers made pos­ sibl e research on arti fci al i ntel l igence and computer si mulation of human probl em sol vi ng. I have tried to organi ze the di scussi on of these ideas i n a si mpl e, logical way that wi l l help you understand, remember, and appl y them. You shoul d be warned, however, that the theory of probl em sol vi ng is far from bei ng preci se enough at present to provi de si mpl e cookbook i nstructi ons for sol vi ng most probl ems. Partl y for this reason and partl y for reasons of i ntri nsi c meri t, teaching by example i s the pri mary ap­ proach used in thi s book. Fi rst, a probl em-sol vi ng method wi l l be di scussed theoreti cal l y, then it wi l l be appl i ed to a variety of probl ems, so that you may see how to use the method i n actual practi ce. To master these methods, i t i s essential to work through the exampl es of thei r appl i cation to a vari ety of probl ems. Thus, much of the book i s devoted to anal yzi ng probl ems that exempl ify the use of di ferent methods. You should pay careful attention to these probl ems and Z Chapter J shoul d not be di scouraged if you do not perfectl y understand the the­ oretical di scussi ons. The theory of probl em sol vi ng will undoubtabl y hel p those students with sufci ent mathematical background to under­ stand i t, but students who l ack such a background can compensate by spendi ng greater time on the exampl es. SCOPE OF THE BOOK Thi s book i s pri mari l y a practi cal gui de to how to solve a certai n cl ass of probl ems, speci fcal l y, what I cal l formal problems or just "prob­ l ems" (wi th the adjecti ve formal bei ng understood in later contexts). Formal probl ems i ncl ude al l mathematical probl ems of ei ther the "to fnd" or the "to prove" character but do not i ncl ude probl ems of de­ fni ng "mathemati cal l y i nteresti ng" axi om systems. A student taking mathemati cs courses will hardl y be aware of the practical si gni fcance of this excl usi on, since defning i nteresti ng axiom systems i s a prob­ lem not typi cal l y encountered except i n certain areas of basic research i n mathemati cs. Si mi l arly, the probl em of constructi ng a new mathe­ matical theory in any feld of sci ence is not a formal probl em, as I use the term, and I wi l l not di scuss i t i n thi s book. However, any other mathematical probl em that comes up i n any feld of sci ence, engineer­ i ng, or mathemati cs is a formal probl em in the sense of thi s book. Probl ems such as what you should eat for breakfast, whether you shoul d marry X or ), whether you should drop out of school , or how can you get yourself to spend more time studyi ng are not formal prob­ l ems. These probl ems are vi rtual l y i mpossi bl e at the present time to turn i nto formal probl ems because we have no good ways of restrict­ ing our thi nki ng to a speci fed set of given i nformation and operati ons (courses of acti on we mi ght take) , nor do we ofen even know how to specify preci sel y what our goal s are i n sol vi ng these probl ems. Under­ standi ng formal probl ems can undoubtedl y make some contri buti ons to your thi nki ng in regard to these poorly specifed personal probl ems, but the scope of the present book does not i ncl ude such probl ems. Even if i t di d, i t woul d be extremel y di fcul t to specify any preci se methods for sol vi ng them. However, formal probl ems i ncl ude a l arge cl ass of practi cal probl ems that peopl e mi ght encounter i n the real worl d, although they usual l y encounter them as games or puzzl es presented by fri ends or appearing i n magazi nes. A practical probl em such as how to build a bridge across a ri ver i s a formal probl em if, i n sol vi ng the probl em, one i s l i mited to some speci fed set of materi al s (gi vens) , operati ons, and, of course, the goal of getting the bri dge bui l t. Introduction ó I n actual i ty, you might l i mi t yourself in thi s way for a whi l e and, if no solution emerged, deci de to consi der the use of some additi onal materi al s, i f possi bl e. Expanding the set of given material s (by means other than the use of acceptable operati ons) i s not a part of formal prob­ lem solving, but often the situation presents certai n givens i n sufci entl y di sgui sed or i mpl i ci t form that recogni ti on of al l the gi vens i s an i m­ portant part of ski l l in formal probl em sol vi ng. That ski l l wi l l be di scussed l ater. Practical probl ems or puzzl es of the type we wi l l consi der di fer from probl ems i n mathemati cs, sci ence, or engi neeri ng i n that to pose them requi res l ess background i nformati on and trai ni ng. Thus, puzzl e probl ems are especi al l y sui table as exampl es of probl em-sol vi ng methods i n thi s book, because they communi cate the worki ngs of the methods most easi l y to the wi dest range of readers. For this reason, puzzl e probl ems wi l l constitute a l arge proporti on of t he exampl es used i n thi s book -at least prior to the l ast chapter. I n pri nci pl e, it might seem that most i mportant probl em-sol vi ng methods woul d be uni que to each speci al i zed area of mathemati cs, sci ence, or engi neeri ng, but thi s i s probabl y not the case. There are many extremel y general probl em-sol vi ng methods, though, to be sure, there are al so speci al methods that can be of use i n onl y a l i mited range of fel ds. I t may be qui te di fcul t to l earn the speci al methods and knowl edge requi red i n a parti cul ar fel d, but at least such methods and knowl ­ edge are the speci fc object of i nstructi on i n courses. By contrast, general probl em-sol vi ng methods are rarel y, i f ever, taught, though they are quite hel pful i n sol vi ng probl ems i n every fel d of mathemati cs, sci ence, and engineering. GENERAL VERSUS SPECIAL METHODS The relation between speci fc knowl edge and methods, on the one hand, and general probl em-sol vi ng methods , on the other hand, ap­ pears to be as fol l ows. When you understand the rel evant material and specifc methods quite wel l and al ready have consi derable ex­ perience i n appl yi ng thi s knowl edge to si mi l ar probl ems, then i n sol v­ ing a new probl em you use the same speci fc methods you used before. Consi deri ng the methods used in si mi l ar probl ems is a general probl em­ sol vi ng techni que. However, in cases where it is obvi ous that a par­ ti cul ar probl em is a member of a cl ass of probl ems you have sol ved before, you do not need to make expl ici t, consci ous use of the method: si mpl y go ahead and sol ve the probl em, using methods_ that you have 4 Chapter J l earned to appl y to thi s cl ass. Once you have thi s l evel of under­ standi ng of the rel evant materi al , general problem-sol vi ng methods are of l i ttl e val ue in sol vi ng the vast majority of homework and exami na­ tion probl ems for mathemati cs, sci ence, and engineeri ng courses. When probl ems are more compl i cated, i n the sense of i nvol vi ng more component steps, and are not hi ghl y si mi l ar to previousl y sol ved probl ems, the use of general probl em-sol vi ng methods can be a sub­ stantial aid i n sol uti on. However, such compl ex probl ems wi l l be en­ countered onl y rarel y by the begi nni ng mathematics, science, and engi neeri ng students taking courses i n high school and col l ege. More i mportant to the i mmedi ate needs of such students is the rol e of gen­ eral probl em-sol vi ng methods i n si mpl e homework and exami nation probl ems where one does not completel y understand the rel evant material and does not have consi derabl e experience i n sol vi ng the rel evant class of probl ems. In such cases, general probl em-sol vi ng methods serve to gui de the student to recogni ze what rel evant back­ ground i nformati on needs to be understood. For exampl e, when one understands the general probl em-sol vi ng method of setting subgoal s, one can often set parti cul ar subgoal s that di rectl y i ndi cate what types of speci fc i nformation are bei ng tested (and thereby taught) by a parti cul ar probl em. One then knows what secti ons of the textbook to reread in order to understand the rel evant materi al . I f, however, the book is not avai l abl e, as in many exami nation si tuations, general probl em-sol vi ng methods provi de one with powerful general methods for retri evi ng from memory the rel evant background i nformati on. For exampl e, the use of general probl em-sol vi ng methods can i ndi cate for whi ch quanti ti es one needs a formul a and can provi de a basi s for choosi ng among diferent al ternati ve formul as. Frequentl y, a student may know all the defni ti ons, formul as, and so on, but not have strong associ ati ons to this knowl edge from the cues present in each type of probl em to whi ch thi s knowledge i s rel evant. With experience i n sol vi ng a variety of probl ems to whi ch the knowl edge i s relevant, one will devel op strong di rect associations between the cues i n such probl ems and this relevant knowl edge. How­ ever, in the earl y stages of learni ng the material , a student wi l l l ack such di rect associ ati ons and wi l l need to use general probl em-sol vi ng methods to i ndi cate where i n one' s memory to retri eve rel evant in­ formation or where in the book to look it up. Assumi ng t hi s idea is true (and thi s book aims to convi nce you i t i s) , mastering general probl em-sol vi ng methods is i mportant to you both so you can use prob­ l ems as a l earni ng devi ce and so you can achi eve the maxi mum range of appl i cabi l i ty of the knowl edge you have stored in mi nd -on an exami nati on, on a job, or whatever. Introduction b The goal of thi s book is to teach as many of these general probl em­ sol vi ng methods as I know about, so that if you spend the ti me to master these methods you can more efecti vel y l earn the subject matter of your courses. Al so, si nce the abi l i ty to use the i nformation gi ven i n most mathemati cs, sci ence, and engi neeri ng courses i s ofen pri ­ mari l y the abi l i ty to sol ve probl ems i n these fel ds, the book aims to i ncrease thi s abi l i ty to use knowl edge. RELATION TO ARTIFICIAL INTELLIGENCE It shoul d be emphasi zed that t hi s text is pri mari l y a practical how-to­ do-it book i n a feld where the l evel of preci se (mathemati cal ) formul a­ tion is far bel ow what I am sure it wi l l be in the future, perhaps even the near future. Arti fci al i ntel l igence and computer si mul ati on of human probl em sol vi ng are currentl y very acti ve fel ds of research, and resul ts from some of this work have heavi l y i nfuenced thi s book. However, theoretical formul ations of probl em sol vi ng superior to those we currentl y have wi l l eventual l y make the present formul ati on outdated. Neverthel ess, the methods described i n the present book, however i mperfectl y, can be of substantial beneft to any student who masters them. When someone has a beautiful mathematical theory of probl ems and probl em sol vi ng sometime i n the future, then cl earer and more efecti ve how-to-do-it books can be written. Meanwhi l e, it i s my hope that t hi s book wi l l hel p many peopl e to sol ve probl ems better than they di d before. APPLYING METHODS TO PROBLEMS As di scussed previ ousl y, to master the probl em-sol vi ng methods de­ scribed i n this book, it i s necessary to study the exampl e probl ems i l ­ l ustrati ng their use. The probl ems and sol uti ons anal yzed i n Chapters 3 to 1 0 i l l ustrate the use of the methods di scussed i n the parti cul ar chapter. Chapter 1 1 consi ders a vari ety of homework and exami nation probl ems for mathemati cs, sci ence, and engineering courses. Of course, you probabl y have l ots of your own probl ems to sol ve in school or work, and you shoul d begi n usi ng the methods on these probl ems i mmedi atel y. Merel y readi ng thi s book provi des onl y the begi nni ng concepts necessary to mastering general probl em-sol vi ng methods. Practice i n usi ng t he methods i s essential t o achi evi ng a hi gh l evel of ski l l . b Chapter J Everyone who sol ves probl ems uses many or al l of the methods descri bed i n this book, but if you are not an extremel y good probl em sol ver, you may be usi ng the methods l ess efectivel y or more hap­ hazardl y than you could be by more expl i ci t trai ni ng i n the methods. At frst , the appl i cation of such expl i ci tly taught probl em-sol vi ng methods i nvol ves a rather sl ow, consci ous anal ysi s of each probl em. There i s no parti cul ar reason to engage i n t hi s careful , consci ous anal ysi s of a probl em when you can i mmediatel y get some good ideas on how to sol ve i t. Just go ahead and sol ve the probl em "natural l y. " However, afer you sol ve i t or, even better, whi l e you are sol vi ng i t, anal yze what you are doi ng. I t wi l l greatl y deepen your understandi ng of probl em-sol vi ng methods, and you might di scover new methods or a new appl i cation of an ol d method. As you get extensi ve practice i n usi ng these probl em-sol vi ng methods you shoul d become so ski l l ed in thei r use that the process becomes l ess consci ous and more automatic or natural . Thi s i s the way of al l ski l l l earni ng, whether dri vi ng a car, pl ayi ng tenni s, or sol vi ng mathematical probl ems. Z Problem Theory FOUR SAMPLE PROBLEMS To i l l ustrate the concepts invol ved in the theory of probl ems described i n thi s chapter, we will begin with four sample probl ems. Instant Insanity Ínstant Ínsanity i s the name of a popul ar puzzl e consi sting of four smal l cubes. Each face of every cube has one of four colors : red ( R) , blue ( B) , green (G) , or white ( W) . Each cube has at least one of i ts si x faces with each of the four diferent col ors, but the remai ni ng two faces necessari l y must repeat one or two of the colors al ready used. The exact confgurati ons of colors on the faces of the cubes are shown in Fig. 2- 1 . The faces of the cubes in the fgure have been cut along the edges and fattened out for easy presentation on the two­ di mensional page. (To reconstruct the cube i n three-di mensi ons, one would si mpl y cut out the outl ined fgure, turn the top fap over on the top and the bottom over on the bottom, and wrap the l eft si de and back around to joi n up with the right si de at the rear of the cube. ) For con­ venience, the faces of one cube i n the fgure have been labeled front , top, bottom, back, left side, right side. I f you thi nk of the front cube ö ßack W H 2B2G 2G2W I � I ßotIOm U 2R2W W W G �" sdc H FlGUHE 2-1 W W Û H The six colored faces of each of the four cubes in the Instant Insanity puzzle. You could cut out each of the above fgures and fold along the edges to make cubes. In the above fgure H ¯ red, B ¯ blue, W¯ white, and C ¯ green. The cubes have been given these "names": 2B2C, 2C2W. 2H2W. and 3H. which indicates the colored faces, of which they have more than one. K U U 3H Þ W Chapter Z H as being cl osest to you (facing you), then mental ly constructing the cube from the two-di mensional drawi ng should be rel ati vel y easy to do. However, you may wish to buy the puzzle to provi de a more con­ crete and enjoyabl e representation. Problem Theory Ü The goal of the puzzl e is to arrange the cubes one on top of the other i n such a way that they form a stack four cubes high, wi th each of the four sides having exactl y one red cube, one blue cube, one green cube, and one whi te cube. Chess Problem From the board confguration shown in Fi g. 2-2 descri be a sequence of moves such that whi te can achieve mate in fve moves. Find Problem from Mechanics What constant force wi l l cause a mass of 3 ki lograms to achi eve a speed of 30 meters per second in 6 seconds, starting from rest? Proof Problem from Modern Algebra You are gi ven a mathematical system consi sting of a set of el ements (A, B, C), with two bi nary operati ons (call them addition and multipli­ cation) that combi ne two elements to give a third element. The system has the fol l owi ng properties: ( I ) Addition and mul ti pl i cation are cl osed ; that is, A + B and A . B are members of the original set for al l A and B i n the set. ( 2) Mul tipl ication i s commutative ; that is, A B equals H f ¯ ' i . i FIGURE 2-2 Pr of a famous chess problem. White to achieve mate in fve moves. 3Ü Chapter Z BA for al l A and B in the set. ( 3) Equal s added to equal s are equal ; that i s, if A ¬A I and B ¬ B I, then A + B ¬A I + B I, for all A, B, A I, B I in the set. (4) The l ef di stri buti ve law appl ies ; that i s, C(A + B) - CA + CB, for al l A, B, C in the set. ( 5) The transi ti ve law al so appl ies ; that i s, if A ¬ B and B ¬ C, then A ¬ C. From these given assumptions, you are to prove the right di stri buti ve l aw-that i s, that (A + B)C ¬ AC + BC, for al l A, B, C in the set. WHAT IS A PROBLEM? Al l the formal probl ems of concern to us can be consi dered to be composed of three types of i nformation: i nformation concerning givens (gi ven expressi ons), i nformation concerning operations that transform one or more expressi ons into one or more new expressi ons, and in­ formation concerning goals ( goal expressi ons). There may be i nter­ medi ate subgoal expressi ons mentioned expl i citl y in the probl em, or the probl em sol ver may defne these subgoal expressi ons for hi mself; but we wi l l assume that there is onl y one terminal goal per problem. Any probl em stated with two or more independent terminal goal s coul d always be viewed as two or more probl ems with the same gi vens and operations and diferent goal s. For conveni ence and accuracy, I tend t o take t he more formal view that a probl em i nvol ves expressi ons of i nformation rather than actual physi cal objects. Even i n a practical probl em stated i n terms of physi cal objects, it is always possi bl e to consi der objects or sets of properties of objects as represented by expressi ons. I ndeed, we must have representati ons in our heads of objects, properti es of objects, and op­ erati ons when we sol ve practi cal probl ems, si nce we certai nl y do not have the real objects there. Thus, defnitions of probl ems, soluti ons, and methods need not make any di sti nction between practical (con­ crete) and symbol ic (abstract, mathematical ) . However, when deal i ng with a practi cal probl em, there is no need to tal k of representations or expressions, if the probl em i s more easily solved without using this more abst ract l anguage. Givens Gi vens refer to the set of expressi ons that we accept as bei ng present i n the world of the probl em at the onset of work on the probl em. I n­ deed, the gi vens and the operati ons together constitute the enti re world of the probl em at the begi nni ng of work on it. This defni tion of the givens encompasses expressions representi ng objects, things, pieces Problem Theory 33 of material , and so on, as wel l as expressi ons representing assump­ tions, defnitions, axi oms, postulates, facts, and the l i ke. In some ki nds of puzzl es the gi vens consi st of the material s. For exampl e, the gi vens i n I nstant I nsanity are four cubes, with each si de of each cube havi ng one of four colors (red, bl ue, green, or whi te) , as shown i n Fi g. 2- 1 . I n the chess probl em, the gi vens are the pi eces of each pl ayer and their posi ti ons on the board pl us the i nformation concerni ng whose move it is. In the parti cul ar chess probl em shown i n Fi g. 2-2, the givens are that whi te has a ki ng, a rook, and a pawn at the posi ti ons i ndi cated; that bl ack has a king, a bi shop, and two pawns at the posi ­ ti ons indi cated; and that i t i s white' s move. The i mpl i ci tl y specifed given i nformation consi sts of al l the rul es of chess, incl uding such i n­ formation as that a rook can move any number of squares al ong a row or col umn unti l blocked by another pi ece, that a king can move one square i n any di rection ( horizontal l y, vertical l y, or di agonal l y) , that checkmate consi sts of putting the opponent' s ki ng i nto a position where it woul d be captured on the next move if it was not moved out of the square it was i n and such that all squares that the ki ng coul d move to would al so resul t in capture. I n the fnd probl em, the gi vens are the information expl i citl y stated i n the probl em pl us whatever other mathematical or sci entifc knowl ­ edge i s to be i mpl i ci tl y assumed as part of the gi vens. In the physi cs probl em described above, the expl i ci tl y described gi ven information i ncl udes the fol l owi ng: the mass of the given object i s 3 ki l ograms, its initial speed i s zero, its fnal speed afer 6 seconds of appl ying a force is 30 meters per second, and the force and mass are constant. I m­ pl i ci tl y specifed information i ncl ude Newton' s second law that force equal s mass times accel eration, and the rul es of algebra and possi bl y cal cul us (dependi ng upon how one sol ves the probl em). In a mathematical proof probl em, the gi vens are al l the axi oms that one is allowed to assume. The gi vens in the parti cular proof probl em descri bed above are three of the fve assumpti ons: ( 1 ) that the system i s closed, (2) that mul ti pl i cation i s commutati ve, and (4) that the lef di stri butive law holds. Assumptions ( 3) , that equals added to equal s are equal , and ( 5) , that the transi ti ve law hol ds (that i s, if A -B and B -C, then A -C) are real l y rul es of i nference rather than gi vens. Rul es of i nference are operati ons, di scussed bel ow. Operations Operations refer to the acti ons you are al l owed to perform on the givens or on expressi ons derived from the givens by some previ ous TZ Chapter Z sequence of actions. Other terms for operati ons i ncl ude transforma­ tions and rules of inference, though the l atter term seems to be appro­ priate onl y for concl usion-drawing probl ems and not so appropriate for action-oriented probl ems. I n I nstant I nsanity, the allowable operations can be conceptuali zed i n a vari ety of equi val ent ways, the si mpl est of whi ch i s just that cubes can be pl aced on top of one another in a singl e tower ( such that al l faces of al l cubes are either paral l el or perpendi cul ar to one another) . I n a chess probl em, the allowabl e operations are given by the all ow­ able moves of each pi ece on the board of the player whose turn it i s to move. I n a fnd probl em, the operati ons are sometimes pecul iar to the probl em but are often the operati ons (or rul es of i nference) of mathemati cs or logi c. I n the mechani cs probl em described at the be­ gi nni ng of this chapter, mul ti pl ying or di vidi ng both sides of an equa­ tion by the same quantity is an al l owabl e operation. In a proof probl em, the operations are those rul es of inference that are allowabl e wi thi n the mathematical system i n question. For exampl e, i n proposi ti onal l ogi c, if proposition A i s true and if the statement "A i mpl i es B" i s true, then one may infer that proposition B i s true. In the modern-algebra proof probl em described at the begi nni ng of the chapter, the two rul es of inference that constitute the allowable operati ons i n this probl em are property ( 3) , that if A =A ' and B = B' , then A + B = A' + B' , and property ( 5) , that if A = B and B = C, then A = C. Note that these operati ons take two i nput expressi ons and produce a singl e new output expressi on. Al so note that, although ad­ dition and mul ti pl i cation are certai nl y operati ons wi thi n the mathe­ matical system described i n the proof probl em, mul ti pl ication and addition are not the operations to be used i n sol ving the probl em. Something that i s an operation i n one probl em may be onl y a part of the gi ven expressi ons in another probl em. Let me di sti ngui sh between destructive operations, whi ch produce new expressions by destroyi ng ol d expressi ons, and nondestructive operations, whi ch produce new expressi ons to i ncrease the set of exi sting expressi ons without destroyi ng any ol d expressions. In the above exampl es, I nstant I nsanity and chess i nvol ve destructive opera­ tions ; al gebraic fnd probl ems and l ogical proof probl ems i nvol ve nondestructi ve operations. Al though many probl ems allow one t o use any al l owabl e operation at any time, some probl ems pl ace restricti ons on the number of times an operation can be used or the conditions under whi ch it can be used. For i nstance, i n chess a pawn frst can be moved either one or two squares, but thereafter it can be moved ahead only one square at a time. Problem Theory Tó Let us adopt t he convention that an operation refers t o a cl ass of actions, with the actions being di stingui shed onl y by the operands­ expressi ons or objects -to which the operation i s appl i ed. Assume that a parti cul ar operati on, F, can be appl i ed to any expressi on wi thi n some set of expressi ons, {Xi} . The parti cul ar Xi to whi ch we wi l l appl y the operation wi l l be called the operand. The operation appl i ed to a parti cul ar operand, namel y, F(Xi ), wi l l be cal l ed an action. Obvi ousl y, these defnitions of operations, operands, and acti ons general ize easi l y t o functions of more than one variabl e -for exampl e, F(x, ), z). Goals The goal of a probl em is a terminal expressi on one wi shes to cause to exi st in the world of the probl em. There are two types of goal s specifed i n probl ems: completel y specifed goal expressions i n proof probl ems and incompletely specifed goal expressi ons in fnd probl ems. For exampl e, consi der the probl em of fnding the val ue of X, gi ven the expression 4x + 5 -1 7. I n this probl em, one can regard the goal expression as being of the form X -g where the correct number i s to be found i n order to fl l i n the bl ank i n the goal expressi on. The goal expression i n a fnd probl em of thi s type i s i ncompl etel y specifed. I f the goal expression were specifed compl et el y -for exampl e, X -3 - then the probl em woul d be a proof probl em, with onl y the sequence of operations to be determined in order to sol ve the probl em. Of course, if one were not guaranteed that the goal expression X -3 was true, then the terminal goal expressi on shoul d real l y be consi dered to be i ncom­ pletel y specifed -something like the statement "x -3 i s (true or fal se) . " I n I nstant I nsanity, the goal i s incompl etel y speci fed. The goal i s t o get a tower of four cubes arranged i n such a way that each of the four rows of si des has one of each of the four col ors. However, one is not told exactl y what the arrangement of the colors i s to be -if one were, it woul d be a very si mpl e proof probl em i nstead of a rather hard fnd probl em. I n many chess probl ems, the goal i s to checkmate the other pl ayer in some smal l number of moves. Thi s goal i s cl ear, but it i s certainl y not the same as giving a compl ete specifcation of the termi nal board position. I ncomplete specifcation of the goal state does not i mpl y any am­ biguity about what constitutes a correct or i ncorrect sol ution to the probl em, as I shall defne the term sol uti on. There may be more than one correct solution to a probl em, but all formal probl ems di scussed in this book have the property that a solution i s either correct or in­ correct, without ambiguity. T4 Chapter Z One reason for di scussi ng the completeness of specifcation of the goal is to cl earl y descri be the nature of the diference between fnd and proof probl ems. Another reason is to point out that fnd probl ems have a terminal or goal expressi on that i s specifed ( i n vari ous ways and to di ferent degrees) in a manner rather si mi l ar to the theorem to be proved in a proof probl em. I t turns out that the degree of si mi l arity i n the speci fcation of the goal expressi on i s sufci ent to allow most of the same probl em-sol vi ng methods to be appl i ed to fnd probl ems and to proof probl ems. Working backward from the goal i s probabl y the onl y general probl em-sol vi ng method that i s used pri mari l y i n proof probl ems and vi rtual l y never in fnd probl ems. Al l other methods di s­ cussed i n thi s book are frequentl y used i n both fnd and proof prob­ l ems. Thus, although the di sti ncti on between fnd and proof probl ems i s perhaps the most fami l iar di sti nction between types of probl ems, i t has onl y moderate signifcance for problem-sol vi ng methods. Implicit Specification of Givens, Operations, and Goals Al though some probl ems (for exampl e, some proof probl ems) expl i ci tl y specify al l of the gi vens, operati ons, and goal s, other probl ems speci fy them onl y i mpl i ci tl y. For exampl e, in sol vi ng the typi cal physi cs probl em, al l of the assumpti ons, operati ons, and previ ousl y proved theorems of real -variabl e and compl ex-vari able mathemati cs are at one' s di sposal i n worki ng on the probl em, though thi s fact i s general l y not stated expl i ci tl y. Usual l y, t he i mpl i ci t gi vens, operati ons, and goal s of a problem are cl ear to the probl em sol ver, but sometimes they are not. Incomplete Specification of Givens, Operations and Goals There are ofen del i beratel y i ncompl ete statements of gi vens, opera­ tions, and goal s. That i s, the probl em sol ver may have some degree of choi ce among a set of possi bl e gi ven expressions, a set of possi bl e operati ons, and a set of possi bl e goal expressi ons. We have al ready di scussed the case where the terminal goal expre�si on is not specifed compl etel y, but i nstead the probl em sol ver has to fnd the correct expressi on to fl l i nto a bl ank space in the termi nal goal expressi on. Many fnd probl ems, such as the exampl e gi ven earl i er of fndi ng X -= g gi ven 4x + 5 -1 7, are equi valent to a probl em with a com- pl etel y speci fed goal , ¬X + 5 -1 7, but with an i ncompl etel y specifed gi ven, x -= . Equi val ences l i ke this obtain where operations Problem Theory TÜ are uni quel y reversi bl e (that i s, where there exist i nverse operati ons for al l operations) . I n algebra probl ems -for i nstance, solving for x ¬ = i n a cubi c equation such as x3 + 2x2 - X - 2 ¬ O -i t i s probabl y somewhat better to vi ew the probl em as having d completel y speci fed goal expressi on, r + 2X2 - X - 2 -0, and an i ncompl etel y speci fed gi ven expressi on, x ¬ = , than the reverse. Ofen you are asked to determi ne all the val ues of x that sati sfy the equati on, whi ch means that you need to know all the val ues of x from whi ch you coul d deri ve the compl i cated equation. Basically, this i s a hypothesi s generati on ( guessi ng) and test­ i ng situati on, because the di rection of i mpl i cation (by ordi nary arith- metic operati ons) i s from an unknown x -= to a known goal , r + 2x2 - X - 2 ¬ 0, not the reverse. There are three val ues of x that sati sfy the equation r + 2X2 - X - 2 ¯ 0, so the l atter equati on cannot i mpl y three contradi ctory equati ons, x ¬ I , x --I , and x --2. Other exampl es of probl ems wi th i ncompl ete specifcation of gi vens or operati ons i ncl ude many construction problems. Many such prob­ lems requi re one to bui l d something wi th a range of possi bl e gi ven material s and operati ons, ' but there are costs or other restri cti ons attached to the use of the material s ( gi vens) and operati ons. The prob­ l em solver must sel ect an unordered set of materi al s and an ordered set of (sequence of) operati ons that sati sfes some constrai nts speci fed i n the probl em and also achi eves the goal . Optimization problems are a natural extensi on of probl ems where gi vens or operati ons have costs. I n an opti mi zati on probl em, one i s supposed t o fnd t he way t o achi eve t he goal that mi ni mi zes some cost or maxi mi zes some uti l i ty. WHAT IS A PROBLEM STATE? A problem state, the state of the world of a probl em, is the set of al l the expressi ons that exi st i n the world of the probl em at a parti cul ar ti me. The probl em state can be changed onl y by appl yi ng an operati on to one or more expressi ons exi sti ng i n the previ ous probl em state to produce one or more new expressi ons. I n probl ems that have onl y nondestructi ve operati ons, a probl em state consi sts of all the expressi ons that have been obtained from the gi vens up to that moment in working on the probl em. I n probl ems that have one or more destructi ve operati ons, the probl em state i n­ cl udes onl y the currentl y exi sting expressi ons (those obtained that have not been destroyed) . Ofen probl ems with destructi ve operati ons Tb Chapter Z are considered to have onl y a single expressi on representing their state at the current moment, with the operati ons being able to change that entire state into a new state. I n such probl ems, there is no reason to di stingui sh between state and expressi on. The gi ven probl em state i s the set of al l gi ven expressions. When the givens are not specifed compl etel y, there are multiple possi bl e gi ven states. When the givens are compl etel y specifed, there i s a unique gi ven state. A goal state is a state that i ncl udes the goal ex­ pressi on. When the goal i s not compl etel y specifed or when there are nondestructi ve operati ons, there are multiple possi bl e goal states. When t he goal i s compl etel y specifed and al l operati ons are destruc­ tive, there may be a unique goal state. WHAT IS A SOLUTION? A solution to a problem contains al l four of the fol l owi ng parts. (a) Com­ pl ete specifcation of the gi vens ; that is, a uni que given state from whi ch the goal can be derived vi a a sequence of al l owable operations. (b) Compl ete specifcation of the set of operations to be used. (c) Com­ pl ete specifcation of the goal s. (d) An ordered succession or sequence of probl em states, starting with the given state and terminating with a goal state, such that each successi ve state is obtained from the pre­ cedi ng state by means of an al l owabl e action (operation appl ied to one or more expressi ons in the precedi ng state) . Part (d) real l y i ncl udes t he frst three parts, so it may be taken t o be a sufci ent defnition of a probl em sol uti on. However, part (d) appears to place pri mary emphasi s on the sequenci ng of actions, and i n many probl ems it i s the specifcation of gi vens or operati ons that constitutes the main source of difculty i n the probl em. Thus, it i s important to give these matters proper emphasi s. A si mpl e and completel y equi valent defni tion of a sol ution i s t o say that a sol ution i s a sequence of al l owabie actions that produces a com­ pl etel y specifed goal expressi on. In I nstant I nsanity, a sol uti on coul d be consi dered to consi st of some gi ven confguration of the four cubes, fol l owed by a sequence of diferent confgurati ons of the cubes, each of whi ch was obtained by an al l owable operation from the previous confgurati on, and ending with a confguration that satisfes the goal of having each of the four col ors represented once on each of the four si des of the row of four cubes. I n a chess probl em, a sol uti on consi sts of some gi ven board con­ fgurati on, fol l owed by a sequence of board confgurati ons, each of Problem Theory T¯ whi ch is deri ved from the previ ous confguration by an al l owabl e move, and endi ng wi th a checkmate confgurati on. I f the probl em as­ serts that thi s sol uti on is to be accompl i shed with some restri cti ons on the number of moves, then the description of the probl em state must i ncl ude a move counter that i s i ncreased by one on every move. The terminal expressi on must not onl y be a checkmate posi ti on, but the move counter must be l ess than or equal to some val ue. Chess prob­ l ems are often opti mi zation probl ems, i n whi ch the di ferent sol uti ons have di ferent val ues dependi ng upon how few moves they requi re. I n algebraic fnd probl ems or l ogical proof probl ems, t he sol uti on consi sts of a sequence of states such that (a) the gi ven state i s the conjunction of al l the gi vens, ( b) each successi ve state i s deri ved from the previ ous state by addi ng an expressi on that has been obtained by appl yi ng an al l owabl e operation to one or more of the previ ousl y obtai ned expressi ons, ( c) t he goal state i ncl udes a compl etel y speci fed goal expressi on. When there are several gi ven expressi ons, the most common practi ce i s to write down the gi ven expressi ons onl y as soon as they are needed for some operati on. This procedure makes it easi er for the reader to fol l ow the proof, but I thi nk i t i s more l ogi cal to re­ gard al l the gi vens as having been written down in the gi ven probl em state. I f there i s some psychol ogical beneft i n wri ti ng them down agai n i n probl ems i nvol vi ng onl y nondestructi ve operations, of course you shoul d do i t. But I do not thi nk thi s writi ng exercise shoul d i nfuence your defni ti on of a probl em sol uti on. STATE-ACTION TREE Al t hough the sol uti on of a probl em can be defned in terms of either a sequence of acti ons or a sequence of states (terminating wi th the achi evement of the goal ) , it i s very useful to represent both the pos­ si bl e sequences of acti ons and the possi bl e sequences of states i n a common di agram, whi ch coul d be cal l ed a state-action tree for a prob­ l em. An exampl e of such a tree i s shown i n Fig. 2-3 . I n a state-action tree, the nodes or branch poi nts of the tree represent al l the possibly di ferent probl em states that coul d resul t from al l the diferent action sequences. The concept of a node i n a state-acti on tree difers from the concept of a probl em state i n a somewhat subtl e, but i mportant, way. To be sure, every node represents a state of the prob­ lem, but two di stinct nodes do not necessari l y represent two di sti nct or di ferent states of the probl em. That i s, two or more action se­ quences, whi ch resul t i n two di ferent nodes, may resul t i n two i dentical TÛ No. possi ble states State level (given) 0 Z 4 Z 8 J IL 4 FIGURE 2-3 State-action tree for a probl em with two possibl e actions at each state, showing how the number of possi bl e "terminal" states at level H (equaling the number of diferent action sequences that are H actions in length) i ncreases geometrical l y with H. probl em states. Strictl y speaking, a node represents the sequence of actions or the sequence of states that l ed up to i t, not the probl em state achi eved by that sequence of actions or states. However, as l ong as you bear in mind that di stinct nodes do not necessari l y represent di stinct probl em states, there i s no harm i n consideri ng a node to repre­ sent a state, rather than the sequence of actions or states that led up to i t . The branches from each node represent the diferent actions that coul d be sel ected at that node. Obvi ousl y, the actions possi bl e at each node need not be si mi l ar to the actions possi bl e at any other node, but i n many probl ems the acti ons that are possi bl e at each node fal l i nto the same action cl asses or operati ons, with onl y the avai l abl e operands bei ng diferent ; however, thi s si mi l arity is not true of every probl em. I n addi ti on, the number of possi bl e acti ons at each node need not be equal either at the same level or across diferent l evel s. These possi bl e di ferences from node to node do not al ter the pri­ mary l esson to be l earned from exami ni ng a state-action tree - Problem Theory TÜ namel y, how rapidl y the number of possi bl e nodes or action sequences increases in such a tree as a function of l evel , that i s, the l ength of the pri or action sequence. If M acti ons occur at each node, then there are M !' possi bl e acti on (or state) sequences terminating at l evel n. Each of these diferent action ( state) sequences i s represented by a node at level n i n the state-action tree, so there are M !' diferent nodes at l evel n. Thi s geometric (di screte exponential) i ncrease i s perhaps t he si ngle most i mportant fact to consider i n devel opi ng probl em-sol vi ng methods. To sol ve a problem you must state t he exact sequence of acti ons ( states) that resul ts i n the goal , and many probl ems requi re a moder­ atel y long sequence of acti ons to accompl i sh the goal . Thus, we are ofen faced wi th a search among an extremel y l arge number of al­ ternative action sequences. In these cases, we must "prune the tree" so that there are not so many possi bl e acti on sequences to i nvesti­ gate. But, of course, we must prune i n such a manner that we do not cut of al l the branches that have "frui t, " that i s, states i ncl udi ng the goal . If you had no basi s for choosi ng between t he alternati ve acti on' at each node, if al l the nodes at all l evel s represented di sti nct states (di stinct sets of expressi ons), and if onl y one of the states (up to and i ncl udi ng level n) i ncl uded the goal , then there would be no way to prune the tree and reduce the search. However, i n most probl ems, i t i s possibl e to prune the tree. Diferent sequences of acti ons ofen resul t i n equi valent probl em states, al l owi ng you to combi ne nodes, prune branches, construct equi valent reduced state-action trees, and so on (for exampl e, cl assi ­ fcatory trial and error and macroaction i n Chapter 4). Usual l y, there are good reasons for choosi ng certain acti ons at any node and ignor­ i ng other acti ons and the branches they generate (for exampl e, state evaluation and hill cl imbing i n Chapter 5) . Frequentl y, a l arge probl em can be broken up i nto subprobl ems, thereby transforming a l arge tree i nto several smal l er trees , with a great reduction i n the total number of branches (for exampl e, subgoal s in Chapter 6) . Sometimes, a much smal l er tree resul ts from tryi ng to get from the goal back to the gi vens, rather than the reverse (for exampl e, worki ng backward i n Chapter ·. Probl ems wi th mUl ti pl e gi ven states can be represented by as many state-acti on trees as there are possi bl e gi ven states. I n some probl ems, the pri nci pal task i s to choose among the gi ven states (alternative sets of givens), the one or more gi ven states whose state-acti on trees con­ tai n a goal state. Ofen these probl ems requi re only a very short acti on sequence to achi eve the goal , once the correct gi ven state has been sel ected. In such probl ems, the main di fcul ty i s to fnd the correct type of tree i n a l arge forest ; cl i mbi ng the tree may pose onl y a mi nor ZÛ Chapter Z probl em. The method of contradi cti on di scussed in Chapter 7 is ofen useful for these probl ems. There i s a speci al case of probl ems wi th multiple gi ven states that occurs quite frequentl y and is of parti cul ar interest. In these prob­ l ems, the sol ver has the option of consi dering state A to be gi ven and state B to be the goal or of consi deri ng state B to be gi ven and state A to be the goal . Thi s ki nd of equi valence between two problems occurs where i nverse operati ons exi st for al l operations. One probl em of thi s type was di scussed earl i er i n the chapter -namel y, the equi valence of deri vi ng x -3 from 4x + 5 ÷ 1 7 or vi ce versa. Û Inference Vi rtual l y al l probl ems present some of the rel evant i nformation i n impl i ci t, rather than expl i ci t, form. That i s, some of the i nformation concerning gi vens, operati ons, or occasional l y even goal s i s presented i n a subtle manner that may not strongl y attract your attenti on, unl ess you know what to l ook for. I n a sense, thi s situation might be said to be poor communi cation of the components of a probl em. Why do not the people who make up probl ems si mpl y do a better job of communi ­ cating the rel evant i nformation? I woul d agree that, in some cases, probl ems used for teachi ng pur­ poses coul d be i mproved by making the rel evant i nformation very cl ear. I n these cases the probl em is di fcul t enough in expl i ci t form wi thout the added di fcul ty of the rel evant i nformation bei ng presented i m­ pl i ci tly. However, when you are posi ng and sol vi ng mathemati cal , scientifc, and engineering probl ems for yourself in some real -l ife en­ deavor, your own i ni ti al posi ng of probl ems wi l l contai n i mpl i ci t statements of i nformati on. Unl ess you know how to anal yze a probl em for i mpl i ci t i nformati on, you wi l l have di fcul ty sol vi ng actual prob­ lems l ater on. Probl ems often evol ve from (a) vaguel y formul ated to (b) semi ­ preci sel y formul ated to (c) preci sel y but partl y i mpl i ci tl y formul ated ZZ Chapter d to (d) preci sel y and expl i citl y formulated stages. It is very i mportant for probl em sol vers to know what ki nds of i mpl i ci t informati on to look for in probl ems, because thi s i nformation is ofen a critical step i n probl em sol vi ng, whether i n school or i n l ife. Furthermore, even when al l the gi vens and operati ons are expl i ci tl y presented i n the probl em, i t i s, of course, necessary to transform the gi vens by means of the operati ons i n some way i n order to sol ve the probl em. The solver must make i nferences, draw concl usi ons, from the gi ven informati on, a process that i s, in essence, rendering expl i ci t the statements that were ( i n a somewhat di ferent sense) onl y i mpl i ci t in the gi vens. When implicit information refers to the consequences of gi ven i n­ formation, i t i s a somewhat di ferent use of the term than when i t refers to i nformation not contained in the expl i ci t statement of the probl em (although, by conventi on, one i s supposed to know that i t i s part of the i nformation in the probl em). However, there are all degrees of expl i ci t menti on of i mpl i ci t i nformation found in diferent probl ems. For exampl e, a probl em mi ght refer to even numbers. In one sense, thi s statement i s expl i ci t menti on of even numbers from whi ch one can draw the i nference that, if n is an i nteger and an even number, then it can be expressed as 2m, where m is al so an i nteger. However, the defni ti on of even numbers is not presented expl i ci tl y in the prob­ lem and must be suppl i ed from memory. Thi s sort of semi expl i ci t, semi i mpl i ci t presentati on of i nformation occurs all the time i n prob­ l ems. Thus, i t i s probabl y not too useful to di sti ngui sh between the drawing of concl usi ons from di ferent degrees of i mpl i ci tl y versus ex­ pl i ci tl y presented i nformati on. Drawi ng i nferences from i mpl i ci tl y or expl i ci tl y presented i nforma­ tion is essenti al l y random trial and error, unl ess some criteria are speci fed regardi ng whi ch i nferences (more general l y, whi ch trans­ formati ons of the goal or the gi ven i nformati on) shoul d be made frst. There are essenti al l y two cri teria that can be formulated semi preci sel y, but not compl etel y preci sel y, at the present ti me. The frst criterion is that the i nferences should be those that you have frequentl y made in the past from the same type of i nformati on. You assume that the properti es that proved useful in the past wi l l most l i kel y prove useful in the present probl em. The second criterion is that the i nferences you draw shoul d be those i nferences that are concerned with properti es menti oned i n the goal , the gi vens, or i n previ ousl y derived conse­ quences of the goal or the gi vens. I nferences that sati sfy this second criterion are l i kel y to combi ne with other i nformation to yield sti l l further i nferences. Inference Zó Thus, the general probl em-sol vi ng method described i n this chapter may be stated as fol l ows : Draw inferences from explicitly and implicitly presented information that satisfy one or both of the following t wo criteria: (a) the inferences have frequently been made in the past from the same type of information; (b) the inferences are concerned with properties ( variables, terms, expressions, and so on) that appear in the goal, the givens, or inferences from the goal and the givens. Throughout the rest of the book, the expressi on "drawi ng i nferences" wi l l be used to refer to the above statement of the method - namel y, drawi ng i nferences that sati sfy one or both of the previ ousl y stated criteria. Drawi ng inferences (more general l y, making transformati ons of the goal or the gi vens) i s probabl y the frst probl em-sol vi ng method you should empl oy i n attempting to sol ve a probl em. You are essenti al l y expanding t he goal or t he gi vens by bringing t o bear al l of t he knowl­ edge you have concerni ng thi s probl em i n your memory. Frequentl y, probl ems are quite si mpl y sol ved, once al l the rel evant i nformation i s retri eved from memory, i n the drawi ng of i nferences from expl i ci tl y and i mpl i ci tl y presented informati on. Most peopl e do make frequent use of the i nference method, at least i n connecti on wi th drawi ng i nferences from gi vens. (Thi s procedure is ofen thought to be random trial and error, but thi s characterization is l argel y i naccurate, si nce peopl e' s inferences usual l y do meet one or both of the stated cri teri a. ) The general probl em-sol vi ng methods di scussed l ater i n the book are somewhat l ess uni versal l y used by human probl em sol vers , but the di scussion of them shoul d not l ead you to ignore the basic inference method. For thi s reason, thi s method i s the frst general probl em­ sol vi ng method di scussed i n thi s book. Furthermore, a greater under­ standi ng of how the i nference method operates and an awareness of some i l l ustrati ve use can greatl y faci l i tate your profci ency in usi ng the method, parti cul arl y wi th respect to i nferences from the goal i nformation, whi ch peopl e do not pay enough attenti on to. Peopl e have a bias to start at the begi nni ng, whi ch they take to mean the gi vens. Thi s bi as i s ofen i nappropri ate i n probl em sol vi ng, si nce the goal i s frequentl y a better begi nni ng poi nt than the gi vens. So-cal l ed insight problems are ofen probl ems i n whi ch the pri nci pal step in sol uti on is to draw the appropriate inference from certain ex­ pl i ci tl y or i mpl i ci tl y presented i nformati on. Very few steps are requi red to solve the probl em. What i s necessary i s to make that one critical transformation of the givens that essenti al l y sol ves the probl em. Di fcul t i nsi ght probl ems are often di fcul t preci sel y because t hey Z4 Chapter d requi re you to draw an i nference that is not too cl ose to the top of your hi erarchy of i nferences from thi s type of gi ven information [cri­ terion (a)] . Obvi ousl y, the more you have stored i n your memory con­ cerning the pri nci pal i nferences to be drawn from the types of given informati on contained i n the probl em, the more l i kel y you are to be able to achi eve the cri ti cal i nsi ght . However, whatever your l evel of speci fc knowl edge concerni ng the gi ven i nformati on, greater under­ standi ng and experi ence in the use of the i nference method wi l l i n­ crease your chances of systematical l y di scovering the requi red i nsight i n the course of drawing i nferences concerning properti es of the given i nformati on. Just knowi ng that what you are doing i s surel y not random trial and error may cause you to go further and further down the l i st of i nferences to be made from the i nformation i n the probl em, rather than gi vi ng up thi s approach afer the frst few i nferences fai l . With the knowl edge of probl em-sol vi ng methods contained i n thi s book and experi ence i n appl yi ng them to the sol uti on of probl ems, you can gradual l y devel op a fai rl y accurate i ntuition as to whi ch probl ems are i nsight probl ems and thus most suited to the inference method and not to other probl em-sol vi ng methods. If you cl assify a probl em as an i nsight probl em, then you shoul d conti nue drawing inferences ( rather than use other methods) for a l onger period of time than if you do not cl assify it as an i nsight probl em. Of course, drawing i nferences ( i ncl udi ng expl i ci t representation of i mpl i ci t i nformati on) is ofen an i mportant part of sol vi ng any probl em, not just i nsight probl ems. I nsight probl ems are si mpl y those i n whi ch inference i s the pri nci pal or onl y method empl oyed i n sol vi ng them. In noni nsight probl ems, you shoul d stop usi ng the i nference method when you "run out of gas" usi ng the method -that i s, when you fnd it di fcul t to draw from the gi ven i nformation any new concl usi ons that seem to have any l i kel i hood of bei ng useful i n sol vi ng the probl em. I n noni nsight probl ems, you shoul d then go on to consi der empl oyi ng other general probl em-sol vi ng methods, usi ng the expanded set of gi ven i nformati on provi ded by the inference method. I n i nsi ght prob­ l ems, when you run out of gas, you shoul d go back and try over and over agai n to look at the probl em from a diferent point of view to yi el d addi ti onal new i nferences. The di scussi on of i nference and i mpl i ci t i nformation natural l y di vi des i nto three secti ons. Fi rst, gi vens may be, to some extent, stated i m­ pl i ci tl y and, in any event, can usual l y be expanded consi derabl y by use of the i nference method. Second, operati ons are not al ways ex­ pl i ci tl y stated. Thi rd, the goal of the probl em i s occasi onal l y not Inference Zb completel y cl ear, and the sol ver must get a preci se and correct defni ­ tion of the goal . In addi ti on, it i s ofen hel pful to specify the proper­ ties of the goal in more detai l . Thi s procedure frequentl y i nvol ves drawing i nferences from presented i nformation ( gi vens and goal ) , i ncl udi ng expl i ci t symbol i c or di agrammatic representation of i nforma­ tion that may appear onl y i mpl i ci tl y in the probl em. GIVENS The probl ems at the end of a secti on in a textbook are there to test the reader' s knowl edge of the material presented i n that section. Each probl em, then, i ncl udes all of the given assumpti ons, proved theorems, and operati ons that appeared i n the secti on as wel l as the parti cular givens of the parti cul ar probl em. In addi ti on, some previ ous material presented i n the book may be rel evant to sol vi ng the probl em, and certain background knowledge from other books may also be needed. Such background i nformati on concerni ng gi vens and operati ons i s one ki nd of i mpl i ci t i nformation i n probl ems. You shoul d be aware of thi s ki nd of i mpl i ci t i nformation i n prob­ l ems, and take care to master background subject matter before pro­ ceedi ng on to courses that have this background as a prerequi si te. I f you have not ful l y understood what was presented previ ousl y i n the course or what was presented i n relevant background courses, you should face thi s fact and go back to l earn the relevant prior material , either si multaneousl y wi th or i nstead of taki ng a subsequent course. It i s l unacy to go on to more advanced courses wi thout a reasonabl y cl ear understandi ng of t he rel evant background materi al . The general probl em-sol vi ng methods taught in thi s book wi l l not substitute for l ack of the rel evant knowl edge. I t i s true that you can understand the relevant material and not be able to solve probl ems for l ack of understandi ng of general probl em­ sol vi ng methods. However, you wi l l al so fai l to sol ve probl ems if you l ack the rel evant knowl edge, no matter how ski l lful a probl em solver you are. In today' s schools a C or even a B i n a course may represent an i nadequate level of understandi ng for going on to more advanced courses, and the consci enti ous student should recogni ze this fact and act accordi ngl y. In addition to background i nformati on, there i s another ki nd of i m­ pl icit problem i nformation that the ski l l ed probl em sol ver can come to recognize rather easi l y, someti mes greatl y faci l itating sol uti on. Thi s Zb Chapter d other ki nd of i mpl i ci t i nformation concerns the properti es possessed by each of the gi vens or operations i n a probl em. When a fami liar object or acti vi ty i s presented i n a probl em, all of the known proper­ ti es of that object or activity (i ncl udi ng al l i ts known relations to other objects or acti vi ti es) are usual l y consi dered to be part of the given in­ formati on. There may be no question that everyone who works on the probl em knows all of the rel evant properti es of al l the gi vens and operati ons i n the probl em. That i s, no speci al i zed background knowl­ edge i s requi red. However, amateur probl em solvers frequentl y fail to ask themsel ves what they know about the gi vens and operati ons i n a probl em from thei r own past experi ence. I nsight probl ems are very ofen probl ems that requi re one to notice -whi ch means represent expl i ci tl y -properti es of gi vens presented i n the probl em. Of course, many of the i mpl i ci t properti es of the gi vens are irrel evant to sol vi ng the probl em. We know that most people have two l egs, two arms, two eyes, ski n, hair, a nose, a mouth, and so on, but most of these properti es are i rrel evant to the solution of any single probl em where peopl e are i ncl uded i n the gi ven i nformati on. Such i rrel evant properti es shoul d be ignored, and probl em sol vers are usual l y able to reject such trul y i rrelevant i mpl i ci t properti es. The di fcul ty usual l y comes i n abstracti ng or consci ousl y consi dering t he possi bl y relevant i mpl i ci t properti es. Some exampl es are described i n the fol lowing subsections. Numerical Properties Whenever numbers are i nvol ved in a probl em in any way, you shoul d consi der whether the known properti es of the ki nd of numbers i nvolved i n the probl em might be of any value in sol vi ng the probl em. For ex­ ampl e, if some number i s known to be a posi tive i nteger, then i t cannot be negati ve, zero, or a fracti on. If an i nteger, i s known to be even, then it can be expressed as -2m, where M i s al so an i nteger, or as -2sp, where ò i s an i nteger and p i s an odd i nteger. If an i nteger, i s known to be odd, then i t can be expressed as -2m + 1 , where m is an i nteger, or -2sp + I , where ò is an i nteger and p is an odd i nteger. A somewhat famous exampl e in the psychol ogy of probl em sol vi ng of the abstracti on of numerical properti es comes in t he Í3 problem of Karl Duncker ( 1 945 , p. 3 1 ) . The probl em can be stated as fol l ows : Prove that al l si x-pl ace numbers of the form abcabc (for exampl e, 4 1 64 1 6 or 258258) are di vi si bl e (evenl y) by 1 3 . Inference Z¯ Stop reading and try to sol ve thi s probl em, then read on. You might try a variety of speci al cases, verifyi ng that i n every case the number was di vi si bl e by 1 3 , but that woul d probabl y not suggest how to prove the theorem i n general . The critical step i s to i nqui re whether you know any numerical properti es of a number of the form abcabc. If you coul d not sol ve t hi s probl em before, stop readi ng and try agai n by abstracti ng numerical properti es of numbers of the form abcabc. I f you sti l l coul d not sol ve the probl em, consi der whether you coul d factor a number of the form abcabc i nto a product of other numbers. Now stop readi ng and try agai n. I n factori ng t he number, you no doubt determined that abcabc - (abc) ( \ 00 I ) , for all numbers of the form abc and therefore for al l numbers of the form abcabc. Now, of course, 1 00 1 is di vi si bl e ( evenl y) by 1 3 , so ( abc) ( 1 00 I ) i s di vi si bl e by 1 3 , and the theorem i s proved. Furthermore, the factoring of abcabc i nto abc( 1 00 1 ) can be achi eved qui te automati cal l y by representi ng the numerical propert i es of abcabc i n the fol l owing standard way (for whi ch abcabc i s real l y the conven­ ti onal abbrevi ation) : abcabc = (a . 1 05) + ( b · 1 04) + ( c · l O:l) + (a . 1 02) + ( b · 1 0) + (c) = tl . ( 1 05 + 1 02) + b( l 04 + 1 0) + c( l 03 + I ) = a . 1 02 ( l O:l + I ) + b . 1 0( 1 03 + I ) + ct l O3 + I ) = ( l OO I ) ( a · 1 02 + b · 1 0 + c) = ( 1 00 1 ) (abc) Topological Properties Topology is concerned wi th the properti es of geometric fgures that remain unaltered when the fgures are stretched, shrunk, and twi sted i n any regul ar or i rregul ar way. For exampl e, consi der the square shown in Fi g. 3- 1 . I magine that the square was drawn on a sheet of very fexi bl e rubber and that it was stretched so that the square l ooked l i ke that shown at right i n the fgure. What properti es remain i nvariant under the stretchi ng, shri nki ng, and twi sti ng of the rubber sheet? Actual l y, a number of properti es are unchanged. Poi nts i nsi de the fgure remai n i nsi de, poi nts outsi de the fgure remain outsi de, and poi nts on the edges ( l i nes) of the fgure remain on the edges. I f you consider that the fgure has onl y four poi nts -namel y, the four verti ces A, B, C, and D - and that the edges are defned merel y as unordered pairs of the vertex poi nts, then the set of points and the set of edges (unordered pairs of poi nts) has not been changed by the di storti on ei ther. ZÛ Û C D A ( a ) FI GURE 3-1 ß Û C (b) Di storting a square drawn on a rubber sheet to i l l ustrate the topological properti es of a fgure (those propert i es that are unchanged by stretchi ng, shri nki ng, and twi sti ng). Chapter d Consi der a fgure wi th several faces or regi ons enti rel y encl osed by l i nes with no i nterior l i nes, such as the three-face fgure shown in Fig. 3- 2. Al l of the i nvariants described i n the precedi ng paragraph for a si ngl e-face fgure obtain for the mul tiface fgure. I n addi ti on, the faces that border on each other (have a common edge) sti l l border on exactl y the same faces afer the di storti on. Thus, if you constructed the set of unordered pairs of faces that border on each other-namel y, i, g) and (g, h) -thi s set woul d remain i nvariant under stretchi ng, shri nki ng, and twi sti ng. ß ß A / A F C Û Û (a) ( b) FI GURE 3-2 Di storting a three-face fgure drawn on a rubber sheet to i l l ustrate topol ogical properti es. Faces are represented by f, ¡, and h. Verti ces are represented by A. B, C. D. E. and F. Inference ZÜ One of my favorite probl ems i nvol ves the property of the borderi ng ( di rect connecti on) of faces i n an i mportant way. Thi s i s the no(ched­ checkerboard problem: You are gi ven a checkerboard and 3 2 domi noes. Each domi no covers exactl y two adjacent squares on the board. Thus, the 32 domi noes can cover al l 64 squares of the checkerboard. Now suppose two squares are cut of at di agonal l y opposi te corners of the board (see Fig. 3 . 3). Is i t possi bl e to pl ace 3 I domi noes on the board so that al l of the 62 remai ni ng squares are covered? If so, show how it can be done. If not, prove i t impossi bl e. Stop readi ng and try to sol ve thi s probl em. I f you coul d not sol ve i t, consi der the fol l owi ng hi nt. Thi s problem primari l y involves use of the i nference method to expl i ci tl y represent certain properti es of the checkerboard and domi noes that are onl y FI GURE 3-3 The notched checkerboard. óÛ Chapter d i mpl i ci tl y presented in the present probl em. Once the appropri ate property or properti es are recogni zed, the solution to the problem i s obvious. Now stop readi ng and try to sol ve the probl em, i f you could not do so before. The cri tical property is that of the two squares of the checkerboard that are covered by any domi no. What are some of the properties of any such two squares? If you have not yet solved the probl em, stop reading and try agai n, considering thi s hi nt. The critical properti es of the two squares covered by any domi no can be expressed i n terms of the colors of these two squares. What are the colors of the two squares covered by any domino on a checkerboard? I f you have not yet sol ved the probl em, stop readi ng and try agai n, consi dering thi s hi nt. The key i nsight requi red to sol ve the notched-checkerboard probl em i s to notice that a domi no covers two squares that are always of diferent col ors (that i s, one bl ack and one whi te). Si nce the di agonal l y opposite corner squares are of the same col or, there are now 30 squares of one col or and 32 squares of the other color, and obviousl y the 62 squares cannot be covered by 3 1 dominoes. What has i ntrigued me most about t he probl em i s thi s : t he impos­ si bi l ity of covering the remai ni ng 62 squares with 3 I dominoes can be proved i rrespective of whether the eight-by-eight matrix is presented as a checkerboard with a checkerboard coloring pattern and even irrespective of whether the probl em sol ver has ever experienced a checkerboard coloring pattern. But what probl em-sol vi ng methcd woul d l ead one to discover the el egant proof that comes from imposing a checkerboard col oring pattern on the matri x? Is this kind of i ngenious idea a chance happeni ng, or something onl y very bri l l iant people can think of, using methods that are not understandabl e by others ? I do not thi nk so. I thi nk that use of the probl em-sol ving method of repre­ senting al l of the possi bl y relevant properties of the gi vens in a prob­ lem makes it l i kel y that many probl em sol vers would di scover the el egant solution of even the notched eight-by-eight colorl ess matri x probl em. I thi nk that it i s not l i kel y a person unfamil iar wi th a checkerboard col ori ng pattern woul d impose such a pattern on a colorl ess eight-by­ eight matri x. However, I thi nk that i t i s l i kel y that a person would do somethi ng equi val ent to i mposi ng checkerboard col ori ng on the matri x, as fol l ows. U si ng t he method of trying t o represent al l of t he possi bl y rel evant properti es of the gi vens i n the probl em, one would eventual l y l abel the squares i n t he ei ght-by-eight matrix i n ordered-pair (co­ ordi nate) notati on, as shown i n Fi g. 3- 4. Now one might eventual l y I nference o 1. | 1,= 1, J ¹·* 1, 5 J, | 7J 1 6. 0 > , 1 ò. ` ·¸ 6, 4 í,7 6. ó I. 1 · . | 5 . I ' ` 5. J 5, 4 5. 5 ¯ , o 5 . 7 4,0 4, 1 4, 2 4, 3 ¬· ¬ +. ¯ 1, 6 +, 1 3, 0 3, | J, ` *· ' 4·' ³ · 5 J. | `, 1 2. 0 2, 1 2, 2 `. ¬ `, 4 2, 5 2, 6 =,1 1 , 0 | . I 1 , 2 | . 3 1 , 4 ì , 5 1 , 6 I . 1 0, 0 0, 1 0. 2 0. 3 0. 4 0, 5 0. 6 FIGURE 3-4 The notched checkerboard with ordered-pair (coordinate) label i ng of the squares. óT look for some property common to all pairs of squares that a si ngl e domino coul d cover. I f the i dea occurred to one to look for thi s ki nd of property, then havi ng l abel ed the squares i n ordered-pair (coordi­ nate) notation, it i s likely that one would see that a domino must cover two squares, one of whose coordi nate sums i s odd and the other even. Since the di agonal l y opposite squares of the matri x both have ei ther an odd or an even coordi nate sum, the notched matri x cannot be cov­ ered by the 3 1 domi noes. The solution i s i n every way equi valent to that given for the notched checkerboard using the col or property but in no way requi res one to i nvent some special l abel i ng scheme such as a checkerboard col oring pattern_ Onl y the very general l y useful and familiar coordi nate labeling scheme i s needed. Let us exami ne why thi s probl em i s an exampl e of the abstraction of the topological properti es of a fgure. A domino covers two faces that border on each other i n a compl ex fgure composed of faces with a very special type of bordering structure. It i s the bordering structure óZ Chapter d of the matri x of faces that is represented by the coordi nate l abel i ng scheme (or the checkerboard col ori ng pattern) , and the shapes and sizes of the faces or the matri x are compl etel y i rrelevant. Thus, the notched eight-by-eight matri x probl em i s a probl em where the cri ti cal properti es to be represented are topological properti es. Other probl ems i n whi ch representing topological i nformation i s important for achi eving sol uti on are those i n whi ch a bl ock i s cut i nto component subbl ocks. The fol l owi ng cube-cutting problem i s, I guess, the cl assi c such probl em: You are worki ng wi t h a power saw and wi sh t o cut a wooden cube, 3 i nches on a si de, i nto 27 I -i nch cubes. You can do thi s by maki ng si x cuts t hrough the cube, keepi ng the pi eces together i n the cube shape ( see Fi g. 3-5). Can you reduce the number of necessary cuts by rearrang­ ing the pi eces afer each cut ? | | | | ...�. . . . L. . | | | | | | | | ...¿. . . T ... | | | | | | FIGURE 3-5 Sl i ci ng a 3-by-3-by-3-i nch cube into 27 subcubes. Stop readi ng and try to sol ve the probl em. Consi der the 3-by-3 -by-3-i nch cube to be al ready di vi ded i nto its 27 component cubes but sti l l stacked i n such a way as to form a 3-by-3 -by-3 cube. The i mportant topol ogical properti es of such a structure are concerned with the verti ces, the edges, and the faces of the component cubes. I f you did not sol ve the probl em, stop readi ng and try agai n. Among the i mportant topologi cal properti es the one most l i kel y to be rel evant to the sol ution of the present probl em concerns the faces of the component cubes, si nce the power saw essential l y separates the faces of certain component cubes from the faces of other component cubes. If you have so far not sol ved the probl em, stop readi ng and try agai n, usi ng this hi nt. The 27 component cubes fal l i nto several cl asses on the basi s of how many of thei r faces (si des) border on other component cubes versus how many are parts of the exterior faces of the 3- by- 3-by-3 cube. Clas­ sify the component cubes by this criterion and consider t hi s i nforma- Inference óó tion in rel ation to the sol ution of the probl em. If you have not sol ved the probl em thus far, stop reading and try agai n. There are four cl asses of cubes with respect t o t he property of the number of "i nterior" faces (that is, the number of faces that border on faces of other component cubes and thus must be cut ) . There are the ·corner cubes that have onl y three i nterior faces ; there are the 1 2 edge cubes that have four interior faces ; there are the 6 face cubes that have fve i nterior faces ; and there i s the one center cube that has si x i nterior faces (and i s total l y hi dden from vi ew i n the 3-by-3-by-3 cube). A cross-sectional di agram of the cube representi ng the number of i nterior faces for each component cube i s shown i n Fi g. 3-6. I f you have not sol ved the probl em thus far, consi der the i nformation in Fi g. 3 - 6 and try agai n. J 4 J Top section Middle section Bottom section 4 J 4 5 4 J 4 5 4 5 ô 5 4 5 4 J 4 5 4 J 4 FIGURE 3-6 The number of interior faces (needi ng to be cut) for each component cube of the 3-by-3-by- 3-i nch cube. J 4 J The key i nsight requi red to sol ve bl ock-cutting probl ems in general and thi s cube-cutting problem i n parti cul ar i s to focus on that subbl ock whi ch has the greatest number of faces that must be cut i n order to separate it from the other subbl ocks. The reason for focusi ng on the subbl ock wi th the l argest number of faces to cut i s that the number of such faces on thi s bl ock sets a mi ni mum to the number of cuts that must be made. I t i s obvi ous why thi s i s so, si nce under no ci rcumstances can one cut more than one face of a subbl ock at a time. In the case of the cube, thi s fact means focusi ng on the most central cube, whi ch has no exposed faces to begi n wi th. Thi s cube has si x faces that must be cut, and therefore no fewer than si x cuts wi l l sol ve the probl em. Si nce we know by i nspection that si x cuts wi l l sol ve the probl em, the number of cuts that i s requi red i s exactly si x. The same pri nci pl e can be applied to a l arge cl ass of other probl ems to set a mi ni mum on the number of cuts that are requi red. For exampl e, a cube cut into four subcubes, as i n Fi g. 3- 7, requi res three cuts, si nce each of the four subcubes has three unexposed faces that must be cut. ó4 l | | | | | ~~~~~~4~~~~~~ | | ' | | ' FIGURE 3-7 Sl icing a 2-by-2-by-2-inch cube into subcubes. Chapter d Operations Many practi cal probl ems requi re you to thi nk of a type of operati on that wi l l sol ve the probl em. The operati on i s usual l y one with whi ch you woul d be quite fami l i ar, but thi nki ng of that operation may be far from tri vi al . Neverthel ess, it i s probabl y of some hel p to be expl i ci tl y aware of the possi bi l ity of i mpl i ci t operati ons and to have some exampl es of such probl ems i n your mi nd. One exampl e is the wel l ­ known radiation problem of Duncker ( 1 945, p. I ) : Gi ven a human bei ng wi th an inoperabl e stomach tumor, and rays whi ch destroy organic ti ssue at sufci ent i ntensi ty, by what procedure can one free hi m of the tumor by these rays and at the same ti me avoi d destroyi ng the heal thy tissue whi ch surrounds it? There are a number of pl ausi bl e sol uti ons, each of whi ch i nvol ves thi nki ng of some operation not speci fed i n the statement of the prob­ l em. For exampl e, the rays might be focused from several sources so that they i ntersected i n the region of the tumor. A si ngl e source of radiation coul d be rotated around the body so that all the beams i nter­ sected in the regi on of the tumor. Perhaps a source of radiation coul d be i mpl anted i nsi de the tumor. I n some formal probl ems with a preci sel y del i mi ted set of operati ons, the properti es of one or more of the operati ons may be somewhat i m­ pl i ci t. Fai l ure to achi eve a compl etel y expl i ci t and accurate under­ standi ng of the properti es of such operati ons may bl ock solution of the probl em. As an exampl e of thi s type of si tuati on, consi der the fol l owi ng one-heavy-coin problem: You have a pi l e of 24 coi ns. Twenty-three of these coi ns have the same weight, and one i s heavier than the others. Your task is to determine which coin i s heavier and to do so i n the mi ni mum number of weighi ngs. You are gi ven a beam bal ance ( scal e), whi ch wi l l compare the wei ghts of any two sets of coi ns out of the total set of 24 coi ns. Inference ób Stop reading and try to sol ve the probl em. Consi der the properti es of the wei ghi ng operati on if a beam bal ance i s used. What ki nd of i n­ formation does the beam balance provi de concerni ng rel ati ve weights i n any two sets of coi ns? How many di ferent outcomes are there to a weighing on a beam balance? I f you have not sol ved the probl em thus far, stop readi ng and try agai n. A beam balance actual l y has three di ferent outcomes, not two ­ namel y, the l eft pan is heavi er, l ighter, or equal in wei ght to the right pan. Since there are three di ferent outcomes to weighing on a beam balance, it i s at l east theoreti cal l y possi bl e that a beam balance coul d provi de one with an answer as to whi ch of three subsets of coi ns con­ tai ns the heavy coi n ( not just deci di ng whi ch of two subsets contai ns the heavy coi n) . Consi der thi s hi nt, and you shoul d easi l y be abl e to solve the probl em. A beam balance has two pans and compares the weights of two sets of coins. For thi s reason, many people assume that the operation they have avai lable to sol ve the probl em i s essenti al l y to ask which of two (equal l y l arge) sets of coins contai ns the heavy coin. Accordi ngl y, they reason that the optimal strategy must be to di vi de the total set of coi ns i n hal f and wei gh one hal f agai nst the other hal f ( 1 2 coi ns agai nst 1 2 coi ns). Then, having determi ned whi ch set of 1 2 coi ns contai ns the heavy coi n, they proceed to divide that set i n hal f and weigh 6 coi ns against 6 coi ns, then 3 agai nst 3, and fnal l y I agai nst I , or 2 agai nst 2 fol l owed by I agai nst I . When the number of coi ns remai ni ng i s onl y three, i t might occur to a person that one' s original characterization of the operati on as a two-way question was in error. However, someti mes even thi s si mpl e termi nal probl em does not necessari l y i ndi cate to the probl em sol ver that the beam balance can actual l y provide one with an answer to a three-way question, if the coi ns are di vi ded i nto three pi l es (two of whi ch are equal) on each and every weighi ng. Thi s procedure i s, of course, the solution to the probl em-namel y, you shoul d frst weigh two sets of eight coins agai nst each other. I f one pan i s heavi er than the other, then i t contai ns the heavy coi n. If the two pans balance, then the heavy coi n i s i n the remai ni ng set of eight coi ns that was l ef of the pans of the balance scal e. I n any case, you fnd out which subset of eight coins contai ns the heavy coi n. You then continue to partition the remai ni ng set of eight coi ns into three parts by wei ghi ng three coins agai nst three coi ns. No matter which subset of coi ns contai ns the heavy coi n, the answer wi l l be found i n one addi ti onal weighi ng, or three wei ghi ngs i n al l . By contrast, di vi di ng the set i nto two equal parts (whenever possi bl e) requi res four wei ghi ngs. ób Chapter d I f you are careful to expl i ci tl y state the properti es of the operations avai l abl e i n a probl em, then you will be more l i kel y to avoid the in­ accurate characteri zati ons that frequentl y occur i n probl ems such as thi s one. GOALS Occasi onal l y in school and more frequentl y in real -l i fe probl ems, the goal of a probl em is not compl etel y cl ear. Obvi ousl y, an i mportant step i n representi ng the i nformation i n a probl em i s to be sure you have a preci se and correct defni ti on of the goal . I t is ofen worthwhi l e to question whether or not you understood the goal correctl y, si nce it i s someti mes easy to make a mi stake i n thi s regard. As an exampl e, con­ sider the fol l owi ng logic probl em: The country of Marr i s i nhabi ted by t wo types of peopl e, l i ars and truars (truth tel l ers). Li ars always lie and truars always tell the truth. As the newl y appoi nted Uni ted States ambassador to Marr, you have been i nvi ted to a local cocktai l party. Whi l e consumi ng some of the nat i ve spi ri ts, you are engaged in conversati on wi th three of Marr' s most promi nent ci ti zens : Joan Landi l l , Shawn Farrar, and Peter Gant. At one point i n the conversati on Joan remarks that Shawn and Peter are both l i ars. Shawn vehementl y deni es that he i s a l i ar, but Peter repl i es that Shawn i s i ndeed a l i ar. From this i nformati on can you determi ne how many of the t hree are l i ars and how many are truars? Stop readi ng and try to sol ve thi s probl em, then read on. In sol vi ng l ogi c probl ems of thi s type, it i s a common procedure to l i st al l of the possi bi l i ti es in the form of one or more tabl es. For ex­ ampl e, i n thi s probl em one mi ght attempt to fl l out the fol l owi ng tabl e: Person Joan Shawn Peter Liar Truar I n a probl em very si mi l ar to thi s one, a student in one of my problem­ sol vi ng cl asses attempted to solve the probl em by fl l i ng out just such a tabl e. Perhaps you di d thi s i n the present probl em. However, in translating the probl em i nto this form, a subtle transformation has taken pl ace with respect to the goal -namel y, the goal has been changed Inference ó¯ from determi ni ng hoI\ ' many of the three are l i ars to determi ni ng whether each of t he three persons is a l i ar or a truar. If you try to answer the new versi on of the probl em, you wi l l never be abl e to reach a sol ution ! Al l you must determine is how many of the three are l i ars. The correct answer i s a number: 0, I , 2, or 3. The names of the peopl e are l argel y i rrel evant i nformation, added to make the probl em appear more i nteresting and si mul taneousl y to act as a di stracti on. If you have not yet sol ved the probl em, stop readi ng and try agai n, then read on. I n fact, i t i s i mpossi bl e to determine whether Shawn i s a l i ar and i t i s al so impossi bl e to determi ne whether Peter i s a l i ar. However, one can concl ude t hat ei ther Shawn i s a l i ar and Peter a truar or Shawn i s a truar and Peter a l i ar. Ei t her way one and onl y one of the men i s a l i ar. Thus, Joan must be a l i ar. Thi s concl usi on i mpl i es that there are exact l y two l i ars and one truar in the group of three nati ves to whom you are tal king. Thi s answer sol ves the original probl em but does not al l ow you to compl etel y fl l out the tabl e. I n addi ti on to readi ng and rereadi ng a probl em to avoid mi sunder­ standing the goal , you shoul d have a cl ear, preci se statement of the goal , rather than some vague formulation of i t. Sometimes vague state­ ments of the goal are partl y or completel y due to uncl ear statements of the goal i n the probl em, and someti mes the vague formulation i s due part l y or compl etel y to sl oppy reformulation by the probl em sol ver. I n either case, a vague formul ation of the goal may do consi derable harm when you attempt to sol ve a probl em. For exampl e, consi der the cheap-necklace problem: You are gi ven four separate pi eces of chai n that are each t hree l i nks i n lengt h. I t costs 2¢ t o open a l i nk and 3¢ to cl ose a l i nk. Al l l i nks are cl osed at the begi nni ng of the probl em. Your goal is to obtai n a si ngl e cl osed chai n, usi ng al l l i nks, at a cost of no more t han 1 5¢. The goal i s t o obtain a si ngl e cl osed chai n, usi ng al l l i nks. But what does that mean? I s a si ngl e cl osed chai n a si mpl e loop or ci rcl e? Or woul d mul ti pl e l oops be sati sfactory? Is i t concei vabl e that a cl osed chai n mi ght mean a l ong chai n formed wi thout joi ni ng the ends together i n a l oop? Coul d there be some other variati ons on these possi bi l i ti es? Unti l you can deci de whi ch of the reasonable possi bi l i ti es consti tutes the actual goal of the probl em, you do not real l y know what the prob­ l em i s and cannot expect to make much progress i n sol vi ng i t. Some people object to such del i beratel y vague statements of goal s , but vagueness regardi ng goal s is often a feature of real probl em sol vi ng and you probabl y shoul d get some experi ence in deal i ng with thi s ki nd óÛ Chapter d of probl em. Whether in school or el sewhere, the l esson is that you should be sure you have a preci sel y formul ated and accurate under­ standi ng of the goal . I n addi ti on to having a preci se and accurate understandi ng of the goal , you wi l l frequentl y fnd it hel pful to have a more detailed repre­ sentation and understandi ng of the goal than may be provided i n the original statement of the probl em. As Pol ya ( 1 962, p. 7) has empha­ si zed, it may be useful to i magi ne for a moment that you have already sol ved the probl em and ask yourself, "What woul d I have? " Pol ya vari ousl y cal l s thi s exerci se wi shful thi nking or taki ng the probl em as sol ved. Whatever one cal l s thi s type of thi nki ng, i t i nvol ves some sort of i ncrease in the expl i ci t representation of the goal ei ther in sym­ bol i c (verbal ) or di agrammatic form. The useful ness of i ncreasing the speci fcati on of the goal may i nvol ve l i ttle more than i ntroduci ng names ( l abel s, symbol s) for concepts that appear in the goal but are not expl i ci tl y represented in that form in the original statement of the probl em. Thus, one purpose of i ncreasi ng the speci fcation of the goal is to i ntroduce the necessary working concepts for reachi ng the goal . Another purpose is to deri ve some addi ti onal properti es possessed by the goal , ei ther by rigorous i nference from the i nformation contai ned i n the goal and/or the gi vens of the probl em or by representing a rea­ sonable conjecture (guess) based on one or another heuri sti c consi dera­ tion. I n ei ther case, deri vi ng addi ti onal properti es of the goal may make i t easi er to reach the goal because then you have a more specifc idea of the diferent components that you are attempting to achi eve. As an example of a probl em where it i s useful to represent the goal expl i ci tl y in a di agrammatic form, consi der the fol l owi ng geometry constructi on probl em: Gi ven an acute angl e UVW and a poi nt P wi t hi n t he angl e, use a compass and straightedge to construct a segment QR passi ng t hrough P, such that QP and PR stand i n the ratio 2 : I , Q and R l ying on UV and VW, respectively. In sol vi ng this probl em, i t i s, of course, useful to represent the acute UVW and the point P wi thi n the angle as shown i n Fig. 3- 8. However, although we do not yet know exactl y where point Q l i es on the l i ne UV or where poi nt R l i es on the l i ne VW, i t i s useful to expl i ci tl y repre­ sent the goal l i ne QR. Thi s representation is done by drawi ng in a hypothetical dashed l i ne, as shown in Fi g. 3- 8. The sol uti on to thi s probl em wi l l be di scussed i n more detai l i n Chapter 4, but the advan­ tage of expl i ci t representati on of the l i ne QR is that you are more Inference óÜ C FIGURE 3-8 Expl i cit representation of goal l i ne. �mmmmm� mmm l i kel y to see how to construct si mi l ar triangl es i nvol vi ng the l i ne seg­ ment QR, and the constructi on of these si mi l ar triangl es is a cri ti cal step i n sol vi ng the probl em. Al though the pl acement of the l i ne QR i n Fi g. 3-8 i s certai nl y not expected to be exactl y correct, i t gi ves you a more explicit representation of what the fnal goal woul d look l i ke. I n thi s case, that makes i t much more probabl e that you wi l l see cer­ tain rel ati ons that are cri ti cal in sol vi ng the probl em. I ncreasing the speci fci ty of the goal general l y means more than merel y drawi ng an extra line or two i n a fgure or i ntroduci ng a few new symbol s ( i mportant though thi s purel y representati ve aspect may be). Ofen it means deri vi ng addi tional properti es possessed by the goal , usi ng either the statement of properti es of the goal as given in the original probl em or possi bl y also usi ng given i nformati on to deri ve properti es of the goal (wi thout necessari l y achi evi ng the enti re goal ) . A marvelous example of t he i mportance of deri vi ng properti es of the goal i s provided by the fol l owi ng pl ane-geometry probl em: Can t wo triangl es have fve of their si x parts (three sides and three angles) be equal and yet the triangles not be congruent? Stop readi ng and try to sol ve thi s probl em, usi ng the probl em­ sol vi ng method of expl i ci t representation of the goal and deri vi ng properti es of the goal . The frst i nference you might make is that, for two triangl es to have fve of thei r six parts equal , this goal subdi vi des i nto two alternati ve, more specifcal l y stated goal s -namel y, the two triangl es havi ng three sides and two angl es be equal or the two triangl es having two si des and three angl es be equal . Havi ng expl i ci tl y represented both pos­ si bi l i ti es, it i s easy to see that the frst i s i mpossi bl e -namel y, two tri­ angl es with three equal sides must be congruent, by a theorem of pl ane 4Û Chapter d geometry. Thus, we need onl y consi der the case where t he two tri­ angl es have two si des equal and all three angles equal . If you di d not solve the probl em previ ousl y, stop readi ng and try agai n. Another property that can be deri ved regardi ng t he goal pair of triangl es is that the two triangl es must be si mi l ar. Thi s property is a tri vi a� ··tstatement of the property al ready deri ved that the three angl es are e'iJal . Neverthel ess, restating thi s property using the words "si mi l ar triangl es" i s quite hel pful i n bringing to mind a useful repre­ sentation and the proper theorems regardi ng the relationshi p of corre­ spondi ng parts in si mi lar triangl es. If you di d not sol ve the probl em so far, stop readi ng and try agai n. I f you had not al ready done so, you shoul d have i ntroduced some kind of di agrammati c representation of the two si mi l ar triangl es that constitute the goal , such as the di agram i l l ustrated in Fi g. 3 - 9. Besi des drawing the two si mi l ar triangl es, you shoul d al so have l abel ed the si des i n a manner that easi l y refects whi ch si des are correspondi ng, as i s al so shown i n Fi g. 3- 9 -namel y, by usi ng the same l etter for correspondi ng si des and di stingui shi ng the two triangl es by the pres­ ence or absence of a prime. It i s not stri ctl y necessary for the solution of thi s probl em to l abel the angl es, but it does not hurt. If you have not solved the probl em al ready, stop readi ng and try agai n. Another i nference that can be drawn regardi ng t he goal i s that the two equal si des in the triangl es A Be and A ' B' C' will be noncor­ respondi ng si des. Thi s concl usi on is cl earl y true (from the method of contradi cti on, to be expl ai ned in Chapter 7) , si nce, if the two si des were correspondi ng, we shoul d have three equal angl es and two equal correspondi ng si des i n the two triangl es, and such tri angl es are cl earl y congruent, by several theorems of pl ane geometry. I f you sti l l have not solved the probl em, stop readi ng and try agai n. Yet another rel evant i nference concerning t he properti e� of t he goal is that the rati os of all correspondi ng si des of si mi l ar triangl es are C · C � A ß C (a) l lJ FIGURE 3-9 Two si mi l ar triangl es wi th two equal sides (not corresponding sides). Inference 4T equal . Thus, a' /a -b' /b -e' /e. If you have not sol ved the probl em, stop reading and try agai n. Another i mportant set of i nferences to be drawn from the goal are the i nequal i ti es that hold between the l engths of the di ferent si des wi thi n each of the two triangl es. Of course, it i s compl etel y arbi trary whi ch side we deci de i s l ongest , next l ongest, and shortest. Neverthe­ l ess it i s i mportant to represent this i nformati on expl i ci tl y i n sol vi ng thi s probl em. As i ndi cated in Fi g. 3- 9, we have assumed that e : b : a and c' : b' : a' . I f you have not sol ved the probl em thus far, stop readi ng and try agai n. Now i t i s useful t o consi der whether the goal triangl es ABC and A ' B' C' coul d be equi l ateral or i soscel es triangl es. Si nce the triangl es are si mi l ar, if one i s equi l ateral the other i s equi l ateral , and if one i s i soscel es the other i s i soscel es. Cl earl y, t he tri angl es cannot be equi ­ lateral , si nce then al l three si des of triangle ABC woul d have to be equal to all three sides of A ' B' C' , or else al l three sides of triangle A BC woul d have to be unequal to al l three sides of triangle A ' B' C' . Neither case sati sfes the goal constrai nt of havi ng two equal si des. I n a somewhat si mi l ar manner, we can contradi ct the possi bi l i ty that the triangl es ABC and A ' B' C' are i soscel es. I f you have not al ready done so, stop readi ng and prove thi s and attempt to sol ve the rest of the probl em. I n proving that the two triangl es cannot be i soscel es and i n further work i n connection wi th thi s probl em, it i s useful to deri ve another property of the goal triangl es, namel y, that tri angl e A' B' C' i s bi gger than triangl e ABC. Cl earl y, one i s free to make this assumpti on with­ out any l oss of general i ty, since the label i ng of the triangl es i s purel y arbitrary. We can si mpl y adopt a conventi on for conveni ence that the A' B' C' triangle refers to the l arger of the two si mi l ar triangl es, no matter what pair of si mi l ar triangl es we choose to work with. I nci ­ dental l y, thi s tri ck of observi ng when one i s free to assume certain relati ons without any l oss of general i ty comes up often enough i n probl em sol vi ng to be worth taking speci al note of In the present i n­ stance, if the triangl es are i soscel es, there are two possi bl e cases: the t wo l ongest si des are equal ( c -b and e' -b' ) or t he two shortest si des are equal ( b -a and b' -a' ) . I n the former case, the two larger si des (e' and b ' ) of triangle A' B' C' wi l l be l arger than any of the three sides of triangle ABC. Thus, there cannot be two sides of A ' B' C' equal to two sides of A BC. Si mi l arl y, i n the l atter case, there wi l l be two si des (a and b) of triangle A BC that wi l l be smal l er than any of the three sides of triangl e A' B' C' . So, i n the l atter case, there also cannot be two si des of triangle A BC that are equal to two si des of triangle 4Z Chapter d A' B' C' . Thus, we can assume that e > b > a and e' > b' > a' . That i s, the si des must have a strict i nequal i ty rel ati onshi p among them, withi n any gi ven triangl e. Now, if you have not sol ved the probl em al ready, stop readi ng and try agai n. Again conti nui ng t o focus on t he properti es of the goal , we can deri ve whi ch of the two si des of triangl e A BC must be equal to whi ch of the two si des of triangle A' B' C' . If you have not solved the probl em, stop readi ng and answer the questi on concerni ng whi ch si des of triangle ABC must be equal to whi ch si des of triangle A' B' C' . Havi ng answered that questi on, try again to sol ve the probl em, if you have not done so al ready. Since e' i s the l argest si de of triangl e A I B' C' and triangle A I B' C' is l arger than triangle ABC (e' > e, b' > b, a' > a), i n the goal triangl es, b' must be equal to e and a' must be equal to b. That i s, the l argest si de ( e' ) of triangle A ' B' C' can have no si de equal to i t i n triangl e ABC, and the smal l est si de (a) of tri angl e ABC can have no si de equal to i t i n tri angl e A' B' C' . Agai n, if you have not sol ved the probl em al ready, stop readi ng and try agai n. I t i s now hel pful t o represent another concept connected wi t h the goal triangl es, namel y, the expansion ratio of the correspondi ng si des of the two triangl es: x -a' /a -b' /b -e' /e. Usi ng the rel ati onshi ps expressed i n thi s seri es of equati ons, we can derive the equations b' -xb and a' ÷ xa. Recal l i ng that i n the goal triangl es, b' must equal e and a' must equal b, we can derive the expressi ons x ÷ e/b and x - b/a. From thi s fact we concl ude that in the goal triangl es e/b -b/a. That is, the ratio of the l arge si de to the middl e si de i n the triangl e A BC must be equal to the rati o of the middl e si de to the smal l si de of tri­ angl e ABC. Now al l that i s necessary i s to real i ze that we have so compl etel y speci fed the goal pair of triangl es i n thi s case that we have everythi ng we need to sol v� the ori gi nal probl em. We know that two triangl es can have fve of thei r parts equal , provided that those parts are three angl es and two si des and that the rati o of the l arge si de of one triangle to the mi ddl e si de i s equal to the ratio of the mi ddl e si de to the smal l si de of the same triangl e. Then cl earl y thi s rel ati onshi p wi I I hol d for both triangl es, si nce the triangl es are si mi l ar. In addi ti on to triangl e ABC satisfyi ng the relation e/b -bfa, it is al so necessary that triangle ABC sati sfy the triangle i nequal i ty, namel y, L ´ b + a. However, there are an i nfni ty of sets of three l i nes ( a, b, e) that do indeed form a tri­ angle ( sati sfy the triangle i nequal i ty) and are i n the relation of e/b -b/a. For each such triangl e (ABC) , one can construct exactl y one l arger triangl e (and one smal l er triangl e) such that fve of the six parts of the Inference 4ó two triangl es are equal . The requi red expansi on of the triangle A Be i s obviousl y the factor X -c/ b -b/a. Thus, the ori gi nal probl em i s solved. Note that i n essence thi s was a constructi on probl em, though it was phrased more as if i t were an exi stence probl em. However, i n order to determi ne the exi stence of pai rs of triangl es having fve of thei r si x parts equal , i t was necessary i n this case to sketch a specifc means by whi ch such a pair of triangl es coul d be constructed. The strategy used i n thi s proof was to i ndi cate a pl an for constructing a pair of such triangl es. Thi s probl em provi des a trul y remarkabl e exampl e of the i mportance i n some probl ems of focusi ng on the goal and deri vi ng properti es of the goal (drawing inferences concerning the goal ) . The number of ti mes that properti es of the goal were represented or i nferences were made concerning the goal i n thi s probl em was unusual l y l arge. Practi cal l y t he enti re probl em-sol vi ng process consi sted of representi ng or deri vi ng properti es of the goal , in thi s case. The reason for thi s extensi ve focusi ng on the goal was pri mari l y that the gi vens were so unspeci fc ­ they were al l the axi oms and theorems of pl ane geometry. The onl y unique aspect of the probl em that i ndi cated what to sel ect from al l of our knowledge of pl ane geometry was the goal . Thus, we necessari l y had t o focus enti rel y on the goal , si nce i t was t he onl y uni que aspect of the probl em. Said another way, the goal provi ded us a uni que be­ gi nni ng point from whi ch to draw i nferences, whereas i f we had started from the gi vens our frst step would have been to write down any axi om or theorem of plane geometry. Starti ng wi th the gi vens we woul d have had l i ttle idea where to proceed. Therefore, good strategy i n thi s prob­ l em was to focus on the goal and make that goal progressi vel y more and more specifc to i ndi cate exactl y what aspects of pl ane geometry were relevant to sol ve the probl em. In thi s case, once al l the proper­ ti es of the goal were expl i ci tl y represented, it was tri vi al to sol ve the probl em from the begi nni ng, by specifying a pl an of constructi on. A probl em in a completel y diferent context that i l l ustrates the use­ ful ness of focusing on the goal and deri vi ng some of i ts properti es at an earl y stage in probl em sol vi ng is the fol l owi ng 63-link-chain probl em: Wanda the wi tch agrees to trade one of her magi c broomst i cks to Gaspar the gho!t i n exchange for one of his gol d chai ns. Gaspar is somewhat skepti cal that the broomsti ck i s i n worki ng order and i nsi sts on a guarantee equal i n days to the number of l i nks i n hi s gol d chai n. To faci l i tate en­ forcement of the guarantee, he i nsi st s on payi ng by the i nstal l ment pl an, one gol d l i nk per day unti l the end of the 63-day peri od, wi th the bal ance to be forfei t i f the broomsti ck mal functi ons duri ng the guarantee peri od. Wanda agrees to t hi s request, but i nsi st s that the i nstal l ment payment be 44 Chapter 3 efected by cutting no more than three l inks in the gol d chain. Can this be done, and, i f so, what l i nks i n the chai n shoul d be cut? The chai n i nitial l y consists of 63 gol d l i nks arranged i n a simpl e linear order (not closed into a circle). Stop readi ng and try to determi ne an additional property possessed by the goal that woul d be hel pful to deri ve for use i n solvi ng the probl em. The pri mary property of the goal that i s useful i n sol vi ng the problem is the i nformati on that only three l i nks need be cut to achi eve the goal . This property means that there will be at least three si ngle l i nks i n the goal state, namely, the three l i nks that have been cut. We sti l l do not know how l ong the other l engths of chai n wi l l be (whi ch si ngle l i nks i n the 63- l i nk chain wi l l be cut), but we can then begi n to work on the probl em, knowi ng the l engths of three of the segments of chain i n the sol uti on of the probl em (the three si ngle-l i nk chai ns). This prob­ lem al so i l l ustrates the subgoal method, and i s di scussed in this context i n more detail in Chapter 6. If you cannot wait until Chapter 6 to check your sol uti on to thi s probl em, please turn ahead to pages 1 00- 1 0 I . Ü 1 ¯ - 1 . l FIGURE 3-1 0 Part of a famous chess problem. White to move and to achieve mate in fve moves. Inference 4b I n some probl ems, one cannot rigorousl y i nfer addi ti onal properti es possessed by the goal , but one can make reasonable conjectures based on heuristic pri nci pl es. For exampl e, consider the fol l owi ng chess probl em, whi ch constitutes one portion of a famous probl em origi nated by Sam Ll oyd. The probl em i s for white to achi eve checkmate i n fve moves from the starting position shown i n Fig. 3 - 1 0. Stop readi ng and try to solve thi s problem by guessi ng one or more pl ausibl e goal posi ­ tions i n whi ch bl ack i s checkmated and then try to determi ne how you coul d achi eve such a checkmate posi ti on. I t seems reasonabl e to conjecture that the checkmate posi ti on wi l l have whi te' s rook at hi s own ki ng' s rook one. That i s, whi te' s rook will be at the end of the open fle where he has the bl ack king trapped. Bl ack has the potential opportuni ty to i nterpose hi s bi shop at two pl aces i n that fl e between the conjectured posi ti on of whi te' s rook and bl ack' s king. However, the move sequences by whi ch bl ack can i nter­ pose his bi shop between the white rook and the black ki ng can al l be frustrated by whi te i n one way or another. The essenti al strategy for sol vi ng the chess probl em comes by conjecturing that you wi sh to have the whi te rook at whi te' s ki ng' s rook one, without the possi bi l i ty of bl ack bl ocki ng the attack by hi s bi shop, and then working forward to determine what white must do at each move in order to achi eve that terminal checkmate posi ti on. 4 Classification of Action Sequences RANDOM TRIAL AND ERROR The frst thi ng that most people do when confronted with a problem i s t o start appl yi ng t he allowable operati ons t o t he gi vens i n t he probl em. Cal l thi s random trial and error. ( Readers with a course i n probabi l i ty shoul d understand that what I am cal l i ng random trial and error i s equi valent to random sampl i ng wi th replacement from the population of action sequences less than or equal to some maxi mum l ength. ) I f a very short sequence of such actions is sufci ent to get from the gi vens to the goal , even randoml y generated sequences of acti ons may yi el d the sol uti on fai rl y qui ckl y. SYSTEMATIC TRIAL AND ERROR To avoid goi ng around in ci rcl es, i t is obvi ousl y desi rable to remember what sequences of acti ons have been tried already wi th no success. I n addi ti on, it i s desi rable to have a scheme for systemat i cal l y generat­ ing di ferent sequences of actions, whi ch guarantees that all sequences (to some maxi mum l ength) wi l l be generated. Most desi rable of al l random trial -and-error schemes woul d be a generation method that Classification of Action Sequences 4¯ automatical l y produced a mutual l y excl usi ve and exhausti ve l i sti ng of all sequences of actions up to some maxi mum l ength. Cal l thi s sys­ tematic trial and error (equi val ent to random sampl i ng without re­ pl acement) . From the above di scussi on, it shoul d be rel ati vel y obvi ous that there can be di ferent degrees of systemati cness between random and completel y systematic trial and error. A probl em sol ver coul d have some memory for past attempts, but it coul d be l i mited or subject to error. A probl em sol ver coul d al so have di ferent degrees of efecti ve­ ness i n systematical l y generati ng all of the di ferent sequences of acti ons. The degree of systemati cness i n the use of tri al and error i s one useful i ndi cator of the i ntel l igence of diferent speci es of ani mal s. I t hi nk it i s not known whether any speci es of ani mal bel ow human beings can be trai ned to be more systematic i n thei r trial and error, but humans certai nl y can be. Peopl e can overcome thei r memory l i mitations by writing thi ngs down, and they can ofen i nvent mutual l y excl usi ve and exhausti ve generation schemes, though the di fcul ty of accompl i shi ng the l atter vari es from probl em to probl em. CLASSIFICATORY TRIAL AND ERROR The most powerful ki nd of trial and error is what might be cal l ed classicatory trial and error, whi ch requi res that sequences of acti ons be organized i nto cl asses that are equi valent (or probabl y equi valent) with respect to the sol ution of the probl em. That is, if one sequence of actions wi thi n a cl ass wil l sol ve the probl em, then al l the other se­ quences of acti ons within the same cl ass will probabl y al so solve the problem. Conversel y, if one sequence of actions within the class can be shown not to sol ve the probl em, then probabl y every other sequence of acti ons i n the same cl ass wi l l al so fai l . To appreci ate t he power of cl assi fcatory trial and error, consi der a state-action tree with n possi bl e acti ons at each node of the tree. Wi th thi s representati on, there are nm possi bl e sequences of acti ons that are m acti ons i n l ength. For even rather smal l val ues of n and m, nm can be s o l arge as to prohi bi t the use of systematic trial and error. Obvi ousl y, if nm sequences of acti ons coul d be reduced to a small number of equivalence classes -that is, classes that are equi valent with respect to the solution of the probl em-i t woul d make the probl em much si mpl er to sol ve. I n thi s case, you coul d systemati cal l y try one sequence from each cl ass unti l you found d cl ass that sol ved the prob­ l em. Such cl assi fcatory trial and error onl y works for probl ems where 4Û Chapter 4 sequences of acti ons fal l i nto cl asses that are equi val ent wi th respect to sol uti on of the probl em, but most of the probl ems peopl e sol ve probabl y exhi bi t some such equi val ences. There are four di ferent categori es of probl ems i n whi ch cl assifca­ tory trial and error is hel pful , and wi thi n each four are two subtypes. To di scuss these types of probl ems, l et us i magi ne that the states reached by vari ous action sequences appl i ed to the given i nformation can be represented by a sequence of l etters , where the frst l etter stands for the action taken at the frst node, the second l etter for action at the second node, and so on. Thus, abc represents the state reached by taking acti on a at the frst node, fol l owed by acti on b at the second node, fol l owed by action c at the thi rd node i n the state-action tree. The basi c pri nci pl e i s that two or more action sequences are equi val ent if and onl y if they resul t in the same state or states thought to be equi va­ l ent wi th respect to sol vi ng the probl em. The frst major type of equi val ence cl ass of action sequences i s the obvi ous one that resul ts from havi ng equi val ence cl asses of actions. In thi s case, for exampl e, l et us i magi ne that a set of actions { bl , b 2 , b3 , • • • } -{ bi } are al l i denti cal or thought to be equi valent. I n thi s case, the acti on sequence ab; c i s equi valent to the action sequence abjc, for all i and j. People usual l y have no troubl e i n identifyi ng equi val ence cl asses of action sequences based on such el ementary equi ­ val ences of component actions. Such equi valent acti ons often ari se i n probl ems where there are a l arge number of equi valent gi vens, such as a l arge number of enti ti es of the same type -for exampl e, si x sti cks i denti cal i n l ength and every other i mportant property. When the gi vens are not i denti cal i n every respect but are equi val ent with respect onl y to the properti es that are thought to be i mportant to the probl em, then recogni ti on of such equi val ences may be more di fcul t and subject to error. In any event, identical or equi valent actions (ofen resul ti ng from i dentical or equi val ent gi vens) produce the frst type of equi va­ l ence cl asses of action sequences. A second and rel ati vel y fami l iar type of equi valence cl ass of action sequences ari ses i n probl ems having commutative actions -that i s, where the resul t of taking acti on a fol l owed by action b yi el ds the same resul t as taking action b fol l owed by action a. If three actions (abc) are all commutati ve with respect to one another, then action sequences abc, acb, bac, bca, cab, and cba are al l equi val ent, si nce they resul t i n the same state when appl i ed to the same gi ven i nformation or other starti ng point i n a probl em. For exampl e, i n sol vi ng for X, given the equation 5x + 17 -3x + 2 1 , we could subtract 3x from both sides of the equati on as the frst step, then subtract 1 7 from both si des as the Classification of Action Sequences 4Ü second step; but the same resul t i s achi eved by performing the acti ons i n the reverse order. Even the fnal acti on of di vi di ng both si des of the equation by 2 coul d be commuted with respect to the other two ac­ tions and equi val ent resul ts obtai ned. A thi rd major way i n whi ch action sequences may be equi val ent oc­ curs i n probl ems where one or more acti ons have inverse acti ons. I f action a has an i nverse action a- I , then the resul t of appl yi ng action a fol l owed by action a- I is to leave the state of the probl em i denti cal to what it was before the sequence aa- 1 • Said another way, the sequence of acti ons aa- 1 equal s the i denti ty acti on, whi ch l eaves the state of the probl em unchanged. I f all the acti ons i n some sequence are com­ mutati ve, then any co-occurrence of acti on a and i ts i nverse a- I per­ mi ts you to cancel both a and a-I from the sequence. For exampl e, if the acti ons a, a-I , b, c, d are al l commutati ve wi th respect to each other, then the action sequence ca- 1 bda is equi val ent to the acti on sequence cbd, si nce the a and a-I cancel each other. A good exampl e of the power produced by the combi ned recogni ti on of commutati vi ty and i nverse acti ons i n reduci ng the number of di f­ ferent action sequences to a small set of equi val ence cl asses is pro­ vided by the six-arrow problem: You are gi ven si x arrows in a row, the lef t hree poi nti ng up, and the right t hree poi nti ng down. The goal is to t ransform these arrows into an al ­ ternati ng sequence such that the lef-most arrow poi nts up, the next arrow to i t poi nts down, the next up, then down, then up, and then down. The acti ons al l owed are to si mul taneousl y i nvert ( turn upsi de down) any two adjacent arrows. Note that you cannot i nvert one arrow at a ti me but must i nvert two arrows at a ti me, and the two arrows must be adjacent. The gi ven and goal states are i l l ustrated i n Fi g. 4- 1 . Achi eve the sol u­ ti on usi ng the mi ni mum number of acti ons (i nversi ons of adjacent pai rs). Before readi ng further, try to sol ve thi s probl em by determi ni ng the very smal l number of di ferent equi val ence cl asses of action sequences. 2 3 4 5 6 2 3 4 5 6 r r r 1 1 1 r 1 r 1 r j Gi ven Goal FIGURE 4-1 The si x-arrow probl em. bÛ Chapter 4 I f you have not sol ved the probl em, consi der the fol l owi ng. I n representi ng the i nformation gi ven i n the probl em, you shoul d note that there are onl y fve di ferent possi bl e acti ons that you can take at any gi ven stage of the probl em-namel y, to i nvert arrows I and 2, 2 and 3 , 3 and 4 , 4 and 5 , or 5 and 6, as shown i n Fig. 4- 1 . Of course, if every di ferent sequence of acti ons had to be consi dered, and you had no way of knowi ng how long a possi bl e sequence might be necessary to sol ve the probl em, then the probl em coul d be extraordi nari l y di fcul t, de­ spite the l i mited number of acti ons avai l abl e at each node. In fact, a l i ttle reasoni ng concerni ng the equi val ence of diferent action se­ quences reduces the number of nonequi valent action sequences to an extremel y smal l number. I n the frst pl ace, you shoul d note that the order in whi ch you per­ form the acti ons makes no di ference. That i s, the acti ons commute one wi th another, so that i nverting arrows 3 and ¬ and then i nverti ng 4 and 5 i s compl etel y equi val ent to frst i nverting 4 and 5 and then i nverti ng 3 and 4. The same i s true for any set of three or more acti ons i n a sequence. Thus, you do not have to deal wi th ordered sets of ac­ ti ons but onl y with unordered sets. Thi s statement si mpl y means that al l the di frent orderi ngs ( permutati ons) of a gi ven unordered set of action sequences are equi val ent, greatl y reduci ng the number of pos­ si bl e sol uti ons to be consi dered. Now stop readi ng and try to sol ve the probl em, if you coul d not before. I f you sti l l cannot sol ve the probl em, consi der the fol l owi ng hi nt. An opti mal solution will contai n no more than one occurrence of any gi ven type of acti on. An action i s i ts own i nverse. I nverti ng arrows 2 and 3 twi ce l eaves the arrows exactl y the same as they were. Thus, any pai r of two occurrences of a gi ven action can be cancel ed (even if they are not adjacent, si nce the acti ons are compl etel y commutati ve). Any even number of occurrences of an action i s equi val ent to zero occurrences of that acti on, and, for the same reason, any odd number of occurrences i s equi valent to a singl e occurrence. Thus , we need consi der only combi nati ons ( unordered sets) of from one to fve pos­ sibl e actions. At this poi nt we have reduced a potenti al l y i nfni te num­ ber of di ferent action sequences to 3 1 possi bl e classes of acti on sequences. Each of the 3 1 cl asses can be represented by its si mpl est member, as fol l ows : 5 si ngle- step acti ons, 1 0 two-step action sequences, 1 0 three-step action sequences , 5 four-step action sequences, and 1 fve- step action sequence. Stop readi ng and try to solve the probl em, if you di d not before. We can now observe t hat two of the fve acti ons - namel y, i nverti ng arrows and 2 and i nverting arrows 5 and -cannot possi bl y be in­ cl uded i n the opti mal sol uti on, si nce these actions change arrows 1 Classification of Action Sequences bT and 6, whi ch are in the right position in the begi nni ng state. To change these end arrows back to the correct posi ti on would requi re another use of exact l y the same acti on, si nce no other acti ons change arrows I and 6. Thi s sol uti on cannot possi bl y be optimal , si nce i t i s equi val ent to not performing the action at al l . Thus, we have reduced the number of possi bl e acti ons to consi der at any node to three. The maxi mum number of acti ons i n a sol uti on sequence i s now reduced to three. I t i s then a si mpl e matter to rul e out al l of the one- step and two-step action sequences, l eavi ng onl y the si ngl e three- step action sequence as a sol uti on to the probl em. Of course, as i l l ustrated i n Fi g. 4-2, there are actual l y si x di ferent acti on sequences that al l achi eve the goal i n the smal l est number of steps (three) . These si x sol uti ons difer onl y i n the order wi th whi ch the three acti ons are appl i ed. Thi s sol u­ tion poi nts out once again the exi stence of a l arge variety of acti on sequences that are compl etel y equi val ent wi th respect to the sol uti on of the probl em. The fourt h major type of equi val ence cl ass of action sequences ari ses i n probl ems where some arbitrary sequence of acti ons abc resul ts i n State Six equivalent solutions 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 Gi ven t t t + + + t t t + + + t t t + + + t + + + + + t + + + + + t t + t + + t + t t + + t + + t t + t + t t + + Goal t + t + t + t + t + t + t + t + t + Gi ven t t t + + + t t t + + + t t t + + + t t + t + + t t t t t + t t t ¹ + t t + + t + t t + + t + t + + t t + Goal t + t + t + t + t + t + t + t + t + RGURE 4-2 Si x equi val ent three-act i on sequences that sol ve the si x-arrow probl em. bZ Chapter 4 an i denti cal or equi val ent state as some other sequence of actions dafg, for exampl e. You may not have any el egant theoretical defnition for al l the equi val ences of this fourth major type i n a probl em, but neverthel ess you shoul d recognize such equi valences and take advan­ tage of them i n speedi ng up the sol uti on to the probl em. In t hi s fourth, most general case of cl assi fcatory trial and error, what we are doing i s to defne certain probl em states and recogni ze that any action se­ quence that achi eves a gi ven probl em state is a member of the same equi val ence cl ass. We may not know prior to executi ng an action se­ quence that i t will resul t i n the same state as some al ready executed action sequence ; however, the advantage of recogni zi ng the equi valent state reached by both action sequences i s that we need not conti nue pursui ng an action sequence dafg that arri ves at the same state as a previ ousl y executed acti on sequence abc, if we have al ready deter­ mi ned that no path from that state is l i kel y to reach the goal . Thus, we can truncate i n t hi s way, recogni zi ng that a diferent sequence of acti ons has resul ted in the same state, and, therefore, we need not conti nue the action sequence from that poi nt on, si nce it woul d be, i n essence, repeating sequences previousl y shown to be frui tl ess. One practical way to i mpl ement thi s use of cl assifcatory tri al and error is to gi ve names or otherwi se store in your memory a representation of certain di sti ncti ve states reached in vari ous attempts to sol ve the probl em. A good exampl e of the useful ness of expl i ci tl y i dent ifyi ng di sti nc­ ti ve states achi eved i n vari ous attempts at sol uti on of a probl em i s provided by the railroad-siding problem: You are gi ven a ci rcul ar rai l road t rack that passes through a tunnel and has one si di ng, as i l l ust rated in Fi g. 4- 3. In the gi ven state of the probl em, an engi ne, Î, rests on the si di ng and two cars , ¬ and B, rest on the ci rcu­ l ar track on opposi te si des of the t unnel , as i l l ustrated i n Fi g. 4- 3 . The goal i s to i nterchange the posi ti ons of cars ¬ and B and have the engine ß Tunnel Tunnel ß A ( a) Gi ven |b) Goal FIGURE 4-3 The ri l rad- �i di ng prohl em. Classification of Action Sequences bó back on the si di ng. An i mportant restricti on is that onl y the engi ne can pass through the tunnel ; the cars cannot . Both cars and engi ne may rest on a si di ng i n any order and i n any numbers. However, as wi th real -worl d rai l road si di ngs, a seri es of cars comi ng of the si di ng must go on to the ci rcul ar track i n the di recti on of the tunnel ; they cannot make the sharp angle t urn from the si di ng onto the ci rcul ar track in the di recti on away from the tunnel . Cars can be coupl ed and uncoupled from one another or coupl ed or uncoupl ed from the engi ne at any poi nt. Stop readi ng and attempt to sol ve the probl em by drawi ng a di agram on a piece of paper and getting three di sti ngui shabl e objects to act as the engine and cars A and B. You mi ght consi der there t o be many di sti ngui shabl e states i n the probl em, where the states are defned to be the di ferent arrangements of the engi ne and cars A and B on the ci rcul ar track and si di ng; how­ ever, it might be extremel y useful to consi der a si mpl e di sti ngui shabl e state i n whi ch onl y one car or engi ne rests i n each of the three major portions of the track -namel y, the si di ng, the porti on of the track i n a cl ockwi se di recti on from the si di ng to the tunnel , and t he porti on of the track in a countercl ockwi se di recti on from t he si di ng to the t unnel . Considering onl y these three diferent posi ti ons and l i miting consi dera­ tion to those cases where onl y a si ngl e car or engi ne rests in each of the t hree posi ti ons, there are onl y si x possi bl e confgurati ons of the three enti ti es i n the three posi ti ons. One i s the gi ven state, anot her i s the goal state, and the remai ni ng four can easi l y be represented wi th penci l and paper. The si x possi bl e confgurati ons resul t from t he pos­ si bi l i ty of pi cki ng any of the t hree enti ti es to fl l a posi ti on on t he sidi ng, t hen pi cki ng any t wo of the remai ni ng ent i ti es to fl l the upper posi ti on on th·e ci rcul ar track, whi ch l eaves onl y one remai ni ng entity to fl l the l ower position on the ci rcul ar track. This yi el ds 3 ³ 2 or six possi bl e confgurati ons. Havi ng l i sted al l six confgurat i ons, you mi ght fnd i t useful to qui ckl y cl assify your acti on sequences wi th respect to whether they achi eve as a subgoal any one of the four confgura­ ti ons other t han the gi ven or goal confgurati on. Al l acti on sequences that achi eve any part i cul ar one of t he four nontermi nal confgurati ons are equi val ent. Thus , you mi ght set each of the four nontermi nal con­ fgurati ons as a subgoal , try to achi eve i t , and then see whether you coul d get to the goal posi ti on from t hat part i cul ar nont ermi nal posi ­ ti on. I n thi s way, you ensure a certai n degree of vari ety i n t he act i on sequences you take, so t hat you are not goi ng around i n ci rcl es. Now stop readi ng and try to sol ve t he probl em agai n, if you di d not before. I n my opi ni on, the ideal subgoal posi ti on to work for i s to i nterchange the engine and car A , pl aci ng the engine on the upper portion of t he b4 Chapter 9 ci rcul ar track and car A on the si di ng, wi th car B remai ni ng in the l ower posi ti on of the ci rcul ar track. Thi s subgoal confguration i s probabl y optimal because i t i s j ust sl ightl y more than hal fway between the gi ven state and the goal state i n terms of the sequence of steps needed to sol ve the probl em. Note that, al though thi s i mmedi ate state i s bei ng cal l ed a subgoal state, there is no sense in whi ch we have any reason to thi nk that this state i s cl oser to the goal than the gi ven state. Thus, we are not real l y usi ng the subgoal method, as thi s wi l l be defned i n Chapter 6. Rather, the basi s for sel ecti ng thi s confguration as a state to work toward is si mpl y that getti ng to i t from the gi ven state and getting frm it to the goal state each represent equi val ence cl asses of action sequences. Stop readi ng and try agai n to sol ve the probl em, if you di d not before. Consi der an extensi on of the method of cl assifyi ng action sequences wi th respect to states achi eved - namel y, i dentifyi ng acti on sequences i n terms of the sequence of states achi eved. In the present probl em, i t turns out that the sol uti on sequence passes through fve of the si x si mpl e confgurati ons that we di st i ngui shed. The sequence of fve si mpl e confgurati ons i n the sol uti on of the probl em i s shown i n Fi g. 4- 4. By fol l owi ng thi s seri es of subgoal s , you shoul d be abl e to sol ve the probl em. Thi s probl em provi des an excel l ent i l l ustration of the uti l ity of i dent ifyi ng equi val ent acti on sequences by states achi eved or sequences achi eved, si nce i t i s rat her easy to get mi xed up con­ cerni ng where one i s and where one i s going wi t h al l the i nvol ved move- L A ß ¹ A L DDD ß L B Du L ¹ FIGURE 4-4 ^ sequence of �i mpl e confgurat i on� i n t he sol ut i on to t he rai l road- si di ng probl em. Classification of Action Sequences bb ments of cars and engi nes requi red i n order to sol ve the probl em. I f one does not i dentify landmarks al ong t he way, i t i s easy t o go around i n ci rcl es. So far, the exampl es that have been presented have i l l ustrated onl y cases where several di ferent acti on sequences were consi dered equi va­ l ent because they achi eved preci sel y i dentical states. I n some prob­ l ems, i t i s useful to consi der action sequences to be equi val ent when they achi eve states that are consi dered equi val ent with respect to sol uti on of the probl em, despi te t he fact these states may not be i den­ ti cal i n every respect. The states ( and therefore the action sequences that l ead to them) are consi dered equi val ent because they al l have cer­ tain propert i es in common t hat we judge to make the states equi val ent i nsofar as sol vi ng thi s parti cul ar probl em i s concerned ( though the states mi ght wel l not be judged equi val ent wi th respect to sol vi ng some other probl em) . Such cl assifcat i on of states ( and action sequences) as equi val ent i s, of course, more dangerous than t he compl etel y safe cl assi fcati on of states as equi val ent when the states are i denti cal . I f our judgment i s faul t y concerni ng whi ch properti es are rel evant and i rrel evant to the sol uti on of the probl em, then our judgment that al l members of some equi val ence cl ass wi l l fai l to sol ve the probl em may be faul t y. Nevert hel ess, our judgment concerni ng rel evant and i r­ rel evant properti es is general l y sufci entl y good that such equi val ence cl asses are general l y qui te useful . An exampl e of t hi s type of equi va­ l ence cl assi fcati on of action sequences al ong wi th some of the i denti ty­ based equi val ence cl assi fcati on of acti on sequences i s provi ded by the cheap-necklace problem: You are gi ven four separate pi eces of chai n that are each three l i nks i n l ength ( see lef si de of Fi g. 4-5 ) . I t costs 2¢ to open a l i nk and 3t to cl ose a l i nk. Al l l i nks are cl osed at the begi nni ng of the probl em. Your goal i s to joi n al l 1 2 l i nks of chai n i nto a si ngl e ci rcl e ( see ri ght si de of fgure) at a cost of no more than 1 5t. Stop reading and t ry to sol ve the probl em by defni ng equi val ence cl asses of acti on sequences based on the achi evement of equi val ent states. I f you di d not sol ve the probl em, consi der t hese hi nts. There i s an i mpl i ci t operation of inserting one l ink i nto an open link that has a cost of O¢ attached to it. In addi ti on, there i s another i mpl i ci t operati on of detaching an open l i nk from a cl osed l i nk that al so has a cost of O�. These operati ons are onl y i mpl i ci t l y specifed. Now try to sol ve the bb Chapter 9 Given state Goal state Chain A _ Chain B ¸ Chain e � Chai n Û FIGURE 4-5 The gi ven and goal states for the cheap-necklace probl em. probl em agai n, if you di d not before because you di d not expl i ci tl y represent these operati ons. I f that was not your di fcul ty, read on. Havi ng represented al l the i mportant gi vens and operati ons i n the probl em, l et us exami ne how many di ferent types of acti on sequences there mi ght be that achi eve the goal of getti ng al l the links i nto a cl osed chai n ( ci rcl e) . The one type of action sequence t hat vi rt ual l y everyone consi ders frst i s to open an end l i nk of one chai n ( for exampl e, chai n A ) , i nsert an end l i nk of another chai n i nto i t ( for exampl e, chai n B) , cl ose t he joi ni ng l i nk, open an end l i nk of the combi ned ( 6- l i nk ) chai n, i nsert anot her 3-l i nk chai n ( for exampl e, chai n C) i nto i t , cl ose the joi ni ng l i nk, open an end l i nk of the combi ned ( 9- l i nk) chai n, i nsert the l ast 3- l i nk chai n ( chai n D) i nto i t, cl ose t he joi ni ng l i nk, open an end l i nk of t he combi ned ( 1 2- l i nk) chai n, i nsert the other end l i nk, and cl ose t he joi ni ng l i nk to form a cl osed chai n. However, thi s acti on sequence costs 1 8i , whi ch exceeds the l i mi t of 1 5i . There are a l arge number of acti on sequences that are essenti al l y equi val ent to t he one j ust menti oned, whi ch we mi ght refer t o as the (l l d- to-(l1 d acti on sequence. Obvi ousl y, i t mak(s no di ference whi ch 3- l i nk chai n we start wi th, add on second, or add on t hi rd. I n addi ti on, i t makes no di ference whi ch end l i nks we open at various stages of the probl em. I f we have expl i ci tl y noti ced t he equi val ence of al l these Classification of Action Sequences b¯ di ferent action sequences, we are i n a favorabl e posi ti on to di scover whether there are any action sequences that are not equi valent to end-to-end that mi ght resul t i n the sol uti on. I f you have so far not solved the probl em, stop readi ng and try agai n. I t might occur to you that , afer openi ng a l i nk, you coul d i nsert two end l i nks i nto it or i nsert a mi ddl e l i nk. These acti ons as a part of any action sequence ( whi ch did not l ater essenti al l y reverse the efect of these acti ons) woul d undeni abl y resul t i n an outcome that was not equi val ent to end-to-end. However, a l i ttle i nspecti on of the nature of the goal to be achi eved reveal s that, in the fnal cl osed chai n, there are no l i nks that have more than two l i nks i nserted in them. Thus, al l methods that i nsert two end l i nks (of chai ns wi th two or more l i nks) or a mi ddl e l i nk ( of a chai n wi th three or more l i nks) i nto an end l i nk of any chai n wi th two or more l i nks coul d not produce the goal state. (Thi s statement is true unl ess the critical acti on were l ater reversed, and reversi ng acti ons seems very unl i kel y to be a part of the correct solution, i n vi ew of the cost l i mi tati on. ) Assumi ng that we have heuri sti cal l y rejected al l action sequences that i nvol ve acti ons that resul t i n three or more l i nks bei ng i nserted i nside anot her l i nk, are there any other types of action sequences that are not essential l y equi valent to end-to-end? I f you have not yet sol ved the probl em, stop readi ng and try agai n, then read on. Yes, there i s exactl y one other type, and i t i s the sol uti on to the probl em. For some reason, this type of acti on sequence does not occur to many peopl e very qui ckl y, but if you have rul ed out al l exampl es of the previ ousl y menti oned two equi val ence cl asses of acti on sequences, then you are i n a rather favorabl e posi ti on for di scovering thi s remain­ ing type of action sequence. At least, you are not wasting ti me trying out many specifc exampl es of each of the two l arge classes of acti on sequences that do not work. Perhaps you have al ready di scovered thi s remai ni ng type of action sequence and perhaps you have not. In any event, thi s type of action sequence coul d be cal l ed destroying a chain. The action sequence i s as fol l ows: Open a l i nk of one ( 3- l i nk) chain (for exampl e, chain A ), detach the link from that chain (chai n A ) , i nsert the link i nto ends of two di ferent 3- l i nk chains (for exampl e, B and C) , close the joining l i nk; open another l i nk of the chai n one is destroyi ng (chai n A) , i nsert that l i nk i nto an end of the combi ned ( 7-l i nk) chai n and an end of the re­ mai ni ng 3 - l i nk chai n ( D) , cl ose the joi ni ng l i nk; open the last l i nk of the frst chai n (A ) , i nsert the ends of the combi ned ( I I -l i nk) chai n i nto it, and cl ose the joi ni ng l i nk. The cost of thi s type of action sequence i s exactl y 1 5i, sol vi ng t he probl em wi thi n the cost l i mi tati on. Chapter 4 MACROACTIONS Now consi der a sequence of acti ons that starts with some gi ven state and achi eves some ot her state. Call t hat a micro{lctioll sequence. Fur­ thermore, consi der any other sequence of mi croactions that starts from the same gi ven state and achi eves the same terminal state to be a member of an equi val ence class of mi croaction sequences. Call such an equi val ence class a maeroaetion. Thus, a macroaction i s defned to be an equi val ence class of sequences of mi croacti ons, though i n some cases the equi val ence cl ass may consi st of onl y one member. Defni ng one or more macroacti ons based on the mi croacti ons speci­ fed i n the probl em i s someti mes a si gni fcant aid i n solution. If you l i ke si l l y analogi es, it i s l i ke wearing seven-l eague boots and taking giant steps i nstead of baby steps. Ofen, when one has defned macro­ acti ons, the number of such macroacti ons necessary to go from the gi vens to the goal is extremel y smal l . Thi s means a small state-action tree, wi th onl y a relati vel y small number of di stinct possi bl e macro­ action sequences to test. Systemati c trial and error wi l l ofen be quite adequate to sol ve the probl em from this point on. Defni ng macroacti ons from sequences of mi croacti ons does have one possi bl e di fcul ty, namel y, that application of a macroaction might take one past the goal . I n probl ems wi th destructi ve operations, one woul d have sped past the goal much as an express subway train speeds past a l ocal subway stop. You woul d not even have the op­ portuni ty to see the goal for an i nstant as you passed through i t, whereas at least on the express subway there i s some chance t hat you coul d see your local stop as you sped through i t. Even wi th nondestructi ve operati ons, the efect of appl yi ng a macroacti on that took you past the goal coul d be much the same as wi th destructi ve operations. To be sure, even when you have gone past the goal , you still have achi eved i t i n some sense, when the probl em i nvol ves nondestruct i ve operat i ons. However, if you do not know that the goal was achi eved, because you never wrote down the goal expressi on, then i n a practical sense, you have not achi eved the goal . As an exampl e of the successful use of macroacti ons and cl assi ­ fcatory tri al and error, consi der t he set of possi bl e acti on sequences i nvol ved i n reduci ng an equati on of the form ax + b ÷ex + d to an equa- ti on of the form x ÷ . « There are a number of di ferent mi cro- acti ons in di ferent orders t hat wi l l serve to reduce t hi s equati on. One Classification of Action Sequences bÜ coul d subtract b from both si des, then subtract CX from both si des, and then di vi de both si des by ( a C) , but one coul d al so add CX to both si des, then mul ti pl y both si des by I /( a C) , and then subtract b/(a C) from both si des, and so on. To someone experi enced i n sol v­ ing such si mpl e al gebra probl ems, thi s probl em may seem pretty tri vi al , but many students l earni ng el ementary algebra for the frst ti me fnd these probl ems di fcul t. One reason for thi s di ference i s that the ex­ peri enced l i near-equation reducer sees al l these di ferent action se­ quences as equi val ent, whi l e many of the i nexperi enced l i near-equati on reducers have yet to learn thi s fact. Al though probabl y few people actual l y think i n terms of a si ngl e equation for l i near-equation reduc­ ti on, what an experi enced l i near-equati on reducer has i n hi s head i s essenti al l y equi val ent t o a si ngle macroacti on for probl ems of the above type, namel y, OX + b ¬ CX + d - X ¬ ( d b)/(a C) . The ex­ perienced l i near-equati on reducer probabl y goes through a short se­ quence of mi croacti ons to sol ve such a probl em, but such a person does thi s i n one of a very small number of compl etel y routi ni zed ways, with, at most , a si ngl e choi ce of one of the equi valent mi croacti on sequences. Thi s statement i s what i s meant by thi nki ng i n terms of a si ngl e macro­ action rather than a sequence of mi croacti ons. Another exampl e of the useful ness of thi nki ng i n terms of macro­ actions occurs in geometry construction probl ems. I n such probl ems, you are al l owed to use a compass, an unmarked straightedge (no grada­ ti ons as on a rul er), and, of course, penci l and paper. The mi croacti ons that you have avai l abl e are then to draw arcs of ci rcl es and straight­ l i ne segments. In learning how to sol ve geometry constructi on prob­ l ems, you frst learn what sequences of mi croacti ons al l ow you to achi eve certain states, such as construct i ng a perpendi cul ar to a l i ne at a gi ven poi nt , constructing a perpendi cul ar bi sector of a l i ne seg­ ment, constructi ng an angle bi sector, or construct i ng a paral lel to a gi ven l i ne through a gi ven outsi de poi nt. Thereafter, in more compl ex geometry constructi on probl ems, you thi nk i n terms of what sequence of these macroacti ons i s necessary i n order to sol ve these geometry constructi on probl ems rather than i n terms of the original mi croacti ons of drawi ng arcs and ci rcl es and strai ght-l i ne segments, t hough you must use a sequence of such mi croacti ons in achi evi ng each macroacti on. However, i n constructi ng the basi c pl an for sol vi ng a more compl ex geometry construct i on probl em, the use of a macroacti on i s extremel y hel pful . As an exampl e of how t hi nking i n terms of such macroact i ons si mpl i fes geometry constru ct i on probl ems, consi der t he fol l owi ng probl em, whi ch was previ ousl y di scussed bri efy i n Chapter 3 . bÛ Chapter 9 Gi ven an acute angl e UVW and a poi nt P wi thi n the angl e, use a compass and strai ghtedge to construct a segment QR passi ng through P, such that QP and PR stand i n the rati o 2 : 1 , Q and R l yi ng on UV and VW, respecti vel y. Of course, t he l i ne QR shown i n Fi g. 4- 6 i s not part of the gi ven state but is rather the goal to be achi eved. The l i ne QR has si mpl y been drawn i n the fgure to faci l i tate thi nki ng about the probl em. Now stop readi ng and attempt to sol ve the probl em, thi nki ng i n terms of geom­ etry constructi on macroacti ons. I f you di d not sol ve the probl em, consi der the fol l owi ng hi nt. The l i ne segment s QP and PR wi l l be i n the rat i o 2 : L if and onl y if the ratio of the line segment QR to the l i ne segment PR i s i n the rati o 3 : I . This hi nt merel y exempl i fes a rel ati vel y trivial i nference made from the gi ven i nformati on, though i n thi s case thi s tri vi al transformation of the gi ven i nformation can be of consi derable help i n sol vi ng the prob­ l em. Stop readi ng and try to solve the probl em, if you did not before. I f you have not yet sol ved the probl em, consi der the fol l owing additional hint. One way to make the line segments QR and PR be i n the rati o 3 : 1 i s to make them correspondi ng parts of si mi l ar triangles, one of whose other si des i s known to be i n the ratio 3 : 1 . Si nce PR i s part of t he l i ne segment QR, t he obvi ous choi ce for si mi l ar tri angl es woul d be to construct a paral l el l i ne to the l i ne UV through the point P, produci ng a l i ttle triangle MPR ( i l l ustrated i n Fig. 4-7) , which would be si mi l ar to the bi g triangle VQR. Of course, we have not yet deter­ mi ned the l i ne QR, so thi s operation is sti l l to be done in order to sol ve the probl em. However, any l i ne QPR drawn through the point P wi l l now resul t in triangle MPR bei ng si mi l ar to triangle VQR. Thus, al l that remai ns i s to determi ne whi ch l i ne QPR wi l l resul t i n a si mi l ar triangle in whi ch the rati os of the si des are 3 : I . Note that constructi ng a paral l el to a gi ven l i ne t hrough a gi ven outsi de poi nt i s not an el ementary mi crooperati on, but rather a macro- x \ x \ v \ C \ ' * x \ FIGURE 4-6 Const ruct _H such t hat _I¯ 2 ^ IH. Classification of Action Sequences C I ´ FIGURE 4-7 Constructi ng d paral l el ( PM) to U| through the point P. bT operation that requi res a sequence of mi crooperati ons to be achi eved. However, i n pl anni ng the sol uti on of the probl em, we need not bother to expl i ci tl y carry out the sequence of mi crooperati ons necessary to achi eve t hi s macrooperation. Stop readi ng and try to sol ve the prob­ l em, if you have not done so. Pl aci ng the l i ne segments QR and PR i n a 3 : I ratio i s the same as pl aci ng the l i ne segments V R and M R i n the rel ation of 3 : 1 . The l atter is equi val ent to pl aci ng the l i ne segments V M and M R in the rat i o 2 : 1 . The l ength of VM is al ready determi ned. Therefore, al l we need is to determi ne the l ength MR ( whi ch determi nes t he poi nt R) . Si nce MR is hal f the l ength of VM, what we need is to determi ne what hal f the l ength of VM is. By t hi s time i t shoul d be clear what macroacti on wi l l al l ow us to determi ne hal f of VM and mark i t of from poi nt M al ong the l i ne MW to determi ne the poi nt R. Stop readi ng and see if you can determi ne t hi s macroacti on and then sol ve the probl em. The requi red macroacti on for determi ni ng hal f t he l i ne segment V M is to construct a perpendi cul ar bi sector to the l i ne segment VM. Thi s procedure determi nes t he mi dpoi nt of VM, from whi ch one can deter­ mi ne the l ength of hal f the segment VM by measuri ng from t he mi dpoi nt to ei ther poi nt V or poi nt M wi t h the compass. Then we si mpl y hol d the compass at t hi s posi ti on, pl ace one end of the compass at poi nt M, and mark the other poi nt al ong the l i ne MW t o determi ne poi nt R. Havi ng determi ned poi nt R such t hat the l i ne VM i s i n t he rat io 2 : 1 to the l i ne segment M R, we have now uni quel y determi ned the l i ne QPR such t hat the rati o of t he l i ne segment QP to the l i ne segment PR i s 2 : 1 , and the probl em i s sol ved. For any readers wi t hout experi ence i n pl ane geometry constructi on probl ems or who have forgotten what they l earned, a bi t of i nstruc­ tion concerni ng t he achi evement of the two pri nci pal macroact i ons bZ Chapter 9 used in thi s probl em may be hel pful in making the sol uti on of the probl em compl etel y concrete. To construct a paral l el to the l i ne UV through the poi nt P, use the arbitrari l y drawn l i ne QPR i n Fi g. 4- 6. Pl ace one poi nt of the compass at poi nt Q and draw an arc through the l i nes QU and QP. The arc may be of any reasonabl e radi us. Now draw an arc of the same radi us around the poi nt P. Now use the compass to measure the di stance between i ntersecti ons of the arc around Q that i ntersects QU and QP. To do this operati on, pl ace one point of the compass at the i nter­ secti on of thi s arc wi th the l i ne QU. Now, keepi ng the same radi us wi th the compass , pl ace one poi nt of the compass at the i ntersection of the arc around the poi nt P wi th the l i ne PR and measure of the same di stance along that arc. Thi s point when connected to point P wi l l produce a l i ne paral l el t o t he l i ne VQU. To achi eve the macroaction of bi secti ng l i ne segment VM, pl ace the compass at poi nt V and draw an arc around V i ntersecting the l i ne VM at a poi nt more than hal fway between V and M. Now draw the same radi us arc around the poi nt M, i ntersecti ng the l i ne VM and i ntersecti ng the arc around V once above the l i ne and once bel ow the l i ne. Connecti ng the two i ntersecting poi nts for the arc around V and the arc around M resul ts in a perpendi cul ar bi sector to the l i ne VM and therefore determi nes the mi dpoi nt of l i ne VM. Consi deri ng how compl ex especi al l y the frst of these two macro­ acti ons i s, i n terms of the sequence of requi red mi croacti ons, it is cl ear why i t faci l i tates pl anni ng the sol uti on of the probl em to t hi nk i n terms of macroacti ons rather than t he sequences of mi croacti ons necessary to achi eve t hem. Knowi ng whi ch sequences of mi croacti ons or equi val ence cl asses of sequences of mi croacti ons to defne as macroacti ons appears to depend very heavi l y (i f not compl etel y) on speci fc knowl edge of the area from whi ch the probl em i s taken. General probl em-sol vi ng anal ysi s makes it cl ear what the potenti al val ue i s i n defni ng macroacti ons, but i t does not t el l you whi ch macroacti ons t o defne i n any gi ven prob­ l em area. Thi s characteri sti c i s very frequentl y the nature of the rel a­ ti onshi p between general probl em-sol vi ng methods and speci fc knowl edge - that i s, general probl em-sol vi ng methods di rect you toward t he type of speci fc knowl edge t hat you shoul d acqui re and mot i vate you to acqui re thi s knowl edge by demonst rat i ng thei r useful ness i n the sol uti on of probl ems. I nci dental l y, there are other ways to sol ve t hi s probl em, usi ng other geometri c macroact i ons. So, i f you thought of a di ferent way to sol ve the probl em, i t may wel l be correct . bó GETTING OUT OF LOOPS Begi nni ng probl em sol vers frequent l y run out of i deas to appl y to a probl em. Hi ghl y ski l l ed probl em sol vers often experi ence the opposi te di fculty ; they have too many ideas and are forced to choose among a vari ety of possi bl e approaches to the probl em. I have devoted some space i n t hi s book to this questi on of deci di ng among many possi bl e probl em-sol vi ng methods, but I have been mai nl y concerned wi t h the matter of provi di ng the student wi t h a ri ch vari ety of general methods for attacki ng probl ems to ensure that you do not spend a great deal of ti me stari ng at a probl em wi thout gett i ng any i deas. Someti mes you may have no i deas at al l as to how to sol ve a prob­ l em, but more frequentl y you wi l l run out of new i deas after havi ng tried vari ous methods, none of whi ch worked. I n such cases, you may repeatedl y thi nk of the i nadequate methods for sol vi ng the probl em and get the feel i ng that you are goi ng around i n ci rcl es. When you are caught in a loop l i ke thi s, i t is obvi ousl y ti me to do somet hi ng di ferent from what you have been doi ng. But how? In many cases that seems to be just the troubl e: you are i n a seri es of l oops, thi nki ng of the same i nadequate i deas over and over agai n. An excel l ent frst st ep i n getti ng out of a l oop and doi ng somethi ng di ferent i s to anal yze what you have been doi ng. You must determi ne the attri butes ( properti es) of the approaches you have been taki ng. Usual l y when you make an efort to characteri ze what you have been doi ng i n tryi ng to sol ve a probl em, you can i mmedi atel y t hi nk of some ways to approach the probl em di ferentl y. Often, what is cri ti cal is to step back and t hi nk about what you have been doi ng rat her than t hi nk about the probl em i tsel f. There are two basi c l evel s at whi ch t hi s anal ysi s of your probl em­ sol vi ng methods can take pl ace: ( a) the l evel of the speci fc acti on or action sequences specifed i n the probl em ( cl assi fyi ng action sequences) and ( b) the l evel of general probl em-sol vi ng methods ( cl assi fyi ng prob­ l em-sol vi ng methods) . I n each case, after you have characteri zed what acti ons or methods you have used, you shoul d ask what other cl asses of acti ons or methods seem remotel y appl i cabl e to the probl em. At the l evel of general probl em-sol vi ng methods, there are general l y many speci fc ways t o i mpl ement any gi ven general method i n any part i cul ar probl em. What are the propert i es of t he way you have chosen? Coul d you construct an al ternat i ve way t hat had di ferent propert i es? Is t here any i nformat i on that i s expl i ci t l y or i mpl i ci t l y a part of the probl em that has not been expl i ci tl y represented? What b4 Chapter 9 ki nd of i nformation has been used? Can you thi nk of any al ternative representation of thi s same i nformation? At the acti on-sequence l evel , a good exampl e of the useful ness of cl assi fyi ng action sequences to get out of l oops i s provi ded by the nine-dot four-line problem: Wi thout your pencil l eavi ng the paper, draw four straight l i nes t hrough the fol l owi ng three-by-three array of nine dots ( see Fig. 4-8). FIGURE 4-8 The ni ne-dot four-l i ne probl em. Stop readi ng and try to sol ve t hi s probl em. I f you are l i ke many amateur probl em sol vers, you may have pro­ duced a number of attempted sol uti ons such as those shown in Fi g. 4- 9. Al though t here are many di ferent ways you can produce i ncorrect sol ut i ons of thi s type, you can get the feel i ng rather qui ckl y t hat they fal l i nto a smal l number of cl asses, al l of whi ch are i ncorrect. You may feel you are goi ng around i n ci rcl es, produci ng attempted sol uti ons t hat are of the same character as your previ ous tri es and getti ng no cl oser to sol uti on wi th each attempt . When you reach such a stage, i t i s wel l to try to determi ne t he properti es of your attempted methods of sol ut i on. If you ask what al l t hese acti on sequences have i n common, one answer is that t hey al l keep t he four l i nes wi thi n the peri meter of the t hree-by-three array of ni ne dots. If you exami ne the gi ven i nforma­ ti on in t he probl em, it is cl ear t hat t hi s restri cti on to the perimeter of t he array is not a part of the probl em. Thus, i t i s permi ssi bl e to attempt sol ut i ons i n whi ch t he l i nes extend beyond the peri meter of the array of dot s, and, wi t h thi s i nsi ght , the sol ut i on i s readi l y achi eved as i l l ust rated i n Fi g. ¬· ÌO. FI GURE 4-9 I ncorrect sol ut i on, t o the n i ne- dot fou r- l i ne p robl em. INCUBATION FIGURE 4- 1 0 Correct sol uti on t o the ni ne-dot four-l i ne probl em. bb When you have been going around in ci rcl es and wi sh to do somethi ng diferent to try to sol ve a probl em, probabl y the most frequentl y gi ven pi ece of advi ce i s to put the probl em asi de for several mi nutes, hours, or days, and work on somethi ng el se or get a good night' s sl eep before coming back to the probl em. Thi s is good advi ce, though in an exami na­ tion situation the maxi mum period of time you can let any probl em i ncubate i s, of course, set by the ti me l i mitati ons of the exam. But even then i t may be best to work on other probl ems and come back l ater t o t he more di fcul t ones, so that you wi l l not spend too much ti me on di fcul t probl ems and fai l to fni sh a number of easi er ques­ tions. In addi ti on, even a few mi nutes or tens of mi nutes spent sol vi ng other probl ems may gi ve you a fresh perspecti ve for sol vi ng probl ems you found di fcul t on the frst attempt. I must confess that i ncubation i s not one of my favorite probl em­ sol vi ng methods, pri mari l y, I suppose, because, when one i s forced to use it, i t i ndi cates that all the other general probl em-sol vi ng methods have fai l ed. However, when you have tri ed a l arge number of ap­ proaches to a probl em with no success, there comes a poi nt at whi ch even the most ski l l ed probl em sol ver shoul d undoubtedl y put the prob­ l em aside for a few hours or days and come back to it l ater. This is true even though a ski l l ed probl em sol ver may sti l l be able to generate new ideas concerni ng how to sol ve the probl em. Psychol ogi sts do not understand why i ncubation i s useful i n sol vi ng probl ems. The di fcul ty in expl ai ni ng the benefci al efects of i ncuba­ tion on probl em sol vi ng is not t hat we l ack any ideas concerni ng pos­ sibl e mechani sms for the efect. On the contrary, there are too many possi bl e mechani sms for the benefci al efects of i ncubati on on prob­ lem sol vi ng. Fi rst, you may be quite general l y fati gued after you have worked on a probl em for a l ong t i me, and comi ng back to it i n a fresher state bb Chapter 9 of mi nd seems l i kel y to be benefci al (though agai n we do not under­ stand the mechani sms of general i ntel l ectual fatigue or the need for sl eep and so on). Second, there may be more speci fc i ntel l ectual fati gue or i nter­ ference i n the use of your memory because of the l arge number of i ncorrect acti ons you have taken i n tryi ng to sol ve the probl em. The passage of ti me fl l ed wi th i nterveni ng acti vi ti es provi des an oppor­ tuni ty for these i nterferi ng memori es to fade away. Onl y the most val uabl e l essons you have l earned remai n in the foreground of your mi nd when you go back to the probl em, wi th a host of l esser i nterfering associ ati ons havi ng decayed to a l ow l evel . It is not cl ear that this sort of memory l oss shoul d necessari l y be benefci al to probl em sol v­ i ng, but i t wel l mi ght be. Thi rd, when you come back to the probl em, you have an al tered memory and new set of thi ngs on your mi nd as a resul t of the i nter­ veni ng act i vi ty. These new associ ati ons and new cues may wel l result i n the retri eval of new i deas from memory concerni ng how to sol ve the gi ven probl em. Thi s expl anati on i s probabl y the si ngl e most pl ausi bl e reason for the success of the method of i ncubati on. There i s a fourth, somewhat more exoti c possi bi l i ty, namel y, that a person' s mi nd goes on unconsci ousl y worki ng the probl em all during the l ong i ncubati on peri od. Either because the unconsci ous mi nd has a long time to work on the probl em or because somethi ng speci al i s added by unconsci ous problem sol vi ng, the probl em manages t o get solved in thi s way, when consci ous probl em sol vi ng has fai l ed. I n any event, the unconsci ous probl em sol vi ng may modi fy memory i n a man­ ner that faci l i tates consci ous probl em sol vi ng at a l ater ti me. There i s not one shred of evi dence for thi s expl anati on of i ncubati on, whereas the frst three possi bl e mechani sms are all extensi ons of previ ousl y establ i shed psychologi cal pri nci pl es. Neverthel ess, many psychol o­ gi sts bel i eve i n unconsci ous probl em sol vi ng. I am very skepti cal on the matter, but that i s pri mari l y a matter of phi l osophi cal preference. In any event, i ncubati on often works, whatever the mechani sm. THEORY Û State Evaluation and Hill Climbing I n the l ast chapter we reduced the amount of tri al -and-error search in a probl em by constructi ng equi val ent state-acti on trees of reduced si ze. In thi s chapter, we di scuss a very di ferent way of reduci ng the number of state-acti on sequences t hat have to be searched before achi evi ng the sol ut i on. The method has two part s: (a) defni ng an evaluation jimclion over all states i ncl udi ng the goal state and ( b) choos­ i ng acti ons at any gi ven state to achi eve a next state wi th an eval uati on cl oser to t hat of the goal . Pi cki ng an acti on on the basi s of such a l ocal eval uati on of i t s consequences i s known as hill climbing, si nce eval uati on functi ons are frequent l y defned so that the goal state has the maxi mum val ue on some one-di mensi onal eval uati on functi on. Fi gure · i l l ustrates the appl i cati on of state eval uati on and hi l l cl i mbi ng t o t he state-acti on tree for some unknown probl em wi t h a hypotheti cal eval uati on functi on defned over each state. The val ue of the functi on for each state is wri tten i nsi de the ci rcl e for each node ( state) . Thi s exampl e arbitrari l y uses an i nteger-val ued eval uati on functi on, wi th the begi nni ng state havi ng val ue 0, the goal state havi ng val ue 1 0, and nongoal states havi ng val ues i ntermedi ate between ° bÛ Begi nni ng state State l evel Goal state Hi l l-cl i mbi ng resul t FIGURE 5-1 State-acti on tree wi t h an integer-val ued eval uati on functi on defned over every state (node). One-step hi l l cl i mbi ng resul ts in the action sequence shown by the arrow. Not e that, i n t hi s case, hi l l cl i mbi ng does not achi eve t he goal state. Û Z J 4 and 1 0. Appl i cation of a one-step hi l I -cl i mbi ng method to thi s state­ acti on tree wi th thi s eval uati on function yi el ds the sequence of action choi ces shown by arrows i n Fig. 5- 1 . You wiII note t hat hi l I cl i mbi ng need not succeed i n achi evi ng the goal the frst ti me, and thi s time it did not. Havi ng fai l ed to achi eve the goal by hi l l cl i mbi ng i n the frst attempt, there are many thi ngs you can do to achi eve the goal , sti l l using hi l l cl i mbi ng. You coul d try choosi ng the action wi t h the next t o best val ue at one of the various nodes on the ori gi nal hi l l -cl i mbi ng path, use stri ct hi l l cl i mbi ng at al l other nodes, and see if you achi eved the goal wi th any of these mi ni mal vi ol ati ons of t he general hi l l ­ cl i mbi ng method. I n the present i nstance, thi s mi ni mal . modifcation of hi l l cl i mbi ng woul d succeed i f you took the next to best action going from state l evel 0 to state l evel l , because, from t hat point on, hi l l cl i mbi ng resul ts i n an action sequence t hat achi eves t he goal . Al ternati vel y, you coul d try two-step hi l l cl i mbi ng and choose the sequence of two acti ons at any gi ven node t hat resul ted i n a node wi th the greatest val ue. Thi s two-step hi l l cl i mbi ng woul d produce the goal the frst ti me i n the probl em shown i n Fi g. 5- 1 . State Evaluation and Hill Climbing bÜ Fi nal l y, you coul d question the eval uation function you had defned over te states i n the probl em. There i s usual l y no way to be certain that you have defned the eval uation functi on that i s i deal for represent­ i ng progress in achi evi ng the goal i n any gi ven probl em. Someti mes the fai l ure of hi l l cl i mbi ng suggests that a reexami nati on of the ( expl i ci t or i mpl ici t) eval uation functi on i s i n order. Eval uati on functi ons are general l y not given i n the probl em (except i n optimization probl ems) , and so any eval uation function can be chosen to see i f it works i n conjunction wi th hi l l cl i mbi ng ( or some ot her probl em-sol vi ng method) to produce the sol ution to the probl em. Someti me.s when hi l l cl i mbi ng i s used i n conjuncti on wi th a state­ eval uation functi on, a real -val ued ( numeri cal ) eval uati on i s defned for each state. In other cases, you may have some abi l i ty to compare several states and judge whi ch is cl oser to the goal , but no actual numbers are assigned to the states. Whether or not numbers are assigned to states, two states can have e'ui val ent eval uation and so you coul d not choose between them. So far we have di scussed probl ems wi th onl y a si ngl e-val ued (one­ di mensi onal ) state-eval uation functi on, but there are al so probl ems where the goal di fers from the begi nni ng state on several di mensi ons. I n these cases, it i s usual l y possi bl e to make judgments regardi ng cl oseness to the goal on each of the di mensi ons separatel y, but there may be no si ngl e, necessari l y opti mal way to combi ne the eval uati ons on each separate di mensi on i nto a si ngl e overal l eval uati on of each state. Thus, you coul d have a vector-val ued eval uati on functi on as­ si gned to each state, as shown i n Fig. 5- 2. There are a number of hi l l -cl i mbi ng opti ons i n regard to vector­ val ued eval uati on functi ons, such as that shown in Fi g. 5- 2. You coul d t ry various al ternation schemes - that i s, hi l l cl i mbi ng on one di men­ sion for a whi l e and t hen hi l l cl i mbi ng on anot her di mensi on for a whi l e. Obvi ousl y, when no i mprovement is possi bl e on a part i cul ar di mensi on by any acti on t hat you coul d t ake from the node where you are currentl y l ocated, you shoul d hi l l -cl i mb on a di ferent di mensi on for at l east t ha node. I f you have reached the goal wi th respect to one di mensi on, you shoul d also hi l l cl i mb on other di mensi ons. I n usi ng these al ternati on schemes, i t hel ps to keep records of the nodes where you coul d have chosen to i mprove on a di ferent di mensi on than the one you di d choose. When t he frst hi l l - cl i mbi ng path t hrough the state-acti on tree fai l s to produce t he sol uti on, these nodes where you had good al ternat i ve choi ces are the obvi ous pl aces to back up to and start new paths . ¯Û Begi nni ng state State level Û Z J Goal stat c FIGURE 5-2 State-acti on tree wi th a two-di mensi onal vector-val ued evaluation functi on defned over every state (node) . In thi s case. the goal state has the eval uation vector ( 5 . 4). and the begi nni ng state has the eval uati on vector (0, 0). The path taken by a hi l l cl imbi ng method depends on whet her you hi l l cl i mb on wei ghted summed component s or t ry some al t erat i on scheme. I n t he former case. t he exact wei ghi ng of t he t wo component val ues i s al so i mportant i n det ermi n i ng t he path t aken by hi l l cl i mbi ng. Another approach to mul ti di mensi onal eval uati on functi ons i s to combi ne the val ues on the separate di mensi ons i nto a si ngl e overal l val ue for each state. I f t here i s some si ngl e most nat ural way to com­ bi ne t hem. do i t t hat way frst ; but remember that. no matter how natural the combi nati on method mi ght be, i t coul d be the wrong way to combi ne the val ues on the di ferent di mensi ons -t hat i s, wrong for achi evi ng the sol uti on by one-di mensi onal hi l l cl i mbi ng. If the orig­ i nal l y chosen combi nati on method fai l s to work, try some other method of combi nati on, al ternati on schemes wi th the ori gi nal mul t i di mensi onal eval uati on functi on, mul t i step hi l l cl i mbi ng, defni ng Ù new eval uati on funct i on, or t he l i ke. APPLICATIONS Exampl es of the use of state-eval uati on functi ons and hi l l cl i mbi ng abound i n probl em sol vi ng. For i nstance, when you pl an a tri p across the count ry on a map. you i ni t i al l y exami ne roads t hat go i n nearl y the ri ght di recti on. The ri ght di recti on is the di recti on t hat reduces t he di stance between where you are and where you are goi ng at the fastest rat e. Of course, choosi ng t he road at the begi nni ng of a tri p that goes cl osest to t he ri ght di rect i on may prove to be a bad choi ce. State Eval uation and Hi l l Climbing ¯T This road may eventual l y l ead to a dead end or requi re you to go far out of the way to reach the goal . I n addi ti on, pl anni ng a tri p on a map usual l y i nvol ves other consi derati ons -speed, scenery, or other properti es - besides fnding the shortest road between the starti ng and endi ng poi nts. These consi derati ons pl ace you i n the posi ti on of doing hi l l cl i mbi ng on Ù vector-val ued eval uati on functi on. Despi te al l these compl i cati ons, experi ence suggests that hi l l cl i mbi ng is a promi nent method used in sol vi ng tri p pl anni ng probl ems wi t h a map. Penci l -and-paper maze probl ems are rather si mi l ar to trip-pl anni ng probl ems on a map, and peopl e frequentl y use hi l l cl i mbi ng i n an at­ tempt to sol ve them. However, chal l engi ng maze probl ems are usual l y del i beratel y constructed t o frustrate a h :n-c1 i mbi ng approach. Maze probl ems frequentl y requi re nonopti mal choi ces at earl y and mi ddl e stages of the sol uti on and may even requi re detours ( i ncreases i n the di stance from the goal , as measured by the most obvi ous eval uation function of physi cal di stance) . On the other hand, maze probl ems usual l y do not i nvol ve consi derati ons of road speed or sceni c beauty. Defni ng an expl i ci t eval uation functi on and empl oyi ng hi l l cl i mbi ng i s al so useful i n sol vi ng the one-heavy-coi n probl em di scussed i n Chapter 3 : You have U pi l e of 24 coi ns. Twenty-three of these coi ns have the same wei ght, and one i s heavi er. Your task i s to determi ne whi ch coi n i s heavi er and to do so i n the mi ni mum number of weighi ngs. You are gi ven a beam bal ance ( scal e) , whi ch wi l l compare the wei ght of any two sets of coi ns out of the t otal set of 24 coi ns. A sui tabl e eval uati on functi on for sol vi ng t hi s probl em woul d be the number of coi ns whose cl assi fcati on as heavy or l i ght i s known. At the begi nni ng of the probl em, the val ue of the functi on i s zero, si nce none of the 24 coi ns i s known to be ei ther heavy or l i ght. In the goal stat e, the heavy-l i ght cl assi fcati on of al l 24 coi ns i s known, so the val ue of the functi on i s 24. Thus, a hi l l - cl i mbi ng approach woul d choose an action at each node t hat maxi mi zed the number of coi ns whose heavy­ l ight cl assi fcation i s known. A very l arge number of al ternati ve acti ons are present at each node. For exampl e, at the frst node, you mi ght wei gh any one of the coi ns agai nst any t wo of t he other coi ns. I n general , you mi ght wei gh any set of M coi ns agai nst any set of coi ns, where Í ¬ l l 24. The num­ ber of di ferent pai rs of sets of Í and coi ns that sati sfy the restri cti on that Í ¬ ÎÌ l 24 i s extremel y l arge. However, the most el ementary considerat i on of the previ ousl y menti oned eval uati on functi on and ¯Z Chapter b the hi l l - cl i mbi ng approach i mmediatel y rul es out al l acti ons that do not i nvol ve wei ghi ng two sets containing equal numbers of coi ns i n the two pans of the beam bal ance. Thi s excl usi on reduces the number of al ternati ve acti ons consi derabl y. Furthermore, usi ng the method of defni ng equi val ence cl asses of actions di scussed in Chapter 4, note that, at the frst node of the probl em, you have no way to di sti ngui sh di ferent subsets of i coi ns; thus, you must consi der any t wo sets of i coi ns t o be equi val ent t o each other ( i n thei r l i kel i hood of contai ni ng the heavy coi n). This con­ si derati on reduces the number of di ferent alternati ve acti ons at the frst node to 1 2 -that i s, a set of 1 2 coi ns i s weighed agai nst a set of 1 2 coi ns, a set of 1 1 coi ns agai nst another set of 1 1 coi ns, 1 0 agai nst 1 0, and so on, or 1 agai nst 1 . I f you expl i ci tl y i nqui re whi ch of these 1 2 al ternati ve acti ons resul ts i n the greatest number of known coi ns fol l owi ng the frst weighing, you shoul d be l ed to sel ect the optimal action at the frst node -that i s, t o weigh a set of 8 coi ns agai nst another set of 8 coi ns, si nce t hi s maxi mal l y i ncreases the val ue of the eval uation function from 0 known coins to 1 6 known coins fol l owi ng the frst weighing, whatever the out­ come of the frst wei ghi ng. The same sort of eval uation function and hi l l -cl i mbi ng approach can be used to sol ve more compl ex coi n-wei ghi ng probl ems, such as those i nvol vi ng two heavy coi ns or one coin that might be ei ther heavi er or l ighter than the other coi ns. When the coi ns are cl assi fed i nto three or more categori es ( for exampl e, heavy, medi um, and l i ght) , t hen it may be useful to use as an eval uati on function the number of coi n-cl assi fcati on pai ri ngs ( for exampl e, coi n I is heavy, coi n 2 i s medi um, coi n 3 i s l i ght) that have been rul ed out. I n al l of the coi n-wei ghi ng probl ems, from the si mpl est to t he most compl ex, keep i n mind that, afer a given wei ghi ng, the val ue of the eval uati on functi on may be di ferent for the di ferent outcomes of the wei ghi ng. In such cases, the value of the eval uati on functi on for a parti cul ar wei ghi ng is usual l y best consi dered to be the expected val ue of the eval uati on functi on across al l di ferent outcomes, where t he val ue of t he eval uati on functi on for each outcome is wei ghted by the probabi l i t y of obtai ni ng t hat outcome. Thus , afer the frst wei gh­ i ng of ei ght coi ns agai nst eight coi ns i n the previ ousl y menti oned one-heavy-coi n probl em, the opti mal choi ce i n the second wei ghi ng is ei ther to weigh two coi ns agai nst two coins or three coi ns agai nst t hree coi ns . In ei t her case, the t hree outcomes of the wei ghi ng ( t i l t l eft , bal ­ ance, til t right) are not equal l y l i kel y, nor does each outcome resul t in an equi val ent i ncrease i n the number of known coi ns. For exampl e, wi t h State Eval uation and Hill Climbing ¯ó the three agai nst three wei ghi ng (out of the eight remai ni ng coi ns) , the probabi l ity of thei r balanci ng evenl y i s �, whi l e the probabi l i ty of ti l t l ef i s �, and the probabi l i ty of ti l t ri ght i s � . For si mpl i ci ty, l et us use as the eval uati on functi on the number of unknol l 'll coi ns, where the goal state has a val ue of zero unknown coi ns. Thus, hi l l cl i mbi ng, i n t hi s case, means attempti ng to mi ni mize t he val ue of t he eval uation functi on. U si ng thi s eval uation functi on, the val ue of a bal anced outcome i n the three-agai nst-three weighi ng i s two remai ni ng unknown coi ns, whi l e the val ue of t i l t l eft i s 3 and the val ue of t i l t right i s al so 3. The overal l eval uati on of the t hree-agai nst­ three weighi ng, t hen, i s (� . 3) + ( � . 3) + (� . 2) ¬ ¥ ¬ 21 . The three-against-three wei ghi ng produces the mi ni mum expected val ue on the eval uation functi on. Thi s fact can be seen by computi ng the expected val ue for the other three pl ausi bl e weighi ngs -namel y, one agai nst one, t wo agai nst two, and four agai nst four. The two­ agai nst-two weighing i s almost as good as the three-agai nst-three weighi ng, by thi s evaluation functi on. The two-agai nst-two wei ghi ng has an expected value of (i ' 2) + (i . 2) + (4 ' 4) ¬ 3. The four-against­ four weighing has an expected val ue of a . 4) + (4 . 4) ¬ 4. The one­ against-one weighing has the poorest expected val ue of al l - namel y, (� . 0) + (� . 0) + (* . 6) ¬ 44 . I n terms of achi eving the goal of determi ni ng the one heavy coi n out of 24 i n the mi ni mum number of wei ghi ngs, ei ther t he t hree-agai nst­ three wei ghi ng or the two-agai nst-t wo wei ghi ng i s opti mal on the second weighi ng. Thus , i n thi s case, hi l l cl i mbi ng i s a successful prob­ l em-sol vi ng method, si nce it chooses one of the two acti ons that wi l l l ead t o the goal wi t h t he mi ni mum number of total acti ons (wei ghi ngs) . Sol vi ng si mpl e l i near equat i ons provi des another exampl e of t he possi bi l i t y of successful use of hi l l cl i mbi ng i n probl em sol vi ng. Con­ sider the l i near equation 9x + 7 ¬ 5x + 15 as the gi ven, with an expres- sion of the form x ¬ = bei ng the goal . The bl ank, . , repre- sents some currentl y unknown real number that constitutes the val ue of x i n the solution to the equati on. I ni ti al l y, we mi ght defne a four-val ued vector eval uati on functi on for thi s probl em, consi st i ng of the coefci ents of t he À and numeri cal terms on the l eft-hand si de of the equati on and the x and numerical terms on the right-hand side. For the l i near equation above, then, the val ue of the eval uati on functi on at the gi ven state woul d be ( 9, 7, 5, 1 5 ) . The val ue of t he eval uati on functi on for the goal state i s ( 1 , 0, 0, = ) , where . agai n i ndi cates t hat we do not current l y know what real number i s acceptabl e i n t hi s posi t i on. We mi ght choose acti ons at each step designed to i ncrease the number of terms of t hi s ¯4 Chapter b four-valued vector eval uation function that are in agreement with the correspondi ng terms of the eval uati on functi on for the goal . Thus, if we subtract 5x from both si des of the equation, the eval uati on functi on is changed to ( 4, 7, 0, 1 5) , whi ch is known to di sagree wi th the eval uation functi on for the goal i n onl y the frst two posi ti ons ( the agreement of the val ue i n the fourth posi ti on wi th the desi red val ue i n the goal expressi on cannot be determi ned) . Subsequent l y, subtracting 7 from both sides of the equati on changes the eval uati on function to ( 4, 0, 0, 8) , whi ch di sagrees with the goal expression i n onl y one posi­ tion ( t he frst ) . Fi nal l y, di vi di ng both sides of the equati on by 4 has an eval uation functi on ( 1 , 0, 0, 2) , whi ch i s known to di sagree wi th the eval uati on functi on for the goal i n zero posi ti ons. The state achi eved at t hi s poi nt that i ncl udes the expressi on x ÷ 2 consti tutes the sol uti on to the probl em. Rather t han t hi nk of t hi s at al l i n terms of a four-val ued vector eval uation functi on, we can si mpl y t hi nk of the number of "bad" terms i n the expressi on. I ni ti al l y t here are t hree bad terms. After subtract i ng 5x from both si des of the equati on ( obtai ni ng 4x + 7 ÷ 1 5) , there are onl y two known bad terms. Afer subtracti ng 7 for both si des ( obtai ni ng 4x ÷ 8), there i s onl y one known bad term. Fi nal l y, after di vi di ng both si des of t he equation by 4 ( obtai ni ng X ÷ 2) , t here are no bad terms, and the probl em is sol ved. I t may be somewhat di fcul t for someone experienced i n sol vi ng such si mpl e l i near equati ons to i magi ne that anyone actual l y uses t hi s sort of eval uati on functi on and hi l l cl i mbi ng i n order to sol ve so si mpl e a probl em. However, t hi s approach coul d be used, and, very l i kel y, many begi nni ng al gebra students unconsi ousl y use j ust such a method i n sol vi ng thei r i ni ti al l i near-equati on probl ems. The more experi enced l i near-equati on sol ver very l i kel y t hi nks of the probl em in terms of t hree subgoal s , namel y, getti ng al l the x terms on the l eft si de of the equati on, getti ng al l the numeri cal terms on the right si de of the equati on, and di vi di ng t hrough by the coefci ent of the x term. However, t hi s subgoal method ( to be described i n detai l i n the fol l owi ng chapter) uses the same sort of eval uati on functi on as used by the hi l l - cl i mbi ng approach to l i near-equati on probl ems. Once you are an experi enced sol ver of l i near equat i ons you probabl y never thi nk of eval uati on functi ons , subgoal s, or hi l l cl i mbi ng at al l but si mpl y sol ve t he probl em using t he same t ype of act i on sequence you have used i n sol vi ng ot her such probl ems - namel y, subt ract the x term on the right-hand si de of the equation from the x term on the l ef-hand side of the equation, then subtract the numeri cal term on the l eft-hand side of the equati on from the numerical term on the State Evaluation and Hil l Cl imbing ¯b right-hand si de of the equation, and fnal l y di vi de t hrough by the coefci ent of the X term. ( Thi s probl em- sol vi ng met hod, knowi ng how to sol ve a probl em because you recognize i ts rel ati onshi p to other probl ems you sol ved previ ousl y, wi l l be di scussed i n Chapter 9. ) Thus , there are many di ferent probl em-sol vi ng methods t hat can al l l ead to roughl y the same sequence of acti ons i n sol vi ng a si mpl e l i near­ equati on probl em. Thi s si mpl e probl em is di scussed pri mari l y to com­ muni cate what i s meant by such concepts as eval uati on functi ons, hi l l cl i mbi ng, subgoal s , rel ati ons between probl ems, and the l i ke. Furt hermore, hi l l cl i mbi ng i s frequent l y u sed t o sol ve more compl ex equati ons or sets of equati ons, usi ng as an eval uati on functi on some measure or measures of the di screpancy i n form between some equa­ tion you have produced and the goal equati on. Thus , in sol vi ng equati ons i nvol vi ng exponenti al terms wi t h the unknown in the ex­ ponent, a sol ver often takes l ogs of both si des of the equati on to i ncrease the si mi l arity of the resul ti ng equati on to the goal equati on ( si nce i n the goal equati on t he unknown i s not i n t he exponent). I n sol vi ng diferenti al equati ons, you can i ntegrate t o get rid of the di ferenti al terms, and i n sol vi ng i ntegral equati ons , you can sol ve for the i ntegral s or el se di ferenti ate i n order to get rid of i ntegral s , and s o on. The si x-arrow probl em di scussed i n Chapter 4 to i l l ustrate the power of noti ci ng equi val ence cl asses of action sequences provi des a very good exampl e of a probl em in whi ch you can defne mul ti pl e -at l east three -diferent eval uation functions. The three eval uation functi ons di fer consi derabl y i n thei r efecti veness for a hi l l -cl i mbi ng approach. Recal l that the si x-arrow probl em i s as fol l ows: You are gi ven si x arrows i n a row, t he lef t hree of whi ch are poi nti ng up and t he right three of whi ch are poi nti ng down. The goal i s to transform these arrows i nto an al ternat i ng sequence such that the left-most arrow poi nts up, the next arrow to i t poi nts down, the next up, then down, then up, and then down. The acti ons al l owed are to si mul taneousl y i nvert ( t urn upsi de down) any two adjacent arrows. Note that you may not i nvert one arrow at a time but must i nvert two arrows at a t i me, and the two must be adjacent. The given and goal states are i l l ust rated i n Fig. 5-3 . You are to achi eve the sol uti on usi ng the mi ni mum number of act i ons ( i nversi ons of adjacent pai rs) . Stop readi ng and t ry t o defne t hree di ferent eval uati on functi ons that might be rel evant to sol vi ng t hi s probl em by hi l l cl i mbi ng, t hen read on. ¯b Chapter b 2 3 4 5 6 2 3 4 5 6 r r r 1 1 1 r 1 r 1 r 1 Gi ven Goal RGURE 5-3 The si x-arrow probl em. The most obvi ous eval uation functi on i s probabl y the number of arrows that are i n the same posi ti on as i n the goal state. Thi s eval ua­ ti on functi on starts out at four in the gi ven state and ends at si x in the goal state. However, thi s most obvi ous eval uation function turns out to be of no hel p whatsoever i n sol vi ng the probl em at any of the earl y stages. For exampl e, of the fve al ternati ve acti ons you might take at the begi nni ng state, four l eave the eval uation function unchanged at four and onl y one action -i nverti ng arrows 3 and 4 -decreases the eval uation (from four to two). Even this l i mi ted degree of di scrimi na­ ti on among acti ons is of negati ve val ue in sol vi ng the probl em, si nce i nverting arrows 3 and 4 i s an action that i s, i n fact, desirable to per­ form at some stage in sol vi ng the probl em, whereas i nverting arrows I and 2 and i nverting arrows 5 and 6 are acti ons that should not be performed at any stage. Even i f you choose to i nvert arrows 2 and 3 or i nvert arrows 4 and 5 at the frst step, thi s eval uation function agai n provi des no assi stance i n choosi ng the correct action at the second step. It i s onl y when you have chosen the correct two begi nni ng acti ons that the eval uation functi on coul d i mmedi atel y tel l you whi ch acti on to choose at the thi rd step, a fact that woul d be obvi ous i n any event. Havi ng read thi s di scussi on of one eval uation functi on, you might stop readi ng for a bi t and try to generate some addi ti onal eval ua­ tion functi ons, if you are not satisfed wi th the ones you have thought of so far. A somewhat di ferent eval uati on functi on that is consi derabl y more useful in sol vi ng the probl em is to count the number of runs of arrows (consecuti ve arrows with i denti cal orientati on). Thi s eval uation func­ tion starts out at two runs for the begi nni ng state and ends at si x runs for the goal state. In the sol ution shown i n Fi g. 5- 4, thi s eval uation functi on was not i ncreased i n going from the begi nni ng state to the next state, but was i ncreased at each of the two remai ni ng states. I n other sol ut i ons to t he probl em, t he number of runs mi ght be i ncreased State Evaluation and Hill Cl imbing ¯¯ State Three evaluation functions Di stance between No. of arrows i n No. of runs two i ncorrect 1 2 3 4 5 6 goal posi ti on of arrows arrows Begi nni ng t t t t t t 4 2 3 t t t t t t 4 2 2 t t t t t t 4 4 Goal t t t t t t 6 6 RGURE 5- 4 The val ues of three di ferent eval uati on functi ons for each successi ve state i n a sol uti on to the si x-arrow probl em. at the frst step, hel d constant at the second step, and fnal l y i ncreased agai n at the t hi rd step. Thus, the number of runs i s a more useful eval uati on functi on i n conjuncti on wi th the hi l l - cl i mbi ng approach than i s the number of arrows i n the goal posi ti on. However, the eval uation functi on that i s optimal i n conjuncti on wi th the hi l l -cl i mbi ng approach to t hi s probl em i s to consi der the di stance between the two i ncorrectl y pl aced arrows and attempt to reduce t hat di stance. Probabl y you woul d arri ve at such an eval uation functi on, i n essence, by working backward ( see Chapter 7) and noting that you coul d sol ve the probl em if you had al l the arrows correctl y posi ti oned, except two i ncorrect l y posi ti oned arrows that were adjacent to each other. In fact , you mi ght note that, si nce an acti on al ways changes the position of two arrows, the fnal step must necessari l y be to change two arrows ( both of whi ch are i ncorrect l y posi ti oned) to bei ng cor­ rect l y posi ti oned. Thus, if more than one step is requi red, you know for certain that you coul d not get fve arrows correctl y posi ti oned and be abl e to sol ve t he probl em. Hence, there i s no poi nt to the frst eval uati on functi on ; you shoul d focus i nstead on what you need to do i n order to achi eve the subgoal of putti ng the two i ncorrect l y posi ­ ti oned arrows adjacent to each other. In essence, thi s procedure de­ fnes the t hi rd eval uati on functi on, whi ch i s the di stance between the two i ncorrectl y posi tioned arrows. Note that, i n the gi ven state, the val ue of thi s eval uation functi on i s 3 , and the successi ve acti ons i n a correct sol ut i on to the probl em can reduce t hi s to 2 and t hen to I , from whi ch the fnal acti on i s obvi ous . 1b Chapter b Another probl em that i l l ustrates the possi bi l ity of defni ng several pl ausi bl e eval uati on functi ons for the sol uti on of a probl em by hi l l cl i mbi ng i s t he fol l owi ng discrimination rel ' ersal problem: I n the one-di mensi onal worl d of Li nel and, there are two races of "peopl e" : whi tes and bl acks. As in our t hree- di mensi onal worl d, the whi tes have for a very l ong t i me di scri mi nated agai nst the bl acks. However, of l ate, the bl acks have been maki ng some gai ns i n the area of soci al justi ce, i n some cases obtai ni ng judgments from courts and l egi sl atures that a cer­ tai n degree of reverse di scri mi nati on shoul d obtai n for a period of ti me, as symbol i c retri buti on to bl acks and as a l esson to whi tes concerni ng the evi l s of di scri mi nati on. One of the areas i n whi ch the bl acks have just now achi eved a court deci si on ordering di scri mi nati on reversal i s i n the matter of bus t ravel . I n the past , whi t es have al ways ri dden i n t he front of the bus and bl acks i n the back. Now the court has just ordered that for a t i me bl acks wi l l ride in the front and whi tes in the back. When the order took efect, there was one seven-passenger bus that was al ready l oaded with t hree blacks i n the last t hree seat s, three whi tes in the next t hree seat s, and the front seat empty. The bus i s automati c, re­ qui ri ng no dri ver ( steeri ng i s not requi red i n Li nel and) . All this i s i l l us­ t rated i n Fi g. 5-5. Si nce the order had al ready gone i nto efect, the pol i ce i nsi sted that the bl acks and the whi tes must reverse posi ti ons compl etel y. Of course, peopl e i n Li nel and are abl e to move to adjacent posi ti ons i n thei r l i near worl d, so a person coul d move to an empty adjacent seat i n the bus. However, i n addi ti on, Li nel anders have i nvented a special devi ce that al l ows them to pass t hrough two-di mensi onal space for a very l i mited di stance, hopping over i nterveni ng persons and objects i n either di recti on al ong thei r l i near worl d. Thi s hoppi ng abi l i t y has a maxi mum l i mi t equal to two seats i n the bus. Thus, ei ther a whi te or a bl ack coul d jump over one or two adjacent seats i n the bus, provi ded the target seat was empty. For exampl e, i n the gi ven state of Fi g. 5- 5, the frst whi te coul d move i nto the front seat, or the second whi te coul d hop over one whi te i nto the front seat , or the t hi rd whi te coul d hop over two whi tes i nto the front seat . But the frst bl ack coul d not hop over all three whi tes i nto the front seat. U si ng these movement propert i es of whi tes and bl acks in FI GURE 5-5 The di scri mi nat i onal reversal probl em for a bus i n Li nel and . State Evaluation and Hill Climbing ¯Ü Li neland, sol ve the probl em of reversi ng the rel ati ve posi ti ons of the blacks and the whi tes so that al l three bl acks are i n front of al l t hree whites i n the bus. Do this i n the mi ni mum number of moves. In sol vi ng the probl em, note that it is i rrel evant to the sati sfacti on of the court order where the empty seat occurs i n the bus, so l ong as al l t hree bl acks are si tt i ng in front of all three whi tes. I t i s not possi bl e i n thi s probl em to wri te down a si ngl e goal state, si nce a variety of possi bl e goal states wi l l satisfy the probl em. Al l we can say for sure is that, in the goal state, the frst four posi ti ons in the bus will contain al l three bl acks and the l ast four seats i n the bus wi l l contain al l three whi tes, but one does not know where t he empty seat wi l l occur. Thi s l i mi ted speci fcation of the goal state, however, is quite adequate for defning a variety of eval uati on functi ons that ap­ pear rel evant to the sol ution of the probl em. At this point, stop readi ng and, using a hi l l - cl i mbi ng approach, try to defne expl i ci t l y some eval ua­ tion functi ons that mi ght prove useful for sol vi ng the probl em. There appear to me to be two obvi ous types of eval uati on functi ons we can defne for thi s probl em, both of whi ch are quite sati sfactory for hi l l cl i mbi ng. One eval uation functi on i nvol ves numberi ng the posi ti ons i n the bus from I at the front to 7 at the back. The eval uation functi on would be somethi ng l i ke the average posi ti on of the whi tes mi nus the average position of the bl acks. We coul d then attempt to maxi mi ze thi s number. With thi s eval uation functi on, a detour i s requi red on the frst move, but thereafer al l moves i n the optimal sequence do i ncrease thi s eval uation function by an amount that i s ei ther greater than or equal to every other al ternati ve action ( usual l y greater than every other al ternati ve action). Expl i ci t computati on of the value of thi s function i s somewhat more di fcul t than the second eval uation functi on to be di scussed, but al l that real l y counts i s the rel ati ve di ference between the current state and every al ternati ve state that can be achi eved by taking any admi ssi bl e acti on. Thi s di ference i s rel ati vel y easy t o determi ne, and so t hi s eval uati on functi on proves quite hel p­ ful in conj uncti on wi th a hi l l -cl i mbi ng approach to sol vi ng the probl em. If you have not yet thought of a second eval uation functi on for the sol ut ion of thi s probl em, st op readi ng and try to thi nk of another one. A second eval uati on functi on, whi ch i s even easi er to compute than the frst , i s the sum of the number of bl acks i n front of each whi te, summed across al l whi tes. The val ue of t hi s eval uati on functi on for every state starting wi th the begi nni ng state and endi ng wi t h the goal state i n the opti mal sol uti on of the probl em i s shown i n Fig. 5 - 6. Cl earl y, ÛÛ Chapter b State Configuration Evaluation Gi ven (0) . W W W B B B 0 1 W . W W B B B 0 2 W B W W . B B 2 3 W B W W B B . 2 4 W B W . B B W 4 5 . B W W B B W 5 6 B . W W B B W 5 7 B B W W . B W 7 8 B B ¯ W W B W 7 Goal (9) B B B W W . W 9 RGURE 5- 6 The opti mal sol ut i on t o t he di scri mi nation reversal probl em. usi ng hi l l cl i mbi ng on the eval uati on functi on of the number of Bs to the lef of each W. summed over al l three Ws. thi s eval uation functi on represents most cl osel y what we are trying to achieve i n the probl em. In the given state, there are zero blacks i n front of the whi tes, yi el di ng an eval uation of zero. I n the goal state, there wi l l be al l three bl acks in front of al l three whites, yi el ding an eval uation of 9. Thus , with thi s eval uation function, we know exactly what val ue i s possessed by the goal state. Wi th the frst eval uation function, i ncreases in the eval uation can be made, afer the goal is achi eved, by moving whi tes further to the back of the bus and posi ti oni ng the empty seat i n the mi ddl e. Si nce thi s addi ti onal move i s not requi red for the sol uti on of the probl em, it i s unnecessary and nonoptimal . However, thi s i s a tri vi al matter, and, i n fact, both eval uati on functi ons serve al most equai l y wel l i n the sol ution of the probl em. Actual l y, the frst eval uation function provi des in­ creases i n the eval uation between states 2 and 3 , states 5 and 6, and states ! and 8 ( see Fig. 5 - 6) , where the second eval uation functi on cannot be changed by any acti on. Thus , in some ways, the frst eval ua­ tion functi on is superior, though if you do some l i mited l ooki ng ahead ( two-step hi l l cl i mbi ng) with the second eval uation function, you wi l l have t he optimal choi ce at each of these three nodes. The optimal sol uti on to the di scri mi nation reversal probl em i s achi fved by usi ng (a) the second ( rel ati ve posi ti on) eval uati on function with a l i mi ted degree of two-step hi l l cl i mbi ng to decide among the equ i val ent l y val ued act i ons i n st at es U and | . , and ¬ . ` and 6. and 1 and ö, or ( b) t he frst ( absol ute posi t i on) eval uat i on functi on wi t h the tri vi al modi fcation that you stop when the rel ati ve posi ti ons are correct. even though the absol ute-posi ti on eval uati on functi on coul d st i i l be State Evaluation and Hill Climbing ÛT i ncreased. It i s a remarkable fact concerni ng the absol ute-positi on evaluation function that it permits choi ce of the optimal action at each state, using a hi l l -cl i mbi ng approach. That is, hill cl i mbing usi ng the absol ute-posi ti on eval uation function will produce the solution the very frst time by choosing the action that maxi mi zes the eval uation on the next state ( bearing in mind that, at the frst move, you must choose the acti on that reduces the absol ute position eval uation by the least amount). The defnition of evaluation functions and the use of hi l l cl i mbi ng have a substantial rol e in pl ayi ng chess games and in the sol ution of many chess probl ems. However, there are at least two major di fcul ­ ti es i n the use of hi l l cl i mbing i n chess. Fi rst, even the i mmediate evaluation of any move you take must depend to some extent on your opponent' s immedi atel y fol l owing move. Thi s fact leads to a certain degree of uncertai nty, but it can be resol ved i n at least two ways : (a) by assigning sUbjective probabi l i ti es t o your opponent' s di ferent moves and accordi ngl y determi ni ng the expected values of your own possible moves (as was done i n the "game agai nst nature" i l l ustrated by the coin-weighing probl em) and (b) by assigning an eval uati on to your move consi stent with the best next move your opponent coul d produce (where thi s move i s i n any sense determinabl e). Si nce your subjective probabi l i ti es for your opponent' s moves are l i kel y to be onl y approxi matel y accurate at best and since your abi l i ty to judge your opponent' s best response i s l i mi ted as wel l , there are substantial di f­ cul ties i n appl yi ng hi l l cl i mbing, no matter which approach i s taken. The second pri ncipal di fcul ty of usi ng hi l l cl i mbi ng i n chess con­ cerns the very l arge variety of diferent eval uati on functi ons that are relevant to playi ng a good game of chess or sol vi ng many chess prob­ lems. For exampl e, you are concerned with moving your own pi eces to favorabl e posi ti ons on the board (where they control l arge numbers of squares), the subsi diary goal of control of the center, preventi ng your opponent' s pi ece devel opment, ensuring the safety of your ki ng, jeopardi zi ng the safety of your opponent ' s ki ng, and many, many others. I t i s not at al l obvi ous what al l the rel evant eval uation func­ ti ons mi ght be i n chess nor how to wei ght them i n diferent si tuati ons to come up with some overal l eval uati on of your next move. Despite al l these probl ems, the defni ng of eval uation functi ons and the use of hi l l cl i mbi ng ( l ooking ahead one or more steps) are i mportant probl em-sol vi ng methods i n chess. To i l l ustrate, consi der the fol l owi ng very si mpl e end-game probl em i nvol vi ng bl ack' s rook and ki ng agai nst whi te' s ki ng, wi th the posi ti ons as i l l ustrated i n Fi g. 5- 7 and bl ack to move. It i s bl ack' s move and bl ack' s objecti ve i s to checkmate the ÛZ | ¯ - I FI GURE 5-7 Bl ack to move i n a manner that maxi mal l y restricts the squares of the board to which the white ki ng mi ght eventual l y move. Chapter b white ki ng in the mi ni mum number of moves. Stop reading and try to think of at l east one eval uation function that i s relevant to thi s objective and that might dictate the choice for bl ack' s frst move i n the present i nstance. Al though bl ack must be continual l y concerned wi th avoiding stale­ mate ( putting the white king i n a position where he i s not i n check but has no move except one that would put hi m i n check), the most obvi ous objecti ve of black i s to mi ni mi ze the number of squares to whi ch the white ki ng can move without being i n check. Al though mini­ mi zi ng the possibl e moves of the white ki ng might be consi dered to appl y onl y to the next move, i t is more useful to consider the eval uation function to refer to mi nimi zi ng the number of squares to which the white king might L1³ be able to move (that i s, mi ni mizing the number of squares reachable by a sequence of several moves). By the former eval uation function, moving the bl ack rook to either the ffth, si xth, or seventh fi l e ( row) woul d be equal l y good. However, by the second, more adequate eval uation functi on, onl y the move of bl ack' s rook to hi s ffth fle maximal l y restricts the number of squares on the board to whi ch the whi te king might ul ti matel y move ( if the rook merel y stayed on that fl e). Thus , thi s i s the solution to the probl em, and it i s rel ati vel y straightforward i n terms of Ô hi l l -cl i mbi ng approach usi ng the second eval uation functi on. Ûó DIFFICULTIES WITH HILL CLIMBING Local Maximum One of the most common appl i cati ons of hi l l cl i mbi ng comes i n opti­ mi zation probl ems, where the evaluation functi on i s already given in the frst part of the probl em. For exampl e, suppose you are attempting to determi ne the maxi mum value of some functi on defned over a I , 2, 3 , . . . ç Î di mensi onal space. That i s, you have some compl ex functi on of several variabl es for whi ch i t i s possi bl e to compute the val ue of the function, gi ven any parti cul ar set of val ues for the i ndi ­ vi dual variabl es, but for whi ch i t i s not possi bl e to determine by analytic means what the maxi mum of the functi on might be. To sol ve such prob­ l ems, i t i s best to begin with a parti cular set of val ues for the i nde­ pendent variabl es, compute the val ue of the functi on (dependent variable) for that set of i ndependent variabl es, then determi ne the val ue of the function for points that are nearby i n the space of the i ndependent variabl es. That i s, you make small variati ons i n the val ues of each of the i ndependent variabl es i n turn and determi ne the val ue of the dependent variabl e (eval uati on functi on) for each new set of i ndependent variabl es. Whatever di rection of movement i n the space of independent variabl es produces the greatest i ncrease i n the value of the dependent variable is chosen as the new focus for expl orati on. Proceed i n thi s manner unti l you fnd a poi nt for whi ch no movement i n any di rection produces an i ncrease i n the eval uation functi on. At that poi nt, you have cl i mbed to the top of some local peak ( l ocal maximum) i n the space, and hill cl i mbi ng i s no l onger of any value i n searchi ng for the highest peak (absolute maxi mum) i n the space of the independent variabl es. The most frequentl y di scussed di fculty wi th the hi l l -cl i mbi ng ap­ proach i n such optimi zation probl ems i s that you can onl y reach a l ocal maxi mum and have no guarantee that it is the absolute maxi mum (highest val ue of the dependent vari abl e, defned over the space of the independent variabl es). The onl y real sol uti on to the probl em is to try a l arge number of wi del y di spersed starti ng poi nts in the space of the independent variabl es and choose the maxi mum of the local maxi ma reached by hill cl i mbi ng. Assumi ng that the givens of the probl em do not include i nformation concerning the optimum val ue of the evaluation functi on, you can never be absolutely sure that you have found the absol ute maxi mum. However, i t shoul d be pointed out that the appli cation of hi l l cl i mb­ i ng to most probl ems other than thi s cl ass of opti mi zati on probl ems i ncl udes information concerni ng the val ue of the eval uti on functi on Û4 Chapter b at the goal . In such cases, you can always know whether or not you have reached the goal by hi l l cl imbing. This greatl y attenuates the seriousness of the "l ocal maxi mum" di fcul ty. End-bunching Hi l l cl i mbing is often used in construction probl ems, where you start putting together some of the materi al s to resul t in a state that is cl oser to the goal (more si milar to the object being constructed) than was the original gi ven state. I n some construction probl ems, thi s method works wel l , but in others i t is not useful . For exampl e, consi der the Instant Insanity probl em described in Chapter 2. Stop reading and defne some possi bl e relevant eval uation functions for Instant I nsanity. One rather natural four-di mensional eval uation function might be the number of diferent colors you achieved on each side of the tower. The number of bl ocks already placed in the tower is i ncl uded in this evaluation function i n what appears to be a completely satisfactory way, si nce to achi eve the goal of having four diferent colors on each si de, you woul d have to have a tower of four bl ocks. If you were given more than four blocks to work with, this would not be a satisfactory eval uation function unl ess you consi dered the number of bl ocks, since you coul d achieve four diferent colors on each side by using more than four bl ocks. However, in the present probl em, the goal state coul d be characterized exactl y as having each of the four colors represented on each of the four vertical sides. Hence, we may consider the beginning state to have the eval uation vector (0, 0, 0, 0) and the goal to have the eval uation vector (4, 4, 4, 4) . The four di mensi ons could rather naturall y be combined into a one-dimensional evaluation func­ tion si mpl y by summi ng the four components. I n obtai ning thi s sum, it seems natural to give equal weight to each component, si nce each di mension has the same range of values and an analogous meaning. Although few people are expl i ci tl y aware of it, vi rtual l y everyone who works on Instant Insanity attempts to use some form of hi l l cl i mbi ng, usi ng something l i ke the above evaluation function. Sys­ temati c use of thi s ki nd of hi l l cl i mbing greatl y reduces the search space (number of alternative towers to be investigated) , but the method sti l l leaves a very l arge number of alternatives to i nvestigate. There are many equivalent options at each of the four nonterminal nodes of the state-action tree for I nstant Insanity, so hi l l cl i mbing with this evaluation function hardl y yi el ds the answer with a si ngle series of four choices. The difcul ty with this state eval uation function applied to thi s probl em i s that it i s much harder to i ncrease the evaluation State Evaluation and Hill Climbing Ûb function by the required amount at the last (fourth) choice node than at earl ier nodes. At most of the last nodes, no action will achi eve the goal , even though the sol ver i s currentl y at a node that has the eval ua­ tion ( 3 , 3 , 3 , 3) . Whether or not you can sol ve the probl em is determined by the exi stence of such an action at the fourth node, but the eval uation function for the states that coul d be achieved at earl i er nodes gives very inadequate i nformation concerning the "correct" fourth node at whi ch to be. That is, there are many fourth nodes with the eval ua­ tion ( 3 , 3 , 3 , 3 ) , and very few of these have any action that l eads to a terminal node with the eval uation (4, 4, 4, 4). There are many probl ems l i ke thi s, where the restrictions bunch up at the end of the probl em. It i s as if you had many easy trail s to cl i mb most of the way up a mountai n, but the summi t was attainable from onl y a few of these trai l s, with the rest runni ng into unscalable preci ­ pi ces. Hil l cl i mbi ng (i n the probl em-sol vi ng sense) i s often not a very good method to use i n such cases, though it may considerabl y reduce the amount of trial-and-error search. The astute reader might note that the end-bunchi ng of restri cti ons i s a difculty with hi l l cl i mbing that i s somewhat anal ogous t o t he l ocal ­ maxi mum di fcul ty. Detours and Circling Probl ems with mul tipl e equival entl y val ued paths at the early nodes can be di fcul t to sol ve with hi l l cl i mbi ng, but perhaps the greatest frustration in using the method comes i n detour problems, where at some node you must actual l y choose an action that decreases the evaluation. Somewhat less difculty i s encountered i n what might be called circling problems, where at one or more nodes you must take actions that do not i ncrease the eval uti ons. If the nodes where you must detour or ci rcl e have no better choi ces (that i s, no choi ces that increase the evaluation) , then you are more l i kel y to try detouring or circl ing than if the critical nodes have better choi ces. When better choices are avai l abl e, you tend to just choose them and go on without considering the possi bi l ity of detouring or ci rcl i ng. If the path you choose does not l ead to the goal , you might go back and i nvestigate alternati ve paths, but the frst ones to be i nvestigated wi l l be those that were equi valent or almost equival ent at some previ ous node. Onl y afer al l of thi s fai l s shoul d you try detouring -that i s, choosi ng an action at some node that produces a state that has a lower eval uation than the previ ous state had. Ûb Chapter b The missionaries-and-cannibals problem is a famous example of the di fcul ti es encountered by hi l l cl imbing in a detour probl em. The probl em i s as fol l ows : On one side of a river there are three missionaries and three cannibal s. They have a boat on thei r si de that is capable of carrying two people at a ti me across the river. The goal is to transport al l six people across to the other si de of the river. At no poi nt can the cannibals on either side of the river outnumber the missionaries on that side of the river (or the canni bal s would eat the outnumbered mi ssi onaries). This constraint only holds when there is at least one missionary on the side of the river where there are more cannibal s. That is, it is all right to have one, two, or three cannibals on the same side of the river with zero missi onaries, because then they would have no missionaries to eat. Stop reading and try to sol ve the probl em by explicitly defning some eval uation function and using a hill cl i mbi ng approach, then see Fig. 5-8 for a sequence of states that sol ves the probl em. Ofhand, you mi ght thi nk this was an absol utel y tri vial probl em, si nce the state-action tree for the probl em i s rather smal l , and hi l l cl i mbi ng on an eval uation function such as "the number of people on the other si de of the river" reduces the number of paths to search to a very smal l number. But that is just the troubl e ! Hi l l cl i mbing on this obvious eval uation function reduces the search space i n such a way as to eliminate the path that leads to the goal . Gi ven this evaluation function (the number of people on the other side of the river) for each state, there i s a critical node at whi ch you must detour (more than usual ) to sol ve the probl em by taking two people back across the river to the original side. Of course, at every other node in the probl em, there is a necessary detour, when one person must row the boat back to the origi nal si de. But, as I mentioned before, necessary detours often cause l ittle di fculty, especi al l y i n a probl em l i ke thi s one, where they are so obvi ousl y necessary on any path to the goal . But taking two people back to the original si de i s a detour that just does not occur to many peopl e who work on this probl em. If they were consciousl y aware that they had defned an evaluation function and were hi l l cl i mbi ng using that evaluation function, then it would qui ckl y occur to them that a detour might be necessary or that a new evaluation function was i n order, and so on. Incidental l y, one reason why peopl e have no di fcul ty wi th the necessary detour on every other node of the state-action tree for the mi ssionaries-and-cannibal s probl em is that sol vers usual l y auto­ matical l y use two-step hi l l cl i mbi ng; that i s, they maximize the number State Evaluation and Hill Cl imbing b1 Node Node Level State Evaluation Level State Evaluation 0 6 MC J 2 MMMGGG b MMGG b MG b 7 MMMG b 2 4 MMGG GG 2 G 8 MMM J MMMGG b GGG b 3 GGG b 9 MMMGG b 3 5 MMM G 4 GG 1 0 MMMG 2 4 MMMG b GG b 5 MMGG b 1 1 MMMGGG b 4 6 MG (Cri ti cal detour step) AGURE 5-8 A di agram of the successi ve states i n a solution to the mi ssi onaries and canni bal s probl em, where M ¯ mi ssionary, C ¯ canni bal , h ¯ boat , and the horizontal l i ne is the ri ver. If two-step hi l l cl i mbi ng i s used unti l the l ast step (to ignore the necessary detour on every al ternate acti on), the eval uation numbers considered are the underlined numbers. of people on the other si de of the ri ver afer 8 trip across the ri ver and the return trip as wel l . U si ng thi s two-step hi l l cl i mbi ng, no detours at all are necessary, and at the one cruci al node, al l that i s necessary i s a ci rcl i ng action ( not i ncreasing the number of people on the other si de as a resul t of the round-trip voyage of the boat ) . You coul d defne an evaluation function di ferent from "the number of peopl e on the goal side of the ri ver. " Obvi ously, you coul d use the two-di mensional vector of the number of mi ssionaries and the number of canni bal s on the goal si de of the river, starting with (0, 0) and the goal being ( 3 , 3) . This process does not avoid the necessi ty of detouring (or ci rcl i ng i n two-step hill cl i mbi ng). ÛÛ Chapter b Somehow we woul d l i ke to have the constrai nt regarding cannibals outnumberi ng mi ssionaries to be refected i n the evaluation function. If that were done i n the proper way, we woul d suspect that, by that eval uati on functi on, it woul d not be necessary to make anything but the obvi ousl y necessary detour i nvol ved i n getti ng the boat back to the original si de of the ri ver. If you were to consi der the number of mi ssi onary-cannibal pairs on the goal si de of the river to be your eval uati on of the state, then the only detours that woul d have to be taken would be those necessary to bring the boat back across the river. Thi s eval uation function does not di stingui sh between states as fnely as does the previ ousl y mentioned eval uation functi ons. That is, at any gi ven node, there are more actions with the same eval uation than is the case with the previ ousl y mentioned eval uation functi ons. How­ ever, thi s does not cause di fcul ti es since most of the acti ons at any given node are el i mi nated from consi deration by the constraint that the canni bal s cannot outnumber the mi ssionaries on either side of the ri ver. I must say, though, that I am sure that someone who was thor­ oughly fami l i ar with evaluation functi ons and hi l l cl i mbi ng woul d l ook for a detour before defni ng some new evaluation function i n thi s si mpl e a probl em. Inference versus Action Problems Most of the probl ems di scussed in the present chapter as exampl es of the more or l ess successful use of hi l l cl i mbi ng were action prob­ l ems ; only a few were inference probl ems. The best formal defni ti on of the di sti ncti on between these two cl asses of probl ems i s that action probl ems i nvol ve only destructi ve operations, whereas i nference probl ems i nvol ve primarily or excl usi vel y nondestructi ve operati ons. Acti on probl ems are concerned wi th achievi ng changes i n some physi ­ cal worl d via constructi ons, movements, or the l i ke. By contrast, inference probl ems are concerned with our knowledge of something (whether or not one thi nks of there being any physi cal referent) . In inference probl ems, the objecti ve is to expand the set of true state­ ments to i ncl ude the desi red goal statement. By thi s defni ti on, tri p-pl anni ng probl ems, maze probl ems, the si x­ arrow probl em, the di scri mi nation reversal probl em, I nstant Insanity, and the mi ssionaries-and-canni bal s probl em are al l action probl ems. The coi n-weighing probl em, the l i near and other equation-sol vi ng prob­ l ems, and the functi on-opti mi zati on probl em are all i nference probl ems (though other opti mi zation probl ems might be acti on probl ems) . State Evaluation and Hill Climbing ÛÜ Obvi ousl y, the hi l l -cl i mbi ng method is not restricted to action prob­ lems and excl uded from inference probl ems. However, there is almost always a substantial l y greater economy ( parsi mony) when you de­ scri be your state at any poi nt i n the sol uti on of an acti on probl em. Si nce action probl ems i nvol ve onl y destructi ve operati ons, the state description can usual l y be achi eved by a si ngl e si mpl e expressi on. Furthermore, the compl exi ty of the expressi on does not usual l y grow enormousl y wi th i ncreases i n the number of acti ons that have been taken i n the attempt to sol ve the probl em. By contrast, i n i nference probl ems the number of expressi ons generated i ncreases wi th every action. In i nference probl ems, you are conti nual l y i ncreasi ng the num­ ber of statements known to be true. The descri pti on of the probl em state must general l y be consi dered to i ncl ude the enti re set of expres­ si ons gi ven or deri ved up to that poi nt. Si nce the goal i s usual l y a si ngle expressi on, it is general l y much more di fcul t to defne an evaluation function that is useful for hi l l cl i mbi ng that compares the current state wi th the goal state. Another reason for the greater di fcul ty in usi ng hi l l cl i mbi ng i n inference probl ems i s that the nondestructi ve operati ons frequentl y found i n such probl ems are ofen not one-to-one operati ons -that i s, operations that take one expressi on as i nput and produce one expres­ si on as output. There are such one-to-one operations, of course. How­ ever, i n addi ti on, inference probl ems usual l y contain a variety of two-to-one, and three-to-one, or even more compl ex operati ons -that i s, operati ons that take two or three or more expressi ons as i nput and produce one expressi on as the output (the i nferred expressi on). One-to-one operati ons are usual l y cal l ed unary operati ons ; two-to-one and three-to-one operati ons are usual l y called bi nary and ternary operations. By and l arge, probl ems with only unary operati ons are more suscepti bl e to a hi l l -cl i mbi ng approach than are probl ems con­ tai ni ng bi nary and ternary operati ons. Of course, there i s always t he trivial eval uati on functi on of how much knowledge you have obtained from the given informati on. How­ ever, sheer amount of knowl edge (for exampl e, the number of deri ved expressi ons), whi l e posi ti vel y correl ated wi th achi evi ng the goal ex­ pressi on, may not be very related to the achi evement of the goal , i f the inferences are proceeding i n the wrong directi on (a direction not rel ated to i nferri ng the goal expressi on). We need to defne an evaluation functi on that measures the rel evant progress toward achi eving the goal expressi on i n i nference probl ems, and such evaluation functi ons can frequentl y be found. However, when they are found, they are usual l y more useful in conjuncti on wi th ÜÛ Chapter b the subgoal method (to be di scussed in the fol l owing chapter) than they are with hi l l cl i mbi ng. The reason i s that usual l y a sequence of several actions i s requi red to achi eve an expression that moves the eval uation function in the direction of the evaluation characteri stic of the goal . Most of the single actions taken in reachi ng each successive subgoal cannot themsel ves be identifed as reduci ng the di stance to the goal in terms of the goal eval uation functi on. However, i n these cases, if we defne eval uation functions relevant for reachi ng each successi ve subgoal , then conceivabl y i n a very large proportion of inference probl ems i t is possi bl e to use the hi l l -cl i mbing method to achieve various subgoal s. Some examples of thi s combined use of hill cl i mbing and subgoal methods will be di scussed in the mathemati cs, sci ence, and engineering probl ems of Chapter 1 1 . Û Subgoals THEORY A probl em-sol vi ng method that is i mportant but di fcul t to master i s that of defning subgoal s i n order to faci l itate solving the original probl em. Thi s method i s someti mes cal l ed "anal yzi ng a probl em i nto subprobl ems, " or "breaki ng up a probl em i nto part s. " In essence, the purpose is to repl ace Ô si ngl e di fcul t probl em wi t h two or more si mpl er probl ems. Of course, if you already know how to sol ve some of the sub­ probl ems, or if some of them are analogous to probl ems you already know how to sol ve, then obviousl y i t might be easier to solve the set of si mpl er probl ems than the si ngl e original probl em. However, the fact that it is advantageous to break up a probl em i nto subprobl ems does not mean you must be more fami l iar wi th the sub­ probl ems than wi th the original probl em. One way to see the advantage of defning subgoal s is to look at the fol l owing anal ysi s of the state­ action tree for a probl em ( 1 ) wi th M alternati ve actions at each node and (2) a sequence of acti ons bei ng necessary for sol uti on. Let us assume that we know that the sol ution to some probl em wi l l requi re a sequence of acti ons ( or l ess) . By systematic trial and error there are M º alternati ve paths (action sequences) to be i nvestigated ÜZ Chapter 6 in the original probl em. Now assume that you can defne a subgoal state that is known to be on the correct path to the goal and, let us say, halfway from the begi nni ng to the goal . Defni ng one subgoal di vi des the probl em i nto two subprobl ems -frst, getting from the given state to the subgoal and, second, getting from the subgoal to the goal . In thi s case, there are . paths to i nvestigate in attempting to get from the givens to the subgoal , and there are the same number (. ) of paths to i nvestigate to get from the subgoal to the goal . Thus with the si ngl e subgoal , the number of action sequences to be i nvestigated is ·. action sequences that are n/2 steps long, versus . action sequences that are steps long i n the original probl em wi thout a subgoal . To get some concrete notion of the advantages of reduci ng the ex­ ponent of .i n thi s manner, consi der the case where .-1 0 and -1 0. I n thi s case, .-1 01 0 and ·. -2 . ( 1 05 ) . In thi s case, a si ngle sub­ goal has reduced the search by a factor of 50,000, whi ch is, of course, a staggering reducti on. In addi ti on, wi th the subgoal , the action se­ quences are onl y half as l ong. A state-acti on tree of a very simple prob­ l em, whi ch vastly underesti mates the power of the subgoal method, is shown in Fi g. 6- 1 . If we defned four subgoals (fve subprobl ems) i n the probl em, with .-1 0 and -1 0, then the number of two-step paths i nvolved i n achi evi ng al l the subgoals pl us the fnal goal i s 5 ( 1 0 ) -500, whi ch i s a reducti on of the search by a factor of 2 . ( 1 07 ) , or 20, 000, 000. To be sure, a number of si mpl ifyi ng assumpti ons were made i n com­ puti ng the comparati ve advantages of defning a series of subgoal s. However, the pri mary assumpti on, whi ch overesti mates t he advantages of the subgoal method, is that you coul d be sure the subgoal s you defned were states on a path that led to the goal . I n some cases, you .be sure of thi s, but i n many other cases you cannot. Neverthel ess, if you coul d fnd a true subgoal by maki ng 5, 1 0, or even 1 00 guesses of i t, you woul d sti l l be reduci ng the search space by extremel y l arge factors in al l of the many probl ems that requi re more than a few steps to sol ve. The subgoal method i s advantageous for attacki ng probl ems that requi re a sequence of more than two or three acti ons to solve -whi ch i s what most nontri vi al probl ems requi re. Sti l l , some probl ems are not si mpl i fed appreci abl y by thi s method ; they are sometimes called i nsight probl ems because they requi re few steps to solve once the cri ti cal i nsight has been achi eved. These i ncl ude probl ems i n whi ch one must represent the components of the probl em i n some sui tabl e way, guess the correct set of gi vens (where there are mul ti pl e given Given Goal FIGURE 6·1 State-action tree for very si mpl e problem showing how defning a subgoal on the correct path (action sequence) to the goal can reduce the search. I n this case, the search i s l i mited to the region inside the two boxes, which is eight action sequences each two steps long, instead of l ô action sequences each four steps long. Some simpl ifying assumptions are made, such as that one knows that the subgoal i s two steps from the beginning and two steps from the end. However, the average problem i s much longer, and the degree of reduction in search by defning subgoal s is far greater than in this si mpl e exampl e. Üó states), or choose a solution approach that violates hill cl i mbing but that requires choosing from among onl y a smal l number of action se­ quences, once the i nsight has been achieved. Exampl es of i nsight probl ems, for which the subgoal method has l ittle to ofer, are many implicit information probl ems, such as the notched-checkerboard and block-cutting probl ems of Chapter 3 , and the detour probl ems, such as the mi ssionaries-and-cannibal s probl em of Chapter 5. The subgoal method al so does not work, of course, i f you cannot thi nk of any pl ausi bl e subgoal s. However, if the probl em seems l i kel y to be d mul tistep rather than i nsight probl em, it i s usual l y advantageous to spend some time trying to generate pl ausi bl e subgoal s , because of the enormous power of the method. How do you try to defne a subgoal with reasonable hopes that you are on a path to the goal ? Al though there i s no method of defning plausibl e subgoal s that i s mathematically preci se and that appl i es to every type of probl em, you can take the frst step by defning an Ü4 Chapter 6 evaluation function over diferent probl em states, as was done as a necessary precondition in appl ying the hi l l -cl i mbing method. Having done this, you can recognize a pl ausi bl e subgoal as a probl em state wi th an eval uation part way between the given state and the goal state. I f the eval uation function is mul ti di mensional , then such a subgoal might have the goal val ues on some, but not al l , of the di mensi ons. Or i t mi ght just have val ues closer to the goal val ues on some, or even al l , of the di mensi ons. Defni ng an eval uation function over probl em states provides not only a way to recognize a pl ausi bl e subgoal but also a way to generate or defne pl ausi bl e subgoal s. Evaluation functions are si ngle valued in the di rection from a probl em state to an evaluation vector for the probl em state. ( Mul ti di mensional eval uation functions are still si ngle valued so long as only one di sti nct evaluation vector i s associated with each di sti nct probl em state. ) The i nverse function may be mul ti valued, but that does not seriousl y reduce the val ue of thi s approach to generating pl ausibl e subgoal s. It may be that subgoal s can always be determined t o have inter­ mediate val ues between the gi ven and goal state accordi ng to some expl i ci tl y defned evaluation functi on. However, a person may fre­ quentl y defne subgoal s without being able to state expl i ci tl y any rel evant eval uation functi on. Thus, this book will general l y di scuss the appl ication of the subgoal method to probl ems without attempting to describe any formal evaluation function. I n general , there are a mul ti pl i city of pl ausi bl e subgoal s, some on a correct path to the goal and some not. As mentioned, even if the probabil ity of a pl ausi bl e subgoal bei ng a true subgoal i s onl y 0. 1 or 0. 0 1 , the method i s sti l l reduci ng the search by an enormous factor i n most probl ems. In any event, it i s usual l y not very difcul t to con­ jecture a variety of reasonabl y pl ausi bl e subgoal s, but the l ikel i hood of defni ng a good subgoal wi l l depend upon how good an eval uation function you have defned over probl em states. In turn, how good your evaluation function is, how suitabl e it i s to solving the probl em at hand, ofen depends upon how adequatel y you represented the in­ formation i n the probl em (di scussed i n Chapters 3 and 1 0) , the defning of macroactions, and the use of various other probl em-sol vi ng methods. Probl em-sol vi ng methods are general l y used i n combination, and the combi ned power of several methods i n reducing the search space can resul t i n very fast sol ution of many probl ems with l ittle trial-and-error search being requi red. When you have defned two or more subgoal s to be achieved in get­ ting from the gi ven state to the goal , you can make a logical di stinction Subgoal s Üb as to whether the subgoal s must necessari l y be achieved in a certain order or whether they can be achieved i n any order. Thi s si mpl e logical di stinction between ordered and unordered subgoal s i s i l l ustrated i n Fi g. 6-2. In some probl ems i t i s obvi ous that one of the subgoal s (SG , ) i s cl oser to the gi ven i n terms of the eval uation function than i s the other subgoal (SG 2 ) , while the l atter subgoal i s closer to the goal than is the former subgoal . In such i nstances, the subgoal s cl early should be achi eved i n a parti cul ar order. In other cases, while the achievement of two or more subgoal s may constitute two components that are necessary i n order to get to the goal from the givens, i t i s not obvi ous whi ch subgoal is easi er to achi eve from the gi vens. I n the l atter case, you have a choi ce of what order to arrange the subgoal s on a path from the gi vens to the goal . In these cases, if your frst choi ce i s not worki ng out wel l , then you shoul d swi tch to some other choi ce i n orderi ng the subgoal s. In some cases, where the ordering of subgoal s i s not i mmedi ately apparent at the outset of the probl em, some orders of achi evi ng subgoal s may be easier to accompl i sh than others. Bei ng aware of the di sti nction between ordered and unordered subgoal s permits greater fexi bi l ity in the solution of probl ems i nvol vi ng unordered subgoal s. Ordered subgoals ( Givens >- -® GOal ) Unordered subgoals FIGURE 6-2 An exampl e of ordered subgoal s (SGi) where there i s a unique order i n whi ch the diferent subgoal s must be achieved i n getting from the givens to the goal versus an exampl e of unordered subgoal s where any order of achievement of the subgoal s can l ead to the goal . Üb Chapter 6 If you have defned n subgoal s (whether ordered or unordered) , you have automatical l y defned n + 1 subprobl ems to be sol ved -namely, getting from the gi vens to one of the subgoal s, getting from the frst subgoal to the second subgoal , and so on, from the nth subgoal to the goal . Whether any particul ar ordering of the subgoal s i s requi red by the original eval uation function or i s optional by that evaluation functi on, you may often be free to choose to work on any l i nk i n the chain frst, second, and so on. By and l arge, i t i s advantageous to start with a subprobl em of getting from the gi vens to one of the subgoal s, or el se to work on the probl em of getting from one of the subgoal s to the goal , because subgoal s are frequentl y not ful l y defned probl em states. I t i s usual ly preferabl e to work on a subprobl em i n whi ch either the begi nni ng or end state i s compl etel y specifed. I n most probl ems, thi s means starting from the gi vens i s preferabl e, though i n some cases working backward from the goal may be just as good or better. An additional advantage of starting with the givens is that you are si mul taneousl y drawing inferences (see di scussion of thi s method in Chapter 3) that you know i s expanding the i nformation you have available for the sol ution of the probl em, no matter what the ul ti mate success of the parti cul ar subgoal approach that you have taken. If you begin working on other subprobl ems that do not start from the gi ven state, then a fai l ure usi ng the subgoal approach may not have generated equi val entl y useful i nformation as that which would have been generated by starti ng from the givens. All this is perhaps rela­ ti vel y obvious, but it i s neverthel ess i mportant to bear i n mi nd. APPLICATIONS Pl anni ng a trip across the country is a probl em to which the subgoal method can be applied in a relati vel y tri vial manner. If you wanted to travel from San Franci sco to New York City, you might sel ect Denver and Chicago as subgoal s. Sel ection of such reasonable subgoal s de­ pends upon havi ng an eval uation function defned over such ci ti es (such as thei r two-di mensional coordi nates on a map) , which i ndicates that Denver and Chi cago have i ntermediate values on the east-west coordi nate compared to San Franci sco and New York. In thi s case, it is primari l y one of the two di mensi ons of the evaluation function that needs to be altered to get from the given state to the goal . But in going from Spokane, Washi ngton, to Mi ami , Fl orida, by way of Denver and Memphi s, Tennessee, you are substantial l y alteri ng the val ues on both di mensi ons in going from the given state to the goal . Obvi ousl y, the subgoal s are ordered i n these trip-pl anni ng probl ems. Subgoal s ܯ As an example of a probl em that i s extremel y easy to sol ve usi ng the subgoal method (wi th unordered subgoal s) , consi der the fol l owi ng: A light pl ane carrying three men crashes in the desert . The men decide that their best chance for survi val consi sts of each of them setting across the desert i n diferent directions in hopes that one of the directions wi l l pass by a sufcient number of oases to permit that man to reach civiliza­ tion and get help for the others. Before going their separate ways across the desert , they are faced wi th the probl em of achieving an equal di vi sion of their stock of water and canteens. They have in their possession fve canteens ful l of water, fve canteens half-ful l of water, and fve empty canteens. All canteens are the same size. Since water-carrying capacity is important should a man reach an oasi s, they wish to divide both thei r suppl y of water and the number of canteens equal l y among themsel ves. How can they achieve thi s? Stop reading and attempt to sol ve thi s probl em. If you were unable to sol ve i t, consi der the fol lowi ng hi nts. You might original l y defne three unordered subgoal s consi sti ng of attempt­ ing to di vi de the ful l canteens evenly among the three men, di vi di ng the half-ful l canteens evenl y among the three men, and di vi di ng the empty canteens evenl y among the three. I t i s i mmedi atel y obvi ous that thi s subgoal approach wi l l not work. An alternati ve defni ti on of sub­ goal s i nvol ves frst maki ng the i nferences that the total quantity of water is 5 + � or 74 canteens ful l of water and that the total number of canteens equal s 1 5 . From thi s you can concl ude that, in the goal state, each person wi l l have 24 canteens ful l of water and 5 canteens. I n es­ sence, thi s defnes a si x-di mensional vector eval uati on functi on such that i n the gi ven state each of the three persons has zero water and zero canteens and i n the goal state each person has 24 canteens ful l of water and 5 canteens. If you have not sol ved the probl em, stop read­ ing and attempt to defne rel evant subgoal s. The relevant subgoal s are t o attempt t o gi ve frst one of t he men (i t obviousl y does not matter whi ch one) 24 canteens ful l of water di s­ tributed among fve canteens. There are a number of ways of doi ng thi s, onl y one of whi ch wi l l make i t i mpossi bl e to achi eve the other two subgoal s of gi vi ng 2 4 canteens ful l of water i n 5 canteens to each of the other two men. The onl y way that prevents achi evement of t he goal i s to give the frst man the enti re set of half-ful l canteens. Any other method of gi vi ng the frst man 24 canteens ful l of water and 5 canteens wi l l permit achi evement of the remai ni ng two subgoal s -namel y, gi ve the frst man I ful l canteen of water, 3 half ful l canteens of water, and I empty canteen, or gi ve the frst man 2 ful l canteens of water, 1 half ful l canteen of water, and 2 empty canteens. Once the frst subgoal has ÜÛ Chapter 6 been achieved, it is tri vial l y obvi ous whether or not it is possi bl e to achieve the second and thi rd subgoal s. Another probl em that i l l ustrates t he power of t he repeated use of the subgoal method is the fol l owing: Nine men and two boys want to cross a river, using an infatable raf that will carry either one man or the two boys. How many times must the boat cross the river in order to accomplish this goal ? (A round trip equals two crossings. ) Stop reading and try t o solve the probl em, using t he subgoal method. Defne as a subgoal the probl em of getting one man across the river and getting the boat back to the starting si de. Stop readi ng and attempt to solve the probl em, if you have not al ready. It takes exactly four crossings to get one man across the river and retur the boat to the original side. Fi rst, the two boys cross the river i n the boat, then one boy takes the boat back to the original side of the river, then a man takes the boat across the river, and then the second boy takes the boat back to the original side of the river. These four crossings put both boys and the boat in the same position they were when they transported the frst man across the river. Thus, to transport all nine men across the river wi l l requi re 9 " 4 or 3 6 one-way crossings. At that point, both boys will be on the original side of the river with the boat, and one additional crossing wi l l be requi red for them to get to the goal side of the river with the boat. Thus, a total of 37 one-way crossings are requi red i n al l . An example of a somewhat simil ar subgoal probl em i n a probabi l ity context is provided by the fol l owing exampl e: The ace, 2, 3 , 4, 5 , 6, 7, and 8 of hearts are placed face up in a row on the table. Then a pack of eight cards containing the ace, 2, 3 , 4, 5 , 6, 7, and 8 of spades are shufed and placed in front of the player. As each succes­ sive spade i s turned over, the corresponding heart is removed from the row. What is the probability that all the hearts can be removed without a break (hole) ever occurring in the row of hearts? Stop reading and try to sol ve the probl em. Consi der as a subgoal the probabi l ity that the frst removed heart does not cause a break in the row. Stop reading and try to sol ve the probl em, using this subgoal . The probabil ity of achieving the frst subgoal (removing one heart without producing a break in the row) is exactl y � . This probabi l ity resul ts from the fact that there are two end positions to the row, and Subgoals ÜÜ onl y the two cards i n these end positions may be removed wi thout causing a break. Since there are eight cards i n the row, the probabil ity i s �. If you did not sol ve the probl em before, stop readi ng and try to solve the origi nal probl em, having achieved the solution to the subgoal . Once the frst sub goal has been achieved of drawing one card from the end of each row (and its probabi l ity has been determined) , the second subgoal shoul d be to compute the probabi l ity that the second card will be removed from the end of the row. If the frst subgoal has been successful l y achi eved, there are sti l l two cards at the ends, but now onl y seven cards in toto. Thus, the probabi l ity of successful l y re­ moving a second card from the ends of the row i s �. If you have not yet solved the entire probl em, stop reading and try to complete the rest of the sol ution on your own. Conti nue in thi s way, defni ng successi ve subgoal s of removi ng cards from either end of the row unti l al l cards have been removed. The probabi l ity that each subgoal will be successful l y achieved with a random shufing of the pack of spades is evi dentl y � . � . i . � . � . i . � , or :1:5 ' The probabi l ity of successful achi evement of the entire set of necessary subgoal s i s si mpl y the product of the probabi l iti es of achi evi ng each successi ve sub goal . Note that thi s probabi l ity probl em represents a rather i nteresting variation i n the use of the subgoal method, since there are, in essence, two paral l el sets of subgoal s i n­ vol ved in the probl em. On the one hand, there i s the series of subgoal s of removing cards from an end of the row, progressivel y reduci ng the row in l ength without creating a hol e. On the other hand, there i s the series of subgoal s of computing the probabi l i ti es of achieving each of these subgoal s. Now consi der thi s rather di ferent probl em (previ ousl y di scussed in Chapter 3) , which al so i l l ustrates the use of the subgoal method: Wanda the wi tch agrees to trade one of her magic broomsti cks to Gaspar the ghost i n exchange for one of hi s gold chai ns. Gaspar i s somewhat skeptical that the broomstick is i n working order and i nsi sts on a guaran­ tee equal to the number of l i nks i n the gold chai n. As a guarantee, he insi sts on payi ng by the i nstal lment plan, one gold link per day until the end of the 63-day peri od, wi th the bal ance to be forfei t if the broomsti ck malfuncti ons duri ng the guarantee peri od. Wanda agrees to thi s arrange­ ment , but i nsi sts t hat t he i nst al l ment payment be efected by cut t i ng no more than t hree l i nks i n the gol d chai n. Can t hi s cutti ng be done, and, if so, what links i n the chain should be cut? The chain i ni ti ally consi sts of 63 closed gold l i nks arranged i n a si mple li near order (not closed into a ci rcle). Stop reading and try to sol ve the probl em. TÛÛ Chapter 6 Assume it is possi bl e to sol ve the probl em by making onl y three cuts. Obvi ousl y, if it i s possi bl e, then Gaspar and Wanda wi l l have to make change on vari ous days during the 63 -day period. That i s, they must exchange vari ous links of chain on the di ferent days, so that Wanda acqui res one extra l i nk each day, si nce it surel y i s not possi bl e to separate the chain into 63 i ndi vidual l i nks by making only three cuts. I f you have not sol ved the probl em, stop reading and try agai n. Sti l l assumi ng that it i s possi bl e to sol ve the probl em, note that, i f it i s possi bl e, the sol ution wi l l resul t i n creating at l east three si ngle l i nks of chai n, as wel l as vari ous other longer l i nks of chain. I f you have not solved the probl em so far, stop reading and try agai n. Havi ng three i ndi vidual chain l i nks wi l l permit payment of one l i nk per day from days 1 to 3. Now as the frst subprobl em, you should determi ne the longest link chain that can be used with Wanda making change, in order to permit payment of an addi tional link on the fourth day. Obviousl y, the solution to this probl em i s to cut a chain that is four l i nks long, since Wanda can return the three i ndi vi dual l i nks. Then the second probl em i s to cut the maxi mum l i nk chai n that wi l l permi t payment when the three i ndi vidual chai n l i nks and the four­ l i nk chain have been used up. Obvi ousl y, this chain would consi st of eight l i nks. Conti nue in thi s manner, defni ng as subgoal s the making of change, using l engths of chain known to be part of the solution, until these are all gi ven over to Wanda and a longer chain is requi red. Then determine what longer chain i s requi red on that day. Thus, the solution of the probl em i s to have 3 i ndi vidual l i nks of chain, then a chain each of 4 l i nks, 8 l i nks, 1 6 l i nks, and 32 l i nks. Si nce, by i nspection, this wi l l requi re onl y three cuts (separating the 4 from the 8, the 8 from the 1 6, and the 1 6 from the 3 2) , the probl em is sol ved. Note that essential i nsight for sol vi ng the problem i s to consider how to make change on each day of the 63 -day period, starting from the frst day and conti nui ng through to the 63rd day, achieving these subgoal s i n order. In addition to the subgoal method, it is al so i mportant te the solution of thi s probl em for you to make the inference from the goal that, when you have determined where to place the cuts in creating the l arger chai n l i nks, you wi l l have al so achieved three i ndi vi dual l i nks. Thus, you shoul d start the process of change making on the i nitial days, usi ng the i ndividual l i nks unti l they are inadequate, and there­ after conti nue to use al l the known l ength chains until they are inade­ quate, at that poi nt cutting of the l argest l ength of chain that wi l l solve the probl em on that day. In a sense, the original probl em i s di vided into 63 subgoal s -that is, making correct change on each of the 63 Subgoals TÛT days, though onl y a few of these days are special i n that they requi re you to exchange one long pi ece of chain for all previousl y gi ven shorter pieces of chai n. A very si mpl e puzzl e problem i l l ustrates t he use of t he subgoal method in an enti rel y di ferent context: Fi ve squares are i nserted i nt o a three-by-two rectangle, as i llustrated in the gi ven of Fig. 6- 3. Three of the squares have a label A, one square has label B, and one square has label C. Any square may be moved within the rectangle to an adjacent square, provided that the square moved into is empty. The problem i s to make a sequence of moves so as to achieve the goal state, as illustrated i n Fi g. 6-3 . Given � � FIGURE 6-3 Goal � � The gi ven and goal states for the A BC puzzl e. Now make up fve l ittle squares of paper (or other tokens) that wi l l ft in the rectangle in Fig. 6- 3 , and, by moving them around in the rectangl e, attempt to solve the probl em. In attempting to sol ve the probl em, you will fnd it helpful to try to defne a subgoal state that is on the path from the gi vens to the goal by some eval uation functi on. Moving the A Be squares in the ri ght four cel l s of the rectangle wi l l not solve the probl em, for the three squares can onl y be moved i n a cycl i c manner wi thi n the four squares, whi ch wi l l never change the relative ordering of the B and e sguares -preci sel y what i s requi red in the goal state. At some poi nt, the B and e squares must be separated to the frst and thi rd col umns of the rectangle in order to achieve a change in the cycl i c order of the B and e squares. With thi s somewhat vague idea for a subgoal in mind, stop reading, attempt to defne the subgoal more preci sel y, and then sol ve the probl em, if you have not done so al ready. A more specifc defnition of the subgoal of separating the B and e squares to opposite si des of the rectangle is i l l ustrated in Fig. 6- 4. Note that i f the B and e squares are separated as i n subgoal 1 , it i s relativel y easy to move B next to e (subgoal 2) and then move the A ' s around i n such a way that B coul d be on top of e i n the thi rd col umn. Thus, in the case of thi s parti cul ar subgoal , you can qui ckl y TÛZ Chapter 6 verify that you coul d get from the subgoal s to the goal and then at­ tempt to reach subgoal 1 from the gi ven state. The l atter problem i s not too di fcul t and i s l ef to you as an exerci se. Subgoa l I � FIGURE 6-4 Subgoal 2 LE � ^ useful set of two subgoal s for the sol ution of the ABC puzzl e. The rel evant eval uation function for defning subgoal 2 i s probabl y whether the cycl ic order of B and C wi thi n the right four cel l s of the rectangle i s BC or CB. Subgoal 2 shares the BC order with the goal state, whereas the cycl i c order in the gi ven state is CB. Since the BC ordering i l l ustrated i n subgoal 2 cannot be achieved from the prior state by movi ng B unl ess B i s at some time moved out of the right four squares, you know that the preceding subgoal must have B in the extreme l eft-hand col umn of the rectangl e. Subgoal 1 i l l ustrates the simpl est such possi bi l ity in relation to subgoal 2. Surel y one of the most remarkabl e si mpl e exampl es of the use of the subgoal method comes i n the Tower of Hanoi (disk transfer) problem. One version of the probl em can be stated as fol l ows: There are three identi cal spikes and si x di sks, each with a diferent diameter but each having a hole in the center large enough for a spike to go through. At the beginning of the problem, the six di sks are placed on one spike, one on top of another, with the largest di sk on the bottom, then the next largest, and so on, in order of decreasi ng size until the smallest di sk, which is on top. (See Fig. 6- 5. ) You are permi tted to move only one di sk at a time from one spike to another spike, wi th the restriction that a larger di sk must never be moved on top of a smaller di sk. The goal is to transfer all six di sks to one of the other two spi kes (without ever permit­ ti ng a l arger disk to rest on top of a smaller di sk) . I bel i eve I was once tol d some relativel y routine mechanical proce­ dure for sol vi ng this probl em, but I do not remember it, si nce I was not given a proof that i ndi cated why it worked. However, a beautiful repeated hi erarchical (recursive) use of the subgoal method provides a sol ution to thi s probl em that dramatical l y i l l ustrates the power of the subgoal method. Stop readi ng and try to sol ve the probl em. Given state Goal state FIGURE 6-5 The given and goal states for the Tower of Hanoi (di sk transfer) probl em. T Ûó Apropos the probl em-solving method (di scussed in Chapters 3 and 1 0) of compl ete representation (naming) of al l the concepts i n a prob­ lem, the frst step i n solving this probl em might be to give names (in thi s case numbers) to the di sks i n a manner that easily represents the one way in which they difer from each other, namely, in diameter. So let us number the di sks I to 6 from smal l est to l argest. In addition, l et us label the probl em of transferring si x disks from one spi ke to another a six-problem, which impl icitl y recogni zes this probl em as a particular case of a l arger cl ass of disk-transfer probl ems (fve-probl ems, seven-probl ems, and so on) . For convenience in verbal descriptions, l et us also label the spi kes A , B, and C. This representation of the problem is shown in Fig. 6- 6. Now stop reading and try agai n to solve the probl em, if you fai l ed before. Havi ng represented t he probl em i n this way, I think that i t i s rea­ sonably l ikel y that one woul d thi nk of the fol l owi ng elegant way to divide the probl em into subgoal s. Sol vi ng a six-probl em from A onto C is equivalent to sol vi ng a fve-probl em from A onto B, moving the six- TÛ4 Gi ven state for 6-problem 1 Goal state for 6-problem 1 2 3 4 5 6 A B c RGURE 6- 6 A 2 3 4 5 6 B C The given and goal states for the Tower of Hanoi (di sk transfer) probl em, wi th numerical representati on. di sk from A to C, and solving a fve-probl em from B onto C. The fve­ probl ems are equi val ent to a four-probl em, a move of the fve-di sk, and another four-probl em. In turn, the four-probl ems can be subgoaled into two three-probl ems and a move, and so on. Thus, the enti re prob­ lem can be sol ved by a recursi ve use of the subgoal method. To actual l y i mpl ement the method, i n thi s case, you must have the abil ity to re­ member what l evel of subgoal you are currently working on, but this probl em can easi l y be sol ved by making some notes on a piece of paper. In any event, it is clear that thi s method sol ves the probl em and, in addi ti on, thi s subgoal method gi ves an excel l ent i nsight i nto the struc­ ture of the Tower of Hanoi probl em. Story-algebra probl ems frequentl y i l l ustrate the useful ness of the subgoal method. Consider the fol l owi ng si mpl e probl em: Each day, Abe either wal ks to work and rides hi s bi cycle home or rides his bicycle to work and wal ks home. Ei ther way, the round tri p takes one hour. I f he were to ri de both ways, i t would take 30 mi nutes. How long would a round trip take, if Abe walked both ways? Stop readi ng and try to sol ve the probl em by defning some si mpl e subgoal s. The frst subgoal you might defne i s t o determine how long it takes Abe to ride one way. You can then determi ne, as a second subgoal , how long it takes to walk one way. Then it is trivial l y easy to determine how long it takes to wal k both ways and to solve the probl em. You note that the time to ride both ways i s 30 mi nutes, from which it is obvious that the one-way riding trip requi res 1 5 minutes. If these 1 5 minutes are subtracted from the one-hour round trip for wal king plus riding, 45 minutes remain for the one-way wal ki ng tri p. Doubl ing thi s yi el ds a round-trip wal king time of 90 mi nutes, whi ch i s the sol ution to the probl em. Subgoals TÛb Most story-algebra probl ems are amenable to the subgoal approach. Instead of working di rectl y t o determine the val ue of the unknown quantity, you set subgoal s of determi ni ng various other unknown quan­ ti ti es that are rel ated to the goal quantity by some known rel ati on. When all the unknown quanti ti es except the goal quantity have been determi ned i n the known relati on, you can then use the known relation to solve for the goal quantity. You must al so be able to represent the el ements expressed i n the story probl em i n algebraic (equation) form. However, afer skill in algebraic representati on, skill i n defning sub­ goal s i s probabl y the next most i mportant el ement i n the sol uti on of story-algebra probl ems. For another si mpl e example of the subgoal method appl i ed to a simple story-algebra probl em, consi der the fol l owi ng: I ngrid bri ngs a quant i t y of hats to sel l at the Saturday market. I n the morni ng, she sel l s her hats for $3 each, grossi ng $ 1 8. I n the afternoon, she reduces her pri ce to $2 each and sel l s twi ce as many. What was I ngri d' s gross i ncome for the day from the sale of hat s? Stop reading and solve the probl em by defni ng a si mpl e sequence of subgoal s. The frst subgoal i s t o determine how many hats are sold i n the morning. From thi s, it i s tri vi al l y easy to determine the number of hats sold i n the afernoon, whi ch i s the second subgoal , and then the gross i ncome for the day. The specifc sol uti on to the probl em is as fol l ows : I f I ngrid grossed $ 1 8 i n the morni ng by sel l i ng hats at $3 per hat , she evi dentl y sol d 6 hat s. This impl i es that she sol d 1 2 hats i n the afernoon. Therefore, I ngrid grossed 6 ³ $3 + 1 2 ³ $2, or $42 for the day. The subgoal method i s al so frequentl y useful i n the sol ution of geometry probl ems, as i n the fol l owi ng exampl e: Gi ven t he parallelogram ABCD i l l ustrated i n Fi g. 6-7, prove t hat the perpendi cul ars ¬Ì and CF drawn to the di agonal BD are equal. Stop readi ng and try to solve the probl em by defning a rel evant subgoal . One very common way to prove two l i nes are equal is to prove that they are corresponding parts of congruent triangl es. In the present case, this coul d mean either proving that triangle ABE was congruent to triangle CDF or proving the triangle AED was congruent to triangle CFB. These two alternative subgoal s appear to be equi val ent, and therefore we may arbitrari l y choose to work on the subgoal of provi ng TÛb B C A D FIGURE 6-7 Gi ven AB ¯ CD. BC ¯ AD. AE � BD. and CF � BD. prove that AE ¯ CF. triangle ABE congruent to triangle CDF. Stop reading and attempt to sol ve the probl em, possi bl y by defning a further subgoal . To prove triangle ABE congruent to triangle CDF, it is hel pful to defne a prior subgoal of proving that triangle A BD is congruent to triangle CDB. Stop reading and attempt to sol ve the probl em, using this sequence of two subgoal s. Triangle ABD i s evi dentl y congruent to triangle CDB, si nce the three corresponding sides of each are equal. From thi s, we can con­ clude that angle L equals angle { i n Fig. 6-7. From thi s, we can conclude that triangle ABE i s congruent to triangle CDF, since both are right triangl es and there are corresponding angles (besi des the right angl es) that are equal and the hypotenuses are equal . Now that these triangl es have been proved congruent, si de A E equal s si de C F by corresponding parts of congruent triangl es, and the probl em is sol ved. Another geometry probl em that i s qui ckl y solved by use of the sub­ goal method i s the fol lowing: Gi ven the ci rcle illustrated in Fi g. 6- 8, proceed from the ci rcumference along a diameter of the circle for an arbitrary unknown di stance, to poi nt ¬ , then turn perpeadicular to the radi us and draw a l i ne connecting the radi us to the circumference, point B. Then erect another perpendi cular at B until i t intersects, at point C, the di ameter perpendi cular to the original di ameter. The di ameter of the circle is 1 00 feet. Determi ne the length of the l i ne AC. Stop reading and try to sol ve the probl em by defning relevant subgoal s. At frst, thi s probl em may seem extremel y di fcul t t o sol ve, since very l ittle numeri cal i nformation i s gi ven i n the probl em. However, a reasonabl e subgoal for determining the l ength of the l i ne A C is to deter­ mi ne some triangle to whi ch A BC is congruent, where the l ength of Subgoals TÛ¯ one or more of the si des of thp second triangle is known. Alternati vel y, you coul d attempt to determi ne some triangle congruent to the triangle AOC. Using thi s subgoal , attempt to sol ve the probl em. Of course, triangle A BC i s congruent to triangle A �C, but thi s knowl edge i s not much hel p, si nce you do not know the l engths of any of the l i nes i n either of these triangl es. There are no other triangl es drawn expl i ci tl y in Fi g. 6-8. Thus , you wi l l have to draw additional l ines in order to defne new triangles that may be congruent to the triangl es al ready given in Fig. 6-8. Cl earl y, you shoul d draw lines that resul t in triangl es with one or more known l ength sides. The onl y known l engths are the diameters and radi i of the ci rcl e. Thus, the constructed triangle shoul d evi dentl y i ncl ude a diameter or radi us. Given thi s l i ne of reasoni ng, sooner or later you shoul d hi t upon the idea of drawing the radi us DB to defne the triangl es BOA and BOC, both of whi ch are congruent t o t he original triangl es ABC and AOC (easi l y proved). From thi s, we concl ude that l i ne AC equal s l i ne DB, by corresponding parts of congruent triangl es, and, si nce l i ne DB i s a radi us of a ci rcl e, we know that l i ne A C equal s t he radi us of the ci rcl e, which i s 50 feet. The common practice in mathematics of conjecturing and provi ng one or more l emmas as subgoal s on the way to provi ng some major theorem i s a good example of the use of the subgoal method. The ski l l ­ ful defni ti on of l emmas ( subgoal s) to aid i n provi ng a difcul t theorem depends on having si mple and compl ete representations of the rel e­ vant mathematical concepts and very good eval uation functi ons based on such el egant representations and experience i n theorem proving. Some of thi s abil ity to represent concepts elegantly and defne good evaluation functi ons can be gained by studying general probl em-sol ving methods and appl yi ng them to probl ems requi ring no specialized knowl edge. However, you cannot expect to be able to prove di fcul t theorems i n some area of mathemati cs without extensi ve FIGURE 6-8 Gi ven the ci rcl e i l l ustrated above wi th a diameter of I ÛÛ feet, determi ne the l ength of the l i ne A C. TÛÛ Chapter 6 studying of the concepts, assumptions, operations, and so on, in that area. To sol ve probl ems in a special ized area of knowl edge, one must have l earned certain el egant ways of representing concepts i n that area. Thi s knowledge is often required in order to defne good evalua­ tion functions for use in hill cl i mbing and the subgoal method. Another mathematical proof technique that i s an i ngeni ous use of the subgoal method is mathematical induction. Mathematical induc­ tion can be used to prove theorems that i n some way invol ve natural numbers ( positive integers). Let the goal expression you are trying to prove be represented by E(n) , where n stands for any natural num­ ber (n c I ) . The probl em of proving E(n) true for any natural number, n, can be di vi ded into two subprobl ems : frst, proving E(n) true for n = I , and, second, provi ng that if E(n - 1 ) is true, then E(n) is true. For exampl e, consi der a mathematical induction proof of the theorem that the sum of the frst n positive integers, I? = 1 i, is n(n + 1 )/2: ( 1 ) 2 � = 1 i = 1 = 1 . t so true for n = 1 . (n - I ) n (2) Assume true for (n - I ) : 2:'=-/ i = 2 ( 3) Add n to both sides: 2 7 = 1 i = (n - 2 1 ) n + n (4) . (n - I ) n + 2n Put over common denomi nator: 2:'= 1 1 = 2 ( 5) F t · __ • n(n - 1 + 2) ac or. _ = _ 1 ¯ 2 (6) Si mpl ify: 2;'= 1 . n(n + l ) 1 = 2 Q. E. D. Step ( 1 ) establ i shes that t he theorem i s true for n = 1 , and steps (2) through (6) establ i sh that if the theorem i s true for n - I , it i s true for n. ¯ Contradicti on As mentioned in Chapter 3 , amateur probl em sol vers often do not pay enough attention to the goal or the set of possi bl e goals as part of the information in a probl em. They apply operati ons to the givens i n an attempt to get to the goal , but they frequentl y do not consi der apply­ ing operations to the possible goal s in order to get to the givens or to meet the gi vens hal fway. In Chapter 3 we were concerned with inferences about the goal that could be made primari l y from the partial information the sol ver already possessed about the goal , but al so from gi ven i nformation or from i nformation about both givens and the goal . Here we are al so concerned with inferences that can be made from the goal in conjunc­ tion wi th the givens. However, the purpose of the types of inferences I wi l l di scuss here i s quite diferent from the purpose of those i n Chap­ ter 3, where the purpose was to cl early specify the parts of the goal , so that we coul d more easi l y see exactl y what was to be derived from the givens. By contrast, the purpose of the types of inferences to be di scussed now i s to derive an i nference that contradi cts some pi ece of given informati on. Deriving a contradiction proves that the goal coul d not possi bl y be obtained from the givens, since it i s i nconsi stent with the givens. Thi s method of contradiction i s appropri ate for several types of probl ems i n TTÛ Chapter Î whi ch you must deci de which of two or more goal s coul d be derived from the givens. The method of contradiction onl y tel l s you which goal s cannot be derived from the givens. However, i n some probl ems, the abil ity to deci de whether a possi bl e goal does or does not contra­ di ct the gi vens may be all that is requi red to sol ve the probl em. Probl ems for which the method of contradiction i s appropriate i ncl ude those where you must onl y determine whether a goal is con­ si stent with the given i nformati on, not necessari l y whether it could be derived from the given information using some parti cular set of opera­ tions. The method is al so appropriate for probl ems that guarantee that exactl y one of several alternative goal s can be derived from the given informati on. Here if al l alternati ves but one can be ruled out, then that one must be derivable from the givens, and the method of contra­ diction constitutes a sufci ent proof of i t. Many probl ems make the guarantee that one out of several alterna­ tive goal s can be derived from the givens. In thi s chapter, I wi l l di scuss i n four sections the probl ems that i l l ustrate how the method of contra­ di cti on can be appl i ed. The four sections are as fol l ows. Indirect Proof The frst secti on wi l l be concerned with probl ems with only two or three alternative goal s. It will focus on the method of i ndi rect proof i n mathemati cs, where the two alternative goal s are usual l y that some statement is either true or fal se. We are not i nterested i n whether we can always say a statement i s either true or fal se but rather i n the use of the method of contradiction i n those cases where it is assumed that onl y these two alternati ves can hol d. In such cases, if a person can show that one of the two alternati ves leads to a contra­ di cti on, then the other alternative has been proved. Multiple Choice -Small Search Space The second section will be concerned with the method of contradiction i n probl ems i nvol vi ng a smal l (from two to 1 0) set of alternative goal s that are mutual l y in­ consi stent (only one of the goal s can be derived from the gi vens). I n probl ems i nvol vi ng a smal l set of alternative goal s, i t i s feasible to systematical l y apply the method of contradi ction to every alternative goal . Exampl es of such probl ems i ncl ude mul tipl e-choice exami nation probl ems and certain logic probl ems. Classicatory Contradiction -Large Search Space The third sec­ tion will be concerned with the use of the method of contradiction in probl ems i n whi ch there i s a large, but di screte and fni te, population of alternati ve goal s. In these probl ems, it i s general l y not feasible to systematical l y search every alternati ve. It i s necessary to devi se some Contradiction TTT more efcient search procedure that contradicts l arge cl asses of alternative goal s si multaneousl y. Probl ems in thi s category i ncl ude the coin-weighing problem di scussed earl i er, many concept-attai nment probl ems, and l etter-arithmetic probl ems. Classicatory Contradiction -Infnite Search Space The fourth section wil l be concerned with the use of the method of contradi c­ ti on in probl ems i nvol vi ng infnite (ofen conti nuous) populations of goal s. In these probl ems, it is clearly i mpossi bl e to contradi ct each goal individual l y; the sol ver must contradi ct i nfni tel y l arge cl asses on the basi s of some common property. An example of thi s case i s provided by the half-i nterval search technique in the numerical sol u­ tion for roots of equations. INDIRECT PROOF The method of i ndi rect proof in mathemati cs is an extremel y i mportant example of the probl em-sol vi ng method of contradi cti on. To prove that a statement fol l ows from certain givens, the method of i ndirect proof is to assume the contrary is true and show that the contrary statement, in combination with the givens, resul ts in a contradi cti on. Therefore, since the contrary statement is fal se, the original statement must be true. You wi l l note that, in thi s case, for the method of i ndi rect proof to be valid there must be onl y two possi bl e alteratives: ei ther the goal statement i s true (can be derived from the gi vens) or it is fal se (the contradiction of the statement can be deri ved from the givens, but the original statement cannot) . For the method of contradiction to be val i d, the statement must be either true or fal se. The truth val ue of the statement cannot be undeci dabl e. In addition, the set of gi ven statements must themsel ves be free of i nternal contradiction ; other­ wi se, contradictions coul d be derived from a possi bl e goal in combina­ tion with the givens, not because of a contradiction between the goal and the givens, but because of a contradi cti on wi thi n the givens. How­ ever, the beginning student need not be very concerned with these l i mi tations on the use of the method of contradiction. By and l arge, whenever it appears reasonabl e to use the method of contradi cti on, it i s val i d to use i t. There are, of course, i nnumerabl e exampl es of the use of i ndi rect proof in every area of mathematics. Here i s one exampl e: Given that you have already proved the theorem that all squares of non­ zero integers are positive, prove that equation x2 + 1 ¯ 0 has no integer solution. TTZ Chapter Î Stop reading and attempt to prove the theorem, using the method of contradiction (i ndirect proof) . The frst step in appl yi ng the method of contradiction to thi s probl em i s to assume the contradi cti on of the theorem-namel y, that .has an integer sol uti on, .- where i s an integer. I f you have so far not sol ved the probl em, stop readi ng and try agai n. Given that . + 1 -0, subtract 1 from both si des of t he equation to get . --1 . Now substitute for .getting --1 . Thi s resul t i s a contradi ction to the al ready proved theorem that the square of any integer must be posi ti ve. A famous proof of the exi stence of i rrational numbers also uses the method of contradi cti on. Rational numbers are numbers expressi bl e by si mpl e fractions, min, of i ntegers m and n, where n i s nonzero. To show the exi stence of i rrational numbers, we need to show that there exi sts at l east one such number -for example, \. Given an isosceles right triangle with si des of unit length, the Pythagorean Theorem asserts that the length of the hypotenuse equals the square root of the sum of the squares of the lengths of the sides ; namely, c = V ¯ \. Prove that the length of the hypotenuse of this triangle -namely, \ ¯i s i rrational . Stop reading and try to sol ve the probl em, using the method of con­ tradi cti on. The contradi cti on of the theorem i s to assume that the \ i s rational and therefore can be expressed as the ratio of two i ntegers min, where both m and n are i ntegers ( greater than zero). Al so, when we assume \ equal s min, we can assume that m and n have no common factors, si nce these common factors coul d al ready have been canceled out. If you have not sol ved the probl em so far, stop readi ng and try agai n. From the above, we derive that 2n2 -m 2 , whi ch i mpl i es that m2 i s even. Thi s resul t i n turn i mpl i es that m i s even (m -2p, where p i s an i nteger) . If you have sti l l not sol ved the probl em, stop reading and try agai n. If m -2p, we can substitute 2p for m i n the equation 2n2 -m2, ob­ tai ni ng 2n2 -(2p)2 -4p2. From thi s resul t we obtain n2 -2p2, whi ch i mpl i es that n2 i s even. Agai n, thi s resul t i mpl i es that n i s even (con­ tai ns a factor of two) . However, we have now derived that both m and n are even (contain a common factor of 2) , contradicting the hypothesi s that m and n have no common factors. Thus, the contradiction i s fal se, and \ must be i rrational . A common feature of both these exampl es of i ndi rect proofs that is characteristic of most proof probl ems to whi ch the method of in­ direct proof i s well suited i s that the contradiction of the theorem Contradiction TTó permits a larger number of specifc consequences t o be derived from it than does the origi nal statement of the theorem. This feature gi ves you a great deal more to work with by concentrating on the contradi c­ tion of the theorem than you woul d have by concentrating on the original statement of the theorem. It shoul d be cl ear, then, why the method of contradiction i s so useful in cases l i ke thi s. The method of contradiction i s al so used i n proof probl ems where there are two or more incorrect alternati ves to the correct theorem, each of whi ch must be di sproved by contradiction when combi ned wi th the givens. For exampl e, consi der a proof of the fol l owing theorem: You are gi ven three assumpti ons or previ ousl y proved theorems. ( 1 ) I f e ~ 0 and a ¯ b, then ae ¯ be. ( 2) I f e ~ 0 and a ~ b, then ae ~ be. ( 3) The law of trichotomy obtai ns: for any a and b, one and onl y one of three al ternati ves hol ds : a · b, a ¯ b, or a ~ b. U si ng these gi vens, prove that, if e ~ 0 and ae · be, then a · b. Stop reading and try to prove the theorem, using the method of con­ tradiction. To prove thi s theorem, you must test two i ncorrect alternati ves to show that they resul t in contradictions -namely, a > b and a -b. If you have not al ready proved the theorem, stop reading and try agai n. First, l et us assume a > b. If a > b, then with e > 0, we know that ae > be. But this resul t i s a contradi cti on to the given i nformation that ae ´ be, by the law of trichotomy. Si mi l arly, to rul e out the alternati ve that a = b, we deri ve from {/ = b and L > 0 that ae = be. whi ch contra­ dicts the given information that ae ´ be, by the law of trichotomy. Therefore, the only remaining possi bi l ity, by the law of trichotomy, i s that a ´ b, whi ch was t o be proved. Note that we had t o rul e out two alternatives before we coul d conclude the theorem proved, although in thi s case the method of contradicting each alternative was ex­ tremel y si mi l ar. The method of i ndi rect proof shows up in an enormous variety of probl ems. For exampl e, recall that one essential part of the sol ution to the notched-checkerboard probl em i n Chapter 3 was to assume that there was a method of covering 62 squares with 3 1 domi noes. These 3 1 dominoes must cover 3 1 black squares and 3 1 white squares. From the given information, we can derive that removi ng the two di agonal l y opposite squares of t he checkerboard wi l l produce 32 squares of one color and 30 squares of the other col or, resul ti ng i n a contradi cti on. Thus, there i s no method of covering the 62 remaining squares wi th 3 1 dominoes. Si mi l arly, i n the method used i n Chapter 3 to establ i sh the mi ni mum number of cuts needed to sol ve vari ous cube-cutting probl ems, we TT4 Chapter Î i mpl i citly ruled out a smal l er number of cuts by contradi ction with the i mpl i ci t gi ven i nformation that not more than one face of a subcube could be cut at a time. The method of i ndi rect proof i s ofen useful i n plane geometry proof probl ems. I ndeed, high school plane geometry is usually the frst op­ portunity for most students to become acquai nted with the method of i ndi rect proof. One reason why most people are so suspicious of the method of i ndi rect proof when they frst encounter it i s that they encounter i t so l ate. I suspect they unconsci ousl y feel that so basic a method of proof should have been explained to them much earl ier i n thei r l i ves, as i ndeed i t shoul d have been. Be that as it may, pl ane geometry proof probl ems ofen demonstrate the method of indirect proof, and the fol l owing i s a parti cul arl y si mpl e exampl e: Gi ven the assumpti on that two di sti nct poi nt s determi ne one and onl y one strai ght l i ne, prove t hat two l i nes can i ntersect at no more than one poi nt. Stop reading and try to prove thi s theorem, using the method of contradi cti on. Fi rst, assume the contrary -namel y, that there exi st two l i nes that intersect in at least two points, A and B. If you di d not solve the prob­ lem thus far, stop readi ng and try agai n. Si nce the two straight l i nes i ntersect i n poi nts A and B, there are two di sti nct straight l i nes passi ng through the points A and B. How­ ever, this i s contrary to the assumption that two points determine one and only one straight l i ne. Thus, the contrary of the theorem is con­ tradi cted, and so the theorem i s proved. Finally, consi der the fol l owi ng pl ane geometry problem as an ex­ ampl e of the use of i ndi rect proof: You are gi ven two assumpti ons or previ ousl y proved theorems. ( I ) A straight l i ne is a 1 800 angl e. (2) Two l i nes are perpendi cul ar, if they make a 900 angle where they intersect. From these assumpti ons, prove that from a point on a l i ne, onl y one perpendi cul ar l i ne can be erected. Stop reading and try to prove this theorem, using the method of contradi cti on. To prove thi s geometric theorem it i s useful , as it al most invariabl y i s i n pl ane geometry probl ems, to construct a fgure. Consi der Fig. 7- 1 . To prove that at most one l i ne can be perpendi cular to another l i ne at a gi ven poi nt, assume the contrary -namel y, that at least two l i nes can be drawn perpendi cul ar to a gi ven l i ne through a given poi nt. I n Contradiction TTb Fig. 7- 1 this assumption i s represented by the two l i nes drawn through point A and represented as being perpendi cul ar to line C. If you have not yet proved the theorem, stop readi ng and try agai n. Accordi ng to the hypothesi s that the two perpendi cul ars are di sti nct, there i s some angl e L between them, L > O. Each of the perpendi cu­ l ars forms a 90° angl e with l i ne C. Thus, the straight l i ne C equal s 90° + L + 90° > 1 80°, whi ch i s contrary to the assumption that a straight l ine is a 1 80° angl e. C A FIGURE 7-1 Figure to prove by the method of contradiction that, at a given point, ^, on a l i ne, C, onl y one perpendicul ar can be erected. MULTIPLE CHOICE-SMALL SEARCH SPACE Besi des being useful for proving theorems, the method of contradi cti on i s useful i n the sol uti on of a wide variety of other probl ems where there are usual l y more than two alternatives. Whenever you are guaranteed that exactl y one of a small set of alternative goal s i s consi stent with (or fol l ows from) the gi ven i nformati on, it i s possi bl e to determine which goal , by systematical l y exami ni ng each and deriving a contra­ diction from all but one of them. Probl ems given on tests with mul ti pl e-choi ce answers have so few choices (fve or l ess) that contradi cti on is frequentl y the i deal sol ution method. Si mpl y take each alternative answer i n tur and determine whether it i s consi stent with the given informati on. That is, combine each possi bl e answer with the gi ven i nformation to attempt to derive a contradiction. If you can derive contradi cti ons for al l the answers except one, then that remai ning answer i s correct. For exampl e, con­ sider the fol l owi ng potential exam probl em: The sol uti on of V7x 3 + v ¯ 2 i s: (A) x 3 , (B) x � g (C) x ¯ 2, ( D) x ¯ I , ( E) x ¯ O. TTb Chapter Î Stop reading and try to solve thi s probl em, usi ng the method of contradi ction. The sl ow way to sol ve the probl em is to perform various operations on both si des of the equation (addi ng, subtracting, mul ti pl yi ng, divi di ng, squaring both si des of the equation) . The fast way is to substitute each of the alternati ve val ues of .into the equation and see which ones work. In thi s case, onl y .-1 i s consistent with the equation, so ( D) i s the answer. Sol vi ng for the val ues of one or more variabl es that satisfy one or more equations i s a primary exampl e of probl ems where ofen the gi vens and the goal should be combi ned. In these probl ems, there may be several val ues of a variabl e, or several sets of val ues of the several variabl es, that satisfy the equation or equations. You are being asked to determine one or more such sol uti ons that satisfy the equations, whi ch i s i n essence saying that consi stency of the gi vens and goal statements is all that is demanded. When you encounter a probl em l i ke this on a mul tipl e-choice test, with a smal l number of choices for sol utions of the equations, then the ideal probl em-sol vi ng method i s contradiction. You simply try each of the alternative sol uti ons i n turn to see if it sati sfes the equa­ tions -that i s, gi ves an answer l i ke .-.for al l equations -when the set of val ues of variabl es i s substituted i nto the equations. All of the i ncorrect answers wi l l produce contradi cti ons such as 5 -3. For ex­ ampl e, consi der the fol l owing probl em: Which of the fol l owi ng i s a sol uti on of the cubi c equati on, X+ 4x2 -7x - 1 O ¯ O? x equal s: ( A) -2, ( 8) -5, ( C) 4, ( D) 3, ( E) none of these. Stop readi ng and try to sol ve thi s probl em, usi ng the method of con­ tradiction. In thi s instance, you can factor the cubi c into three l i near factors, yi el di ng three real roots. However, a much faster way to sol ve the probl em i s to check each of the frst four specifc alternative answers for consi stency with the cubic equati on. Si nce one of the answers -namel y, .--5 -i s consi stent with the cubi c equation, and all the rest are not, we know that (8) i s the cor­ rect answer. Si nce all you are asked i s whether a particul ar alternative goal i s consi stent with the given i nformation (the cubic equation), determining one answer that i s consi stent i s sufci ent to sol ve the probl em. You need not actual l y deri ve contradi cti ons in the case of alterati ves (e) and ( D) , except as a check that your determination of consi stency in the case of alternative ( 8) was correct. If alternative Contradiction TT¯ (E) was an expressi on such as "several of these, " it woul d be neces­ sary to deri ve contradi ctions to al l but one of the frst four altera­ tive answers i n order to rul e out this alternati ve. Contradiction i s useful i n examinations on a very wide variety of probl ems. For exampl e, consi der the fol l owi ng: The base of our number system i s 1 0. I f the base were changed to four you would count as fol l ows: I , 2, 3 , 1 0, I I , 1 2, 1 3 , 20, 2 1 , 22, 23 , 30, and so on. The 22nd number i n t he base-four system i s : ( A) 22, ( B) 37, ( C) 64, ( D) 1 04, ( E) 1 1 2. Stop reading and try t o answer t hi s question, usi ng t he method of contradiction. In a number system with base of four, the digits 4, 5, 6, 7, 8, and 9 cannot appear. Thus, alternative answers ( B) , (C) , and (0) coul d not possi bl y be correct. In addition, the number 22 has al ready been used prior to achievi ng the 22nd number i n the base-four system. Thus, the only possi bl e answer of the fve that coul d be correct i s answer (E), whi ch i s 1 1 2. Some probl ems are necessari l y solved by contradi cti on. For ex­ ampl e, consider the fol l owing probl em: The formul a expressi ng the rel ati onship between x and y i n the table i s: x y 2 3 2 3 6 4 5 1 1 1 8 (A) y = 2x + I , ( B) y = -r + 2x 2 + I , (C) y = x' - 2r+ 3x 2 -x + I , ( D) y = x 2 - 2x + 3, ( E) y = x 2 + l . Stop reading and try to sol ve thi s probl em, usi ng the method of contradiction. The correct answer is alternative ( 0) , and there is real l y no other way to determine that (0) i s the correct answer except by usi ng the method of contradi cti on. I nfni tel y many di ferent functi ons are con­ si stent with any fnite number of poi nts, and infnitel y many diferent functions are i nconsi stent with any fnite number of poi nts. All we can do i s to determi ne whi ch functions are consi stent and which are in­ consi stent by checki ng for contradictions between the proposed func­ tions ( goals) and the given information about poi nts on the functi on. No one woul d object to your usi ng the method of contradi cti on to sol ve such probl ems in an examination situation and, of course, to TTÛ Chapter Î using it in the case of i ndi rect proof. However, I have occasionally heard objecti ons to its bei ng used i n some of the above examples of mul ti pl e-choice probl ems, where there are di rect methods for deter­ mining the goal from the gi ven information. Some teachers have even protested that for students to use the answers and look for contra­ di cti ons is mi l dl y i mmoral , that it "educates people to be test takers. " It i s true that such students are not demonstrating their knowledge of the di rect algorithmic methods for obtai ni ng the goal from the given informati on. However, I thi nk we must face the fact that these types of test questions si mpl y do not adequately assess a student ' s knowledge of algorithmi c methods , because the method of contradiction can be used in pl ace of the algori thms. Some students are inevitabl y going to use the method of contradiction whether anybody tel l s them about it or not, and thi s onl y introduces an extra source of noise in the rela­ tive assessment of understandi ng of diferent students. Test questions must be made fool -proof; it will not do to ask students to be fool s. Furthermore, there are many ti mes when a teacher wants to assess students ' knowledge of certain specifc mathematical concepts with questi ons that can be answered using the method of contradiction, and it is either not possi bl e or the teacher does not care to assess their understanding of any algori thmi c method for generating the solution. Another class of probl ems that involves a search among a small popul ation of alternative goals are the recreational logic probl ems that make up such a l arge part of probl em books. In many of these probl ems it i s difcul t to make inferences from the gi vens to the goal , but it is general l y quite easy to test any gi ven assumption about the goal for consi stency with the given information. Since the number of alternative goal s i s ofen quite smal l , the method of contradi ction is ideal l y suited to the solution of such probl ems. Some of the most i nteresting recreational logic probl ems that are the most difcult to solve by inferences from the gi vens to the goal are probl ems i nvol v­ ing the possi bi l ity that some of the given i nformation i s fal se. These are the famous liar and truth-tel l er (truar) probl ems. We di scussed one such probl em in Chapter 3 i n connection with the need for having a clear understanding of the goal ; let us consider it again here from the standpoint of the method of contradi ction: The country of Marr is i nhabited by two types of peopl e, l i ars and truars (truth tel l ers). Liars al ways lie and truars al ways tel l the truth. As the newl y appointed United States ambassador to Marr, you have been in­ vited to a l ocal cocktail party. While consuming some of the native spirits, you are engaged in conversation with three of Marr' s most promi nent Contradiction TTÜ ci tizens: Joan Landill, Shawn Farrar, and Peter Gant. At one poi nt in the conversati on Joan remarks that Shawn and Peter are both l iars. Shawn vehementl y denies that he i s a l iar, but Peter repl i es that Shawn is indeed a l iar. From this i nformation, can you determine how many of the three are l iars and how many are truars? Stop reading and try to solve thi s probl em, using the method of contradiction. You wi l l recall from Chapter 3 that al l we need to deter­ mine is how many of the three are l i ars -namely, whether there are zero, one, two, or three l iars among the three people. However, i n order to determine thi s number, it i s useful to consi der al l eight pos­ si bi l ities for the l i ar-versus-truar status of Joan, Shawn, and Peter­ namely, al l three are l iars ; Joan and Shawn are truars , but Peter i s a l iar; Joan and Peter are l i ars, but Shawn i s a truar; Shawn and Peter are l iars, but Joan i s a truar; Joan i s a l i ar, but Shawn and Peter are truars ; Shawn is a l iar, and Joan and Peter are truars ; and, fnal l y, Peter i s a l iar, and Joan and Shawn are truars. It i s easy to test the consi stency of each of these eight possi bi li ti es wi th the given informa­ tion of each of these eight possi bi l i ti es. For exampl e, all three cannot be truars, since Joan would not then say that both Shawn and Peter were l iars. All three cannot be l ying, si nce Peter would not then say that Shawn was a l iar. We can al so rule out each of the three possi bi l i ­ ties in whi ch there are one l iar and two truars i n the group. Of the remaining three possi bil i ti es, we can rul e out the possi bi l ity that Joan i s a truar and both Shawn and Peter are l iars, but we cannot fnd a contradiction to either of the other two possi bi l i ti es -namely, that Joan and Shawn are l iars and Peter is a truar, or that Joan and Peter are l iars and Shawn i s a truar. As di scussed i n Chapter 3 , the inabi l ity to deci de between these two possi bi l ities i s of no consequence to the sol ution of the original probl em, since al l we were asked to determine was how many of the three are l iars. Under either of the two possi bi l i ti es that are not contradi ctory with the given information, there are exactl y two l iars and one truar, whi ch is the answer to the probl em. Instead of exami ni ng al l eight combinations of I i ar-and-truar status for each of the three peopl e, it is possi bl e to use the method of con­ tradiction somewhat more efciently by consi dering vari ous classes of the eight alternatives. For i nstance, a judicious choice woul d be to consider the class of possi bi l i ti es i n whi ch Joan i s a truar. All of the four members of thi s cl ass can be shown to be contradi ctory to the given i nformation, since then both Shawn and Peter must be l iars and Shawn would then be tel l ing the truth -a contradiction. Thi s cl assi fca­ tory use of the method of contradiction i s di scussed more extensi vel y in the next section. In probl ems i nvol ving search through onl y a smal l TZÛ Chapter Î set of alternatives, it is usual l y qui ckest to test each of the possibil i­ ti es i ndi vi dual l y for consi stency with the given informati on. Another l i ar-truar probl em that si mpl y i l l ustrates the useful ness of the method of contradiction is as fol l ows: The Nel sons have gone out for the evening, leaving their four children with a new babysitter, Nancy Wiggens. Among the many instructions the Nel sons gave Nancy before they lef was that three of their children were consi stent liars and only one of them consi stently told the truth, and told her which one. But in the course of receiving so much other information, Nancy forgot which child was the truar. As she was preparing di nner for the children, one of them broke a vase in the next room. Nancy rushed in and asked who broke the vase. These were the chil dren' s statements: Betty: Steve: Laura: John: Steve broke the vase. John broke it. I didn' t break it. Steve lied when he said I broke it. Knowing that only one of these statements was true, Nancy quickly de­ termined which child broke the vase. Who was it? Stop reading and try to sol ve the probl em, using the method of con­ tradi cti on. There are two possi bl e approaches to thi s probl em. First, we might try to test each of the four possi bi l i ti es for who broke the vase. This approach appears to be the most direct way to the goal ; however, it wi l l not work unti l we frst determine whi ch of the four i s tel l i ng the truth and which three are lying. When the l iar-versus-truar status of the four chi l dren has been determined, it is trivial to determine who broke the vase. Thus, to successful l y apply the method of contradic­ tion to the probl em, we should test the four possibilities i n regard to which of the chi l dren i s a truar. If you did not solve the probl em before, stop readi ng and try again, usi ng thi s i ndi rect appli cation of the method of contradi cti on. Betty cannot be the truar, si nce then both Betty and Laura woul d be tel l ing the truth, contrary to the information that onl y one can be tell­ ing the truth. For the same reason, Steve coul d not be tell ing the truth, si nce then both Steve and Laura would be tel l i ng the truth. Laura can­ not be tel l ing the truth because then, if John i s l ying, Steve i s tel l ing the truth, contrary to the information that onl y one chi l d can be tel l ing Contradiction TZT the truth. The onl y possi bi l ity that i s consi stent wi th the gi ven i nforma­ tion is that John is tel l i ng the truth and Betty, Steve, and Laura are l ying. Gi ven thi s, it is tri vi al to determine that Laura must be the one who broke the vase. One of my al l -time favorite recreational logic probl ems is the famous Smith, Jones, and Robinson problem: Smith, Jones, and Robinson are the brakeman, freman, and engineer of a train, not necessari l y respecti vel y. Today only three passengers are ridi ng thi s train, and, by an extraordi nary coincidence, their last names are the same as the l ast names of the brakeman, freman, and engineer. To distinguish the passengers from the trainmen, l et us refer to the pas­ sengers with the title Mr. -Mr. Smith, Mr. Jones, and Mr. Robi nson. Here is some other relevant i nformation: (A) Mr. Robinson lives i n Detroit. (B) The brakeman l i ves halfway between Chicago and Detroit. (C) The passenger who lives i n Chicago has the same name as the brakeman. (D) The brakeman' s next-door neighbor, one of the passengers, earns exactly three times as much as the brakeman. (E) Mr. Jones earns exactl y $2,000 a year (and collects a l ot of food stamps and welfare payments). (F) Smith beat the freman at bil l iards. Who is the engineer? Stop reading and try to sol ve the probl em, using the method of contradi cti on. The most di rect appl ication of the method of contradiction to thi s problem woul d be to test the three possi bi l i ti es for the name of the engineer -Smith, Jones, or Robinson -agai nst the gi ven informati on. As in the previous probl em, thi s most di rect approach i s not the best, si nce none of the si x statements of information i ncl udes any reference to the engineer. Thus, it i s obvious that if we are to determine the name of the engineer, we must consider some more i ndi rect approach, whi ch frst i nvol ves determi ni ng who mi ght be the brakeman or the freman, who might l i ve next door to whom, who might l ive in what ci ty, and so on. I f you di d not sol ve the probl em thus far, stop reading and try agai n. A mi ni mal expansi on of the search space of alternati ves, using the method of contradi cti on, i s to consi der each of the six possi bi l i ti es T ZZ Chapter Î for the assignment of names to the brakeman, freman, and engineer, as i l l ustrated in the fol l owing tabl e (S -Smith, J =Jones, R -Robi nson): Person Brakeman Fi reman Engi neer Hypotheses 2 3 4 5 6 S S J J R R J R S R S J R J R S J S Now we examine the si x pi eces of i nformation to determi ne which of these si x possi bi l i ti es produces a contradiction and therefore can be el i mi nated from consi deration. In the frst pl ace, hypotheses 3 and 5 can be el i minated, because condition (F) says that Smith beat the freman at bi l l iards ; assumi ng a person cannot beat hi mself, then, Smith cannot be the freman. All but one of the remaining four possi bi l ities can be el i mi nated by verbal reasoning, but it can be a little confusing. A great deal of the i nformation i n the probl em concerns the passen­ gers and, i n particul ar, where they l i ve. Thus, it probabl y would be hel pful to go a step further away from the direct approach to the goal and to try to test vari ous possi bi l i ti es for the assignment of passengers' names to l ocations. If you have not solved the probl em already, stop readi ng and try agai n, using the method of contradiction as applied to the various possi bi l i ti es for home addresses of the three diferent passengers. There are three home addresses for the passengers -Chicago, De­ troi t, and halfway between Chicago and Detroit. Furthermore, one and only one passenger l i ves i n each of the three locations. Since the given i nformation says that Mr. Robinson l i ves in Detroit, there are onl y two remaining possi bi l i ti es for the compl ete assignment of pas­ sengers to home addresses: either Mr. Jones l i ves in Chi cago and Mr. Smith l i ves between Chicago and Detroit or el se Mr. Smith l i ves i n Chi cago and Mr. Jones l i ves between Detroit and Chi cago. Si nce Mr. Jones earns exactly $2, 000, and $2,000 is not di vi si bl e by 3 , and the brakeman' s next-door neighbor earns exactly three times as much as the brakeman, Mr. Jones cannot l i ve next door to the brakeman (half­ way between Chi cago and Detroit). Thus, Mr. Jones must l i ve in Chi ­ cago and Mr. Smith must l i ve halfway between Detroit and Chicago. Si nce Mr. Jones l i ves in Chi cago, the brakeman i s Jones by statement Contradiction TZó (e) . Thi s resul t el i mi nates, by contradi cti on, alternati ves 1 , 2, 5, and 6 in the assignment of names to the three positions of brakeman, freman, and engineer. Si nce we already ruled out alternative 3 , we are l ef wi th onl y alterative 4, consi stent with the given informati on. Thus, Smith i s the engineer (Jones i s the brakeman and Robinson the freman) . It ofen faci l i tates work on recreational logic probl ems of thi s type to set up vari ous tabl es representing what goes with what. In the pres­ ent i nstance, there are two useful tabl es. Fi rst i s a tabl e such as the fol lowing, i nvol ving the assignment of the names (Smith, Jones, Robinson) to the positions (brakeman, freman, engineer) : Brakeman Fi reman Eng i neer Smith Jones Robinson In addition, it i s useful to set up a table assigning passengers' names to home addresses, as fol l ows: Mr. Smith Mr. Jones Mr. Robinson Chi cago Detroi t Hal fway between When you acqui re a piece of information such as that Smith cannot be a freman, you enter no i n the box of the tabl e appropriate to Smith being a freman. When you know from gi ven i nformation that Mr. Robinson l i ves in Detroit, you enter yes in that box of the rel evant tabl e; you also enter no i n every other box i n the same row or col umn of the tabl e, si nce there can be onl y one yes in each row or col umn of such logic tabl es. It is the restriction to onl y one yes in a row or col umn that permits rather powerul use of thi s tabul ar representati on: when­ ever you have a yes i n a row or col umn, you can fl l i n the rest of both the row and the column with nos ; whenever you have two nos i n a row or column, you can fl l in a yes i n the remai ni ng position i n that row or column. Tabul ar representation permits us to draw i nferences qui te TZ4 Chapter Î mechanical l y from previous inferences that are recorded in the tabl e, avoiding compl i cated verbal reasoni ng and the possi bi l i ty of memory l oss. The fni shed versions of these two tabl es for solution of this Smith, Jones, Robi nson probl em are shown i n Fig. 7-2. Smith Jones Robinson Brakeman No Yes No Fi reman No No Yes Eng i neer Yes No No Mr. Smith Mr. Jones Mr. Robinson Chi cago No Yes No Detroi t No No Yes Between Yes No No RGURE 7-2 Final tables for sol ution of Smith, Jones, and Robinson probl em. A fnal probl em of a compl etel y diferent ki nd that i l l ustrates the useful ness of the method of contradiction i s a spatial -puzzle problem that I have called the bowling-pin reversal problem: Six-year-old Heather Phillips set up the ten pi ns for her bowling game at the end of the hall in a manner exactly opposite to the correct confgura­ tion. Before Heather could throw the bowling ball down the hall, her father informed her that she had set up the pins i n the wrong manner and that the pins should have the row of one pin i n front, followed by the row of two pi ns, followed by the row of three, and, fnally, the row of four in the back. Although Heather is given to childish reversal errors of this type when she forgets to put on her thinking cap, she is actually a budding mathematical genius. So, upon being informed of her error, Heather quickly put her thinking cap back on, ran down the hall, and, by moving just three pi ns, was able to reverse the confguration from the given state to the goal state, as illustrated in Fig. 7-3 . How did she do it? ( By the way, Heather assumed that the exact placement of the pins on the foor was not important, so long as the relative placement of the pins with respect to each other was correct. You should assume this also. ) Stop reading, put on your thinking cap (if you do not have it on al­ ready), and try to sol ve the probl em, usi ng the method of contradi ction (not just random trial and error) . Given @ @@ @ @ @ @ @ @ @ FIGURE 7-3 GOal @ @ @ @ @ @ @ @ @ @ Gi ven and goal states for the bowl i ng-pin reversal probl em. TZb To appl y the method of contradi cti on, we need to have a wel l -defned set of possibi lities. The smal l est such set of possibil ities i s to ask where the row of four pi ns wi l l be in the goal state with respect to its position in the given state. There are si x logical possi bi l i ti es for thi s -above the row, with one pi n in the given state; the row with one pi n in the given state wi l l become the row with four pi ns i n the goal state; the row with two pi ns in the given state will become the row with four pins in the goal state; the row with three pins i n the gi ven state will become the row with four pins i n the goal state ; the row with four pins i n the gi ven state wi l l remain the row with four pi ns i n the goal state; or the row of four pi ns i n the goal state wi l l be bel ow the row with four pi ns i n the given state. If you have not solved the probl em thus far, stop readi ng and try agai n, using the method of contradiction to el i mi nate al l but one of the si x possi bi l ities. Cl early, if the row of four pi ns were either above the row of one pi n or below the row of four pi ns i n the given state, we would have to move more than three pi ns i n order to achieve thi s aspect of the goal state alone. Thus, we have contradicted these two possi bi l i ti es. To make the row of one pin i n the given state the row of four pi ns i n the goal state woul d requi re a mi ni mum of three moves to achieve that subgoal al one, pl us the row of two pi ns woul d then have to become the row of three pins, produci ng already more than three moves. This arrangement can­ not be the desired solution. If the row of four pins in the gi ven state i s to remain the row of four pi ns i n the goal state, all si x pi ns above it woul d have to be moved, contradicting the requirement of the pro­ posed solution. Fi nal l y, if the row of three pi ns were to become the row of four pins in the goal state, al l three pi ns above the row of three woul d have to be removed, pl us one of the pi ns in the row of four woul d have to be moved, contradicting the restriction to onl y three moves. This resul t leaves onl y the possibil ity of making the row of two pins in the given state into the row of four pins i n the goal state, produci ng TZb Chapter Î the solution in a fairl y di rect manner : the two extreme pi ns in the row of four are moved to the two extreme positions in the row of two, and the top pi n in the gi ven state is moved to the mi ddl e of the row, bel ow al l the other pi ns, achieving the goal state. Probabl y the most common way of sol vi ng thi s probl em is not to use the method of contradi ction but rather to look for a subset of seven pins i n the gi ven state i n identical positions to seven pi ns i n the goal state. In general , if you did not know the mi ni mum number of moves to transform the given state i nto the goal state i n a probl em of this type, you might l ook for the maximum subset of entities in the given state that were i n identical positions relative to one another to positions i n the goal state. Impl ementation of thi s method i s l argel y a percep­ tual method of scanning the given and goal states looking for matches of (usual l y compact) subsets. CLASSIFICATORY CONTRADICTION­ LARGE SEARCH SPACE In di scussing the method of contradiction above, we were able to conceptualize the probl ems so that there was a rel ativel y small popu­ lation of alternative goal s to deci de among by the method. In the probl ems here, however, the number of alternative specifc goal s is so l arge that contradi cti ng them one at a time would be i mpractical . In such cases, we must use some efcient search strategy for contra­ di cti ng l arge subgroups of alternative goal s at a ti me. To implement this more efcient search, some expl i cit or implicit cl assifcation must be imposed on the alternative goal s , and classes of goal s must be con­ tradicted on the basis of common properties possessed by all of them. In addition, there i s ofel some natural ordering for the contradi ction of diferent cl asses of goals such that it i s easiest to rul e out a particu­ l ar class i n the begi nning, some other cl ass next, another class next, and so on. The rul i ng out of earl i er classes of goal s provides the addi ­ tional information necessary for contradi cti ng subsequent cl asses of goal s. Attempts to rule out cl asses of goal s in orders other than the natural or easi est ordering wi l l usual l y be extremel y difcul t or im­ possi bl e. The coin-weighing probl ems di scussed i n Chapters 3 and 5 are ex­ ampl es of the method of cl assifcatory contradi cti on. Recall that i n the si mpl est of these probl ems, you must determine whi ch of n coins i s the heavy coin, usi ng a beam balance. Weighing one group of coins against another provi des information that contradicts a large cl ass of Contradiction T Z¯ possi bi l i ti es with respect to whi ch coin is the heavy coi n. Whether you consider these probl ems to exempl ify the contradiction of one set of alternati ves or the i mpl ication of the compl ementary set of al­ teratives i s obviousl y completely arbitrary. A cl ass of probl ems somewhat si mi l ar to the coin-weighing probl ems in the need for cl assifcatory contradiction of alternative goal s are the concept-attainment probl ems, of which the fol l owing i s one exampl e: You are given a set of six-place numbers (for example, 792, 674, which is to be read as 7 in place I , 9 in place 2, and so on), some of which are examples of the concept and some of which are not. Concepts are either simple concepts of the form "concept is d i n place p" (that i s , a particular digit d i n a particular place p) or conjunctive concepts of the form "con­ cept i s dl in place ¡, and . . . and di in place p; (that i s, a conjunction of digi ts in particular places). I f the concept were 9 in place 2 and 7 in place 5, then 792, 674 would be an example of the concept, because it meets both necessary conditions. On the other hand, 722, 674 would not be an example of the concept, because it lacks one of the necessary condi­ tions: it does not have a 9 in place 2. Now, determine the conjunctive concept that i s impli ed by the following information concerning some six­ place numbers that are known to be examples and nonexamples of the concept: 1 07, 254 is an example of the concept. 1 57, 254 is an example of the concept. 937, 254 i s an example of the concept. 867, 1 84 is an example of the concept. 295, 684 is not an example of the concept. 367, 497 i s not an example of the concept. Stop reading and try to sol ve the probl em, making cl assifcatory use of the method of contradi cti on. From the frst piece of information that 1 07, 254 i s an exampl e of the concept, we can determi ne that the concept wi l l i ncl ude some combination of the fol l owing si x restrictions : 1 i n place 1 , 0 i n place 2, 7 i n place 3 , 2 i n place 4, 5 i n place 5 , and 4 i n place 6. Rather than test al l 63 diferent subsets of combinations of from one to six of these restrictions, it i s much more efci ent to test each of the si x restri cti ons individual l y -namel y, test whether a concept must i nvol ve the restric­ tion of 1 in pl ace 1 , and so on. This procedure i n essence amounts to testing the cl ass of al l concepts that i nvol ve the restriction 1 i n place 1 . Stop reading and sol ve the probl em, if you have not done so al ready. Cl earl y, the set of concepts that i nvol ve the restriction I i n place 1 is contradi cted, because some of the exampl es do not have the 1 i n TZÛ Chapter Î place 1 . Proceeding in the same manner, we can rul e out al l concepts except those that requi re 7 i n place 3 or 4 in place 6. What information tell s you that both of these restrictions are necessary i n order for a six-place number to be an example of the concept? Thi s i nformation comes from the two nonexampl es of the concept, each of whi ch i l l us­ trates that ei ther of the restrictions in i sol ation i s not sufci ent to make a si x-place number an exampl e of the concept. Both are requi red. Cl assifcatory contradiction i n concept-attai nment problems i s equi valent to deriving some rul es of inference as to whi ch dimensions or pl aces of the exampl es of the concept are relevant to the concept and which are irrelevant. For simple and conjunctive concepts of this type, there are two very simple rul es: (a) If two exampl es of the con­ cept difer in the val ues or digits they have on one or more di mensions ( pl aces), al l of these dimensions are i rrel evant to the concept (that is, not invol ved i n the necessary conditions specifed by the concept) . (b) If an exampl e of the concept and a nonexampl e difer on one and onl y one possi bl y relevant di mensi on, then that di mension is rel evant (and the value of that di mensi on i n the example i s the necessary value). Having derived these rul es of i nference, we can now solve all simple and conjunctive concept-attai nment probl ems i n a very straightforward manner. Cl assifcatory contradiction is essential l y equi val ent to thi s inference method. Letter-arithmetic probl ems, such as that bel ow, ni cel y i l l ustrate how cl assifcatory contradi cti on frequentl y can be combined with drawing inferences to provi de a sol uti on. I got the fol l owi ng probl em from Bartlett ( 1 958) and Simon and Newel l ( 1 97 1 ) , who have studied how people sol ve this probl em: LLNPL L + Ô cÑPL L ÑLÜcÑ ! This problem is to be treated as an exercise in simple addition. All that is known is the following: ( I ) D ¯ J , (2) every number from 0 to 9 has its corresponding letter, (3) each letter must be assigned a number diferent from that given for any other letter. The goal i s to fnd a number for each letter, stating the steps of the process and their order. Stop reading and try to sol ve the probl em. Here you shoul d use cl assi fcatory contradiction, whi ch was sug­ gested as an opti mal method for the concept-attainment probl ems. You shoul d test hypotheses concerning the val ues of each letter, which i s, in essence, the testi ng of cl asses of hypotheses about how all the l et­ ters are assigned to diferent numbers. That i s, in testing the hypothesi s Contradiction TZÜ that Í ÷ 3 , you are, in essence, testi ng the enti re set of possi bl e sol u­ tions t o the probl em i n whi ch Í ÷ 3 and the other letters equal various other digits. If you di d not sol ve the probl em, stop reading and try agai n. By knowing that D ¯ 5, we can infer that T -O. Thus, from the o + E -0 col umn, we know that E must equal 9, there having been a carry of 1 from the previ ous Í + R -B col umn. Si nce there is a carry from D + D ¯ T to the next col umn, we know that L + L + 1 -R must be an odd number. From the D + G -R col umn, we know that R i s a number greater than 5. Thus, R coul d onl y be the number 7, si nce we have rul ed out every other possi bl e hypothesi s. Now, si nce E -9 and A + A must be an even number, we know there had to be a carry from the L + L -R col umn to the A + A -E col umn. Therefore, either A could be 4, so that 4 + 4 + 1 -9, or A coul d be 9, so that 9 + 9 + 1 -1 9, that i s, 9 pl us a carry. However, 9 i s al ready used. Thus , we know that A can onl y be 4. As Si mon and Newel l ( 1 97 1 ) point out, you can pro­ ceed to determine a uni que number for each letter except N, B, and O. For these numbers, you must actually try out t he various combi nations of remai ning di gi ts, 6, 3, and 2, assi gni ng them to the three l etters i n each of the six possi bl e ways and testing whether each assignment i s consistent with the information gi ven i n the probl em. If it i s not, you must try a new assignment of the three remaini ng letters to the three remai ni ng numbers, unti l an assignment i s found that works. Clearly, thi s last stage i s contradiction pure and si mpl e. However, in many of the preceding inferences you used the method of contradi c­ tion: you determined which l etter worked by ruling out al l possi bl e alternative assi gnments of digits to that letter. If you have not done so, you would not have known for sure that the digit that seemed to work was the only digit that woul d work when assigned to that letter. Noti ce, however, that an efcient sol ution to the probl em requi res cl assifca­ tory contradiction. I n thi s probl em, then, you must determine as nearly as possible what number to assign to a gi ven letter, independently of testing hypotheses about the numbers to be assigned to other letters. Thi s procedure di ctates that the val ues of diferent letters must be specifed in a certain order because onl y in certain orders is it possi bl e to determine a unique number to be assigned to each letter. As i l l ustrated in letter-arithmetic probl ems and concept-attainment probl ems, cl assifcatory contradiction i s somewhat anal ogous to the probl em-sol ving method of defni ng subgoal s i n those probl ems re­ qui ring the construction of a long sequence of acti ons i n order to achieve the goal . In contradiction probl ems, the di fcul ty i s not in the long sequence of operations but i n the l arge cl ass of possi bl e hypothe­ ses. But either way, you have i n essence a l arge set of alternati ves to search through. To the extent that you can reduce that search space TóÛ Chapter Î by consi deri ng l arge cl asses of alternati ves at one ti me, it is advan­ tageous to do so. Another probl em that exempl ifes the combi ned use of inferences and contradi cti on to reduce a l arge number of hypotheses to a smal l number i s the integer-path-addition problem: Put the digits I , 2, . . . , 9 into a 3 Y 3 matrix, one digit into each cell, as shown i n Fig. 7-4. Your assignment of digi ts to cells must satisfy two conditions: ( I ) Row I plus row 2 must equal row 3 (considering each row as a three-digit number). (2) The digit i must be located immediately next to (above, below, to the right, or to the left) the digit i- I , for i= 2, . . . , 9. This second condition means you may place the digit I anywhere, but 2 must be placed next to I along a row or column (not diagonally), 3 must be placed next to 2, and so on. This is what is meant by calling thi s prob­ lem an integer-path problem. Stop readi ng and try to sol ve the probl em. You coul d begin by trying various hypotheses as to sequences of fl l ing the digits Í , 2, . . . 9 in the ni ne cel l s, but thi s process is long, slow, and chancey. A few i nferences that can be made from the above information allow one to greatl y reduce the number of hypotheses that must be tested. Probabl y the most important i nsight to gain at an earl y stage in working on the probl em is to notice that, of two cel l s adjacent along a row or col umn, one must be fl l ed with an odd number and the other with an even number. This arrangement can easi l y be proved by taking any two cel l s adjacent along a row or col umn and considering al l possi bl e paths from one cell to the other, real izing that an odd num­ ber of steps is required to reach the adjacent cel l no matter what path is taken wi thin the matri x. However, it is real l y not essential to prove the theorem that cel l s adjacent along a row or col umn must have one Row I Row 2 Row 3 FIGURE 7-4 The 3 ? 3 matri x for the integer­ path-addi tion probl em. Contradiction TóT even number and one odd number. A l arge number of trial-and-error substitutions executed within a few mi nutes wi l l show that the theorem i s almost certai nl y true. Stop reading and try to solve the probl em, if you could not do so before. Given t hi s theorem, you can now easi l y determine that the corner cel l s and the center cell must be fl l ed with odd numbers and the other four cells with even numbers, since there are fve corner plus center cells and fve odd numbers. Thi s determi nation cuts the number of alterati ves per cell of the matrix approximatel y i n half, greatl y re­ ducing the search space. The restrictions might well be represented i n a fgure such as Fig. 7- 5. If you have not yet solved the probl em, stop readi ng and try agai n. The astute probl em sol ver might infer that i n the right-hand col umn an odd pl us an even number i s equal to an odd number, but i n the mi d­ dl e col umn an even number pl us an odd number i s equal to an even Row l Row 2 Row 3 No carr Car Odd Even Even Odd Odd Even Odd Even Odd FIGURE 7-5 Restrictions on the digits that can fll cel l s in the integer-path-addition probl em. The restrictions come from considering that the four corner cel l s and the center cel l must have odd digi ts in them. number. Thi s event can onl y happen if there was a carry of I from the right-hand col umn to the mi ddl e col umn. Thus, we know that the sum of the two upper digits i n the right-hand col umn must be greater than 1 0. Furthermore, in the l ef-hand col umn, an odd number plus an even number i s equal to an odd number. Thus , there can be no carryover from the middl e col umn to the left-hand col umn. Hence, the sum of the top two digits i n the mi ddl e col umn i s 9 or l ess. Al so, we know that the sum of the digits i n the lef-hand column i s 9 or l ess, si nce by given information there can be no carry from the left-hand col umn. With al l these restrictions, it i s a rel ati vel y si mpl e matter to consi der the smal l number of hypotheses that are consi stent with these restri cti ons. Perhaps the easi est way to proceed i s to focus on the mi ddl e col umn and test al l the hypotheses that are consi stent with the i nformati on. T óZ Chapter Î Thi s procedure means actual l y testing onl y si x possibl e assignments of digits to the top two cel l s in the mi ddl e col umn -namel y, 6- 1 , 4-3 , 4- 1 , 2-5 , 2-3 , and 2- 1 . It is a rel ativel y si mpl e matter to check each of these assi gnments to see if any path of i ntegers consistent with the assignment coul d solve the probl em. It turs out that only the 2-3 assignment of digits to the top two cel l s i n the mi ddl e col umn wi l l sol ve the probl em. Furthermore, thi s assignment wi l l sol ve the probl em in onl y one way, namel y, the way shown in Fig. 7- 6. Kow I I 2 Kow 2 4 3 KOw 3 5 6 9 8 7 FIGURE 7-6 The sol ution to the integer-path-addition probl em. Another exampl e of cl assifcatory contradiction combined with i nferences based on numerical properti es is provided by the lonesome­ eight problem, which was originated by Chessin ( 1 954) : Determine all of the digits represented by Ä in the following long division and also determine the remaining four digits of the fve-di gi t answer of which 8 is the third digit, as shown i n Fig. 7-7. Stop readi ng and try to sol ve the probl em, usi ng the method of con­ tradiction to draw inferences. 8 X X X X X X X X X X X X X X X X X X X X X X X X X X X X X RGURE 7-7 The l onesome-eight probl em. Since 8 times the di vi sor i s a three-digit number, we know that the di vi sor must be 1 24 or l ess because 8 ³ ( 1 25 + z) -1 , 000 + 8z, which i s a contradi cti on for al l z � o. We can al so determine that the last digit of the quotient must be 9, si :;e 8 times the di vi sor would equal a three- Contradiction Tóó di gi t number (a contradi ction) . The i nitial digit of the quotient must be greater than 7, because 7 ti mes any number l ess than or equal to 1 24 would leave a remainder that was greater than a two-digit number when subtracted from the di vidend (a contradi cti on). Now stop readi ng and try to solve for the rest of the unknown digits, if you di d not before. Si nce t he frst digit of t he quotient mul ti pl i ed by the divi sor equal s a three-digit number, we know it cannot be 9. Thus, the frst digit of the quotient must be 8. The second and fourth digits of the quotient must be zero, because i n both cases two di gi ts from the di vidend were brought down in the work below. So we have the quotient, 80809. Maki ng use of the two pl aces i n the work bel ow the di vi dend i n the long di vi sion where diferences are 99 or less (two-digit numbers) , we can determi ne that 8 ti mes the di vi sor must be a number between 990 and 999. The only di vi sor that will multiply by 8 to give a number between 990 and 999 i s the di vi sor, 1 24. Numbers 1 23 and l ess are rejected by contradi cti on. Thus, we have the quotient and the di vi sor, and from them we can determine the di vidend and al l of the val ues of X i n the work underneath the divi dend (see Fi g. 7-8) . Note that in solving thi s probl em, we contradicted l arge cl asses of hypotheses (solutions) i n making each inference. 8 0 8 0 9 1 2 4 1 0 0 2 0 3 1 6 9 9 2 1 0 0 3 9 9 2 1 1 1 6 1 1 1 6 FIGURE 7-8 The sol uti on to the l onesome-eight probl em. ITERATIVE CONTRADICTION IN INFINITE SEARCH SPACES Occasional l y, the method of contradiction can be used in probl ems that have (initial l y, at least) an i nfni te number of possi bl e sol utions. Natural l y, i n such cases, it i s necessary to rul e out large or i nfnite cl asses of alternati ves by the method of contradi ction. A parti cul arl y simple exampl e of the use of cl assi fcatory contradi cti on is aforded by the fol l owing probl em adapted from Polya ( 1 957) . In numbering t he pages of a book, a printer used 3 , 289 digits. How many pages were in the book, assuming that the frst page in the book was numbered Ì ? Tó4 Chapter Î Stop reading and try to sol ve the probl em, using the method of clas­ sifcatory contradiction to rul e out al l but one of the infnity of positive i nteger answers. Of course, there are not an infnite number of solutions to the prob­ l em, once one draws a relati vel y trivial i nference. The inference is that the number of pages cannot possi bl y be greater than the number of digits ( 3 , 289) , si nce at least one digit has to be used to number each page. Thus, we might wel l regard thi s probl em as an example of clas­ sifcatory contradiction in a large, but fnite, search space. However, since an inference using the method of contradiction was necessary in order to make the search space fni te, it seems appropriate to con­ si der the probl em as having an initial l y i nfnite search space. The probl em provi des a parti cul arly si mpl e exampl e of the use of the iterative method of taking a prel i mi nary estimate of the goal , determining the magnitude of the error of the estimate from the goal , then moving in the di rection of the goal to obtain another esti mate along with a magnitude of error, and so on, hoping ul timatel y to con­ verge upon the goal . Since the prel i minary esti mates are contradicted by the given informati on, I thi nk it i s useful to consider iterati ve methods as a subcl ass of the method of contradiction, a subclass that i s parti cul arly useful in solving probl ems with infnite (but ordered) search spaces. Now stop reading, and try again to solve the probl em, if you coul d not before. To make the iterative sol ution to thi s probl em cl ear, imagine that we start with a prel i mi nary esti mate of nine pages for the book. Each page is a singl e digit. Thus , ni ne digits i n all would be used to number the book. Thi s number i s cl earl y too l ittl e, so we move up to the end of the two-digit numbers, namely, the number 99. Numbering 99 pages requi res 9 single digits pl us 90 two-digit numbers , for a total of 9 + 1 80, or 1 89 digi ts. The number 1 89 i s st i l l substantial l y below t he number 3 , 289. Thus, for our next esti mate we wil l take the end of the three­ digit numbers, namely, 999. Numbering 999 pages uses nine si ngle­ digit numbers plus 90 two-digit numbers plus 900 three-digit numbers , for a total of 9 + 1 80 + 2700, or 2, 889 di gi ts. Thi s fgure is quite cl ose to the target of 3 , 289 digits , so we are encouraged perhaps to try a more di rect analytic method at thi s point, namely, subtracting 2, 889 from 3 , 289 t o obtain an additional 400 digits that are needed t o achieve the goal . ( However, note that one coul d continue to use the iterati ve method. ) Since we are now in the four-digit numbers, each page will requi re four di gi ts. Thus, to use 400 more digits will requi re 1 00 ad­ ditional pages. Thus, we can infer that we must add 1 00 to 999 to ob­ tain an answer of 1 , 099 pages i n the book, i n order to use up 3 , 289 digits. Contradiction Tób Iterative methods are frequentl y used i n the numerical sol uti on for roots of equations. For exampl e, consi der the fol l owing probl em: Determine t he roots ( permi ssi bl e values of x) that satisfy the equation x6 - 4. + 2x + 3.l - 7 x2 + 1 3x - 30 ¯ O. Stop reading and attempt to specify an iterative method of contra­ diction, by whi ch one might determine each root ( permi ssi bl e val ue of x) for the preceding equation. You can assume that you have a com­ puter at your di sposal to carry out the l arge number of steps that might be requi red in order to converge upon each real solution for thi s equation. For l arge enough positive or negative val ues of x, the x6 term i n the expression must domi nate the rest of the expressi on (be greater than the sum of all the other terms i n the expressi on). Thus , for sufci entl y large posi ti ve or negative val ues of x, the expressi on x6 - 4x" + 2x4 + 3x3 - 7x2 + l 3x - 3 0 must be greater than zero and monotoni cal l y i n­ creasi ng for more extreme posi ti ve or negati ve val ues of x. Thus , t here can be no real sol utions for val ues of x more extreme than these points at whi ch the expressi on x6 - 4x5 + 2X4 + 3x3 - 7x2 + 1 3x - 30 begi ns from a positive val ue monotonical l y to i ncrease without l i mi t. You can either determine these poi nts or else you can make a safe guess, based on the values of the coefci ents of the terms i n the expressi on. In this case, you might assume the function to be monotoni cal l y i ncreasing above zero for values of x greater than 1 , 000 or l ess than -1 , 000. As­ sumi ng that the function i s monotoni cal l y i ncreasi ng from posi ti ve val ­ ues beyond thi s range (-1 ,000 : x : + 1 ,000) , we know that there can be no zero crossi ngs ( roots) beyond the i nterval from -1 , 000 : x : + 1 ,000. Thus, by contradi cti on, we have ruled out al l val ues of x greater than 1 , 000 or l ess than -1 , 000. In the present case, we have not done thi s i n a careful way, but we coul d. Now stop readi ng and try to solve the probl em again, if you fai l ed to sketch an iterative sol ution method before. To determi ne sol uti ons for x within the i nterval from x --1 , 000 to x -+ 1 , 000, you may defne a step size such that you think it unl i kel y that there woul d be two diferent sol utions wi thi n a si ngl e step. In the present instance, we might choose a step size of unity, though i f we wi sh t o be more careful we might pi ck steps of l ess than unity. Now we can eval uate the function x5 - 4x 5 + 2x4 + 3x3 - 7x2 + 1 3x - 30 at al l i nteger val ues of x from -1 ,000 to + 1 , 000 to see if the val ue of the function changes si des (goes from minus to pl us or pl us to minus) over the step. I f the function does change signs, we know that there Tób Chapter Î is a sol ution (zero crossi ng) wi thi n the interval defned by that step. We can then proceed to use essential l y the same method (but more efcientl y di vidi ng the remaining interval in half each time) to deter­ mine the val ue of the solution to as fne a degree of approximation as we wi sh. If the function does not change sign over a step, we assume that there i s no sol ution wi thi n that interval . This numerical method for sol ving higher order equations is called the halfinterval search technique and i s an excel l ent example of the i terative use of the method of contradi ction. The half-i nterval search techni que uses contradi cti on, because we consider all i nterval s over which the function does not change sign not to contain a solution to the equation, si nce a solution i nvol ves a zero crossi ng ( passage from a pl us value of the function to a mi nus val ue of the function) . The ab­ sence of a change from pl us to mi nus over some interval contradicts the possi bi l ity of a solution in that interval , provided we are justifed i n assuming that the function cannot have two zero crossings over that i nterval . If there were some reason to doubt the val idity of this for the chosen step si ze, you coul d always choose a smaller step size to see if more real roots would be di scovered, using the smaller step si ze. Naturally, if you already found n real roots for an equation of degree n, then you know you have obtained all the roots that it is possibl e to obtai n, since there are at most n real roots for an equation of degree n. Obvi ousl y, t he same iterative method can be used t o solve other types of equations invol ving logs, exponents, trig functions, and the l ike. In additi on, you can defne multi di mensional analogs of the above iterative method to sol ve sets of several equations with several un­ knowns. However, it clearl y gets more and more time consuming and more and more compl icated, the greater the number of unknowns or the more compl ex the functions. Û Working Backward THEORY The method of working backward is si mi l ar to the method of contra­ diction (Chapter 7) and the method of drawing inferences about the goal (Chapter 3) i n that all three focus on the goal to a great extent and consider it rather than the gi vens as the starting point for the probl em-solving process. However, worki ng backward difers i n the way the goal i s consi dered i n relation to the gi vens. With the methods of contradiction and drawing inferences from the goal , the goal i s considered t o be part of the given i nformati on, and we attempt to derive consequences from the goal i n conjunction with the gi vens. Thus, the di rection of inference i s from the goal statement to some new statements. In working backward, the goal i s not consi dered to be a piece of given i nformati on. We start with the goal , but i nstead of draw­ ing inferences from it, we try to guess a precedi ng statement or state­ ments that, taken together, woul d i mpl y the goal statement. Hence, the di rection of inference i s the same as i n working forward -namel y, from the gi ven information to the goal . We start at the end poi nt and TóÛ Chapter 8 try to determine precedi ng statements, which need not necessari l y be gi ven statements but which, when taken together, wi l l produce the goal . Then we try to determine other statements that will imply those statements, gradual l y working our way back. We hope to arrive at given information that is sufcient to derive everything in between the gi vens and the goal . Why shoul d we want to reverse di rection l ike thi s, proceeding from the goal to the gi vens rather than from the gi vens to the goal ? When is thi s method more appropriate than working forward, and why? That i s, whi ch probl ems are appropriate for working backward and which for working forward? The method of working backward is l i kel y to be useful if a probl em satisfes two criteria. The frst i s that the probl em shoul d have a uniquely specied goal, as is the case for al l proof probl ems. When­ ever there i s a si ngl e, cl early, and completel y specifed goal stated in the probl em, you shoul d seriousl y consi der the possibil ity of working backward. This approach i s particularly true if, i n contrast to the si ngle goal statement, there are l arge numbers of given statements. Newel l , Shaw, and Si mon ( 1 962) have cl earl y stated the advantage to working backward i n such probl ems - namely, there i s no ambiguity as to what statement to start wi th when you work backward, whereas such am­ biguity is consi derable when you work forward. As they so aptly put it, working forward i n such a probl em is analogous to fnding a needle i n a haystack, whereas worki ng backward i s anal ogous to the needle fnding i ts way out of the haystack. You can start from many places outside the haystack i n tryi ng to fnd the single location of the needle. By contrast, the needle starting i n a singl e location can solve the probl em of getting out of the haystack by getting to an extremel y l arge number of alternative locations outside the haystack. I n the needle-in-the-haystack probl em, the l arge number of givens have a disjunctive relationshi p to one another. That is, to solve the probl em, you need to get from any one of these gi vens to the goal or by the method of working backward from the goal to any one of the l arge number of diferent gi vens. In many probl ems to which the method of working backward i s appropriate, such as proof probl ems, the gi vens have a conjunctive relationshi p to one another. That i s, you must use several of the givens to derive the goal . Thus, in the method of worki ng backward, it wi l l usual l y be necessary to work backward from the goal to get to several of the gi vens rather than to onl y one of the givens. Neverthel ess, the method is frequently very useful in such probl ems, because the uni que starting point so fre- Working Backward T óÜ quently di rects you to just those aspects of the gi ven i nformation that are relevant to the sol uti on. In probl ems where the goal i s not so clearly and compl etel y speci fed and there are, i n fact, a vari ety of possi bl e alternati ve goal s, the ad­ vantages of worki ng backward are ofen l argel y el i mi nated. In the case of inference probl ems (probl ems with nondestructive operati ons) , the method of working backward i s suitable for proof probl ems but not general l y for fnd probl ems. I n the case of action probl ems (probl ems with destructi ve operations), essential l y the same di sti ncti on appl i es ­ namel y, probl ems with uni quel y specifed goal s are appropriate for the method of worki ng backward and probl ems in whi ch only certain characteri stics of the goal are specifed are general l y not so suitabl e. The second criterion of a probl em as to whether the method of working backward is appl i cabl e concerns the nature of the operati ons specifed i n the probl em. If al l operati ons are unary and one-to-one operations, the method of worki ng backward i s l i kel y to be hel pful . U nar operations are operati ons that take one gi ven i nput statement and produce one output statement. One-to-one operations are opera­ tions for whi ch i t is possi bl e to uni quel y determine what i nput state­ ments produce the output statement. (These concepts wi l l be di scussed i n more detai l i n Chapter 1 0. ) Si nce a wel l -defned unary operation yields a unique output statement for each input statement, there is no ambiguity concerning the resul t of an operation when appl i ed to some state worki ng forward. However, a wel l -defned unary operati on need not be one-to-one; several di ferent i nput statements coul d pro­ duce the same output statement. Thus, when operati ons are not one­ to-one, worki ng backward may lead to a more rapi dl y branchi ng tree of possi bl e action sequences than wi l l working forward. I n such cases, i t would general l y be preferabl e t o work forward. The unary property of an operation is not as i mportant as the one­ to-one property for the appl i cabi l i ty of worki ng backward. Bi nary or ternary operati ons take two or three i nput statements and produce one output statement. U si ng the method of worki ng backward, i t would be necessary, gi ven the output statement, to produce two or three i nput statements. Superfci al l y, thi s might seem si mi l ar to what happens usi ng the method of worki ng backward when operati ons are not one-to-one. However, there is an i mportant di ference -wi th bi nary or ternary operations, the i nference process i s essenti al l y as compl i ­ cated when working forward as when worki ng backward. A conjunc­ ti on of i nputs i s related to a si ngl e output statement wi th bi nary or ternary operations, and thi s i s equal l y true worki ng forward or back- T4Û Chapter 8 ward. From one probl em to another, it may be more or l ess difcul t to work i n the forward or the backward di rection, but the exi stence of binary or ternary operations does not necessari l y i nvalidate work­ ing backward, and exampl es of working backward i n just such probl ems wi l l be gi ven l ater in the chapter. I n such probl ems, working backward general l y resul ts in a set of subgoal statements , and then the subgoal statements are frequentl y derived from the gi vens by working either forward or backward. By contrast, with operati ons that are not one-to-one, working back­ ward generates a mul ti pl i city of alternative input statements that are di sjuncti vel y related one to another (rather than conjuncti vel y rel ated) . Thus , you are generati ng a l arge set of alternative prior statements, using the method of working backward, whi ch would never be present using the method of working forward, since only one of these state­ ments i s necessary i n order to derive the goal (not a conjunction of several or all of them) . Thus, the more critical property of an operation that is benefcial for worki ng backward i s the one-to-one property, whi l e the unary property faci l itates probl em sol ving working either forward or backward. Another way to state the critical one-to-one property of operations desi rable for working backward i s to say that the operations in the probl em should admit of the possi bi l i ty of defni ng i nverse operations. Inverse operations are operations that go from the output statement back to the i nput statement and reverse the efect of some given opera­ tion. Whenever operati ons are one-to-one, it is possi bl e to specify wel l -defned i nverse operati ons that wi l l uni quel y produce the input statement from the output statement. Cl earl y, if the original l y specifed operati ons were not one-to-one, a mul ti pl i ci ty of diferent input state­ ments woul d produce the same output statement, and you would have no way of defni ng an i nverse operation that uniquel y specifed a single i nput statement that produced some output statement. If ei ther action or inference probl ems specify a si ngle goal and if the operations specifed in the probl em are one-to-one (admit the defni­ tion of inverse operations), working backward wi l l ofen, though not always, be preferabl e to working forward. However, where one or both of these criteria are not sati sfed by a probl em, working backward wi l l l i kel y be inferior to working forward. Hence, working backward i s by no means uni versal l y preferabl e to working forward. In fact, i n my experience, it i s general l y more di fcul t to work backward than to work forward. Neverthel ess, there is a l arge cl ass of probl ems to which the method of working backward i s appropriate, and some examples are given i n the fol lowi ng secti on. T4T APPLICATIONS Action Problems The fol l owing cl ever l ittl e doubling-game problem i l l ustrates the use­ ful ness of working backward: Three peopl e pl ay a game in whi ch one person l oses and twq peopl e win each game. The one who l oses must doubl e the amount of money that each of the other two pl ayers has at that ti me. The t hree players agree to pl ay three games. At the end of the three games, each pl ayer has l ost one game and each person has $8. What was the origi nal stake of each pl ayer? Ofand, it seems as if there i s i nsufcient i nformation to determine the answer. However, because the players all fnish with the same amount of money, $8, it i s possible to compute thei r original stake by working backward. We wi l l label the frst l osi ng pl ayer PI , the second P2 , and the thi rd P 3 • Stop reading and try to sol ve the probl em, if you did not do so before. At the end of game 3, PI ' P 2 , and p;! each had $8. Working backward to the end of game 2, PI must have had $4 and P 2 $4, si nce both won in game 3 (Pa l ost) , and thus both had thei r stakes doubl ed by the re­ suIts of game 3 . Si nce PI and P 2 each gained $4 i n game 3 , P 3 must have lost $8 i n game 3 , so Pa had $ 1 6 at the end of game 2. Now work back­ ward to determine the stakes of each player at the beginning of game 1 , if you di d not sol ve the probl em before. The complete solution obtained by working backward is shown in Fig. 8- 1 , where we observe that in the beginning PI had $ 1 3 , P 2 had $7, and Pa had $4. Note that i f the players di d not al l end with t he same amount of money, it would be i mpossi bl e to determine what each pl ayer started with, because the order in which the pl ayers won and lost would make /| /z / ¸ End of game 3 $ 8 $ 8 $ 8 End of game 2 4 4 1 6 End of game 1 2 1 4 8 Begi nni ng 1 3 7 4 FIGURE 8-1 Working backward to sol ve a doubl ing-game probl em. T4Z Chapter 8 a diference. However, in the present i nstance, the order in which the pl ayers won or lost games makes no diference to determi ni ng the i nitial stake of each pl ayer. Al so, if you had names for the pl ayers , you could not tel l which pl ayer started wi th $ 1 3 , whi ch with $7, and whi ch with $4, unless you know the order i n which they won. Here I simply named the frst l osi ng pl ayer PI ' the second P2 , and the thi rd Pa • Thi s was completely ade­ quate, si nce the stated goal did not require pairing stakes with named pl ayers. Thi s doubl i ng-game probl em i s an extreme exampl e of the useful ­ ness of worki ng backward, si nce it i s essential l y i mpossi bl e to sol ve the probl em, except by worki ng backward. The reason i s that there i s a uni quel y defned goal , but no given state i s specifed at al l . In fact, the probl em i s to derive a gi ven state that, i n conjunction with the op­ erati ons, will produce the goal state. Al though the operati ons are stated i n a forward di rection, they easi l y admit the defnition of unique in­ verse operations (in whi ch two people have thei r stakes cut i n half and the other person has his stake i ncreased by the sum of the amounts the others were decreased) . Thus, it i s clear we must use the method of working backward to sol ve thi s probl em. Of course, it is equal l y correct to say that what we have done i s to transform the operations i nto i nverse operations and reverse the goal and the givens, taking the goal as the givens and attempting to derive the given state from the goal . Obvi ousl y, it makes l ittle diference which way we describe what was done in thi s probl em, si nce what was done is preci sel y the same under either description. Nim games are games in which each pl ayer takes away tokens sub­ ject to a variety of restri cti ons and tries to be the l ast -or not the last -to take a token. N i m games provide excel l ent examples of the useful ness of working backward to determine optimal strategy. One exampl e i s the fol l owi ng: Fi feen penni es are pl aced on a tabl e i n front of two pl ayers. Each pl ayer is allowed to remove at l east one penny but not more than fve penni es at his tum. The players al ternate turns, each removi ng from one to fve penni es H number of turns, unti l one pl ayer takes the last penny on the tabl e, and wi ns all 1 5 penni es. I s there a method of pl ay that wi l l guarantee vi ctory? I f so, what i s i t ? Stop readi ng and try to determine the optimal strategy by working backward. If you conjecture yoursel f to be in the goal state, this state would Working Backward T4ó cl earl y consi st of bei ng the pl ayer whose turn it i s to move, and there being anywhere from one to fve penni es on the tabl e. I n thi s state, you coul d take al l of the penni es lef on the tabl e and be the wi nner. Now, can you work backward from thi s set of possi bl e goal states to conjecture a preceding state for the other pl ayer in whi ch, no matter what that pl ayer does, you wi l l be i n one of these desi rable goal states ? Stop reading and try t o sol ve the probl em, i f you di d not before. It i s clear that i f you confronted t he opposing pl ayer wi th si x penni es on the preceding turn, no matter how many penni es he or she took (from one to fve, there would sti l l be from one to fve penni es on the tabl e when it was your turn, giving you the vi ctory. Thus , you shoul d try to confront your opponent with si x penni es after your move. But you cannot do thi s on your frst move, so you must work backward again and ask what previ ous position you woul d have to put your opponent in so that, no matter what he di d, you coul d have six penni es on the tabl e afer your move. Now stop reading and try to sol ve the rest of the probl em, if you di d not before. Some thought reveal s that i f you coul d confront t he opposing pl ayer with 1 2 penni es after Y 9 ur precedi ng move, then no matter how many pennies he took (from 1 to 5), you woul d be able to take enough penni es to confront hi m with 6 penni es on the next turn. Thus, you want to confront your opponent with 1 2 penni es, and you can do that on your frst move by removing 3 penni es from the board. Note that in thi s nim probl em, there was no uni quel y defned goal state, si nce the goal i s to take the last one, two, three, four, or fve pen­ ni es on the tabl e, and we cannot know whi ch of these moves woul d con­ stitute the goal i n any ni m game we might wi n. Of course, you coul d easily transform thi s into the unique goal of facing your opponent with a tabl e having zero penni es on it on the other pl ayer ' s next move. Ob­ viousl y, it makes no di ference whi ch way you l ook at thi s ni m probl em. The point i s that working backward wi l l frequentl y generate many possi bl e preceding states, and thi s fact does not necessari l y i nval i date the method of worki ng backward. I n the present i nstance, working backward two steps yields a uni yue preceding number of pennies that you should confront your opponent with on the turn prior to your opponent' s l ast turn -namel y, si x penni es. Thus, consideri ng onl y your own sequence of moves (rather than al l the di ferent moves that might be made by your opponent) , we see that the method of working backward in the present probl em proceeds to get one precedi ng state from one succeeding state. Thus , working i n the backward di rection, the tree of possi bl e states i s conti nual l y bei ng pi nched back to a si ngle state at every alternate move. The lesson i n thi s probl em i s that you T44 Chapter 8 shoul d not be too easi l y di scouraged from working backward by a mul ti pl i city of preceding states, if thi s mul ti pl i city is onl y a temporary phenomenon or a one-time sprouting of branches of the tree fol l owed by no further i ncrease. We cannot expect the method of working back­ ward to always produce a single one-to-one chain of states back to the gi vens from the goal with no alternati ves to i nvestigate. Working backward was preferabl e to working forward in the preced­ ing probl em because the number of diferent action sequences that had to be consi dered working backward was consi derabl y smaller than the number that had to be consi dered working forward. You coul d, at every alternate move, determine a unique state i n whi ch you shoul d be. Thi s i s the most general criterion for the appl icabil ity of working backward, namely, that it produce a smal l er space of alternative action sequences than would be produced by working forward. Sometimes working backward i s preferable to working forward because it pro­ duces a smal l er set of action sequences to consi der when combined with a hi l l -cl i mbi ng approach. An exampl e i s provided by the fol l owing checker-rearrangement problem: On an infnitely extended checkerboard, one is gi ven three bl ack checkers and two white checkers i ni ti al l y pl aced in immedi atel y adjacent squares on a si ngl e row, proceedi ng from lef to ri ght, as shown in Fi g. 8-2: bl ack ( B) , white ( W) , bl ack, white, bl ack. The probl em i s to transform this arrangement of al ternati ng bl ack and white checkers into an arrange­ ment in which all three bl ack checkers are on the lef and both whi te checkers are on the right (BBBWW) , with all checkers bei ng in adjacent squares and i n the same row ( see Fig. 8-2). The al l owabl e operati on i s to move two adjacent checkers at a time, one of which must be a bl ack checker and one a whi te checker. During a move, the two checkers bei ng moved must remain together at al l ti mes, with no reversal of their l ef­ to-right order. You are permi tted to move a whi te-bl ack or bl ack-white pair of checkers to any adjacent pair of unoccupied squares al ong the same line. Note that there i s no need to keep the checkers that are not being moved i n immediatel y adjacent squares at any time. That i s, there may be unoccupied squares between checkers at various stages between the gi vens and the goal . Al so note that the fve checkers i n the goal state need not occupy the same fve squares on the checkerboard as they di d i n the gi ven state. They may occupy any immediatel y adjacent fve squares in the same row. Stop reading and try to sol ve the probl em by defning an evaluation function and then using the method of working backward in conjunc­ tion with a hil l -cl i mbi ng approach on thi s eval uation function. T 4b Given Goal ´ ´ f 0 0 FIGURE 8-2 Gi ven and goal slates for the checker-rearrangement probl em. Thi s probl em is frustrating to solve by working forward because there are many possi bl e moves at each point, and i t is not at al l clear how to hi ll -cl i mb in order to get to the goal . Nor is it clear what sub­ goals you ought to set on the way to the goal statement. By contrast, working in the backward di recti on, there is onl y one pair of checkers that can be moved initial l y, namel y, the thi rd and fourth checkers in the row. After the frst move, the number of possi bi l i ti es at each move i s also more limited than would happen if you moved in the forward di rection. The solution is relativel y easy to obtain working backward because of this much greater restriction in the number of possi bl e i ni tial moves. Now stop reading and try to solve the probl em, if you did not do so al ready. In addition to working backward, it i s hel pful to defne as an evalua­ tion function the number of immediatel y adjacent bl ack-white and white-black pairs of checkers. This eval uation function has a value of I in the goal state and a value of 4 bl ack-white or white-black transi­ tions in the given state. Choosing actions that i ncrease this evaluation function i s of some assi stance in narrowing the space of possi bl e moves in worki ng backward from the goal state to the given state. Of course, i t i s al so of some help in working forward. However, the probl em i s considerabl y easi er working backward, because of the greater restric­ tion of initial moves from the goal state, as wel l as for psychological reasons pecul iar to this probl em. An optimal sol ution to the probl em, along with the val ues of the evaluation functi on for each state, is shown i n Fi g. 8- 3 . Because of the greater restriction of i nitial moves starti ng from the goal and worki ng backward than starti ng from the givens and working forward, working backward was cl early i ndicated in thi s checker­ rearrangement probl em. However, even if there is no reason to prefer the method of working backward in a probl em, you shoul d always consider its use whenever there is no reason to prefer working forward. That is, there are many probl ems in whi ch it may not be obvi ous a priori which method, working backward or working forward, is superior; i n such cases, you might wel l try working forward and, i f it di d not seem to be working out wel l , then try working backward. T 4b Goal ß ß e 8 B 1 ® lJ ß B b f ® B ® Evaluation function (No. of BW or WB transitions) ®® ® 2 8 ® 3 3 Given B e B ® } 4 FIGURE 8-3 Working backward to solve the checker-rearrangement probl em. As a fnal exampl e of the method of working backward, l et us con­ sider a water jar problem: Given a jar that will hold exactly 7 quarts of water, a jar that will hold exactly 3 quarts of water, no other containers holding water, but an in­ fnite supply of water, describe a sequence CÍ fllings and emptyings of water jars that will result in achieving J quarts of water. Stop readi ng and try to solve the probl em, working backward. Obviousl y, at the goal state we wi l l have 5 quarts of water in the 7-quart jar. There are several ways thi s might be achieved from a pre­ ceding state, working backward. Fi rst, we might have 2 quarts of water i n the 7 -quart jar and pour i n 3 quarts from the 3-quart jar. Second, we might have 3 quarts of water i n the 7-quart jar and pour in 2 quarts from the 3-quart jar (this seems l ess l ikely than the frst alternative). Third, we might have 4 quarts of water in the 7 -quart jar and pour in Ì quart from the 3-quart jar. Fourth, we might have 7 quarts in the 7-quart container, ¡ quart in the 3-quart container, and pour of 2 quarts i nto the 3-quart jar. Fifth, we might have 6 quarts in the 7-quart jar, 2 quarts in the 3- quart jar. and pour of I quart i nto the 3-quart jar. Now stop readi ng, and try to sol ve the probl em, if you have not done so al ready. Of the fve alternati ves for the state preceding the goal , the frst and the fourth are the most pl ausi bl e, since they i nvol ve quantities of water in one or the other jars that are easy to achieve -namel y, 7 quarts i n the 7-quart jar and 3 quarts i n the 3-quart jar. Thus, we might wel l confne our attention to these two possi bilities, at least Working Backward T4¯ initially. Now stop reading and try to sol ve the probl em, if you have not done so al ready. Al though it i s possi bl e to achi eve 2 quarts i n a 7-quart jar as a sub­ goal and fl l up the 3-quart jar as speci fed in the frst alternative, the fourth alternati ve i s actual l y optimal . Working backward from the fourth alternati ve (7 quarts in a 7-quart jar and I quart i n the 3-quart jar) , we set as the subgoal the achi evement of 1 quart in ei ther jar. It is probabl y not particularl y useful to conti nue working backward any l onger afer having defned these fve alternative precedi ng states. Rather, in attempting t o achieve any one of these states, such as 7 quarts in the 7-quart jar and I quart in the 3-quart jar, it is probabl y most useful t o set t hi s as a subgoal and work forward from t he gi ven information in order to achi eve i t. I n the present i nstance, it i s quite easy to achi eve the state of having 7 quarts i n the 7-quart jar and 1 quart in the 3-quart jar. To achi eve I quart in the 3-quart jar, fl l the 7-quart jar, pour of 3 quarts two successi ve ti mes to achi eve 1 quart in the 7 -quart jar, then transfer thi s I quart to the 3-quart jar. Now fl l up the 7-quart jar, and the subgoal i s achi eved. Afer thi s, of course, it is si mpl e to pour of 2 quarts from the 7-quart jar into the 3-quart jar, which al ready contains 1 quart. Thi s leaves 5 quarts in the 7-quart jar, which is the goal . Thi s probl em ni cel y i l l ustrates how working backward can permit you to defne a subgoal , whi ch you can then achieve by worki ng for­ ward. This pattern is typical of worki ng backward in both action prob­ lems such as this one and the inference probl ems to be di scussed next. Inference Probl ems As a simple example of the method of working backward i n inference problems, consider the fol l owing proof probl em: If ¬ ~ 0 and B ~ 0, t hen A2 AB + B2 ~ O. The theorem is actual l y true for al l ¬ and B, but it needl essl y compl icates the exampl e of the method of working backward to consider the more general case. Thus, restrict the proof to the case where A ~ 0 and B ~ O. Stop readi ng and try to solve the probl em by working backward. To appl y the method of working backward, we frst state the concl u­ sion A2 - A B + B2 > O. A preceding statement that woul d i mpl y that concl usion can be obtained by factoring the expression A2 - AB + B2 into A (A - B) + B2 > O. If we coul d show thi s expressi on to be true, T4Û Chapter 8 then it woul d i mpl y the desi red concl usi on. Stop reading and try to sol ve the probl em, if you did not do so before. By working backward we note that we coul d derive t he expression A (A B) + B2 > 0 from three previ ous expressi ons: frst, A > 0, whi ch i s gi ven i nformation; second, B2 > 0, whi ch i s true for al l real numbers i ncl udi ng B ; and, thi rd, A -B > O. We cannot derive A -B > 0 from the gi ven i nformati on, but we coul d just assume it as one case. Obvi ousl y, i n some cases where both A > 0 and B > 0, A wil l be greater than B. So i n the case where A > B, then A B > 0, and the theorem i s proved. Stop readi ng and try to solve the probl em, if you di d not before. Now, work backward from t he goal statement A 2 - AB + B 2 > 0 to try to deri ve the goal expressi on in the case where B > A. I n thi s case, we factor the goal expression i nto A 2 + B( B A ) > 0, whi ch wi l l be true if, frst, B > 0 (gi ven i nformation) ; second, A 2 > 0 (true for all real A ) ; and, thi rd, B A > O. Now, B A > 0 fol l ows from B > A. Thus, we have establ i shed that the concl usion fol l ows where A > B or B > A. I n addi ti on, we have to show that the concl usi on fol l ows when A ÷ B, but thi s matter i s tri vial . Thi s probl em i s so short and rel ati vel y trivial that many people may not notice how much they are working backward in sol vi ng i t. However, a careful examination re­ veal s that the critical i nsights come from focusing on the goal statement and noticing what it can be factored i nto. Frequently we use the method of worki ng backward for onl y a few steps i n order to deri ve some more congenial formul ation of the state­ ment to be proved, then we proceed to work i n a forward di rection in trying to derive thi s more congenial formul ation. Working backward may si mpl y resul t in the substitution of a singl e subgoal (di rectl y re­ l ated to the goal) for the original goal , or it may resul t in the substitu­ tion of two or more subgoal s i n place of the original goal . An extremel y si mpl e proof of the Pythagorean Theorem can be obtained by i ni ti al l y working backward from the algebraic goal state­ ment to obtain a singl e, more geometric sub goal . As you may remember, the Pythagorean Theorem states that, for any right triangle, c2 ¯ {2 + b2, where L i s the length of the hypotenuse. Prove t hi s theorem, where the givens are, frst, the axioms of Eucl i dean geom­ etry; second, the defnition of the area of a rectangular fgure (l ength times width) ; and, third, the assumpti on that the areas of nonoverl apping fgures are additi ve. Stop reading and try to prove the theorem by working backward to obtain a si ngl e subgoal . ü FIGURE 8-4 Squares erected on the si des of a right triangle by working backward one step from the algebraic. formulation of the Pythagorean Theorem, rª ¯ uª¬ bª• T 4Ü Instead of trying to use the gi vens in an attempt to derive the goal expressi on, i t is far si mpl er to look at the goal expression and note that it i s asserting that the area of a square with side c erected on the hypotenuse i s equal to the sum of the areas of the squares wi th si des a and b, respectivel y, erected on the other si des of the ri ght triangl e. Thi s situation i s shown graphical l y i n Fi g. 8- 4. Thus, by working backward from the goal expressi on c2 -a2 + b2, we have obtained a subgoal that might prove more tractabl e than the original goal -namely, to show that the area of the l arge square with side L i s equal to the sum of the areas of the smaller squares with sides Ó and h. Now stop reading and try to prove the subgoal , if you have not done so al ready. To prove that the subgoal statement i s true, you need to get expres­ si ons for the areas of the three squares that are i n the same terms, so that you can determine whether the sum of the two smal l er areas equal s the largest area. Si nce the original triangle i s the basis for any relation among the areas of these three squares, it seems natural to try to ex­ press the area of each square i n terms of the area of the original triangl e, T. Now stop reading and try to formul ate thi s expressi on, if you have not done so already. It i s quite straightforward to refect the triangl e, T, onto the squares erected on the nonhypotenuse sides, Ó and b. Assumi ng that a > b, we can lay out two triangles on the square with side a and have a rectangle with area l(a - b) left over i n the square. In the case of the smal ler TbÛ Chapter 8 square with si de b, two T triangl es wi l l use up a rectangle that has an area greater than the area of the square by an amount equal to a rec­ tangl e with area b(a - b). All this i s shown i n Fig. 8- 5. Thus , we can replace the terms a2 + b 2 by 4T + a(a - b) - b(a - b) -4T + (a - b)2. Figuring out how to lay out T triangl es i nside the l argest square (with si de c) is more of a challenge. However, with the idea of refect­ ing the original triangle about the side it shares with the various squares erected on its si de, we shoul d eventual l y wind up laying out four T tri­ angl es i nsi de the square with side c, as shown i n Fig. 8-5 . To l ay out the four T triangl es within the l argest square, the most critical prop­ erty of the original triangl e to note i s that L + f -90°. Taking four T tri­ angl es out of the square with side c l eaves a square wi th side ( a - b) i nsi de the four triangl es. Then the area of the l arge square i s c2 -4 T + (a - b)2, exactl y what was obtained for the sum of the areas of the other two squares. Thus, the area of the square erected on side c is equal to the sum of the areas of the squares erected on sides a and b, and t he Pythagorean Theorem i s proved. An exampl e of working backward to fnd several subgoal s is pro­ vided by the fol l owi ng proof probl em: You are given the following four assumptions: ( I ) Mul ti pl ication is com­ mutative ; that is, AB ¯ BA. (2) Equal s added to equal s are equal ; that i s, i f A ¯ A ' and B ¯ B' , then A + B ¯ A' + B' . ( 3) The l ef distributi ve law appl ies ; that is, C(A + B) ¯ CA + CB. (4) The transi ti ve l aw appl ies ; that i s, if A ¯ B and B ¯ C, then A ¯ C. From these four gi vens, prove the right di stributive l aw-that is, (A + B) C ¯ AC + Be. Stop reading and try to sol ve the probl em, working backward to de­ rive subgoal s. To prove this theorem, a good beginning point would be t o start at the goal statement (A + B) C -AC + BC, and write down one or more preceding statements from which you coul d derive the goal statement as a concl usi on. One pair of preceding statements from which you could derive the goal statement i s the fol l owi ng: (A + B) C -X and X -AC + BC. From these two precedi ng statements you could derive the goal statement, usi ng the transitive law. Thus, we have subdivided the goal into two subgoal s that are, however, somewhat dependent one upon another i n that we must try to transform each of the expressions that are consi dered to be equal i n the theorem into expressions that are identical (indi cated by X) . Stop readi ng and try to sol ve the probl em, if you di d not before. |¤ ¯ /) FIGURE 8-5 Expressing the areas of al l three squares in terms of T (the area of the original triangle). U, and b, so that C terms are el i minated from the expression for the area of the large square. TbT I n this probl em, just appl yi ng legitimate operations to the expres­ sions (A + B) C and AC + BC will easi l y resul t i n deriving expressi ons that are identical i n each case. So (A ¬ B) C -C(A ¬ B) , by the com­ mutative l aw for multipl ication. C(A + B) -CA ¬ CB, by the lef di stributive law. Therefore, (A + B) C -CA ¬ CB, by the transitive law of equality. Now CA ÷ AC and CB -BC, by the commutative law for mul tiplication. Therefore, CA ¬ C B -A C ¬BC, because equal s added to equal s are equal . Thus, (A + B) C -AC ¬ BC, by the transi­ tive law for equality, and the theorem i s proved. Û Relations Between Problems When manki nd has a sati sfactory theory of probl ems, it wi l l be pos­ sible to state many deep and detailed rel ati ons between diferent types of probl ems. But even without such a theory, we can stil l state certain basic types of relations between diferent probl ems. In parti cul ar, fve fundamental types of relations can obtain between two probl ems, a and b: Fi rst, probl em a i s unrel ated to problem b ( probl em a and probl em b have no common el ements). Second, prob­ lem a is equi val ent to probl em b (a and b have the same probl em el e­ ments, a and b are compl etel y anal ogous, a and b are i somorphi c). Third, probl em a i s si mi l ar to probl em b ( probl ems a and b have some common elements, probl ems a and b are partial l y analogous). Fourth, probl em a is a special case of probl em b ( probl em a is i ncl uded i n probl em b) . Fifth, probl em a i s a general i zation of probl em b ( probl em b i s i ncl uded i n probl em a) . When probl ems O and b are si mi l ar, they may be of approximatel y equivalent di fculty, b may be simpler than a, or b may be more compl ex than a. EQUIVALENT PROBLEMS In determi ni ng whether any of these fve relations hol ds between two probl ems, i t i s i mportant to note that the critical problem el ements Relations Between Problems Tbó concern the types of operati ons and the relations that can obtain be­ tween diferent expressions or things, not the specifc expressi ons or things themsel ves. For exampl e, in the checker-rearrangement prob­ lem descri bed i n Chapter 8, it would make no diference i n the el ements of that probl em, from a probl em-sol vi ng vi ewpoi nt, if the bl ack checkers were changed to quarters and the white checkers were changed to penni es, provided that al l the same restricti ons and opera­ tions sti l l appl i ed. Al ternati vel y, we coul d replace the bl ack checkers by red poker chi ps and the whi te checkers by bl ue poker chi ps, and, i f everything el se remained the same, the probl em woul d be equi valent to the original probl em. Si mi l arly, i n the ni m probl em of Chapter 8, i n which from one to fve penni es were removed by a pl ayer on each tur, the pennies coul d be replaced by any token such as marbl es, poker chi ps, buttons, or stones. I n the Tower of Hanoi probl em i n Chapter 6, the di sks of decreasing si ze coul d be replaced by any set of tokens having a simpl e order relation among them. Probl ems that difer onl y with respect t o t he names attached to diferent el ements of the probl em, but all of whose relati ons and opera­ tions are i denti cal , are consi dered equivalent, meaning they are completel y analogous or i somorphi c. Recognizing that two probl ems are equivalent may sometimes i nvol ve real i zi ng that many of the i m­ pl i ed properti es i ndi cated by the diferent names attached to corre­ sponding el ements i n the two probl ems are compl etel y irrel evant to the solution of the probl em. However, it is usual l y relativel y trivial to recognize such irrelevanci es of the difering properties of corre­ sponding el ements, and consequentl y the recognition of equi valent probl ems i s frequentl y trivial . SIMILAR PROBLEMS Two probl ems can be extremel y si mi l ar and yet not be equi val ent i n every respect. For exampl e, i n the Tower of Hanoi probl em, you may start with 5, 6, 7, or 1 0 di sks on a si ngl e spi ke; the number of di sks you begin with i s i n no way critical to the method of sol vi ng the prob­ l em as outlined in Chapter 6. You can, i n fact, state a solution to the general Tower of Hanoi probl em i n whi ch n di sks must be transferred from one spike to another. Thus, any parti cul ar Tower of Hanoi prob­ l em (a probl em i nvol vi ng the transfer of some parti cular number of di sks from one spike to another) is a special case of the general Tower of Hanoi probl em. Any two special cases of the general Tower of Hanoi probl em are si mi lar probl ems, though I would hesitate to call them equivalent probl ems. Tb4 Chapter Û Si mi l arl y, in the ni m probl em, i nvol vi ng one to fve pennies that can be removed at each turn, you can construct a l arge vari ety of problems that difer i n the maximum number of penni es that may be removed on each turn or the number of penni es originally placed on the tabl e. Each of these probl ems can be sol ved by essential l y the same problem­ solving method, though the specifc number of pennies that the player should take on each turn will difer from probl em to probl em. Again, all of these probl ems are extremel y si mi l ar, but not completely equivalent. The precedi ng two exampl es of extremel y si mi l ar probl ems were exampl es in which the probl ems difered onl y in the quantities of cer­ tain el ements of the probl em. In each case, all of the qualitative or structural characteri sti cs of the probl ems were identical . Equivalent Difficulty There are other partl y analogous probl ems in whi ch the structure i s somewhat diferent in the two probl ems being compared but still highl y si mi l ar. For example, consider the fol l owing probl em called the fox, goose, corn problem: A man (M) , a fox ( F) , a goose, ( G ) , and some corn ( C) are together on one si de of the river ( straight l ine) with a boat (8) , as i l l ustrated in the gi ven state of Fi g. V · | . The goal is to transfer al l of these enti ti es to the other si de of the ri ver by means of the boat, whi ch wi l l carry the man and one other enti t y. The fox and the goose cannot be l eft al one together, nor can the goose and the corn. Stop reading and try to solve the probl em by recalling the methods used to solve a si mi l ar ( partl y analogous) probl em. Given Goal M F G C B M F G C B FI GURE 9-1 Given and goal states for the fox, goose, corn probl em. The fox, goose, corn probl em i s si milar to the mi ssionaries-and­ canni bal s probl em di scussed i n Chapter 5. Now stop reading and try again to sol ve the probl em, if you did not before. The sol ution t o thi s probl em i s shown i n Fi g. 9- 2. You wi l l note that at one critical stage you must make an apparent detour in order to solve the probl em, just as in the mi ssi onaries-and-cannibals probl em. You might well have suspected that something of this character would be Tbb M F G C B Gi ven F C M G B M F C B G C M F G B FIGURE 9- 2 Sol ut i on to t he fox, goose , M G C B corn probl em. F G M F C B M G B F C Goal M F G C B requi red, si nce the two probl ems are si mi l ar in having restri cti ons re­ gardi ng what enti ti es can be on the same side of the ri ver with what other entities at the same ti me. Note, however, that i nstead of two types of entities, as in the mi ssionari es-and-canni bal s probl em, thi s problem has four types. Furthermore, there is onl y one exampl e of each entity, whereas in the mi ssionaries-and-canni bal s probl em, any one of the mi ssionari es or cannibal s coul d row the boat. With all these diferences, it i s surprising that the one pri mary si mi l arity i s never­ thel ess the dominant el ement of the problem with respect to probl em­ solving methods : I n each case, we fnd an i mpl i ci t eval uation function in terms of the number of entities on the goal si de of the river and take actions that i ncrease that eval uation functi on. In each case, it i s neces­ sary to make a bri ef detour i n terms of that eval uation function i n order to solve the probl em. Tbb Chapter Û A spatial reasoning probl em similar to a probl em previousl y con­ sidered i n this book i s the fol l owi ng: You are given six coins arranged i n two rows (as shown on the l ef side of Fig. 9- 3) so that each coin touches the coi ns immediatel y above or bel ow it and to the lef or right of i t. Specify a procedure for moving exactly two coins so as to achieve the hexagonal arrangement shown on the right side of Fig. 9- 3. Stop reading and try t o sol ve thi s probl em by consi dering a pre­ viousl y solved rel ated probl em. The most cl osel y related probl em i s the bowl i ng-pin reversal problem di scussed in Chapter 7. Both probl ems i nvol ve spatial l y di stributed objects that must be rearranged i n a mi ni mum number of moves to achi eve some new confguration. Stop readi ng and try agai n to solve the probl em, if you di d not before. By analogy t o t he bowl ing-pin reversal probl em, it i s useful t o ask which four of the coi ns will remain i n the same position and which two will be moved, going from the given to the goal . Since there are onl y (6 " 5)/2, or 1 5 combi nati ons of two "moved" coi ns, it would be rel ati vel y simpl e to investigate al l the possi bi l i ties, using the method of contradi cti on. However, if you recall that i n the bowl i ng-pi n reversal probl em an efective strategy was to l ook for maxi mum subgroups i n the given and goal states that occupy the same relative positions to one another, then thi s perceptual strategy can be applied to qui ckl y give the answer 4 5 6 FI GURE 9-3 Coin-rearrangement probl em. Goal Relations Between Problems Tb¯ to the present probl em. Now stop readi ng and try to sol ve the probl em, if you have not done so al ready. Cl earl y, the coi ns i n posi ti ons 1 , 2, 4, and 6 are in preci sel y the same relative confguration to one another as the top four coi ns i n the goal state. Thus , you can achi eve a solution by moving coi ns 3 and 5 to the two bottom positi ons. A symmetrical l y opposite solution can be achi eved by keeping coi ns 1 , 5, 6, and 3 i n the same positions (forming the bottom of the goal hexagon) and movi ng coins 2 and 4 to the top two posi ti ons of the goal . However, these moves are the only two of the 1 5 alternative moves of two coins that solve the probl em. Recall the one-heavy-coin probl em di scussed i n Chapters 3 and 5. I n that probl em, we had to determine whi ch of 24 coi ns was heavi est, using a beam balance. Obvi ousl y, si mi lar pri nci pl es of probl em sol v­ ing are l i kel y to be i nvol ved i n probl ems where the ori gi nal set of coi ns i s some number other than 24. Furthermore, it i s i mmediatel y apparent that making the odd coi n l i ghter than the normal coi ns, rather than heavier, woul d not change the method of sol ution i n any respect. That i s, the one-light-coin probl em is equi valent to the one-heavy-coi n probl em. Simpler Problems What happens when we know the odd coi n is ei ther heavier or l ighter than the normal coins but do not know which of the two relations the odd coi n has to a normal coin? Thi s probl em i s obviousl y rather si mi l ar to the previ ous probl ems, but it is diferent in a much more profound respect than si mpl y a variation i n the number of coi ns of the original set or the heavy versus l ight nature of the odd coin. In this new prob­ lem, where the heavi er versus l i ghter nature of the odd coi n i s am­ biguous, one pri nci pl e of probl em solving sti l l appl i es -that is, you still get an answer to a three-way question from the balance beam. However, the logical character of the reasoni ng, given the diferent outcomes on the balance beam, i s much more compl icated. Neverthel ess, noti ci ng the analogy to these other probl ems would enormousl y aid sol uti on of the heavier-l ighter-coi n probl em. Even if you had not previousl y solved the one-heavy-coi n or one-l ight-coin probl em, i t might sti l l be good strategy to pose and sol ve ei ther of these simpl er probl ems before you attempt to sol ve the more compl ex probl em. Thi s strategy of posing simi l ar, si mpl er probl ems before worki ng on a compl ex probl em i s very useful , since many of the methods of representi ng informati on or of sol vi ng the problem are common to T bÛ Chapter Û both. To be sure, the compl ex probl em wi l l i nvariabl y have some additional compl i cations. However, if when solving the simpler prob­ lem you di scovered some of the methods for sol ving the complex probl em, it will be easier to di scover the remai ni ng methods of solv­ ing the more complex probl em than if you had to solve the complex probl em as a whol e. Al ong the same l i nes, when you origi nal l y confronted the one-heavy­ coi n probl em with a set of 24 coi ns, you might have tried solving a si mpl er probl em, with the odd coin embedded in a set of onl y three or four coi ns. With a set of three coi ns, you have the best opportunity to real ize that the balance beam can provide a three-way partitioning of the set of alternatives. Often, as i n the heavy-coin probl ems, you judge t he si mpl i city of the probl em by the number of diferent el ements or complications in it. In the one-heavy-coi n probl ems, which di fer only in the total size of the original set of coins, the compl ication of the probl em seems to be variabl e in a si mpl e way, namel y, the change in the number of coins i n the total set. However, we have al so noted that a probl em i n whi ch the odd coin might be ei ther heavier or l ighter was a substantial l y more compl i cated probl em than the problem in whi ch i t was known defni tel y whi ch weight relation the odd coi n had to a normal coi n. Si mpl i ci ty i n a prob­ lem i s by no means a si mpl e quantitati ve concept. Another probl em si mi l ar to the one-heavy-coin probl em, but simpler to sol ve because it has one less restri cti on, i s the problem of three-way­ question information theor. The probl em is to determine which element i s the unique element, in a set of n possi bl e el ements, by successi vel y partitioning the set into three subsets , then ask whi ch of the three subsets contai ns the unique el ement. I n ordi nary (two-way-question) i nformation theory, the optimal strategy i s to di vi de the total set into two equal or nearly equal parts and to conti nue di vi di ng the remai ni ng set into two equal parts unti l the uni que el ement i s determined. Interesti ngly, in three-way-question i nformation theory, the optimal strategy i s not always to divide into three equal (or nearly equal) parts, though this i s not a bad strategy. If the objective is to mi ni mize the expected number of questions to be asked i n order to determine the uni que el ement, then other ki nds of parti ti ons besi des the equal partition are opti mal for some set si zes. For exampl e, if there are si x el ements i n the original total set, the opti­ mal strategy with three-way-question information theory i s not to di vi de into 2, 2, and 2 on the frst partition. Instead, the optimal strategy is to di vi de into 3 , 2, and 1 , because one-sixth of the time thi s wil l Relations Between Problems TbÜ give you the answer in one question and fve-si xths of the time it wi l l gi ve you the answer i n two questions. By contrast, the 2-2-2 spl i t wi l l gi ve you the unique el ement i n two questi ons i n every case. Si mi ­ larly, i f there were seven el ements i n the original set, you shoul d divide i nto 3-3- 1 rather than 3-2-2 -and so on. Posi ng and sol vi ng the somewhat si mpl er three-way question i n­ formation-theory problem provi des the surpri sing information that di vi di ng i nto equal thirds is not necessari l y the opti mal solution. Thi s pri nci pl e can be appl i ed i n some cases i n the one-heavy-coin probl em, though it i s li mited by the restriction of having two equal subsets i n that probl em. Someti mes when you pose a si mpl er probl em you l ose al l the di fcul t aspects of the original probl em. I n that case, sol vi ng the si mpl er prob­ l em provides no hel p at al l in sol vi ng the original , more compl ex problem. For exampl e, in the one-heavy-coi n probl em i nvol vi ng 24 coins, if you chose to i nvestigate the si mpl er probl em i nvol vi ng onl y 2 coins, you would draw the same type of wrong concl usi on regardi ng the weighing operation that many people fall i nto when worki ng on the original probl em -namel y, dividing i n half. More serious than the danger that posi ng si mpl er probl ems wi l l l ose the compl exity of the original probl em i s the danger of posi ng ap­ parentl y si mpl er probl ems that are real l y more di fcul t to sol ve than the original probl em. Al though i t i s general l y true that reducing the number of el ements i n a probl em reduces the compl exity of the prob­ bl em, it i s not always so. Sometimes reduci ng the number of el ements of a particul ar kind i n a probl em, or el i mi nating some of the features of the probl em, resul ts in a problem that is more di fcul t to sol ve than the original probl em. Sometimes the supposedl y si mpl er probl em is impossi bl e to solve. The fol l owing coin-weighing probl em i l l ustrates the danger invol ved i n posing si mpl er probl ems : You have 10 stacks of quarters wi th 10 quarters in each stack. One enti re stack is composed of quarters, each of whi ch wei ghs 2 grams l ess than i t shoul d. You know the correct wei ght of a quarter. You may wei gh t he coi ns on a poi nter scal e, whi ch tel l s you how many grams a set of object s pl aced on it wei ghs. What procedure wi l l determi ne the l i ght stack in the smal l est number of wei ghi ngs? Stop reading and try to sol ve the probl em. Thi s coin-weighing probl em i s diferent from the previ ous coi n­ weighing probl ems, primari l y because the weighing operati on i s dif­ ferent. That i s, thi s problem uses a pointer scal e, whereas previ ous TbÛ Chapter Û probl ems used a beam balance. A weighing operation usi ng a beam balance provi des an answer to a three-way question, but a single weighing on a pointer scale provi des an enormousl y greater amount of information ( l i mited onl y by the accuracy of the pointer scal e). Because of the great di ference i n the amount of information provided by the pointer versus beam-balance weighi ngs, there is vi rtual l y no si mi l arity between the sol uti ons to these two types of coin-weighing probl ems. Thus, these two types of coi n-weighing probl ems are not real l y related at al l , i n the probl em-sol vi ng sense. The presence or absence of concrete si mi l arities, such as two probl ems being both concerned with coi ns and weighing operati ons, i s l ess i mportant than more abstract si mi l arities concerned with the relationshi ps among gi vens or between gi vens and operations. If you were mi sl ed into trying to appl y simi l ar methods to those used in sol ving the previous coi n-weighing probl ems, stop reading and try again to solve thi s probl em. Another approach that fai l s i s to try to si mpl ify the probl em by re­ ducing the number of coi ns in each stack to one coin and si mpl y deter­ mi ning which of the 1 0 remaining coi ns is l ight. This actuall y makes the number of weighings vastly greater than i n the original probl em, where one had 1 0 stacks of 10 coi ns each. What other way i s there to simpl ify the probl em? Stop reading and try again to sol ve the probl em, if you di d not before. The other obvious way to si mpl ify the probl em i s to reduce the num­ ber of stacks. Sol vi ng a probl em with a reduced number of stacks coul d faci l i tate sol uti on of the probl em. Now stop reading and try to sol ve the probl em, using this method of si mpl ifcation, if you di d not solve it before. The simplest probl em that can be posed, reducing the number of stacks, is to deci de whi ch of two stacks is l ight. Evidentl y, thi s coul d be done in one weighing by weighing a single coin from one of the two stacks and determining that it i s either the correct weight for a quarter or 2 grams l ess than the correct weight. Of itself, this solution to the two-stack probl em does not i ndi cate how you shoul d solve the 1 0-stack probl em. However, it does provi de you with a basic fami l iarity re­ garding the nature of the information provided by the pointer scale. You should then try to sol ve a three-stack probl em. Now stop reading and try to solve the three-stack probl em i n a way that wil l allow gen­ eral ization to a 1 0-stack probl em, if you have not solved the probl em so far. The three- stack probl em can be sol ved in a si ngle weighing, as can the 1 0-stack probl em. However, an i nsight i s required to accomplish thi s. I know of no general probl em-sol ving method that would auto- Relations Between Problems TbT matically provide you with the critical i nsight. Attempting to sol ve the si mpl er three-stack probl em makes it more l i kel y that you woul d achieve that i nsight, but it does not guarantee i t. If you have not yet sol ved the probl em, stop reading and try to determine what combi na­ tion of coi ns from the diferent stacks woul d allow you to determine whi ch of the three stacks was the l ight stack i n a si ngl e weighing. Once the three-stack probl em i s sol ved, the same procedure wi l l i m­ mediatel y generalize to the 1 0-stack probl em. The three-stack probl em can be sol ved in a si ngl e weighing onl y by i ncl udi ng some number of coi ns from each stack and usi ng the amount of underweight as measured by the pointer scale to determine whi ch of the stacks i s l ight. To make use of the information concerni ng the number of grams by which the weighing is underweight (from what it would be if the coins were al l true quarters) , we cl early need to have some way of associating the amount of underweight with each of the three stacks. Thi s type of reasoning i s an i l l ustration of drawing i n­ ferences from the goal (determi ni ng the l ight stack i n a si ngl e weigh­ ing) . Again stop reading and try to solve the probl em, if you have not done so already. The procedure requi red to associate each stack with an amount of underweight i s to take one coin from the frst stack, two from the second stack, and three from the thi rd. If the pointer scale reads 2 grams underweight, you know that the frst stack is l ight. If it reads 4 grams underweight, the second stack is l ight. If it reads 6 grams un­ derweight, the third stack is l i ght. Generalize the sol ution to the 1 0-stack probl em, if you have not done so al ready. The original 1 0-stack probl em i s solved i n a si ngl e weighing as fol ­ l ows : You take one coi n from stack 1 , two coi ns from stack 2, three coins from stack 3 , and so on, up to 1 0 coins from stack 1 0. Now weigh thi s entire set of coi ns and determi ne by how many grams i t i s underweight. The number of grams of underweight di vided by 2 i s the number of the stack that i s l ight. Thus, the sol ution can be achi eved with onl y a single weighi ng, when you have a sufcientl y l arge number of coi ns avail abl e in each stack. Reduci ng the number of coi ns i n each stack does not si mpl ify the probl em; indeed, it makes the probl em much more difcul t, to the point of compl etel y preventing you from seeing the el egant sol ution to the original probl em. When you reduce the number of stacks rather than the number of coins i n each stack, you obtain probl ems that are i n some sense si mpl er, though, of course, you cannot reduce the number of weighings bel ow 1 . Reducing the number of el ements i n a probl em i s not the onl y way to make it simpl er. Another way i s to change the probl em so as to al l ow you to use an al ready proved theorem i n the sol uti on. That i s, you TbZ Chapter Û change the probl em so that it permits you to use a theorem or knowl­ edge that you do not al ready know how to use in the original probl em. You then hope that appl yi ng thi s theorem to the simpl er problem wi l l gi ve you an idea of how to use the theorem i n the original probl em. A good exampl e of thi s techni que i s provi ded by the fol l owing di s­ tance probl em: A ray of l ight travel s from poi nt A to poi nt B in Fi g. 9- 4 by bouncing of a mirror represented by the line CD. Determine the poi nt X on the mirror such that the di stance travel ed from poi nt A to point B i s a mini­ mum. What i s the rel ationshi p between the angl es O and f? Stop readi ng and try to sol ve this probl em. Probabl y the most sal i ent piece of knowl edge we al l have about mi ni mum di stance i s the geometric assumption that the shortest di s­ tance between two poi nts i s the straight l i ne connecti ng them. How­ ever, it i s not i mmediatel y apparent how to apply this knowl edge in C X FIGURE 9-4 Mi ni mum-di stance probl em. Û the present probl em, si nce we are constrained to connect A and B with a bent line that touches the line CD at some point X. If you have not sol ved the probl em, stop reading and try to defne a si mpl er probl em that al l ows you to appl y the pri nci pl e that the shortest di stance be­ tween two poi nts i s the straight l i ne connecting them. The difcul ty in appl ying thi s pri nci pl e i s that the points A and B l i e on the same si de of the l i ne CD (whi ch the shortest-di stance l i ne i s required to i ntersect ) . Thi s fact prevents the shortest l i ne from being a straight l i ne. However, if the points A and B l ay on opposite sides of the line CD, it would then be possi bl e to i ntersect the line CD with a straight l i ne connecti ng the points A and B. Thi s statement suggests the defni ti on of a si mpl er probl em i n which the points A and B l i e on opposite sides of the l i ne CD. Try to defne such a probl em, if you have not done so al ready. Relations Between Problems T bó A si mpl er probl em that permits the appl i cation of the shortest di s­ tance princi pl e i s the fol l owi ng: Fi nd the poi nt X such that the di stance from poi nt A t o poi nt Ì passi ng t hrough the l i ne CD in Fi g. 9- 5 is a mi ni mum. Si nce poi nts A and E l i e on opposi te si des of the l i ne CD i n thi s new probl em, the shortest di stance between poi nts A and E wi l l be the straight l i ne connecting them. The poi nt X wi l l be the i ntersection of l i ne AE with l i ne CD. Now if poi nt E i s constructed as i ndi cated i n Fi g. 9-5 to be the same di stance from the l i ne CD, and at a poi nt sym­ metrical l y opposite point B, then i t i s obvi ous that the di stance X B wi l l be equal to the di stance XE (si nce these are correspondi ng parts of congruent triangles) . Thi s i ndicates that the sol ution for point X in the simpler probl em i s actual l y the sol ution for point X i n the original probl em. Furthermore, L -¸, since they are opposite interior angles of intersecting straight l i nes, and f -¸, si nce they are correspondi ng angl es of congruent triangl es. Thus, L -f (the angl e of i nci dence equal s the angl e of refection) , and the original probl em i s enti rel y sol ved. A 0 �mmmmmm�m���mmm� C X � � 7 I FIGURE 9-5 · I *· 0 Í � � � � : �� L Si mpl er mi ni mum-di stance probl em. Now that you have had some experience with mi ni mum-di stance probl ems, perhaps you would like to try your hand at my version of the classic walking-fy problem: Bi l l y Smi th smudged hi s l ol l ypop at a poi nt on the wal l of the l i vi ng room I foot from the foor and 6 feet from each corner. A fy with a broken wing is standing on the opposite side wall I foot from the cei l ing and 6 feet from each corner. If the l i vi ng room is 30 feet l ong, 1 2 feet high, and 1 2 feet wide, what is the shortest path along which the fy shoul d wal k to get from where he i s to the l ol lypop smudge? Stop reading and try to sol ve the probl em. Tb4 Chapter Û Si nce thi s is obviousl y a mi ni mum-di stance probl em, it is simi lar i n that respect to the preceding probl em and thus perhaps may be ap­ proached in a si mi l ar way. I n deci di ng how to appl y the methods used in a previous probl em to a new probl em, it i s i mportant to state what you di d in the precedi ng probl em at some l evel of abstraction that i s general enough to appl y to both probl ems. I n trying to state such an appropriate l evel of general ity, you may begin by stating what you did i n whatever manner comes to mi nd most qui ckl y. Thi s statement may wel l be too specifc to appl y to the present probl em, but you might then try to state your methods in progressi vel y more and more abstract form, unti l you reach some statement that appl i es to the present prob­ l em. Now stop readi ng and try to formul ate (at perhaps several l evel s of general i ty) what was done i n the preceding mi ni mum-di stance prob­ l em, in order to get ideas for the present mi ni mum-di stance probl em. One thi ng that was done i n the preceding probl em was to refect a point about a l i ne in order to construct an equivalent di stance for whi ch the sol ution was a straight l i ne. You might investi gate the pos­ si bi l ity of refecti ng the starti ng poi nt, goal poi nt, or other points along the wal l s, foor, and cei l i ng of the room in the wal ki ng-fy probl em, but thi s procedure wi l l not produce a sol uti on. Thus, although refec­ tion about an axis i s an operation that can be perormed i n the wal ki ng­ fy probl em, this operation wi l l not hel p sol ve the probl em. Can you thi nk of a more general way to state what was done i n the preceding probl em that may suggest other operati ons to apply to the wal ki ng-fy probl em and produce a sol ution? Stop readi ng and try to solve the probl em, if you have not done so al ready. A si mpl e way to make something more general i s to strip it of some of its properties, l eavi ng these properties unspecifed. To solve the precedi ng mi ni mum-di stance probl em, we performed some operations so as to construct an equival ent probl em for which the sol ution i s a straight l i ne. Now stop readi ng, and try to perorm some operation such that you obtain an equi valent problem to which the solution is a straight l i ne, if you have not solved the walki ng-fy probl em al ready. The onl y way the fy could fol l ow a straight line i n the original room is to fy across the room, which he cannot do because of hi s broken wing. Thus, we must construct a new medi um through which the fy can wal k an equi val ent di stance i n a straight l i ne from the starting point to the goal . The refecting of a point i n the previ ous minimum­ di stance probl em coul d al so be consi dered to be a rotation of a strip of paper contai ni ng the point 1 800 around the l i ne axi s shown in Fig. 9-5 . I n order to rotate onl y one and not both poi nts, it would obviousl y be necessary to cut the paper at an angle al ong a perpendi cul ar to the axi s of rotati on. Can you thi nk of some way of cutting up the room Relations Between Problems Tbb and rotating some combi nation of wal l s, foor, and cei l i ng that woul d result i n a completel y fat two-di mensional surface? Havi ng achi eved an equi valent fat surface, you coul d then connect the starti ng and fni shi ng poi nts by a straight l i ne. Stop readi ng and try to sol ve the probl em, using thi s hi nt, if you have not done so already. As a chi l d, you may have made boxes out of fat pi eces of paper. Si nce the l i vi ng room in the fy probl em is equi valent to a rectangular box, i t i s just as possible to cut along various edges and fatten i t out as i t i s to construct i t from an origi nal l y fat pi ece of paper. Thus , we can obtain a fattened analog of the l i vi ng room as shown i n Fig. 9-6. Havi ng fattened out the room i n the manner shown i n Fi g. 9-6, it i s a simple matter to determi ne that a straight l i ne connecti ng the gi ven and the goal i s the hypotenuse of a right tri angl e whose si des are 24 feet and 3 2 feet. Thus, usi ng the Pythagorean Theorem, we fnd the l ength of the hypotenuse equal s v242 + 32 2 -v576 + 1 024 -v1 600 -40 feet. Thus, the fy must travel a di stance of 40 feet, and the path J0 I Z I Z Cei l i ng IZ Given i ` ` : ` ` I ` I ` ` I ` 40 Z4 ` I Z I ` ` I ` ` I ` I ` ` I ` Fl oor · | � aaaaa .aaaaaaaa. f Goal I Z JZ FIGURE 9-6 Flattened l i vi ng room for the wal ki ng-fy probl em. Tbb Chapter Û he must fol l ow i nvol ves travel ing across hi s own end wal l , a portion of the cei l i ng, a portion of one of the long side wal l s, a portion of the foor, and a portion of the opposite end wal l . More Complex Problems Posi ng a probl em that is more compl ex than the given probl em is the logical inverse of posi ng a probl em that i s si mpl er than the given prob­ l em. Ofand, it would seem that posing a more compl ex problem woul d hardl y be a useful probl em-sol vi ng techni que, and i n general thi s is true. However, if all el se fai l s, you might attempt to pose a more compl ex probl em in whi ch the el ements of the gi ven probl em are i ncl uded wi th addi tional compl icati ons, just on the chance that it would give you some ideas. This method i s cl early a last resort and unl ikel y to be benefci al , but it i s worth considering for one reason -namely, you may al ready have solved a more compl ex probl em in which your present probl em i s essentially embedded. If this i s the case, then think­ ing of a more complex related probl em that you have al ready solved will provide you with al l the ideas necessary for the solution of your present, simpler probl em. I do not know of many exampl es of thi s, but here i s one: Gi ven a fve-by-fve checkerboard, as shown in Fig. 9-7, try to draw a l i ne through al l the squares of the checkerboard, starting from the square with the dot in it on the lef side and passing through each box once and onl y once, without ever l ifing pencil from paper and wi thout ever pass­ ing outside of the checkerboard. Show how to do this or prove it impossibl e. Stop readi ng and try to sol ve the probl em by considering a previously sol ved related probl em. One probl em that certai nl y has some relation i s the notched-checker­ board probl em di scussed in Chapter 3 . If you recal l how the notched­ checkerboard probl em was sol ved, it might wel l provide you with all the ideas needed to sol ve the present probl em. Now stop readi ng and try agai n to sol ve the probl em, if you have not done so al ready. The probl em that i s most cl osel y si mi l ar i s the i nteger-path addition problem di scussed in Chapter 7. There the probl em was to place the i ntegers 1 to 9 i n a continuous path over a three-by-three matri x such that the three-digit number i n the frst row plus the three-digit number in the second row summed to the three-digit number in the thi rd row. Essenti al l y the i nteger-path addition probl em i nvol ves drawing a line starting from one cell of a three-by-three matrix (checkerboard) in Tb¯ FIGURE 9-7 The fve-by-fve checkerboard. preci sel y the same way as i s i nvol ved in the present probl em usi ng a fve-by-fve matri x (checkerboard) . The i nteger-path addi ti on probl em al so involved another restri cti on, making it a more compl ex probl em than the present one. However, the sol uti on to the i nteger-path addi ­ ti on probl em i nvol ved a consi derati on of the restriction on possi bl e solutions placed by the path (conti nuous l i ne) aspect of the probl em. Now stop reading and try to sol ve the probl em, if you have not done so already. In both the notched-checkerboard and the i nteger-path addition problems the critical property i s the i mposition of a checkerboard coloring pattern on the fve-by-fve checkerboard. Now note that every time you draw a l i ne through two squares, you necessari l y draw a l i ne through one white and one bl ack square. Al ternati vel y, you coul d i m­ pose a two-di mensi onal coordi nate label i ng scheme from ( 1 , I ) to ( 5, 5) . In that case, notice that, whenever you leave a square with an odd coordi nate sum, you pass i nto a square with an even coordi nate sum, and whenever you leave a square with an even coordi nate sum, you pass i nto a square wi th an odd coordi nate sum. N ow stop readi ng and try again to sol ve the probl em, if you di d not before. Consideration of t he i mpl ications of checkerboard col oring patterns for the present probl em yi el ds the fol l owing i nference: I f you start in a white square and must draw a l i ne through an odd number of squares in total , then the col or of the last square you pass through must be the same col or as the square you started from. In the present i n­ stance, there are 25 squares i n the fve-by-fve checkerboard. Thus, TbÛ Chapter Û if you begin in a whi te square, you must end in a white square, and there must be exactly 1 3 white squares and 1 2 black squares in the checkerboard. However, i n the checkerboard shown i n Fig. 9- 7, there are 1 3 bl ack squares and 1 2 whi te squares. Thus, starting from any white square on the board, it wi l l be i mpossi bl e to solve the probl em of drawing a conti nuous l i ne through each square once and onl y once. SPECIAL CASE Parti cul arl y in proof probl ems, it ofen happens that the theorem to be proved states a general relation that hol ds over a number of special cases or entities. In such probl ems, it i s often useful to try to prove the theorem frst for one or more of these special cases before an attempt is made to prove the theorem i n general . The reason i s that it is usual l y easier to prove the theorem for a special case than for the theorem i n general . Thi s argument i s preci sel y the same one made for the advan­ tages of posing and sol ving simpl er probl ems in general . However, not al l si mpl er problems are special cases of the probl em you are try­ ing to sol ve. The reverse, however, is almost i nvariabl y the case ­ special cases are si mpl er probl ems than the general probl em. Proving a theorem true for one or more special cases i ncreases the probabil ity that the theorem i s true in general , but unl ess you can prove the theorem true for al l special cases, proving the theorem i n a parti cul ar case does not, of course, prove the theorem in general . However, di sproving a special case of a conjectured theorem does di sprove the theorem i n general . When you are uncertain about the truth of the theorem, it can be parti cul arly useful to investigate the theorem i n some special case, since a quick di sproof of the theorem for the special case di sproves the theorem in general. This exercise may save you consi derable time that otherwise might be spent i n frui tl ess attempts to prove a fal se theorem. When the theorem i s true, proving it true for one or more special cases may provide you with many of the el ements needed in order to prove the theorem in general . Thi s reason is perhaps the primary one for posing and sol vi ng special cases of general probl ems. One use of the method of special case was di scussed al ready in Chapter 6 on subgoal s as a part of the method of mathematical induc­ ti on. Recall that, i n the method of mathematical induction, we had to frst prove the theorem true for n -1 (a special case) and then show that if the theorem was true for n it was true for n + 1 . Thus, in proving that the sum of the frst n i ntegers equal s n(n + 1 )/ 2, we initial l y es­ tabl i shed that this was true for n -1 . Relations Belween Problems TbÜ Another use of the method of special case occurs someti mes i n multiple-choice exami nation questions. For exampl e, if you were asked to choose one of fve formulas for the sum of the frst n i ntegers, the fastest method might be to i nvestigate each formul a on some speci al case, such as n -5 , very l i kel y determi ni ng that al l but one of the answers produced a contradiction i n that speci al case. Note that this i s, i n essence, a combi nation of the use of two probl em-sol vi ng methods, namely, speci al case and the method of contradi ction. A si mi lar probl em ofen ari ses when you try to remember some formul a you l earned previ ousl y and thi nk you recal l it but are not sure. For exampl e, i n tryi ng to recal l the formul a for the sum of the frst n i ntegers, you might erroneousl y recal l somethi ng such as n(n 1 )/2. Such erroneous conjectures can easi l y be tested and re­ jected by i nvestigating thei r truth i n one or more speci al cases. Si nce you ofen have a reasonabl y good i dea of what the correct formul a i s, a few rejecti ons of i ncorrect statements of t he formul a wi l l usual l y be fol l owed by a correct statement, whi ch mi ght si mpl y be veri fed by mathemati cal i nducti on. Deri vi ng a formul a for the number of combi nations of M thi ngs taken n at a ti me provi des another good example of the use of the method of special case. Undoubtedl y you the reader have encoun­ tered thi s formul a i n the past ; however, i n my experi ence, many stu­ dents fai l to remember the formul a and most do not know how to derive i t. Even if you do know how to sol ve the probl em, i t i s useful to thi nk of how you woul d go about appl yi ng the method of special case to derive the formula. Thus, consi der the fol l owi ng: Derive a formul a for the number of combi nati ons of m t hi ngs taken n at a ti me (m ~ n) . Combinati ons refer to the number of diferent un­ ordered sets of el ements. That i s, the set of two elements obtai ned by drawing X and then drawing Y is equivalent to the set obtained by draw­ ing Y and then X. The set XYZ is equi valent to the set YXZ or the set ZYX. The orderi ng of the elements in the set is irrelevant. Furthermore, you are restricted to drawing an el ement only once from the underlying popul ation of m el ements. That i s, you may sample from the underl ying popul ation n times wi thout replacing the elements you sampl ed (sampl ing wi thout repl acement). Stop reading and try to solve this probl em, making use of the method of special case. There are four speci fc aspects to the probl em. Fi rst, there i s an underlying population of M el ements. Second, you are picking a sample of n of these el ements. Thi rd, the sampl i ng i s done without replace­ ment; that is, every time you pi ck an el ement from the sampl e, you T¯Û Chapter Û do not put it back in the popul ation, so the population is reduced by one el ement every time you choose an el ement for the sampl e. Fourth, you are concerned with the number of diferent unordered sets ob­ tained by thi s sampling procedure, rather than the number of diferent ordered sets. Each of these four aspects coul d be changed to pose a problem related to the present one, some of them simpler than the present probl em, which might faci l itate its sol ution. If you have not yet solved the probl em, stop reading and think how you might change one or more of the four aspects to derive a related probl em or a special case that is si mpl er to solve than the original probl em. You coul d reduce the size of the underl ying population of m el e­ ments to, say, the special case of two el ements. You woul d then also have to reduce the si ze of the sample to either one or two elements (n = 1 or 2). However, it i s probabl y unnecessary to reduce both m and n i n this way. It i s quite possi bl e to l eave m as i t i s and reduce the sample size to two elements (n = 2) . You would now be consi dering the special case of the present probl em where n = 2, namely, where one is sel ecting an unordered pair without repl acement from the underl ying popul ation of m el ements. Stop reading and try to solve the probl em, if you have not done so already. It wi l l be easier to solve for the special case of the number of un­ ordered pairs of el ements if you consider the related probl em of determining the number of ordered pairs of el ements that can be sel ected from the population of m el ements. This l atter probl em is qui te trivial to sol ve: there are m ways to select the frst el ement, and for each of these m ways, there are (m - 1) ways to select the second element ; thus , there are m(m - 1) ways to select an ordered pair of el ements. Having determi ned the number of ordered pairs , it is quite possi bl e to determine the number of unordered pairs . Stop reading and consider how you would do this, then generalize your answer to solve the original probl em, if you have not done so already. The criti cal diference between ordered and unordered pairs i s that a pair of ordered el ements XY is considered equivalent to the ordered pair YX, when unordered pairs are bei ng considered. Thus, there are exactl y two diferent ordered pairs of el ements for each unordered pair. Knowing thi s, how can you sol ve the probl em of determining the number of unordered pairs, if you know how many ordered pair there are? Stop reading and sol ve this probl em and then generalize your answer to solve the original probl em, if you have not done so al ready. Cl earl y, if t here are two ordered pai rs for each unordered pai r and m (m - I ) ordered pairs , then there are m (m - I )/2 unordered pai rs. Relations Between Problems T¯T Now general i ze thi s answer to the sol uti on of the original probl em, where you are sel ecti ng not a pai r of el ements but a set of n el ements at a ti me. The relevant general i zation of the sol uti on to the speci al case i s that the probl em shoul d be broken i nto two parts. Fi rst, the number of ordered sets of el ements shoul d be determi ned, then how many di fer­ ent orderings there are for each unordered set. Now use thi s anal ysi s of the two subprobl ems to solve the original probl em, i f you have not done so already. I f you are not abl e i mmedi atel y to general i ze the sol uti on of the special case to obtai n a sol uti on to the general case, then consi der another special case, where n ÷ 3. Here the number of ordered sets obtained by sampl i ng without repl acement i s m(m - I Hm - 2). The number of di ferent orderings of each sampl e of three elements i s the number of permutati ons of three thi ngs. The number of permuta­ tions of a set of three things is 3 . 2 . I , or 6, si nce there are three ways to pick the frst element from the sampl e, two ways to pick the second, and one way to pick the thi rd. Thus, the number of diferent unordered sets of three elements equals m(m - I Hm - 2)/3 . 2 . 1 = m(m - I Hm - 2)/ 3 ! . Stop readi ng and general i ze the formul a to the case of an unordered set of n el ements, if you have not done so al ready. Cl earl y, the general formul a for the number of unordered sets of Í! el ements selected from a population of m elements wi thout replace­ ment i s m(m - I) . . . (m - n + I )/n ! ÷ m ! /n ! (m - n) ! . The general princi pl e for sol vi ng this probl em comes from breaki ng it i nto two parts i n order to determi ne, frst, the number of ordered sets and, second, the number of di stinct ways of ordering (reorderi ng) the el e­ ments i n a parti cul ar ordered set (sampl e) . The number of unordered sets is equal to the number of ordered sets di vi ded by the number of ways of ordering the el ements in a parti cular set. Thi s general pri nci ­ ple i s essential l y present i n all speci al cases where n = 2, 3, and so on. Thus, sol vi ng the probl em for one or two speci al cases provi des al l the essential ingredi ents for sol vi ng the probl em i n general . The method of speci al case is al so frequentl y useful in geometric probl ems. Consider the fol l owi ng proof probl em i n Eucl idean geometry: You are gi ven the fol l owi ng; (a) A strai ght l i ne equal s an angle of 1 80°. (b) A right angle equals 90°. (c) If two paral l el l i nes are cut by a trans­ versal , the alternate i nterior angl es are equal . Prove that the sum of the angles of any triangle equal s 1 80°. Stop readi ng and try to prove this theorem, maki ng use of the method of special case. T¯Z Chapter Û Si nce you are gi ven information on the number of degrees in the right angl e, it i s reasonable to consi der the speci al case where one of the angles of the triangl e i s a right angl e. For exampl e, consider the right triangle shown i n the lef i n Fi g. 9- 8. Stop reading and try t o sol ve the probl em for thi s special case, then general i ze your answer to a triangle without any right angl e, if you have not done so already. ß Special case General case ß mmm� _ 7 C FIGURE 9-8 Proof of the theorem i n Eucl idean geometry that the sum of the angl es of a triangl e i s 1 80°. Left i s for the special case of a right triangl e. Right is for the general case of any triangl e. Given the right triangle shown i n Fig. 9- 8, i t would be reasonabl e to construct a l i ne at point A paral l el to the opposite side of the tri­ angl e. Havi ng done thi s, the alternate angl es are equal : f' -f and ¸ · -¸. Si nce ¸ · -L + f' , then L + f' -90° ; this establ i shes that L + f -90°. Thus, L + f + ¸-1 80°, and the theorem is proved for the special case of a right triangl e. Now stop reading and try to generalize your solution to the case of any triangl e, if you have not already solved the probl em. In sol vi ng the special case, we have essentially mapped the angles of the triangle onto three angl es that, taken together, form a straight l i ne. This approach extends in a direct way to the general case for any triangl e, not just a right triangl e. Thus, for the general triangl e, we woul d be led to construct a l i ne at A paral l el to the opposite si de. Then, we woul d go through exactl y the same reasoning as i n the speci al case. We might not recognize i n the special case the general pri nci pl e of mappi ng the angl es of the triangl e onto a straight l i ne. However, it i s l i kel y we woul d consi der erecting a l i ne at A paral l el to the opposite si de of the triangle in the speci al case, notici ng that L + f' -90° and Relations Between Problems T¯ó that f -f' . Thi s procedure i ncl udes the essential notion of construct­ i ng an additional line at A paral l el to the opposite si de, and thus , you might think of the more general pri nci pl e. I n some i nstances, i t turns out that provi ng the theorem for a smal l number of special cases constitutes the proof of the enti re probl em. Someti mes this resul t i s obvious i n advance, and someti mes i t onl y becomes obvious afer considering a special case. For exampl e, i n Chapter 7 we considered t he probl em of provi ng that A2 -AB + B2 > o. We considered the proof of thi s theorem onl y for the case where A > 0 and B > 0 but noted that the theorem was actual l y true for al l A and B. Proving the theorem for al l A and B essenti al l y requi res us t o prove i t for four cases: (a) A > 0 and B > 0; (b) A > 0 and B ´ 0; (c) A ´ 0 and B > 0; (d) A ´ 0 and B ´ O. The proof of the theorem for each of these four speci al cases i nvol ves breaki ng the probl em up i nto three more special cases within each of the four previousl y menti oned speci al cases -namel y, A > B, A -B, and A ´ B. Thus, i n al l , we have I 2 special cases for whi ch to prove that A2 - AB + B2 > 0 i s true. But for each of these 1 2 special cases, the theorem i s rather si mpl e to prove. An example i n geometry where the general probl em can be di vi ded i nto two essenti al l y i dentical speci al cases i s provi ded by the proof of the fol l owi ng theorem: You are gi ven t he fol lowi ng: ( a) The measure of an i ntercepted arc i n degrees i s t he same as t he measure of i t s correspondi ng central angl e (namel y, the angle determi ned by drawi ng the radi i from the center of the ci rcl e to the ends of the i ntercepted arc). (b) The sum of the angl es of a triangle equal s 1 80°. (c) The angl es opposi te the equal si des of an i soscel es tri angl e are equal . Prove that an angle i nscribed i n a ci rcl e has half as many degrees as i ts i ntercepted arc. Stop readi ng and try to sol ve the probl em by frst consi deri ng a special case. As i s frequentl y true, there are many ways of formul ati ng special cases of the present theorem. For exampl e, we mi ght consi der the special case where the i nscribed angle i s a right angle and try to prove that its i ntercepted arc i s 1 800 (that the cord for thi s arc i s a di ameter of a ci rcl e). Conversely, we might i nvestigate the speci al case i n whi ch the i ntercepted arc was 1 80° and try to prove that the i nscribed angle was 90°. Another type of special case woul d be to assume that the cords composing the i nscribed angle were equal , and so on. We might consi der many speci al cases before we hi t on a speci al case that is T¯4 Chapter Û most useful in sol ving the general probl em. No previ ousl y mentioned special case i s optimal for the solution of the present probl em, but the alternate special cases may suggest such a one. Stop reading and try to solve the probl em, if you have not done so already, by using the method of special case. The opti mal special case to consi der i s that where one of the si des of the inscri bed angl e i s a di ameter of the ci rcl e. Thi s speci al case i s i l l ustrated i n the upper secti on of Fig. 9- 9. Now stop reading and try to solve the special case, then extend your answer to a proof of the general theorem, if you have not proved the theorem already. Provi ng the theorem for the special case is rel ati vel y straight­ forward. Fi rst, draw i n the dashed l i ne shown i n the upper ci rcl e of Fi g. 9-9, to obtai n the central angl e {, whi ch we know i s equal to the i ntercepted arc. We can easi l y verify that the triangle shown i n the fgure i s i soscel es. Now maki ng use of the assumption that the angles opposite equal si des of an i soscel es t riangle are equal , we know that Special case Fi rst subcase Second subcase FIGURE 9-9 Diagrams for the proof of the theorem in Eucl idean geometry that an inscribed angle has half as many degrees as the intercepted arc (whi ch equal s the central angl e). Relations Between Problems T¯b the inscribed angle a -a' . Therefore, by the gi vens that the sum of the angl es of a triangle equal s 1 80°, we know that a + a' + ¸ -1 80°. Therefore, 2a -f or a -f/ 2, and the theorem i s proved for the special case. Now stop reading and extend your solution to the general case . Actual l y, it is probably si mpl er to consi der two types of general cases. In essence, we are subdi vi di ng the general case into two special cases that exhaust the entire category. Let us cal l these two cases the frst subcase and the second subcase. In the frst subcase, the i nscri bed angl e i ncl udes the diameter of the ci rcl e drawn from the vertex of the angl e. In the second subcase, the i nscribed angle does not include the diameter drawn from the vertex of the i nscribed triangle. Now stop reading and try to sol ve the probl em for each of the two subcases of the general case, if you have not done so al ready. I f in the frst subcase we can divi de the angle a into two components, then al + a 2 -a such that the di vi di ng l i ne for the two component angl es is a di ameter of the ci rcl e. Now for each of these component angl es , a1 ÷ f 1 / 2 and a2 -f2/ 2, as shown i n the l ower l eft di agram of Fig. 9- 9. Thus, al + a2 - ( f1 + f2 ) / 2, and the theorem i s proved for the frst subcase. Now stop reading and sol ve for the second subcase, i f you have not al ready done so. In the second subcase, we can consi der the i nscribed angle a to be equal to the diference between angle al and angle a2 , as i l l ustrated in the right-hand di agram of Fig. 9- 9 (-al - a2 ) ' Si nce each of the component angl es, al and a2 ' sati sfes the requirement of the speci al case (that one of the cords of the angl e be a diameter of the ci rcl e), we know that al -f1 /2 and a2 -f2 / 2. Thus, a -al - a2 -f1 /2 - f2/2 - ( f 1 - f 2 )/2 -f/ 2, and the theorem i s proved for the second subcase. Thi s probl em provi des a beautiful exampl e of the mul ti pl e use of the method of special case. An extremel y special case was frst i nvesti­ gated to get the basic idea for the sol ution. Then the general case was subdivided into two special subcases, which were neverthel ess more general than the origi nal special case. I n provi ng the theorem for each of these two more general special cases, the truth of the theorem for the special case was used as an integral part of the proof. I n some cases, the solution of a si ngl e special case may provi de the solution to the general probl em. One exampl e of this i s provided by the fol lowing probl em: A cylindrical hol e 1 0 inches long i s drilled through the center of a solid sphere, as shown in Fig. 9- 1 0. What volume remains in the sphere? Stop reading and try t o sol ve t he probl em, using the method of special case. T¯b ,� l Ûm. ¬ FIGURE 9- 1 0 The hol e i n the sphere probl em. The probl em implies that the vol ume remai ni ng in the sphere is i ndependent of the diameter of the cyl i ndrical hol e, provided that the hol e i s 1 0 inches long. Assumi ng that the probl em has a unique solu­ tion, i ndependent of the di ameter of the hole, we can get a solution for the general probl em very si mpl y by considering a special case. What is thi s special case? Stop readi ng and try to sol ve the probl em, if you have not done so already. Consi der the special case where the cylindrical hole has a diameter of zero. I n thi s case, si nce the cyl indrical hole i s 1 0 i nches long, the sphere must have a di ameter of 1 0 i nches, and the volume of a solid sphere wi th a di ameter of I O inches equals t7r 3 =t75 3 = 5007/3 . Mak­ i ng the assumption ( whi ch we have certai nl y not proved) that thi s probl em has a uni que sol ution, i ndependent of the wi dth of the hol e, 5007/ 3 must be the answer. Natural l y, if we were conjecturing that the val idity of this theorem was uncertai n, we coul d not use thi s l i ne of reasoning to sol ve the probl em. A second exampl e of a general probl em that can be sol ved by solv­ ing a si ngl e special case i s the fol l owi ng: I n thi s two-person game the pl ayers al ternatel y pl ace poker chi ps on a circular table. The chips must not overlap and must be compl etel y on the tabl e; that i s, no poker chip may stick out over the edge of the tabl e. The l ast player to pl ay a chi p on the tabl e is the winner. I f each pl ayer makes the optimal move on hi s turn, will the frst player or the second pl ayer be the winner? Stop readi ng and try to sol ve the probl em by consi dering a special case. The probl em suggests that opti mal strategy wi l l produce a forced wi n for either pl ayer I or pl ayer 2, independent of the size of the tabl e. Assumi ng thi s, what speci al case yi el ds a qui ck sol uti on? Relations Between Problems T¯¯ Consider a tabl e that i s big enough to accommodate onl y one poker chi p (when pl aced i n the center of the tabl e). I n such a case, the pl ayer who goes frst wi l l be able to pl ace the frst and the l ast poker chi p on the tabl e and wi l l therefore be the wi nner. Thi s case suggests that, i f one of the players has a forced wi n pl aying by optimal strategy for al l sizes of tabl es, then that pl ayer i s the frst pl ayer. Verifying thi s hy­ pothesi s for a tabl e of any size requires a further clever i nsi ght beyond that provi ded by the special case. However, the speci al case does sug­ gest that we should test the hypothesi s that it i s the frst player who can force a win for himself by optimal pl ay. Also, the wi nni ng frst move for the frst pl ayer i n the speci al case might suggest the wi nni ng frst move for the frst pl ayer in the general case. Stop readi ng and try to solve the general case, if you have not done so already. The insight i nvol ved i n sol vi ng the general probl em i s that the frst player initial l y pl aces a poker chip i n the center of the tabl e (as i n the special case) and thereafter pl ays chips i n a symmetri cal l y opposite position to that pl ayed by the second pl ayer. Cl earl y, if the second player has any pl ace on the table availabl e to pl ace a poker chip, there wi l l sti l l then be a symmetrical l y opposite place on the tabl e for the frst pl ayer to pl ace a chip, so that the frst pl ayer must be the last to pl ay a chi p on the tabl e, independent of the size of the tabl e. Note that the i nitial , unique move by the frst pl ayer -namely, pl acing a poker chi p i n the center of the tabl e -i s exactl y the same move that the frst player should make in the special case. A thi rd exampl e of a general probl em that can be sol ved by sol vi ng a single special case is the fol l owing: Triangle ABC is formed by three tangents to a circle, as shown in Fig. 9- 1 1 . Angle DAE ¯ 26°. Solve for angle COB. Stop reading and try to solve the probl em. Angl e DAE (whi ch i s the same as angl e BAC) i s compl etel y deter­ mined by two of the three tangents. It i s impossi bl e from the informa­ tion given in the probl em to determine the l ocation of the tangent BC as i t intersects the ci rcl e anywhere within the arc ED. Thus, assuming that the probl em has a unique solution for angl e COB (which i s cer­ tainly impl ied by the statement of the probl em) , we can solve for angle CO B by consi dering any special case of the pl acement of tangent C B. Stop reading and try to solve the probl em, if you have not done so already. The obvious special case to consi der is for tangent CB to intersect the circl e at the same point as does the line from the origin of the circle to poi nt A (as shown above i n Fi g. 9- 1 1 ) . Havi ng chosen this T¯Û Special case General case L �-A FI GURE 9-1 1 Diagrams for sol ution of the problem to fnd the angle COB. special case, it i s now rel ati vel y easy to sol ve the probl em. Si nce angle A ED and angle AD E intersect the same arc of the ci rcl e, these two angl es are equal . Thus, AE -AD, si nce the si des opposite equal angles of a triangle are equal . Line OE ¬ line OD, since both are radi i of the same ci rcl e. Li ne OA -l i ne OA, si nce they are the same l i ne. Thus, triangle AOE i s congruent to triangle AOD, by virtue of having the three corresponding sides equal. Thus, angle EA 0 -angle DA 0 -1 3°, si nce angle DAE -26°. Si nce there are 1 80° in a triangle, and angl e AEO -angle ADO -90°, we know that angle AOE -angle AOD -90° - 1 3° -77°. We can easi l y prove that triangle OEC is congruent to triangle OFC by havi ng the same hypotenuse and one equal si de (radi i of the circl e) . Si mi l arly, triangle ODB i s congruent to triangle OFB. Thus , angle COF -,angle EOF and angle FOB -, angle FOD. Putting all this together i mpl i es that angle COB -77°, and the probl em i s sol ved. A fourth exampl e of sol ution of a general probl em by means of a singl e special case is the fol l owi ng: I n Fi g. 9- 1 2 angl e BAD ¯ 20°, AB = AC, and AD ¯ AÌo Sol ve for angle CDE. Relations Between Problems T¯Ü Stop readi ng and try to sol ve the probl em, maki ng use of the method of special case. Thi s probl em frst seems to l ack enough data to sol ve i t. The gi ven i nformation i n the probl em i s not adequate to specify a si ngle uni que triangl e with these properti es. There are a l arge variety of di ferent (noncongruent) triangl es consi stent with the i nformati on that A B = A C, AD = AE, and angl e BAD = 20°. Furthermore, these triangl es are not even si mi l ar to one another; that i s, the angl e DA E can assume a variety of di ferent val ues. Cl early, the absol ute l engths of the si des AB. AC, AE, and AD are not rel evant to determi ni ng the angl e CDE. A B '- . .... C FI GURE 9- 1 2 Geometry probl em. However, i t i s surpri si ng to be tol d i mpl i ci tl y by the probl em that the angl e DAE i s irrel evant to the value of the angl e CDE. Assumi ng that wi thi n some range of val ues the magnitude of angle DA E is i rrelevant to the magni tude of the angl e CDE, then how might we go about sol v­ ing for the magnitude of angle CDE? Stop readi ng and try to sol ve the probl em, if you have not done so al ready. Cl earl y, we can solve the probl em for a speci al case of the value of angl e DAE and determine the magni tude of the angle CDE. Accordi ng to the i mpl i ci t information stated i n the probl em, we shoul d obtain the same solution for the angle CDE i rrespecti ve of our choice of angle DAE (over some range) . Thus, l et us pick angle DAE = 20°. Now stop readi ng and sol ve for angl e CDE i n this speci al case, if you have not al ready sol ved the probl em. I f angle DAE = 20°, then angle BAC = 40° and angle ABD = ACD = 70°, si nce these angl es are opposite the equal si des of an i soscel es triangl e and there are 1 80° i n a t ri angl e. By the same reasoni ng, angl e ADE = angle AED = 80°. Thus, angl e DEC = 1 00°, and therefore angle C DE = 1 80° - 1 00° - 70° = 1 0°. I t turns out that an i nfni ty of other val­ ues substi tuted for the angl e DA E yield the same value ( 1 0°) for angle CDE, which i s the solution to the probl em. In general , we could si mpl y substitute some arbitrary val ue ] for angle DAE and sol ve the set of TÛÛ Chapter Û equations to determine the value of angle CDE in a manner that would be i ndependent of ). However, it i s consi derabl y simpler to solve the problem by choosi ng a single special case, since the given i nformation i mpl i es that the solution to the special case i s equival ent to the sol u­ ti on to the general probl em. GENERALIZATION Just as it is someti mes useful to sol ve a speci al case prior to working on a more general probl em, i t is also frequentl y useful to do the opposite and generalize the probl em somewhat. Generali zation plays a rol e i n problem sol vi ng i n at least three diferent ways. First, as a necessary part of probl em sol vi ng, we usual l y abstract from a probl em certain properti es bel onging to a more general cl ass of probl ems and thus rel evant for determining the pre­ vi ousl y establ i shed pri nci pl es needed for sol vi ng our present probl em. Second, afer we have solved the probl em, it i s ofen useful to consi der whether we coul d generalize a solution from i t to a wider cl ass of prob­ l ems i n order to derive a more general concl usi on or one or more corol­ l aries of the pri nci pl e establ i shed i n the probl em we just solved. Third, occasional l y (though i n my experi ence not too frequentl y) , it may be useful to pose and attempt to solve a more general probl em prior to working on the current probl em, even when the solution to that more general probl em is not yet known to us. The frst rol e that generali zation pl ays i n problem sol vi ng has real l y already been di scussed i n Chapter 3 i n connection with the representa­ tion of i nformation. Recall that a critical aspect of solving many probl ems consi sts i n retrievi ng from memory the rel evant previousl y establ i shed rel ations and pri nci pl es wi th common properti es needed to solve the present probl em. It may be that the current probl em i s real l y a speci al case of a general cl ass of probl ems for whi ch we al ready know a si mpl e rule for solution. For exampl e, if the present probl em i s the l i near equation 2x + 5 ¬ 1 3, we know that the solution to thi s parti cul ar l i near equation can be achi eved by using the general methods for sol vi ng any l i near equation of the form ax + b ¬ L. Si mi l arl y, if the equation were a quadratic of the form 7x2 + 2x 4 ¬ 0, we have a formula for solving any equation of the form ax2 + bx + L ¬ O. A broad range of higher order equations can be solved by certain types of numerical methods. I f we have a par­ ticular equation that fal l s wi thi n the scope of a numerical method, we know we can appl y thi s method to sol ve the parti cular probl em. In a geometric context, if a problem gi ves two si des of a right tri- Relations Between Problems TÛT angl e and we are asked to sol ve for the thi rd si de, we know a general method that i s appl i cabl e to sol vi ng all such probl ems -namel y, use of the Pythagorean Theorem, (' 2 ¬ a 2 + b2• If we are gi ven a probl em in whi ch we must determi ne the number of combi nations of seven thi ngs taken four at a ti me, we need onl y retrieve the formul a for the number of combi nati ons of M thi ngs taken M at a time and substitute i n the appropriate values for M and M in order to solve the probl em. Ordi nari l y, to solve probl ems we must combi ne use of more than one previousl y establ i shed pri nci pl e. Thus, i n al l proof probl ems, whether algebrai c, geometri c, or l ogi cal , t he proof i nvariabl y requi res a sequential appl i cation of several previousl y establ i shed pri nci pl es. I n a story-algebra probl em, frst the methods of representi ng the i n­ formation al gebrai cal l y must be appl i ed, then the methods appropri ate for sol vi ng whatever al gebraic equati ons are deri ved from the story. The exampl es of general i zation in thi s most i mportant context coul d be extended i ndefni tely. Sufce i t to say that, i n thi s sense, general i za­ tion pl ays an enormousl y i mportant rol e in probl em sol vi ng. However, as di scussed in Chapter 3 , thi s use of the method of general i zation depends critical l y on the degree of understandi ng you have of the pre­ vi ousl y establ i shed pri nci pl es i n the areas rel evant to the current prob­ lem. A few general pri nci pl es regardi ng representation of i nformation are di scussed i n Chapter 3 and Chapter 1 0 ; however, the fel d i s just too vast and thus is outsi de the scope of thi s book. The second rol e of general ization i n probl em sol ving has l i ttl e use for a student in a course but is frequentl y val uabl e for a mathematician or scientist sol vi ng a new probl em to see if the sol ution can be general i zed to a larger cl ass of probl ems. Along the same l i ne, we may try to derive some addi tional consequences as rel ati vel y straightforward corol l ari es of the sol uti on to the present probl em. For exampl e, if we had estab­ l i shed that the two di agonals of a rectangle were equal , we might ask whether this resul t could be general i zed to a l arger cl ass of si tuati ons. It woul d then be rel ati vel y straightforward to noti ce that the sol uti on general i zes to di agonal s connecti ng equal and paral l el si des of any pol ygon, i ncl udi ng regular hexagons or octagons and paral l el ograms. Another exampl e i n a geometric context i s provi ded by t he theorem that the alternate interior angles formed by a transversal i ntersecting two paral l el lines are equal . This resul t easi l y general i zes to establ i sh that the other pair of alternate interior angles are equal and that both pairs of correspondi ng exterior angl es are equal as wel l . A fnal exampl e, agai n i n a geometric context: Assume that we have already establ i shed that an i nscribed angle has half as many degrees as its i ntercepted arc. From this resul t it i s relati vel y tri vi al to show TÛZ Chapter Û that the angle formed by a tangent and a chord meeting it at the point of contact has also half as many degrees as its intercepted arc. In fact, the latter theorem coul d be thought of as si mpl y bei ng a l imiting case of the former theorem. The thi rd possi bl e rol e of generali zation i n probl em solving i s, i n a sense, the inverse of the previousl y descri bed role of the method of speci al case; namely, it might faci l i tate solution of a specifc probl em to formulate a more general probl em that had not been previousl y solved. Then we might ei ther sol ve the more general probl em or, in any event, work on the solution of the more general probl em for a time, before going back to working on the specifc probl em. Pol ya ( 1 957, pp. 1 08- 1 09) argues that thi s probl em-sol vi ng technique i s qui te useful and he gives the fol l owing exampl e: The problem is to fnd a plane that passes through a given line and bisects the volume of a given octahedron. Pol ya asserts that i t woul d be useful to formul ate the more general probl em of fnding a pl ane that passes through a straight line and bisects the volume of any sol id with a center of symmetry. The solution to thi s probl em i s fai rl y obvious, namel y, a pl ane determined by the given l ine and the center of the sol i d wi th a center of symmetry. Si nce an octahedron i s a special case of a sol id with a center of symmetry, the original , specifc probl em i s sol ved. Polya asserts that the value in formulating the more general previ ousl y unsol ved probl em i s that it can focus the probl em solver's attention on the necessary properties i n the original probl em that must be used i n order to sol ve it. As Polya hi mself points out, however, the primary function of generalization was i n the formulation of the more general probl em. If we had generalized the probl em in an inappropriate way -that i s, using some property that was not in fact central to the solution of the original probl em­ then the formulation of the general probl em would l i kel y have been of no hel p and mi ght even have been a hindrance i n the solution of the original probl em. A two-di mensi onal analog of the previ ous example woul d be to deter­ mi ne the l ine that passes through a given point and, say, a given square bi secting the vol ume of that square. A general i zation of this probl em woul d be to determine a l i ne that passed through a given poi nt and bi sected the area of a given pl ane-fgure with a center of symmetry. Again, formulating the more general probl em directly suggests the solution -namely, that the line passes through the given point and the center of symmetry of the fgure. Relations Between Problems TÛó Personal l y, I am somewhat skeptical of the al l eged benefts of try­ ing to sol ve a more general probl em, when the solution to the more general probl em is not known to the probl em sol ver. However, I am sure that thi nki ng of possi bl e general i zati ons of the current probl em does ofen ai d the probl em sol ver i n real izing al l of the properti es of the probl em, some of whi ch may be the critical properti es in order to sol ve the probl em. In thi s sense, the all eged third rol e of generaliza­ tion i s very si mi l ar to the frst role of thinking of generalizations of the probl em when the sol ution of the more general probl ems is already known. I t i s a question of representing the i mportant properti es or principl es that are present i n a special probl em, and, to do so, an abstraction process i s i nvol ved. Abstracting the properties from a probl em i s, in essence, generali zation. Thus, once again we see that the role of generali zation and the role of representation of information (as di scussed i n Chapters 3 and 1 0) are very cl osel y l i nked and perhaps identical . 1Ü Topics in Mathematical Representatiof As stated in Chapter 2, probl ems contain i nformation concerning gi vens, actions, and goal s. The frst and most basic step in probl em sol vi ng i s to represent thi s information in ei ther symbol i c or diagram­ matic form. Symbolic form refers to the expression of i nformation in words, l etters, numbers , mathematical symbol s, symbol i c logic nota­ tion, and so on. Diagrammatic form refers to the expression of in­ formation by a col l ecti on of poi nts, l i nes, angl es, fgures, di rected l i nes (vectors) , matri ces, pl ots of functions, graphs , and the l i ke. Often the same information shoul d be represented using a variety of symbolic or diagrammatic notations. In fact, di agrammatic representation i s general l y l abel ed ; for exampl e, poi nts, l i nes, and cel l s in a matri x have symbol s attached to them i n the diagram. Of course, probl ems are stated origi nal l y i n some form, often rel yi ng heavi l y upon verbal lan­ guage. The frst step i n sol vi ng such a probl em i s to transl ate from the representation gi ven expl icitl y or i mpl i ci tl y in the original state­ ment of the probl em to a more adequate representati on. Topics in Mathematical Representation TÛb Thi s chapter is concerned with sel ected topi cs in the mathematical or preci se representation of information i n probl ems. Al though preci se representation of the i nformation i n a probl em i s the frst step to take in trying to sol ve a probl em, I deferred di scussi ng this i mportant topic to this l ate chapter of the book for two reasons. First, al though some general statements can be made about the representation of i nformation i n a l arge variety of probl ems, most of the pri nci pl es of representation are specifc to parti cul ar probl em areas. Efective representation for probl ems from some area of mathemati cs, sci ence, or engineering depends upon knowing centuries of conceptual devel opment i n the relevant areas of mathemati cs, sci ence, and en­ gi neering. I doubt that mankind wi l l ever devel op a general method for determining what are the useful concepts to defne i n any parti cul ar area. Certai nl y, no such general pri nci pl es of how to defne good con­ cepts are presented i n thi s book. The best I can do i s to present those types of concepts and the pri nci pl es for representing them that have proved the most useful i n a wide variety of areas of formal probl em sol vi ng. Thi s i s what i s done i n the present chapter, without any cl aim to compl eteness (which woul d be preposterous) and with onl y mi ni mal claim to logical organization of the concepts and the pri nci pl es of mathematical representati on. Second, although some of the pri nci pl es of mathematical representa­ tion are reasonably si mpl e and can be communicated to even the most minimal l y prepared student, some of the principles di scussed i n the l atter hal f of thi s chapter are concerned with concepts from various areas of mathemati cs with whi ch some readers wi l l be unfami l i ar. I hope that these readers wi l l proft from the secti ons on sets, relations, operations, mappi ngs, functions, and real -valued functi ons of a real variabl e. However, it seemed wi sest to put this material near the end of the book so as not to di scourage readers with l ess mathematical sophi stication. The material in the l atter portion of thi s chapter is real l y a brief, simple di scussion of sel ected mathematical topi cs, l argel y modern algebra and combinatorial mathematics. This material i s primari l y intended for students who have some background i n these topi cs i n col l ege, high school , or grade school new math courses. For such stu­ dents, these sections are intended as review of the relation of certain mathematical concepts to the general methods of probl em sol ving di s­ cussed i n thi s book. For students wi th no background i n set theory, modern al gebra, and combinatorial mathemati cs, these sections may be rather hard going and requi re considerable study. Such students shoul d consul t regul ar mathemati cs books concerned with these topi cs, TÛb Chapter J Û rather than try t o master the material on the basi s of the rather brief di scussi on presented here. The pri mary basi s for sel ecti ng the mathematical concepts di scussed i n thi s chapter is thei r appl i cabi l ity to the puzzle-type probl ems characteri stic of recreational mathemati cs, whi ch constitute the pri ­ mary exampl e probl ems i n thi s book. A l arge subcl ass of al l recreational mathemati cs probl ems consi sts of "i nsight" probl ems, where a major di fcul ty may be to recogni ze the i mportant concepts for representing the i nformation in the probl em. REPRESENTATION ON PAPER OR IN THE HEAD Thi s secti on has a si mpl e message: use penci l and paper extensi vel y when you are tryi ng to sol ve probl ems. Of course, the pri mary repre­ sentati on of i nformation is i n your head, but vi rtual l y all probl ems can be sol ved faster by representing some of the i nformation on paper (or a bl ackboard or other writing surface) than they can wi thout a written graphi c ai d. Written representation of i nformation is useful for both verbal symbol i c i nformati on and vi sual di agrammati c i nformation. To try to sol ve probl ems without usi ng penci l and paper i s to subject yourself to an unnecessary handi cap. Al though an occasi onal probl em may be sol ved faster purel y "i n the head, " the vast majority of al l probl ems wi l l be more qui ckl y sol ved by representing i nformation on paper at an earl y stage in worki ng on the probl em. No one can say for sure why this i s so, but there are at l east four pl ausi bl e reasons. Fi rst, writing down the components of a probl em focuses your atten­ tion on the need to give names ( symbol s , di agrammatic representation) to each of the i mportant concepts i n the probl em. Second, i t automati cal l y draws your attention to the i nformation stated in the probl em as you attempt to represent that i nformati on on paper. Thi rd, as you deri ve i nferences or get to i ntermedi ate stages in the sol uti on of the probl em, writing aids your memory for these i nferences or i ntermediate stages at l ater stages i n the sol ution of the probl em. Afer worki ng on a probl em for some time, it i s easy to forget some of the gi ven i nformation or i nferences you drew from the gi ven i nforma­ tion, and some of thi s i nformation may be hel pful l ater. Having thi s informati on wri tten down al l ows you to use rapid vi sual scanni ng to jog your memory for prior concepts and facts that might useful l y be combi ned with the concepts and facts to whi ch you are currentl y paying attention. Fourth, probl ems that i nvol ve tabl es or matri ces of i nformation are Topics in Mathematical Representation TÛ¯ especial l y difcul t to retai n as a vi sual image purel y in the mi nd. Such information i s very efci entl y represented by means of a tabl e written on paper. For an exampl e of the importance of constructing tabl es to represent informati on, see the Smith, Jones, Robi nson probl em in Chapter 7. Si mi l ar concl usi ons appl y to graphs and other fgures, whi ch may be di fcul t to accuratel y i magine and remember purel y mental l y, without graphi c aids. Whatever t he reason, experience i ndi cates that penci l and paper representation of information i s very useful in probl em sol vi ng. So do not be l azy. Al ways have penci l and paper ready when you start to work on probl ems, and make extensi ve use of them through all stages of probl em sol vi ng. DIAGRAMMATIC REPRESENTATION When a probl em in some way i nvol ves spatial concepts -poi nts, l i nes, angl es, di rections, vectors, surfaces or pl ane fgures, sol i ds, contiguity, connectedness, inside, outside, around -di agrammatic representation may be an extremel y useful aid to symbol i c representati on, whether verbal , logical , or algebrai c. Even when the probl em does not seem to involve any spatial concepts, it sometimes happens that you can form an analogy between the concepts i n the probl em and spatial concepts, so that you coul d draw a di agram that might be of some aid i n sol vi ng the probl em. For exampl e, overl apping ci rcl es mi ght be used to repre­ sent overl apping sets, points to represent el ements of a set, and sets of arrows to represent mappings from one set to another. Verbal symbol i c representation is probabl y somewhat more i m­ portant than vi sual di agrammatic representation in probl em sol vi ng and in abstract thi nki ng in general . The communi cation of the givens, operati ons, and goal s of a probl em i s l argel y i n verbal symbol i c terms. Even when we empl oy di agrams i n the sol ution of probl ems, they are usual l y labeled ; that i s, symbol s are attached to the poi nts, l i nes, and angl es. For exampl e, i n sol ving for the l engths of l i nes or the magni ­ tudes of the angl es between l i nes i n geometric fgures, we i nvariabl y make extensi ve use of symbol s attached to vari ous poi nts, l i ne seg­ ments, or angl es i n the di agram ( see Fig. 1 0- 1 ) . Furthermore, al l the spatial information represented by a diagram l i ke Fig. 1 0- 1 can be represented symbol i cal l y without having to employ di agrammatic representati on. For exampl e, the spatial informa­ tion represented in Fi g. 1 0- 1 can be represented symbol ical l y as fol ­ lows : l i nes a, b, and h meet at common vertex B, l i nes a and d meet at vertex A, l i nes d, h, and L meet at vertex D, l i nes b and L meet at T öö ß � d D Ç C FIGURE 1 0- 1 ^ diagrammatic representation of t he spatial information in some geometric probl em. vertex C, lines c and d are col l i near and l i ne h i s perpendi cul ar to l i nes d and c. I f we wi shed to suppress the symbol s for l i ne segments, we could represent l i nes by unordered pairs of points -for exampl e, (A, B) for l i ne a. By adopti ng some conventi ons regarding symbolic representation of spatial or geometric i nformation, the above symbol i c representation can be shortened consi derabl y. For exampl e, let the unordered sets of l i nes meeting at di ferent verti ces ( points) be represented as fol l ows : ( a, b, h) , ( a, d) , ( d , h, c) , ( b, c ) . The fact that d and c are col l i near ( K) and h is perpendi cul ar ( - ) to them coul d be represented by something like d K L and h - d and c. So there i s nothing uneconomical about the symbol i c representation i n terms of time to write the i nformation. However, to say that, compared to symbol i c representation, di agram­ matic representation is l ess i mportant i s not to say it i s uni mportant. People coul d probabl y l earn to sol ve probl ems i nvol vi ng spatial con­ cepts with purel y symbol i c representati ons such as those just presented, but it i s doubtful that they would solve them as efcientl y. Current evi dence i ndi cates that there i s a modal i ty of the mind concerned with spatial concepts that functi ons di ferently from the modal ities of the mind concerned with verbal symbol i c concepts. The symbol i c modal i ti es are much more general l y useful (for ex­ ampl e, even the spatial concepts can be represented i n nonspatial symbol ic terms) , but it i s very l i kel y that the spatial modality of the mind is parti cul arly well sui ted to reasoning about spatial concepts and probl ems i nvol ving those concepts. Representing the spatial informa­ tion in a probl em in di agrammatic terms probabl y brings a new part of your mi nd to work on the probl em. Furthermore, that part of your mi nd i s probabl y very wel l designed for reasoning regarding the spatial aspects of the probl em that are represented i n di agrams. Final l y, much of your prior knowledge regardi ng spatial concepts, relations, and so on, i s probabl y stored i n the mind' s spatial modality. Si nce such prior knowledge i s ofen assumed i mpl i ci tl y to be part of the gi vens i n a Topics in Mathematical Representation TÛÜ probl em, it i s cl earl y i mportant that you have access to your memory for such i nformati on. SYMBOLIC REPRESENTATION General Principles of Concept Representation The si mpl est and most frequent step in symbol i c representation of the i nformation i n probl ems i s to choose some symbol (or sequence of sym­ bol s) to stand for a concept. The concept represented by a symbol can be anything the human mi nd can concei ve. Let us take the symbol X and examine a few of the many concepts i t can represent i n vari ous prob­ l ems. The symbol X can represent any real number or i t can repre­ sent a parti cul ar, but (in the present probl em state) unknown, real number. Al ternati vel y, X can range over the i ntegers ( . . . , -2, -1 , 0, I , 2, . . . ), posi ti ve i ntegers, negati ve i ntegers , rati onal s, i rrati on­ al s, compl ex numbers ( y + zi), the el ements of some set, subsets of three el ements from some given l arger set, and so on. The symbol X can be the label of a parti cul ar point, l i ne, or fgure i n a geometric probl em, the l abel of some parti cul ar el ement i n a practi cal constructi on probl em (such as a gate, a pi ece of a fence, or a stump), or one of the tokens i n a puzzle or game ( such as ti cktacktoe) . Any symbol can be used t o represent any concept, subject t o one and only one restri ction: the same symbol shoul d al most never be used to represent two concepts that are not known to be equi val ent through­ out the present probl em. There are many other pri nci pl es of efecti ve representati on of concepts, but we can vi ol ate them without the ri sk of produci ng concl usions that contradi ct the gi ven i nformation i n the probl em. We cannot represent by the same symbol two concepts that are not equi val ent in a probl em, wi thout runni ng a substantial ri sk of generating contradi cti ons to the gi ven i nformation and deri vi ng i ncorrect answers. By contrast, i t i s perfectl y safe to use di ferent symbol s for concepts or quantities that may l ater prove to be equi val ent or equal . However, it reduces the l oad on the memory, if you notice such equi val ences or equal i ti es before you assi gn symbol s to concepts or quantities, and assign the same symbol to concepts that must be equi val ent. Never­ thel ess, in vi ew of the grave danger i nvol ved in mi stakenl y usi ng the same symbol for nonequi valent concepts, it is best to always use di ferent symbol s for diferent concepts, unl ess you are absolutel y certai n that the concepts are equi val ent throughout the probl em. TÜÛ Chapter J Û Mnemonics and Symbol Conventions Al though you may use any symbol you l i ke to represent any concept, subject to the above-mentioned restri cti on, peopl e tend to develop habits regardi ng the types of symbol s used for diferent types of con­ cepts. For exampl e, i n mathematics, i , j, k, I, m, and n tend to be used for variabl es that range over the i ntegers ; u, V, w, X, ), and , tend to be used for variabl es that range over real numbers ; p and qare usual l y used for probabi l i ti es (ranging over t he real numbers between 0 and 1 ) ; and f, y , and h tend to b e used primari l y to represent unknown functions. I n scientifc and engineering probl ems, symbol s chosen to represent di ferent concepts tend to have some easy mnemonic relation to a longer name for the concept in verbal l anguage, usual l y being the frst l etter of the name (or the frst letter of the key word, if the name of the concept i s some phrase containing several words). For exampl e, f and F mi ght be used for forces, A for area, p and P for pressures, t and T for times, r and R for rates, and w and H for work. Other sym­ bol ic naming conventi ons in science and engineering may be purel y arbitrary, such as using Greek letters ( 8, Lç f) for angl es ; but adhering to such conventi ons (whether mnemonic or arbitrary) makes it easy to remember i n any parti cul ar probl em what concepts the symbol s repre­ sent. Mai ntai ni ng consistency across di ferent probl ems i n the types of symbol s you use to represent types of concepts brings l ong-term memory to the aid of short-term memory i n recal l i ng what i s meant by each symbol you are using i n any probl em. Single Symbols If ease of remembering what a symbol represents is so important for efci ent problem sol vi ng, why not use two or three l etters from the ful l name or even the ful l name for the concept? Something l i ke thi s i s occasional l y done i n probl ems where the names of two or more con­ cepts have the same frst letter; for exampl e, using to for total, te for tension, and ti for time, when al l three concept names appear i n some probl em. This sort of multi symbol naming of a concept i s useful in some cases, and these cases will be di scussed i n subsequent sections on the use of subscripts, vectors, and functions. However, in cases where the mnemonic advantages of using several symbol s to represent a single concept are the only advantages, this practice i s al most always a mi stake. Unl ess there are a very l arge number of concept names that all have the same frst l etter, it is always possi bl e to thi nk of diferent single symbol s to represent each concept, such that each symbol has an adequate mnemoni c relation to the concept name so abbreviated. Topics in Mathematical Representation TÜT For i nstance, you coul d use some l etter in the name other than the frst, capital letters as wel l as smal l l etters (for exampl e, T and t), corre­ spondi ng Greek l etters (for exampl e, ¯ for t) , phoneti cal l y si mi l ar letter names if you know phoneti cs (for exampl e, d for t), or change the con­ cept name to some functional synonym (for thi s probl em) that starts with a diferent frst l etter (for exampl e, sum for totalJorce for tension, duration for time) . For the purposes of uni que representati on in any one probl em, it i s unnecessary to represent concepts by a string of several symbol s (as we do i n verbal l anguage) , because so few di ferent concepts are in­ volved i n any gi ven probl em. You do want to maintain a strong mne­ monic rel ation between your chosen symbol and the concept name i t abbreviates, because that provi des you with access to the i nformation stored i n your l ong-term memory concerni ng the concept and i ts pre­ viousl y establ i shed rel ati ons to other concepts. However, we have just seen that a si ngl e symbol is usual l y adequate to accompl i sh thi s rel a­ tion without a lot of tabl e l ookup. Thi s being the case, there are substantial cogni ti ve advantages to usi ng si ngl e symbol s to represent concepts i n probl em sol vi ng. It i s fai r to say that psychology does not know for sure the exact reasons why thi s i s so, but the experi ence of probl em sol vers establ i shes that it is easier to work with si ngl e-symbol names for concepts. Many pl ausi bl e reasons can be gi ven. Si ngl e symbol s probabl y pl ace l ess of a load on short-term and long-term verbal and vi sual memory. So, for exampl e, it is easier to remember the vi sual i mage or verbal statement of an expressi on, formula, or equation that uses si ngl e symbol s for each concept than one whi ch uses more symbol s to represent each concept. Si ngl e symbols take l ess ti me to write on paper and general l y have l ess potential for erroneous writing or readi ng. Expressions To reduce the cogni ti ve and memory l oad in sol vi ng probl ems, you can also reduce the number of di ferent symbol s used to represent con­ cepts by recogni zi ng the relati ons between concepts right away when you are deci di ng upon representation. For exampl e, if John i s 30 pounds heavi er than Bi l l , l et b represent Bi l l ' s weight and l et b + 30 represent John' s wei ght, to avoi d assigni ng any new symbol at all to represent John' s wei ght. Thi s use of expressi ons i nvol vi ng a smal l number of symbol s to represent al l the concepts i n the probl em can speed up the solution to the probl em by i mmedi atel y reduci ng the number of un­ knowns to those having onl y nontri vial rel ati ons to one another. However, it must be recognized that you are performing two steps at TÜZ Chapter J Û once: representi ng t he concepts i n t he probl em and expressi ng some relati vel y si mpl e rel ati ons between concepts. Trying to do these two steps at once i ncreases the probabi l i ty of error, though when per­ formed successful ly, i t usual l y al l ows you to solve the probl em faster. My advi ce is to try to combi ne these two steps and use expressi ons to represent concepts. However, if you fnd yourself making lots of errors, go back to the more compul si ve and systematic procedure of frst assi gni ng symbols to al l di ferent and i mportant concepts i n the prob­ l em, and onl y then start expressi ng the rel ations between these con­ cepts -for exampl e, by equations such as j ¬ b + 30, where j represents John' s weight. Subscripts I n a probl em wi th many di ferent quantiti es of the same type - such as many di ferent ti mes, rates, di stances, vol umes, hei ghts, or radi i of ci rcul ar bases - it is ofen desi rable to use two-symbol codes for representi ng these concepts. In these probl ems, the type of concept can be thought of as a property of some object, acti vi ty, or object­ performi ng-an-acti vi ty. One of the two symbol s i n the two-symbol code i s used to represent the property and the other symbol to represent the object, acti vi ty, or object-performi ng-an-acti vi ty. In addi ti on, the mnemoni c conventi on i s that the property i s represented by the main symbol , wi th the object, acti vi ty, or object-performi ng-an-acti vi ty represented by a subscri pt. So, for exampl e, the height of cyl i nder A coul d be represented by hA and the radi us of i ts base by rA ' Si mi l arl y, hB and rB coul d be used for cyl i nder B, and so on. Subscri pts usual l y appear to the right and somewhat bel ow the mai n symbol , but they occasi onal l y appear to the lower lef of the main sym­ bol . So long as there are no exponents i nvol ved in sol vi ng a probl em, you can use superscri pts for the second symbol just as wel l as sub­ scripts. Superscri pts can appear to the upper lef or upper right. How­ every, si nce there is very frequentl y a danger that superscri pts will be confused with exponents, i t i s good practi ce to avoid usi ng them. The onl y possi bl e excuse for usi ng a superscript occurs i n probl ems where the objects to whi ch a property appl i es are di ferentiated along two or more di mensi ons, and the val ues of the vari ous di mensi on are compl etel y noncomparabl e. For exampl e, i magi ne that in some prob­ lem you have vari ous compl ex contai ners, each of whi ch has several component containers with di ferent shapes and di mensi ons. Container A might be composed of a cyl i ndrical subcontainer, two cubical sub­ contai ners, and a rectangul ar subcontai ner. There are al so compl ex Topics in Mathematical Representation TÜó containers B and C, each composed of subcontai ners. How shoul d you represent the vol ume of each subcontainer i n contai ner A, for exampl e? For ease in remembering what symbol s represent what and the associated ease i n retri evi ng formulas for vol umes from your l ong­ term memory, you might try usi ng a notation such as the fol l owi ng: Let V/ be the vol ume of the cyl i ndri cal subcontainer of A ; l et [ ·be the vol ume of the frst cubical ( square base) subcontai ner of A ; l et ·be the vol ume of the second cubi cal subcontainer of A ; l et vt be the vol ume of the rectangul ar subcontainer of A. You encounter notati ons l i ke the above, with both l ef and right subscripts and a superscript (or even two ! ) , but I think they are poor notati ons. Superscripts are dangerous because they can be confused wi th exponents. I f a superscript i s put to the upper lef of the main symbol , it el i mi nates the possi bi l i ty of i t bei ng confused with an ex­ ponent for the main symbol ; but it is then apt to be mi staken for an exponent for the symbol to the left i n a mul ti pl i cation probl em (for exampl e, · for ti mes · . If you are always careful to put the entire three- or four-symbol compl ex i nto parentheses -for exampl e, ( 2 Vf) -you wi l l avoid these confusi ons. However, you wi l l then have a complex symbol with as many as six component symbol s; i ndeed, even four component symbol s are too many. In cases l i ke the above, you should probabl y scrap the enti re efort and use a single letter for each concept or, at most, a two-symbol code for each concept. I f you can easi l y think of some si ngl e symbol s that have some reasonabl y good mnemoni c relation to the concepts they represent, use such symbol s. Or, fail i ng thi s, make an arbitrary assignment of si ngle symbol s to concepts and write down the assign­ ment i n a tabl e or di agram, for consul ti ng when necessary. The ti me saved i n the repeated wri ti ng of the symbol s i n vari ous statements or equati ons usual l y more than compensates for the extra ti me spent i n tabl e or di agram l ookup. Furthermore, i t i s rather hard to vi sual i ze, verbal i ze, or otherwi se thi nk about such compl ex symbol s as the ones in the above exampl e. Multiple Subscripts One important exception to the above advi ce regarding compl ex sym­ bol s are cases where you have an enti re matri x (two-di mensional or higher dimensi onal ) of objects or acti vities, each entry of whi ch has one or more properti es. In such cases, use mul ti ple subscri pts ( possi bl y separated by commas), all on the l ower right of the mai n symbol (for exampl e, X Uk or x u.d . Such cases ari se frequentl y in stati stics , where TÜ4 Chapter J Û À¿__ might represent t he wheat yi el d on t he kth pl ot of l and, subjected to the ith val ue on one treatment di mensi on (for exampl e, the amount of some ki nd of ferti l i zer), and subjected to the jth value on another treat­ ment di mensi on (for exampl e, the amount of water) . Why i s use of a compl ex symbol with mul ti pl e subscri pts recom­ mended i n the stati sti cs exampl e and not i n the previ ous example? I n the stati sti cs exampl e there are a number of di mensi ons, and every combi nation of val ues on every di mensi on (that is, every entry i n the matri x) has a defned val ue of the property i n questi on (for i nstance, wheat yi el d). I n the probl em concerni ng vol umes of compl ex contai ners, every complex container coul d have di ferent shapes of subcontainers and diferent numbers of each shape. Suppose you defned di mensi ons l i ke the "shape of a subcontai ner" ; the compl ex contai ner to whi ch the subcontai ner bel onged ; and whether i t was the frst, second, and so on, subcontai ner of thi s shape to be a part of the compl ex container. You woul d then have lots of cel l s in the matri x with no objects corre­ spondi ng to them in the probl em (empty cel l s in the matri x) . Thi s i s more troubl e than i t i s worth. When you actual l y have a l arge number of objects i n some multi­ di mensi onal matri x, there i s real l y no feasi bl e alternati ve to usi ng mul ti pl e subscri pts. Furthermore, in matri x probl ems l i ke the stati sti cs exampl e, we often do operati ons such as summi ng over al l the entries i n some row or column without ever havi ng to look up i n any tabl e or di agram the meaning of each compl ex symbol . When subscript notation i s used, a conveni ent notati on exi sts for i ndi cati ng sums or products ­ for i nstance: and Subscri pt notati on i s always i ndi cated i n probl ems where mul ti pl e sums or products of thi s type are l i kel y to be used i n the sol uti on. Where such mul ti pl e sums and products or some other computati ons or relati ons i nvol vi ng the subscri pts are not l i kel y to appear anywhere in the sol uti on, it is questionable whether you should ever use a sym­ bol wi th more than one subscri pt. Example Problem Tom, Di ck, and Harry mow l awns in the summer to earn money. They each have a lawn mower, and one Saturday they deci de to mow a 5,900 square foot lawn together, usi ng all three lawn mowers. Tom mows 70 square feet per mi nute, Dick 50, and Harry 40. Dick and Harry start mowi ng the l awn at the same ti me, but Tom has troubl e start i ng hi s mower and is del ayed for 30 mi nutes. All t hree boys stop mowi ng at the same t i me, when the lawn is fni shed. How long does Tom mow? Topics in Mathematical Representation TÜb I n sol vi ng thi s probl em, the pri ncipal step is to represent the i nforma­ tion in algebraic notation and set up the equati ons. Afer thi s, the se­ quence of algebraic actions (operations) i s tri vi al . Stop readi ng, and represent the i nformation i n thi s probl em, using the princi pl es di s­ cussed in thi s chapter ; then sol ve the probl em. There are several steps i nvol ved i n representi ng the i nformati on, and you shoul d be aware of them, even though you may be able t o sol ve such probl ems qui ckl y and easi l y. The same types of steps are i nvol ved i n al l story-algebra probl ems, and si mi l ar representational steps are i nvol ved i n many other types of probl ems as wel l . Fi rst we have t o represent t he unknown quanti ti es for whi ch we are to sol ve. Here that means havi ng some expressi ons represent ( stand for) the times that Tom, Di ck, and Harry work. It i s economi cal and an aid to vi sual and verbal memory to choose a si ngl e symbol (ofen a l etter) to stand for an unknown quantity. You may choose any sym­ bol s you l i ke for the unknown quanti ti es represented, provi ded that you do not use the same symbol to represent quantiti es that might not be equal . It is perfectl y safe to use di ferent symbol s for quantiti es that may l ater prove to be equal . However, it aids the memory, if you notice such equal i ti es before you assign symbol s to quantities and assign the same symbol to quantities that must be equal . For exampl e, i n the present probl em, we can use for the ti me that Di ck mowed and al so for the time that Harry mowed. We know that these times are equal si nce Di ck and Harry start and stop at the same ti me. It woul d pl ace an unnecessary strai n on the memory to use and to stand for the mowing times of Tom, Di ck, and Harry, respecti vel y, though there i s no mathematical reason why numeri cal subscri pts cannot be used for these purel y nominal or naming purposes. Note that there i s a good mathematical reason for not choosi ng to represent Tom' s mowi ng time by the symbol 8, because al l number symbol s, i ncl udi ng 8, are al ready i mpl i ci tl y gi ven as concepts i n a story-algebra probl em, and Tom' s mowing ti me i s not known to be equal to 8. Thi s sort of objecti on does not appl y to the use of numbers in a purel y nominal way i n subscripts, and, of course, nomi nal numerical subscripts are used frequentl y i n mathemati cs, sci ence, and engi neer­ ing. Ofen the probl em gi ves onl y the names "frst force" or "second force, " and i n such cases the obvi ous representation i s i and h. I n probl ems with many unknown quantiti es, i t i s desi rabl e, for the same reason of mi ni mi zi ng the l oad on human memory, to choose l et­ ters that have easy mnemoni c rel ati ons to the ful l names of the quanti­ ties represented. For exampl e, we choose or T for ti me quantities and Î or R for rate quanti ti es. I f there are many diferent quanti ti es of TÜb Chapter J Û the same type, such as several diferent ti mes, then i t may be best to represent them by a two-symbol code, such as tT, tn , and tH, where the t i ndi cates a time and the subscript indi cates whi ch time. The subscript shoul d also have some easy mnemonic relation to the ful l name of the quantity represented. I n the present probl em, we might initial l y have chosen tT, tn, and til to stand for the times that Tom, Di ck, and Harry mow, though, as al ready noted, this i s unnecessary. Now here is the sol uti on. Thi s is a si mpl e work-rate probl em. The pri mary equation to use i n such problems i s that work equal s the sum of al l the rate times time components. In this probl em, that means setting up an equation such as where 70(t - 30) + SOt + 40t ¬ 5, 900 t ¬ time Di ck and Harry mow t- 3 0 ¬ time Tom mows Sol vi ng the above equation gi ves t ¬ 50, and therefore Tom mows for 20 mi nutes. SOME IMPORTANT MATHEMATICAL CONCEPTS Ordered and Unordered Pairs Ordered pairs without replacement Each of the possi bl e permutations of m thi ngs taken two at a time is just another name for an ordered pair of el ements (i,j) , such that both i and j are members of the set of m thi ngs and i and j are not the same thing or el ement of the set. Thi s i s sampl i ng twi ce from a set of m el ements without repl acement of the el ement already sampl ed. Ordered pairs of this type ( permutations) are frequentl y involved i n probl ems. For exampl e, consi der a probl em in whi ch some group has the ritual of everyone ki ssing everyone el se on the forehead, and you are sup­ posed to determine the number of people i n the group from the number of ki sses or vice versa. Ki ssing on the forehead i s represented by an ordered pair of persons in whi ch the frst member of the pair is the ki sser and the second member is the person ki ssed. Having recognized that this i s an ordered pair ( permutation) probl em, you can solve the probl em i n a manner si mi l ar to the way you solved other problems ­ namel y, by determi ni ng how many ways there are to fl l each position in the ordered pair. I n this probl em, there are m ways to fll the frst posi ti on, and, with it fl l ed, there are m - 1 ways to fll the second Topics In Mathematical Representation Tܯ posi ti on. Hence, there are m ( m - I ) ways to fl l both posi ti ons (wi th di ferent el ements). Thus, the m persons exchange m( m - 1 ) forehead ki sses. Of course, to determine m from m ( m - 1 ) , you have to sol ve a quadratic equation, but in the average probl em of thi s type, that woul d be very si mpl e. Ordered pairs with replacement A known or unknown poi nt i n a plane is ofen best represented by an ordered pair of symbol s repre­ senting i ts di stances from each of two ( usual l y perpendi cul ar) axes, though, of course, i n many probl ems a poi nt can be represented by just a si ngl e symbol such as A or a dot on a page. However, frequentl y the representation of a poi nt i n a pl ane shoul d be by an ordered pai r of symbol s -an ordered pair of numbers for a known poi nt, an ordered pair of l etters for an unknown point. Si nce the val ues of the two coordi ­ nates can be equal , the representati on of a poi nt i n a pl ane by an ordered pai r i s an exampl e of sampl i ng twi ce with replacement from the entire population of coordi nate values (whether fni te or i nfni te). Sampl i ng with replacement means that when you have chosen the val ue for the frst coordi nate, you put that value back into the popul ation so that it can be drawn again as the val ue for the second coordi nate. Thus, if there were m possi bl e coordi nate val ues, there woul d be m2 possibl e points -that is, m2 ordered pairs of coordi nate val ues. Unordered pairs without replacement Al though forehead ki sses are represented by ordered pairs of persons, l i p ki sses (assumi ng mutual ki ssi ng) are represented by unordered pai rs. That i s , if A ki sses B, B i s assumed also to be ki ssi ng A. Thus, there i s no basi s for di sti n­ gui shi ng between (A , B) and ( B, A ) . Thi s i s a combi nati ons probl em rather than a permutati ons probl em. As we di scussed i n Chapter 9, the number of combi nati ons of m thi ngs taken two at a ti me i s m( m - 1 ) / 2. We deri ved thi s fgure by reasoni ng as fol l ows: The number of permutati ons is m( m - 1 ) , and there are two permutati ons for every combi nati on (two ordered pairs for each unordered pair). Thus, we should di vi de the number of ordered pairs (permutati ons) by 2 to get the number of unordered pairs (combi nations). A l i ne segment i s determined by i ts two end poi nts. Ofen a l i ne ( segment) wi l l be defned as an unordered pair of (di sti nct) poi nts, wi th every di ferent unordered pair of poi nts i n the total set of poi nts repre­ senting a diferent l i ne. Noti ce that a l i ne is i ndeed an unordered pair of poi nts, not an ordered pair, i n most cases. However, i t is perfectl y possi bl e to have directed l i ne segments i n a probl em (that i s, l i nes wi th arrows on one end, or vectors), i n whi ch case the l i ne (A, B) i s di ferent TÜÛ Chapter J Û from the l i ne ( B, A ) , where A and B are two diferent points i n the set of poi nts. The above two exampl es of unordered pai rs are i nstances of un­ ordered pairs where the sampl i ng i s without replacement. A person i s not assumed to be abl e to ki ss hi msel f on the l i ps. A l i ne segment i s determined by two end poi nts that are di sti nct, that i s, two diferent poi nts. Unordered pairs with replacement We can also obtain unordered pairs by sampl i ng from a popul ation with repl acement. An exampl e might be the di stinct combi nations obtained by throwing two di ce, whi ch might produce these resul ts: 6-6, 6- 5, 6- 4, 6- 3 , 6- 2, 6- 1 , 5 - 5 , 5- 4, . . . , 2- 2, 2- 1 , I - \ . To compute the number of di stingui shable throws of two di ce, we reason that the order of the two di ce i s not i mportant and thus 6- 5 is the same outcome as 5- 6. Ofand you might thi nk that the number of unordered pairs obtained by sampl i ng with repl acement would be equal to the number of ordered pairs sampling with repl acement di vided by 2, as was the case for sampl i ng without repl acement ( permutations and combinati ons). How­ ever, this is not the case. Most of the unordered pairs of outcomes obtained by sampl i ng with repl acement do i ndeed have two di stinct ordered-pair counterparts, but there i s a subset of the unordered pairs each of whi ch has onl y a si ngl e ordered-pair counterpart. These l atter pairs are 6- 6, 5 - 5 , 4- 4, 3 - 3 , 2- 2, and I - \ . I n general , if you are sampl i ng with repl acement two ti mes from a popul ation of m el ements, the number of diferent unordered pairs will be + m2 - m m (m + 1 ) m 2 - 2 Thi s resul t is obtained by reasoni ng as fol l ows: exactl y m of the un­ ordered pairs are of the form i-i, where i ¬ 1 , . . . , m. The remai ni ng m2 - m ordered pairs have exactl y two ordered-pair counterparts for each unordered pair. Thus, the quantity ( m2 - m) should be di vided by 2 to get the number of unordered pai rs, and m shoul d be added to this total to get the total number of diferent unordered pai rs. Systematic listing of unordered pairs Occasional l y we need to l i st al l the unordered pairs that can be obtained by sampling, either with or without repl acement, from some popul ati on. The efcient way to accompl i sh such a l i sti ng i s to put al l of the el ements i n the population into an orderi ng, whether they have any natural ordering or not. Havi ng TOpics In Mathematical Representation TÜÜ ordered the el ements from 1 to m, we can then l i st al l of the unordered pairs by proceeding as fol l ows: If the unordered pairs are being ob­ tained by sampl i ng with replacement, we take each el ement from the popul ation and pair it with itself and every element below it i n the ordering. Thus, we woul d obtain pai rs such as 5 - 5 , 5 - 4, 5 - 3 , 5 -2, 5 - 1 , 4- 4, 4- 3 , 4- 2, 4- 1 , 3 - 3 , 3 -2, 3 - 1 , 2-2, 2- 1 , I - I . I f the unordered pairs are being obtained by sampl i ng without repl acement, we take each el ement i n the popul ation and pair it with each of the elements bel ow it in the ordering. Thus, we would have l i stings such as 5 - 4, 5 - 3 , 5 -2, 5 - 1 , 4- 3 , 4-2, 4- 1 , 3 -2, 3 - 1 , 2- 1 . Importance of ordered and unordered pairs Ordered and unordered pairs are very common concepts i n probl ems, and you should be al ert for the possi bi l i ty of representing concepts as ordered or unordered pairs of other concepts, frequentl y an i mportant step i n sol vi ng the probl em. In a sense, thi s representation i nvol ves your recogni zi ng a relation between diferent concepts and i ncorporating that relation i nto your representation of concepts in the probl em, whi ch is si mi l ar to representing concepts by expressi ons, as we di scussed previousl y. You shoul d al so be careful to note whether the ordered and unordered pairs are being obtained with or without repl acement, for, as stated earl ier, it makes a consi derabl e diference i n the number of such pai rs. Ofen you wi l l real ize that some concept can be represented by a pair of other concepts before you real ize whether the pair shoul d be considered an ordered pair or an unordered pair and whether the sampl i ng i s with or without repl acement. However, by bei ng expl i ci tl y aware of the di stinctions, you wi l l try to deci de whether the pai r i s ordered or unordered and whether the sampl i ng i s wi th or without replacement, before trying to use the concept i n sol vi ng the probl em. Ordered and Unordered Sets Ordered and unordered pairs generalize easi l y to ordered and unor­ dered sets of el ements greater than two. So we can have an ordered or unordered set of three el ements (A , D, C) , four el ements ( w, x, ), z) , and so on. Ordered sets without replacement Each permutation of m thi ngs taken n at a time (n ´ m) i s an ordered set of n el ements. I n getting any parti cul ar permutation, we are sampl i ng n times without repl acement from a set of m el ements. Thus, there are m possi bl e ways to fl l the frst posi ti on (m possi bl e el ements that coul d be sel ected frst ) , (m I ) ZÛÛ Chapter J Û ways t o fl l the second position, . . . , and ( m - n + I ) ways t o fl l the nth ( l ast) position, or m( m - I ) . . . (m ¬ Ð + I ) ¬ m ! / ( m - n) ! diferent ways to sel ect all Ð el ements (assumi ng 3Ì Ì m el ements are di s­ tinct ) . An exampl e of a permutati on probl em (ordered sets, sampl i ng without repl acement) is as fol l ows: A gym teacher wi shes t o put on a balancing demonstrati on i n whi ch one of the stunts wi l l be to have four boys stand on each others shoul ders i n a si ngl e tower. Out of the cl ass of 20 boys, the gym teacher wi shes to sel ect the most stable tower of four boys. To do t hi s he plans to try each possi bl e tower of four boys once and time how long they are abl e to bal ance successful l y on each others shoul ders wi thout fal l i ng over. How many such towers of four boys must the gym teacher i nvesti gate? Stop readi ng and try to sol ve the probl em. Si nce there are 20 boys in the gym cl ass and a tower of four boys constitutes an ordered set of four el ements sampl ed from the cl ass without repl acement, thi s i s a permutati ons probl em. Therefore, the number of possi bl e towers i s 20 ! /( 20 - 4) ! ¬ 20 ! / 1 6! . Ordered sets with replacement A poi nt or vector i n n-dimensional space can be represented by an ordered set of its n coordi nates (Xl ' X 2 , • • • , xn ) . Such an ordered set of n el ements is obtained by sampling with repl acement from a popul ation of, say, m possible coordi nate val ues exactl y n ti mes. The number of possi bl e ordered sets obtained by sampl i ng n ti mes from a popul ation of m el ements i s equal to mn . Unordered sets without replacement A triangle is an unordered set of three diferent points ( sampl ing three times without repl acement from the set of all points). A quadrilateral i s an unordered set of four diferent poi nts i n a pl ane. Each combi nation of m thi ngs taken n at a time (n ´ m) is an unordered set of el ements such that none of the el ements is identical l y the same el ement (for exampl e, one i s sampling n ti mes without repl acement from a set of m el ements). The number of combi nati ons of m thi ngs taken n at a ti me is si mpl y the number of permutations of m things taken n at a time, di vided by the number of diferent permutati ons for the same combi nation of n el ements. Since there are n ! diferent permutati ons for each combi nation of Ð el ements, there are m! 1 [ n ! (m - n) ! ] combi nations of m thi ngs taken Ð at a ti me. An exampl e of a combi nati ons probl em i s as fol l ows: How many di ferent bridge hands ( 1 3 cards) can be obtai ned by deal i ng 1 3 cards out of a standard 52-card deck? Topics in Mathematical Representation ZÛT Deal i ng 1 3 cards from the standard 52-card deck i s sampl ing with­ out repl acement. The order in whi ch the cards are dealt to someone makes no diference i n defni ng a bridge hand. Therefore, this i s a combinations probl em, that i s, a probl em i nvol vi ng an unordered set obtained by sampl i ng without repl acement. Thus, the number of bridge hands i s 5 2 ! / 1 3 ! ( 52 - 1 3 ) ! ¬ 52 ! / 1 3 ! 3 9! . Unordered sets with replacement Thus far we have di scussed ordered sets obtained by sampl i ng with and without repl acement and unordered sets obtained by sampl i ng without repl acement. Computing the num­ ber of unordered sets obtained by sampl ing with replacement i s a far l ess trivial probl em. The probl em has an extremel y el egant solution, whi ch I found in Fel l er ( 1 957, p. 3 8) and whi ch I thi nk provi des a good example of how cl ever representation of the i nformation i n the probl em can facil itate its solution. Thus, we wi l l exami ne the sol uti on from two viewpoints -that of determi ni ng the formul a for the number of unordered sets obtained by sampling with repl acement, and that of having an el egant exampl e of how to represent information i n a parti cu­ l ar cl ass of probl ems. The basic probl em i s to determine how many unordered sets we can obtain by sampl i ng n times with repl acement from a population of m el ements. The solution is obtained easi l y by consi dering the m el ements of the popul ation, to be represented by the spaces between m + 1 boundary markers ordered along a l i ne. That i s, we wi l l i magi ne we have a l i ne with m + 1 interval boundaries marked of along that l ine defning m i nterval s, as shown in Fi g. 1 0- 2. In the fgure, the n el ements sampled are represented by circles between vari ous boundary markers. The number of ci rcl es between the frst and second boundary marker represents the number of times the frst el ement was sampl ed. The number of ci rcl es between the second and thi rd boundary marker represents the number of times the second element i n the popul ation was sampled, and so on. With thi s representati on, we can easily compute the number of diferent unordered sets that can be obtai ned by sampl i ng n times without repl acement from a popul ation of m ele­ ments. The two end boundary markers out of the m + 1 boundary markers must remain fxed at the ends. The remaining m 1 boundary markers and n el ements sampled can be rearranged at wi l l . If we con- 1 0 0 I 0 0 0 I 1 0 0 0 0 I FIGURE 1 0-2 1 0 1 Cl ever reformulation of the information in the probl em of determi ning the number of unordered sets that can be obtai ned by sampling H ti mes from a popul ation of H el ements. ZÛZ Chapter J Û si der any parti cul ar rearrangement t o be obtai ned by sampl i ng n ti mes without repl acement from the popul ation of n + m I el ements, then the number of such rearrangements of the l i nes and ci rcl es i s easi l y determi ned, namel y, ( n + m 1 ) ! In ! ( m 1 ) ! . Thus , we have transformed the probl em of sampl i ng n ti mes with repl acement from a popul ation of m el ements to the problem of sam­ pl i ng n ti mes wi thout replacement from a population of n + m 1 ele­ ments. In both cases, we are trying to compute the number of unordered sets obtai ned by such a sampl i ng. Si nce we know the solution to the probl em of obtai ni ng the number of unordered sets obtained by sampl i ng wi thout repl acement (combi nations), we now know the answer to the probl em of determi ni ng the number of unordered sets obtained by sampl i ng with replacement. Relations Rel ations are l abel ed connecti ons between concepts. Examples of relational concepts i ncl ude, among others : father of, brother of, si bl ing of, descendant of, prior to, less than, equal to, i dentical to, heavier than, ol der than, besi de, i ncl udes, is a member of Rel ati ons can be written as "a R b" (meani ng perhaps "a is the father of b") or as R( a, b) . I n the l atter case, R( a, b) is true (equal s I ) , if and onl y if the relation R obtai ns for the ordered pair ( a, b ) , and R ( a, b) is fal se (equal s zero), i f and onl y if the relation R does not obtai n for the ordered pair ( a, b) . Rel ati ons can be cl assi fed accordi ng to whether they sati sfy cer­ tain properti es (axi oms). For exampl e, a relation can be refexi ve (a R a, for all a i n some set ) , antirefexi ve (a not- R a, for all a i n some set), or nei ther. Rel ati ons can be symmetric or commutative (a R b i mpl i es b R a, for al l a and b in some set ) , anti symmetric (a R b i m­ pl i es b not- R a, for al l a and b i n some set) , or neither. Relations can be transi ti ve (a R b and b R c i mpl y a R c, for all a, b, and c i n some set ) or not. Rel ati ons that are refexi ve, symmetri c, and transitive form an especi al l y i mportant class of rel ati ons known as equivalence relations. "Equal to " and "i dentical to" are equi val ence rel ati ons, but so are "has the same weight as," "is the same col or as," and "is just as good as. " Some students fnd i t di fcul t to di sti ngui sh the concepts of equiva­ lence, equality, and identity, and i ndeed the meani ngs of these concepts are somewhat variabl e, especi al l y i n going from science to mathemati cs. To get an i dea of why we someti mes need to di sti ngui sh them, consider the fol l owi ng exampl es. Suppose you were thi nki ng about your prospects for i mmortal ity, and you i magi ned that i t mi ght be possi bl e to form a compl ete dupl i cate of Topics in Mathematical Representation ZÛó yoursel f that was the same confguration of mol ecul es as yoursel f but, of course, used di ferent mol ecul es. Such a dupl i cate woul d be equal to you i n every respect, but the dupl i cate woul d not be i denti cal to you, si nce the two of you i n fact woul d be two diferent parts of the uni verse. I dentical twi ns are consi dered to be genetical l y equal , but they are not i dentical , si nce they are two diferent entities. Al l the mol ecul es of some cl ass are consi dered to be equal, but si nce there is more than one such mol ecul e in each class, the molecul es of a given class are not identical . A thing i s i dentical only to itself, but it can be equal to all dupl i cates. Of course, many people fee! that it i s l i kel y that no two entities can be exact dupl icates i n every respect, but thi s i s not too important for the defni ti on of equal i ty. We can si mpl y say that two things are equal , if there i s no property that we can currently determine to di sti ngui sh them (except of course the fact that they are not the identical l y same enti ty). Two enti ti es may be equivalent to each other i n some one respect (one property) without bei ng equal to each other (equi val ent to each other in all respects). Thi s i s pretty obvious when it i s put this way. Two girl s can have the same weight (to the nearest pound) , have the same shooting percentage i n basketbal l , have the same number of points on a test, and so on. Thus, it i s cl ear that (a) two names of ob­ jects that are identical l y the same real l y refer to the same object, (b) two names of objects that are equal refer to two diferent objects that are equivalent in all respects, and (c) two names of objects that are equi va­ lent in some respect are not necessari l y identical or equal i n al l respects. Someti mes there i s no need to di stingui sh all three types of "same­ ness" concepts. For exampl e, in real -variabl e mathemati cs, where a symbol has onl y one property, i ts numerical val ue, there is no need to di stingui sh equi valence and equality. Furthermore, our defnition of identity is not qui te the same intuitivel y as it is when we di scuss objects presumed to exi st i n the real worl d. I n mathemati cs not viewed as appl ying to the real worl d, two expressi ons are identical l y equal ( F) ¡ if and onl y if they have the same val ues across all substituti ons for free variabl es (for exampl e, x2 - 1 F [x + 1 ] [x - 1 ] ) . Two expres­ si ons are ofen said to be equal (=) if and onl y if they have the same values for at least one substitution for free variabl es (for exampl e, x2 + 2 = 6 for t he substitution x = 2 ) . Operations I n Chapter 2, operati ons were contrasted with gi vens. I ntui ti vel y, operations were the thi ngs that you coul d do to change the state of the probl em and gi vens were the material s you had to work with (the ZÛ4 Chapter J Û starting point or i ni tial probl em state) . You coul d attach a symbol to each operation that you coul d perform to change the probl em state, but what woul d that accompl i sh? For one thing, it might allow you more conveni entl y to list the alternative actions that could be per­ formed at each node in the state-acti on tree. However, thi s is not the primary reason for attachi ng symbol s to operations. The pri mary reason for using a symbol to represent an operation is that you can formul ate probl ems i n whi ch the gi vens are composed of statements that i nvol ve action concepts as well as object or property concepts. Consi der a probl em to sol ve two l i near equations i n the variabl es X and y for the val ues of X and y -for exampl e, 2 + y ¯ I and X 6y ¯ 20. Addi ti on, subtracti on, and multipl i cation operations are i ndi cated i n each of these statements, but the parti cul ar actions they i ndi cate (for exampl e, multiply y by 6) are not the acti ons you take i n sol vi ng the probl em. The actions you take are to add, subtract, mul ti pl y, or divide both si des of some equation by the same quantity and to substitute equals for equal s. By contrast , these are probl ems i n whi ch you might take t he opera­ tions of addi ti on, subtracti on, mul ti pl i cation, and so on, of two quanti­ ti es as the operati ons used at vari ous nodes of the state-action tree. For example, water-jar probl ems (such as the one di scussed in Chapter 8) i nvol ve presenti ng several jars that hold diferent quantities of l i qui d. One is asked to produce some quantity of l i qui d that is diferent from the capacity of any jar. The method of solution i nvol ves, i n es­ sence, addi ng and subtracting the capaci ti es of each of the jars. I magi ne that you have a fve-quart jar and a three-quart jar and are attempting to obtai n exactl y one quart of water. You coul d fl l the three­ quart jar, pour it i nto the fve-quart jar, fll the three-quart jar, pour two quarts into the fve-quart jar, and have exactl y one quart lef in the three-quart jar, as requi red. In water-jar probl ems, addi ng and subtracti ng quanti ti es are the operati ons i n the state-action tree. In sol vi ng equations, addi ng and subtracting quantities are operati ons used in constructing statements ; addi ng and subtracting the same quantities ·...,.,.. are operati ons in the state-action tree. Perhaps there shoul d be two diferent names to di stingui sh operations at these two diferent l evel s i n a probl em. In any event, i t i s i mportant for cl ear thi nki ng to keep operations at the two l evel s di stinct i n your mi nd. Mappings and Functions I magi ne that you have the el ements of some set A (the argument set), the el ements of another set T (the target set), and a set of arrows, each TOpics in Mathematical Representation ZÛb going from one member of the argument set to one member of the target set. Not all of the el ements of set A need have arrows going from them. Some el ements of set A may have several arrows goi ng from them to di ferent el ements i n set T. Some of the el ements of set T may have no arrows going i nto them, and some may have several arrows going i nto them. Any such system of arrows l i nki ng two sets is cal l ed a ..,,, Al though it is easier to expl ai n mappi ngs in di agrammati c (spatial ) terms, a mappi ng can be represented verbal l y as a .of ordered pairs where the frst member of the pair i s an element from set A and the second i s a correspondi ng el ement i n set T. The number of ordered pairs i n the set i s the same as the number of arrows i n the di agram­ matic representati on. Of course, most mappi ngs of i nterest are repre­ sented more si mpl y by a rul e that al l ows us to compute the el ement i n T associated with each el ement i n A. One exampl e of a mapping i s i l l ustrated i n Fi g. 1 0- 3 , whi ch suggests that sets A and T are completel y di sti nct, that i s, have no el ements i n common (are nonoverlappi ng) . Thi s need not be the case. Sets A and T could be i dentical sets, ei ther one coul d be compl etel y i ncl uded i n the other, they coul d be overlapping ( some el ements common to both sets, but some elements i n each set bei ng not contained i n the other set), or they coul d be nonoverlappi ng. I n short, any set relation is Set A Set 1 • FIGURE 1 0-3 A mapping from set A to set T. where set A and set T are nonoverlapping . ZÛb Chapter J Û possi bl e for sets A and T. An exampl e of a mapping i nvol vi ng two overlapping sets is shown in Fi g. 1 0-4. I n addi ti on to the relation between the argument set ( set A) and the target set (set T) as just di scussed, mappi ngs have other properti es. A mappi ng i s .,if and onl y if i t i s defned over al l members of the argument set (that i s, all members of set A have an arrow going from them) ; otherwi se i t i s ., Ofen we assume that all map­ pi ngs are compl ete, si nce the argument set coul d always be reduced to those el ements for whi ch the mappi ng is defned, with no l oss of i nforma­ tion regardi ng the mappi ng other than that regarding i ts i ncompl eteness. Set A Set 1 FIGURE 1 0-4 A mapping from set A to set T. where set A and set T are overl apping. In thi s case, the sets have two common el ements. Sets are encl osed i n ci rcl es. Common el ements are i n the overl appi ng part of the two ci rcl es. A mappi ng i s .,...(i s .·.or i s a ,..if and onl y i f each el ement i n the argument set maps into no more than one el ement i n the target set (that i s, there is no more than one arrow going out of each el ement i n set A ) . If a mappi ng i s both si ngle valued and compl ete, then exactl y one arrow goes out of each element i n set A. The term ,.i s someti mes reserved onl y for si ngle-valued map­ pi ngs, but frequentl y the expressi on .....,.is also heard, whi ch i ndi cates that there i s not complete consi stency i n the restric­ tion of the term functi on to si ngle-valued mappi ngs. An example of a compl ete and si ngl e-val ued mappi ng i s shown i n Fig. 1 0- 5. A mappi ng i s an mappi ng i f and onl y i f every el ement of the target set has an arrow going i nto it. Another way to say thi s is that the Set A Set 1 • • �: �: • ZÛ¯ FIGURE 1 0-5 A si ngl e-val ued mapping from set A to set T. The mapping i s al so compl ete since al l members of set A have arrows going from them. Note that the mapping i s si ngle-valued i n going from set A to set T but not in the inverse di recti on, nor i s the inverse mapping even compl ete (arrows do not go into every el ement i n set !). inverse mapping ( going backward along the arrows from set T to set A ) is compl ete (that i s, defned over every el ement in set T) . An exampl e of an onto mapping i s shown i n Fi g. 1 0- 6. A mapping i s i f and onl y i f every el ement of the target set has no more than one el ement going i nto it. Another way to say thi s is that the inverse mapping ( going backward along the arrows from set T to set A ) is si ngl e valued. Note that a one-to-one mapping need not be onto, just as a singl e-val ued mappi ng need not be compl ete. That i s, the one-to-one property requi res that no more than one arrow go i nto each el ement i n the target set, whereas the onto property re­ qui res that at least one arrow go i nto each el ement in the target set. Si mi l arly, the si ngle-valued property requi res that no more than one arrow go out from each el ement i n the argument set, whereas the com­ pl eteness property requires that at least one arrow go out from each element i n the argument set. An exampl e of a one-to-one mappi ng that is not si ngl e-val ued, onto, or compl ete is shown in Fi g. 1 0-7. Set A Set 1 FIGURE 1 0-6 An onto mappi ng from set A to set T, meaning that the inverse mapping from set T to set A i s compl ete (defned over al l el ements i n set T). The mapping from set A to set T is al so complete but i s not single val ued. ZÛÛ 5cI A 5cI 1 • FI GURE 1 0-7 A one-Io-one mappi ng from set A to set T thaI i s not compl ete ( two el ements of sel T have no arrows goi ng from t hem) . not ont o ( one el ement of set T has no arrow going i nt o i t ) . and not single val ued ( some elements of set A have more than one arrow going from them to diferent members of set T) . Real-Valued Functions of 8 Real Variable The most commonl y encountered functi ons are real -valued functions of one or more real variabl es (real arguments). A statement such as y ¬f( x) means that an element x i s mapped i nto an el ement y accordi ng to the rul e (functi on) represented by the symbol f. I n thi s section, we are assumi ng that x and y are real numbers (the ordi nary positive and negati ve i ntegers, fractions, square roots of posi tive i ntegers, and so on, wi th whi ch we are fami l iar) . Exampl es of functi ons that f coul d repre­ sent i ncl ude pol ynomi al s with known coefci ents (y ¬ 7x + 2, y ¬ 4r 3x2 + À 37) , pol ynomial s with unknown coefci ents (y ¬ ar t bx + . and trigonometric functi ons (y ¬ si n x + a tan3x) . I n some probl ems, you may know that one variable is a function of one or more other variabl es, but i t may take some problem sol vi ng to determi ne exactl y what the function i s. In such cases, i t i s general l y hel pful to assign some symbol l i ke f to the unknown function and write equati ons i nvol vi ng the unknown function [for exampl e, y ¬f( x) ] . I n some probl ems, you can reduce the number of symbol s you have to remember by just writing an expression such as y ¬ y (x) , meaning that the variable y is a functi on of the value of x, but without bothering to give a separate name to the function (separate from the value of the functi on when x is the argument). Thi s is a useful mnemonic tri ck i n si mpl ifyi ng notati on i n probl ems where there i s no possi bi l ity of con­ fusi ng the concept of the functi on ( f) with the concept of the dependent variable ( y) . However, when such confusi on i s possi bl e, thi s trick should be avoi ded. 1 1 Problems from Mathematics, Science, and Engineering Thi s chapter is designed to establ i sh the general i ty of the probl em­ sol vi ng methods di scussed throughout the book. In previ ous chapters, the probl ems used to i l l ustrate the methods were del i beratel y sel ected so that they coul d be solved by the reader with no more background than a high school student with one year of algebra and one year of pl ane geometry. Many of the probl ems were of the puzzl e (or brain teaser or recreational mathemati cs) variety, which requi re no speci al ­ ized knowledge of mathematics, sci ence, or engineering. Al though methods for solving such probl ems have some recreational interest, there i s al so a serious practi cal reason i n mastering them, for they are also useful for sol vi ng serious probl ems i n all areas of mathematics, sci ence, and engi neeri ng. Thi s chapter i s designed t o demonstrate thi s appl i cabi l ity and to gi ve the reader some experi ence i n i t. ALGEBRA The solution of systems of si mul taneous l i near equati ons provi des a simpl e example of the use of eval uation functions, hi l l cl i mbi ng, and ZTÛ Chapter J J subgoal s. As an exampl e, consi der the fol l owing system of three l inear equations : or 2x + y - 3z = 1 x + 2y + 5z = 9 3x - 3y - I Oz = 4 ( El ) (E2) (E3) The operations avai l abl e for sol vi ng such a system are essential l y the fol l owi ng. We can (a) mul tipl y both si des of an equati on by the same num­ ber, (b) add equal s to equal s (or subtract equal s from equal s), and (c) sub­ sti tute equal s for equal s. As an exampl e of the frst, consi der the acti on of mul ti pl yi ng both si des of equati on (E2) by the number -2. Thi s yi el ds the equati on -2x -4y - l Oz =-1 8. As an exampl e of the second operati on, consi der the acti on of addi ng the equati on just deri ved from (E2) to (E 1 ) : 2x + y - 3z = -2x - 4y - l Oz = -1 8 - 3y - 1 3z = -1 7 3y + 1 3z = 1 7 ( El ) ( -2) . ( E2) (E4) As an exampl e of the t hi rd operati on, consi der the fol l owi ng substi tuti on of an expressi on for x, deri ved from (E2), i nto ( EI ) : x = -2y - 5 z + 9 2 (-2y - 5z + 9) + y - 3z = 1 (E2) ( El ) I f we were given a parti cular numerical val ue for x, we coul d, of course, also substitute that parti cul ar value for x anywhere i t appeared i n any equat i on. The goal i s to deri ve three expressi ons of the form x = . g y = . , and z = = , where speci fc numbers appear in the bl anks. Now stop reading and try to sol ve the probl em. The sol uti on of such a problem i nvol ves pri mari l y the use of an eval uation function and the subgoal method, with perhaps a l ittle use of hi l l cl i mbi ng. The eval uation function i s concerned wi th the number of variabl es (unknowns) i n each equation and the number of indepen­ dent equations i nvol vi ng any parti cul ar set of variabl es (unknowns). The original system of equations consi sts of three equati ons, each of whi ch has three variabl es (unknowns). From thi s starting point, a more highl y val ued state woul d be one in whi ch we had two equations i nvol ving the same two unknowns. An even more highl y valued state woul d be one i n whi ch we had one equation i nvol vi ng one unknown. Still more highly val ued woul d be a state i n which we had two equa­ tions, each of whi ch i nvol ved a di ferent, si ngle unknown. The most Problems from Mathematics, Science, and Engineering ZTT hi ghl y valued state of all -short of sol uti on -woul d be one i n whi ch we had three equati ons, each i nvol vi ng a di ferent, si ngl e unknown. For the purpose of defni ng the present state-eval uati on functi on, note that we have ignored the subprobl em of sol vi ng a si ngle l i near equa­ tion with one variable for the val ue of the unknown, si nce we assume that to be a tri vi al subprobl em whose method of sol uti on i s al ready wel l known. We have not bothered to assign numbers to states that have the above-mentioned properti es, because thi s i s unnecessary for sol vi ng thi s probl em. There are several ways to assi gn speci fc num­ bers to these states, and any of them woul d be equal l y sati sfactory as Ù guide to the defni ti on of successi ve subgoal s i n sol vi ng the probl em. In learni ng to sol ve systems of l i near equations by means of the above three operations, you shoul d frst master the sol uti on of l i near equati ons i nvol vi ng one unknown, then systems wi th two i ndependent equations i nvol vi ng two unknowns, then three i ndependent equati ons invol vi ng three unknowns, and so on. You shoul d l earn that the frst subgoal to achi eve i n a system of M equati ons with M unknowns i s to deri ve a system of M - I equati ons i nvol vi ng M - I unknowns. The next subgoal i s to derive a system of M - 2 equati ons i nvol vi ng M - 2 unknowns, and so forth. Occasi onal l y, it is possi bl e to jump several l evel s at once, and thi s i s even better, but i n general you must proceed one step at a time. Now stop readi ng and solve the probl em, if you did not do so before. To solve the above probl em, we shoul d frst set a subgoal that we must achi eve a system i nvol vi ng two equati ons and two unknowns. Somehow, then, we must deri ve two equati ons, from each of whi ch we have el i mi nated the same unknown. Si nce two such equati ons must be derived, there are two parts to thi s frst subgoal (two subgoal s of the frst subgoal ). There are a variety of ways to accompl i sh the frst subgoal , one of which is as fol l ows: Ll 2x + y - 3z = -2x - 4y - I Oz = -1 8 - 3y - I 3z = -1 7 3y + I 3z = 1 7 3x - 3y - I Oz = 4 -3x - 6y - I Sz = -27 - 9y - 2Sz = -23 ( EI ) (-2) . ( E2) ( E4) ( E3) (-3 ) . (E2) (ES) ZTZ Chapter J J Havi ng achieved the frst subgoal , the next subgoal i s to solve this system of two equati ons and two unknowns to derive a singl e equation i nvol vi ng one unknown, as fol l ows : -9y - 25z = -23 9y + 39z = 5 1 1 4z = 28 z = 2 ( E5) ( 3) . ( E4) Havi ng achi eved the second subgoal (i ncl uding fnding the value of one of the unknowns) , it is time to proceed to the third subgoal of deriving another singl e equation i nvol vi ng a si ngle unknown. This deri vation can be done by using the substitution operation, as fol l ows: 3y + 1 3 . 2 = 1 7 3 y + 26 = 1 7 3y = -9 y = -3 ( E4) Y Fi nal l y, the ffth subgoal and the fnal component in the sol ution of the problem is as fol l ows: x + [ ( 2) . (-3 ) ] + [ ( 5 ) . ( 2) ] = 9 x - 6 + 1 0 = 9 x = 5 ( E2) Y To sol ve the above probl em, you have to know where you want to go at all stages in its sol uti on. That i s, you must have an eval uation function si mi l ar to that di scussed here. The eval uation function pro­ vi des the means for defni ng a series of subgoal s ( subprobl ems) that l ead to the sol ution of the entire probl em. Al ong the way, in the achieve­ ment of some of the subgoal s, one equation might be multiplied by a number to yi el d an equation with the same coefcient for a particular unknown as some other equation al ready obtai ned. Thi s action i l l us­ trates the use of another eval uation function -namel y, getting two equations to i nvol ve the same coefcient for a parti cul ar variabl e. Si nce achi evi ng thi s subgoal i s relati vel y si mpl e, we mi ght vi ew the sel ection of the appropriate action to achi eve thi s subgoal as hi l l cl i mbi ng. However, I thi nk that viewi ng the sol ution i n terms of the subgoal method i s far more accurate and important. Probl ems from Mathematics, Science, and Engi neeri ng ZTó N ow l et us consi der the solution of a very di ferent type of equati on: The goal i s t o deri ve an equati on of t he form x ¯ = . The operati ons avai l abl e i ncl ude al l those speci fed i n the previ ous probl em. Al so avai l ­ abl e are operati ons that may be stated general l y as "doi ng the same thi ng to both sides of an equation": adding the same number to both si des, sub­ tracti ng the same number from both si des, mul ti pl yi ng or di vi di ng both sides of the equati on by the same number, rai si ng both si des to the same power, taki ng the same root of both si des, or taking l ogs of both si des. ( Remember, however, that operati ons that i ncrease the degree of an equa­ ti on will add root s and operati ons that reduce the degree of an equation will subtract roots. ) For the purposes of sol vi ng this probl em, the onl y properti es of logari t hms t hat we need to know are that l og ( ab ) ¯ b l og a and that l og ( a . b) ¯ log a + log b. Now stop readi ng and try to sol ve the probl em. Si nce one property of the goal expressi on i s that the x does not ap­ pear i n an exponent, one subgoal that can be defned i mmediatel y i s to derive an equation i n whi ch x does not appear i n an exponent. Stop readi ng and try to solve the probl em, if you di d not before. Thi s sort of probl em would appear in conjuncti on with a chapter on l ogari thms, si nce to achi eve the subgoal we must take logari thms of both si des of the equation, as fol l ows: (x - 3 ) l og 4 ÷ x l og 2 + ( x + I ) Iog 3 ( l og 4 - l og 2 - l og 3 ) x ¬ l og 3 + 3 log 4 x ÷ :- l o-g,3 : +-- 3 - I-og ,, 4_ :- -: log 4 - log 2 - l og 3 Of course, we must know the rel evant properti es of l ogari thms i n order to solve thi s probl em. I n addi tion, we must defne the subgoal of achi evi ng an equation that i s i n a form for whi ch we know the ap­ propriate sol uti on methods, just as i n the case of si mul taneous l i near equations. In vi rtual l y al l probl ems from mathemati cs, sci ence, and engineering, there i s an i nterplay between the use of speciali zed knowl ­ edge and the use of general probl em-sol vi ng methods. Either the lack of special i zed knowl edge or the fai l ure to use general probl em­ sol vi ng methods wi l l resul t i n fai l ure to sol ve probl ems. I n a si mi l ar vei n, consi der the fol l owi ng logari thmi c equat i on: l ogl o (x - l ) + 1 0gl O 5x ¯ 1 ZT4 Chapter J J Stop readi ng and try to sol ve the above equation for the value of the variable x. The sol ution i s analogous to the solution of the previous probl em; namely, we set as a subgoal the derivation of an equation that is a si mpl e pol ynomial in x, for whi ch we may know a solution method (for exampl e, factoring or substitution i nto the quadratic formul a) . I n thi s instance, we set as an i ni tial subgoal the derivation of an equation i nvol vi ng no log terms. That i s, we attempt to el i mi nate logs i n the above equation. Stop readi ng and try to solve the probl em, mak­ ing use of this subgoal . El i mi nating logari thms from the equation can be achieved by ex­ ponentiating each side of the equation, as fol l ows : or 1 0 10g(x - I ) + lo g 5 x = 1 0 1 1 O log(x - I ) • 1 O l o g 5 x = 1 0 (x - 1 ) . 5x = 1 0 5x2 - 5x - I O = 0 x2 - x - 2 = 0 (x - 2) (x + l ) = 0 x = 2 x = -1 From the above, two roots for the equation were derived: x = 2 and x ¯ -1 . The l atter root i s not a solution to the original equation; it i s a root that was added vi a the exponentiating process, since exponen­ tiating i ncreased the degree of the equati on. Operations that resul t i n equations with added roots are not as dangerous to use as operati ons that resul t i n equations that el i mi nate roots of the original equation. When roots are added, it i s easy enough t o determine t he correct roots by substitution in the original equation. When roots are subtracted, there may be no way to determine the val ue of the el i mi nated root ­ whi ch may not be a serious probl em, if you do not need to get that el i mi nated root. In the present probl em, exponentiating both si des of the equation resulted i n a quadratic that was factorabl e, permitting easy solution. Of course, any quadratic equation can be solved by the quadratic formul a, whi ch anyone who has mastered high school algebra should Probl ems from Mathematics, Science, and Engi neering ZTb have memorized or be abl e to l ook up. I n thi s and the precedi ng prob­ lem, the state eval uation function being used was that equati ons wi th l ogs or exponential s of pol ynomials i n X are l ess hi ghl y valued than equati ons that are si mpl e pol ynomi al s i n X, regardl ess of the degree. The reason i s that we do not know any di rect algori thm for sol vi ng an exponenti al or logari thmic equati on. Thus, we are required to trans­ form the equation i nto some form for which we know a method of solution that works at l east i n some cases. I n the last two probl ems, then, we had to transform the exponenti al or logari thmi c equati ons i nto pol ynomial equations, hopi ng that the pol ynomi al equati ons so derived woul d be sol vabl e by factori ng or substi tuti on i nto the quad­ rati c formul a. TRIGONOMETRY Determi ne the al ti tude, h, of a general scal ene tri angl e, gi ven the l ength of one si de (its base b) and the angl es made by the two other sides wi th the base (the two base angl es, O and y) , as i l l ustrated i n Fi g. 1 1 - 1 . B FI GURE 1 1 - 1 Al titude of a triangle probl em. A / Knowing the base ·and the angl es L and ¸, we have determined a specifc (uni que) triangl e. Thus, in pri nci pl e, the altitude and every other property of the tri angl e i s specifed. However, to compute h, we need to have rel ati onshi ps that l i nk h to the val ues of the known quanti­ ties Öç ¸, and b. We might consul t a trigonometry text to determine whether there was any si mpl e formul a i nvol vi ng the unknown h and the three known quantities Lç ¸, and ·Suppose the trigonometry text l i sted no such formul a. How mi ght we proceed to determine the val ue of h ? Stop reading and try to sol ve the probl em. We might defne a subgoal of determi ni ng the area of the triangle i n terms of the known quantiti es Lç ¸, and ·Thi s subgoal i s extremel y useful since we al ready shoul d know that an equation exi sts that re­ l ates h and ·to the area (A ) of the tri angl e -namel y, A ¬ l bh. Stop readi ng and try agai n to sol ve the probl em, if you did not do so before. ZTb Chapter J J Suppose the trigonometry text does indeed l i st several formulas for the area of a triangl e, one of whi ch l ooks somewhat si mi l ar to the equa­ tion we are afer, namely, A ¬ ( b 2 si n L si n -) /( 2 si n () . The onl y prob­ lem with thi s equation is that i t i nvol ves the addi tional quantity {. However, if we remember that the sum of the angl es of a triangl e equals 1 80°, then knowi ng two angl es of a triangle allows us to compute the val ue of the thi rd angle of the triangl e, namel y, { ¬ 1 80 ¬ L -. Thus, usi ng thi s equati on for the area of a triangle i n terms of the three known quantities Lç -, and b, we can determine the area, A. From knowl edge of the area, A, and the base, b, we can compute the height, h, whi ch was to be determi ned. Once agai n, note that speci fc knowl edge of trigonometric and geo­ metric rel ati ons is critical for sol vi ng the probl em. However, the more compl ex formul a for the area of the triangle i n terms of three angl es and one si de i s a relation that we need not have memorized but onl y be capabl e of l ooki ng up i n a trigonometry text. Even the si mpl e formul a for fndi ng the area of the triangle i n terms of i ts hei ght and its base might be looked up in such a text, though i t is l i kel y to be remembered by a student who has understood geometry and trigonom­ etry. The speci fc geometric knowl edge that the sum of the angl es of the triangle is 1 80° probabl y must be known i n order to solve this probl em. Sol vi ng thi s probl em requi res more than knowl edge of the relevant geometric and trigonometric facts. We must also know whi ch of all of the relevant facts shoul d be sel ected to use i n the sol uti on. Thi s sel ecti on requi res the use of general probl em-sol vi ng methods. The goal i s to fnd the val ue of h. We mi ght work backward from thi s state­ ment of the goal to an equati on in whi ch h is i nvol ved along with some other quanti ti es. Such an equation might be A ¬ i bh, the most com­ monl y known formul a for the area of a triangl e. Si nce thi s equation i nvol ves the area that i s also unknown, i t i mmedi atel y suggests the subgoal of fnding the value of the area i n terms of known quanti ti es. Thi s subgoal probabl y requi res us to exami ne a tabl e of trigonometric formulas to determi ne if there is a formul a that rel ates the known quanti ti es Lç -, and b to the area of the triangl e. If there is, then we have all we need to sol ve the probl em. For us to solve this probl em, having a speci fc knowl edge of trigonometric formulas i s l ess i mportant than having access to books contai ni ng such i nformation in conveni­ entl y usable form. What i s cri ti cal i s the use of the general problem­ sol ving methods of working backward and defni ng subgoal s. These methods provi de the framework within which we can proceed i n d goal -di rected manner to sol ve the probl em. Hi l l cl i mbi ng was al so used to some extent to sol ve thi s trigonometry Problems from Mathematics, Science, and Engineering ZT¯ probl em. I n worki ng backward, the selection of an equati on relating to h and A i s superior to an equation relating to quantiti es none of whi ch i s known. Here, the value of the eval uation function i s greater, the more known quantities appear in the expressi on and the fewer unknown quanti ti es appear i n the expressi on. The same pri nci pl e appl i es i n worki ng forward i n tryi ng to defne as a subgoal another formul a for the area of the tri angl e that i nvol ves as many known quanti ti es as possi bl e. In thi s case, i t was possi bl e to fnd a formul a that i nvol ved al l three known quanti ti es and an addi ti onal quantity that was not original l y speci fed i n the givens as known but that could be tri vi al l y derived from the gi ven i nformation by means of a well­ known geometric theorem (that the sum of the angl es of a triangle i s 1 80° ). ANALYTIC GEOMETRY Determi ne the l ocati on and geometri c properti es of the fgure speci fed by the equati on x2 + y2 - 5x + 7y ¯ 3. The specifcal l y rel evant knowl edge from the fel d of anal yti c geom­ etry i s that any equation of the form x2 + y2 + Ax + By ¬ C i s the equation of a ci rcl e, and thi s equation can always be transformed to an equi val ent equation of the form (x - a) 2 + ( y - h) 2 ¬ (2, where ( a, h) represents the coordi nates of the center of the ci rcl e and ( represents the radi us of the ci rcl e. Retri evi ng thi s rel evant knowl edge i s surel y essential to sol vi ng the probl em, and i t gi ves part of the answer already -namel y, that the form of the fgure speci fed by the above equation i s the form for a ci rcl e. Al l that remai ns i s to determi ne the coordi nants of the center of the ci rcl e and i ts radi us ( a, h, and c ) . I t i s thi s probl em to whi ch general probl em-sol vi ng methods are ap­ pl icable. Stop readi ng and try to sol ve the probl em. The gi ven expressi on i s x2 + y 2 -5x + 7y ¬ 3 , and the goal expressi on i s of the form (x - a) 2 + (y - h) 2 ¬ c2. Once agai n, we mi ght defne a subgoal by means of working backward from the goal expressi on. Start­ ing with the goal expressi on, we know we can rewrite the goal expres­ sion i n the form (x2 - 2ax + a2 ) + (y2 - 2hy + h 2 ) ¬ (2. Now we need to work forward from the gi ven expressi on to the subgoal expressi on. Stop reading and try agai n to sol ve the probl em, if you di d not do so before. We know that 2a ÷ 5, from whi ch a ¬ � . Si mi l arly, -2h = 7 or h ¬¬f . Thi s means that we must add a 2 ÷ 25/4 and h 2 ¬ 49/4 to the lef si de of ZTÛ Chapter J J the gi ven equation i n order to compl ete squares to get t o the l ef-hand side of the subgoal equation. The same 29/4 + 49/4 must, therefore, be added to the right side of the gi ven equation, yi el di ng 3 + 74/4, or 86/4. Thus, 86/4 ¬ and L ¬ V/2. Therefore, the coordi nates of the center of the ci rcl e are ( i . i ) g and the radi us of the circl e i s V/2, and the probl em i s sol ved. Compl eting the square becomes a rather fami l i ar specifc technique in and of itself, once you have a certain degree of experience in mathematics. However, at some point to the begi nni ng mathematics student, it i s a new, unknown techni que. To the same student, the techni que of expanding a term of the form (x . ) to get (x2 -2ax + . ) is fami l iar. By usi ng the general probl em-sol vi ng method of working backward to get a subgoal , we can get the idea of completing the square in a compl etel y natural way and know exactl y how to do it. Determi ne equati ons for the new coordi nates of a poi nt i n a pl ane when the new coordi nate system is obtai ned by transl ati on and rotati on from an ol d coordinate system. Transl ation of a coordi nate system means the origin i s changed to a new point , and rotation of a coordi nate system means both axes are turned through the same angl e i n the same di recti on, pi voti ng about the ori gi n. I t makes no di ference i n whi ch order t he transl ation and rotation operati ons are performed ; the same new coordi nate system i s obtained i n either case. Stop reading and try to solve the probl em. The sol ution of thi s probl em resul ts si mpl y from breaking the prob­ lem i nto parts, that i s, setting subgoal s. Fi rst, sol ve the probl em of characterizing the new coordi nates obtained by si mpl e transl ation al one. Havi ng achieved this subgoal , then sol ve the second subgoal of characterizing the fnal set of coordi nates after a rotation has been appl i ed to the coordi nate system previousl y derived from the transl a­ tion. Stop readi ng and try to sol ve the probl em agai n. If x and y are the original coordi nates, and the coordi nates of the new origin i n terms of the origi nal (x, y) axes are (x o , Y o ) , then let the new coordi nates obtained by transl ation be represented by À J , Y I ' The formul as for si mpl e transl ation of coordi nates are as fol l ows: Y ¬ Y J + Y o YI ¬ Y - Y o Stop reading and try to sol ve the rest of the probl em, if you have not done so al ready. Probl ems from Mathematics, Science, and Engineering ZTÜ Let the coordi nates obtained by rotation through the angl e L be X 2 and Y 2 . Formul as relating the new (X 2 ' Y 2 ) coordi nates to the pre­ vious (Xl ' Yl ) coordi nates are as fol l ows: XI ¬ X 2 cos L - Y 2 si n L Yl ¬ X 2 si n L + Y2 cos L To get formulas for the combi ned transl ation and rotati on trans­ formati on, si mpl y combi ne the above formulas to obtain the fol l owi ng: X ¬ X 2 cos L - Y 2 si n L + Xo Y ¬ X 2 si n L + Y 2 cos L + Yo To express the new coordi nates i n terms of the old coordi nates requi res some algebra, from whi ch yi el ds the fol l owi ng: X 2 ¬ X cos L + Y si n L - ( xo cos L + Yo si n L) Y 2 ¬ -x si n L + Y cos L - (-xo si n L + Yo cos L) With the probl em bei ng broken i nto two subgoal s, each of whi ch was simpler to obtai n, i t was possi bl e to obtain the solution to the original probl em by simple algebraic combi nation of the sol uti ons to the two subprobl ems. Now l et us consi der the fol l owing probl em i nvol vi ng the transforma­ tion of coordi nate systems : I s it possi bl e to transform axes such that the strai ght l i ne 4x - 3y + 2 ¯ 0 wi l l have the form A¿ ¯ 0 and such that the straight l i ne 2x + y ¯ 4 wi l l have the form I, ¯ aYI ? If such i s possi bl e, deri ve t he transformati on. Stop readi ng and try to sol ve the probl em. Probabl y the frst thi ng to note i s that we have two equati ons and two unknowns. I f the equations are i ndependent, which they are, then it wi l l be possi bl e to solve the equations for the val ue of X and ). From a geometric poi nt of view, then, you must fnd the point of i ntersection of the two straight l ines represented by these l i near equations. Sol vi ng for the poi nt of i ntersection of the two straight l i nes wi l l prove to be an important subgoal in sol vi ng the probl em, but you need not even real i ze that i n the begi nni ng. Si nce sol vi ng these two l i near equati ons and two unknowns i s so si mpl e, you shoul d probabl y si mpl y draw these ZZÛ Chapter J J inferences from the gi ven i nformation, without regard to the goal , at the outset of the probl em (as di scussed in Chapter 3) . Thi s assumes, of course, that you are famil iar with the process of sol ving two l i near equations with two unknowns, so that thi s is a tri vi al inference. When it is easy to represent expl ici tl y the i nformation that is given i mpl i ci tl y i n the probl em, you shoul d undoubtedl y do so i n the begi nni ng, before even thi nki ng about how to reach the goal from the given information. Thi s i nitial step wi l l yiel d the information that the sol ution of the two equations ( poi nt of i ntersection of the two straight l i nes) is X ¬ I , · ¬ 2. Stop reading and try to sol ve the probl em, if you di d not before. Consider the goal and deri ve a sui table pl an for achi evi ng the goal from the given i nformati on. Havi ng drawn the inference that the two straight l i nes i ntersect at a parti cul ar known point, it is now cl ear that the goal i s achi evabl e. Why? Stop readi ng and try to sol ve the probl em, if you sti l l have not done so. The probl em obviousl y i ndi cates a di vi si on i nto two subprobl ems : (a) making the frst l i ne have the form . ¬ 0 and (b) making the second line have the form . ¬ . · Drawing i nferences about these subgoal s wi l l poi nt out the general types of transformation necessary to achieve each subgoal . Stop reading and try to sol ve the probl em, if you did not do so. It i s somewhat easier to consi der the achi evement of the second subgoal frst. The second subgoal equation, . ¬ . · asserts that, in the new coordi nate system, the second line will have zero . and · i ntercepts (bei ng a strict proportionality) . Having zero . and i ntercepts means that the equation of the second l i ne must pass through the origin of the coordi nate system. Havi ng drawn this i nference from the subgoal (or worked backward from the subgoal , if you wi l l ) , what transformation of the ori gi nal coordi nate system wi l l achi eve this new subgoal ? Stop readi ng and try to solve the probl em, if you did not before. Cl earl y, we can achi eve the second subgoal by a si mpl e transl ation of the coordi nate system from the origin to any point on the second line (2 + · ¬ 4) . Our preceding i nference concerning the point of intersection of the two straight lines might bias us to transl ate the origin of the coordi nate system to the point of i ntersection of the two straight l i nes, but at this stage of working on the probl em we do not know for sure that thi s i s the correct point of origin for the new coordi nate system. Now, how do we achi eve the other subgoal , namel y, that of trans­ forming the frst l i ne so that it has the form . ¬ O? Stop reading and try to sol ve the probl em, if you have not done so al ready. Again drawing i nferences from the subgoal (worki ng backward), we Problems from Mathematics, Science, and Engineering ZZT see that to transform the frst l i ne i nto a l i ne with the equati on . -0, we must transform the frst l i ne so that i t is coi nci dent with the · axi s i n the new coordi nate system. Such a l i ne wi l l al ways have the . co­ ordi nate equal to zero for any val ue of the · coordi nate. Achi evi ng thi s goal requi res what type of transformation of the original coordi nate system? Stop readi ng and try to complete the sol uti on of the probl em, if you have not done so al ready. Cl early, to achi eve this subgoal we must, frst, transl ate the origin of the coordi nate system to some poi nt along the frst line and, second, rotate the axi s of the coordi nate system to coi nci de with the frst l i ne. The frst aspect i nteracts with the transformation necessary to achi eve the other subgoal , so we are restricted to l ocati ng the ori gi n of the new coordi nate system at the point of i ntersection of the two straight l i nes (since the origi n of the new coordi nate system must l i e on both straight l i nes). A second restriction i n achi evi ng the present subgoal is that the axi s of the coordi nate system must be rotated around thi s new origin until the · axi s coincides with the frst straight l i ne .. · + 2 -0) . Geometric i ntuiti on i ndi cates that thi s can be done and so achi evi ng the goal is cl earl y possi bl e. Now the probl em i s to deri ve the nature of the rotati on, si nce the transl ation i s al ready obvious , that is, to move the origin to the point ( I , 2). To sol ve the rest of the probl em, we now need to know the angle of rotation of the coordi nate system requi red to line up the · axis with the straight l i ne represented by ·. · + 2 -O. Stop readi ng and solve the rest of the probl em, if you have not done so al ready. To sol ve thi s subprobl em l et us write down the formulas for the new coordi nates i n terms of the ol d coordi nates ( noting that . -. and · - · , -namel y, . -.cos L + · si n L k, where k -. cos L + · si n L. We mi ght al so go ahead and write down the equati on for · i n terms of .but thi s actual l y i s unnecessary to the sol uti on of the probl em. I f we are conti nual l y aware of what terms represent constants and what terms represent vari ables in any expressi on, then we shoul d note that, i n the right si de of the equation we have just written down, .and · are the onl y variabl es, and cos Lç si n Lç and k are all constants (al bei t unknown to us at present). Si nce gi ven i nformati on speci fes that . ¬0 and that ·. · + 2 -0, we can equate .cos L + si n L k to the expression ·. · + 2. Now if we take the equati ons for the general transformation of coordi nates i nvol vi ng both transl ation and rotati on and substitute them into the equation ·. · + 2 -0 we obtain · . cos L · si n L + . ) · .sin L + · cos L + · + 2 -O. ZZZ Chapter J J Si nce we wi sh t o fnd an a for whi ch À¿ - we can substitute À¿ -i nto the above equati on and al so substitute the known val ues of À_ -1 and Y o -2. Thi s yi el ds the equation -( 4 si n a + 3 cos a) Y I - Thi s equation i mpl i es that 4 si n a + 3 cos a -O. Thus , 4 si n a =-3 cos a. To determi ne a from thi s equation, si mpl y remember the trigo­ nometric i denti ty that sin a/cos a -tan a. We can then derive from the above that tan a --� --0. 75 . Usi ng the tabl es, thi s i ndi cates that a --3 6° 5 2' . Thus, the sol uti on to the probl em is to transl ate the origin of the coordi nate system to the point ( 1 , 2) and rotate the co­ ordi nate system through an angle of 3 6° 52' i n the negati ve di rection. One sl ightl y tri cky aspect of the probl em i s the use of the equation À¿ -i n conjuncti on wi th the equation deri ved by substi tuti ng i nto 4x - 3y + 2 ¯ O. Afer we have establ i shed that i t is i ndeed possi bl e to deri ve a transformation of the coordi nates so that the equation 4x - 3y + 2 -can be transformed i nto an equation of the form À_ -we are not real l y tryi ng to achi eve the goal expressi on À¿ -O. Rather À¿ -i s part of the gi ven i nformati on that we are usi ng t o achi eve the subgoal of determi ni ng the angle of rotation a. Thi s probl em cl earl y points up the need to careful l y defne and redefne what is gi ven i nformation and what i s the goal at di ferent stages i n the solution of the probl em. CALCULUS Prove that , wi thi n the set of triangl es havi ng a constant base and constant peri meter, the isosceles triangle has the maxi mum area. The speci fc calcul us necessary to sol ve thi s probl em i s to know that we can ofen fnd the maxi mum or mi ni mum of a function by di fer­ entiating it with respect to the variabl e(s) of whi ch it is a functi on. Stop readi ng and try to sol ve the probl em. Cl early, i n the present i nstance, the function (dependent variabl e) i s the area of a triangl e. However, the area of a triangle can be expressed as a functi on of a number of di ferent i ndependent vari abl es. Therefore, l et us set the subgoal of fndi ng a formul a for the area of a triangle that i nvol ves those quanti ti es that are speci fcal l y gi ven in thi s problem (ei ther constants or the variabl es with respect to whi ch we wi sh to fnd maxi mum area). Stop readi ng and try agai n to sol ve the probl em, if you did not do so before. I n thi s probl em, t he constants and i ndependent variabl es are evi ­ dentl y the si des of the triangle ( i ncl udi ng the sum of the si des, whi ch Problems from Mathematics. Science. and Engineering ZZó i s the perimeter) . Thus, we need to fnd a formul a that i nvol ves onl y these quanti ti es. Such a formula, whi ch can be looked up i n a book, i s Heron' s Formul a: A - . . .) ( s - b) ( s , ] 1 1 2 , where A is the area of a triangl e; .i s the semi perimeter, whi ch equal s l .+ b + c) ; and . b, and c are the lengths of the si des. Stop readi ng and try to sol ve the probl em, if you have not done so already. Havi ng achi eved the frst subgoal of fndi ng a formul a that i nvol ves the relevant constants and variabl es, we shoul d note that the formul a contai ns the semiperi meter (whi ch i s a constant, si nce the peri meter i s a constant) and the l ength of one side (which i s a constant ; l et i t be si de .. Two variabl e si des remai n -namel y, b and c -and we might achi eve the sol uti on by simply di ferentiating A wi th respect to both b and c. However, suppose we are fami l iar only with di ferentiating functions of si ngl e variabl es with respect to the si ngl e variabl e in order to obtai n the maxi mum or mi ni mum of the functi on wi th respect to that variabl e. I n thi s case, we must set a second subgoal to reduce the number of i ndependent variabl es from two to one. Stop readi ng and try to solve the probl em, if you have not al ready done so. We can achi eve the second subgoal , si nce with a constant peri meter and a constant base .. the sum of the other two sides must be equal to a constant. Thus, b + c -k and c - b. Substituting c -k b i nto Heron' s Formula for the area of a triangl e, we obtain the area as a function of a si ngl e variabl e, namel y, the l ength of si de b. Thi s can be di ferentiated wi th respect to b and the deri vati ve set equal to zero to determi ne that -2b. A tri ck that makes thi s a bi t easi er i s t o note that, i f t he area A i s a maxi mum, then A 2 i s a maxi mum and vice-versa. Si nce i t i s somewhat easier to di ferentiate A2 with respect to b than to diferentiate A with respect to b, thi s l i ttl e tri ck saves some work. In either case, we solve for -2b, from which i t fol l ows that b -c, and the theorem is proved. The work is gi ven bel ow: A2 -. .- . ) . b) . c) A2 -. . . ) . b) . k + b) d 2/ db -. . . ) . - + b) + . b) ( + ,-0 .+ k - b + . b -0 k - 2b -0 - 2b b + c = 2b - · ZZ4 Chapter J J I ncidental l y, the above probl em can be sol ved entirel y without cal­ cul us, usi ng the method of contradi ction i n conjunction with Heron' s Formul a. To use t he method of contradi cti on, we assume that the squared area (An in the case where ·,¬ ,is greater than the squared area (An in the case where · ¬ ¬ d. Without l oss of general ity, assume ·,> Since ·,+ ¬2d, then · > d > U si ng these equal i­ ti es and i nequal i ti es and some algebraic mani pul ation of the equation ( . ·, ( . > . d , we can eventual l y derive · , · ´ 0, whi ch is a contradi cti on si nce the square of any real number must be posi ti ve. The algebra i s given below for the i nterested reader: Deri ve the form of the fol l owi ng i ndefni te i ntegral : y = , _ __i | z d x The background i nformation that is assumed to be gi ven i ncl udes knowl ­ edge of the i ntegral s of el ementary functi ons (such as xn, eX, log x, si n x, and cos x). Other i mportant background i nformati on are the techni ques of i ntegrati on by substi tuti on and by part s, and di ferenti ati ng a functi on of a functi on. The frst major choi ce i n attacki ng an i ntegration probl em of thi s ki nd i s whether to use the method of substitution or the method of i ntegration by parts. Someti mes both methods must be used, but, i n any event, you sti l l have to deci de whi ch to appl y frst. Si nce i ntegra- Problems from Mathematics, Science, and Engineering ZZb tion by substi tuti on i s the more useful techni que, i t i s to be preferred as an i ni tial choi ce of i ntegrati on method, unl ess there is some speci al reason for preferring i ntegration by parts. I ntegration by parts i s useful pri mari l y when the function to be i ntegrated is a product of two func­ tions ,.÷ ,.... , Al though al l functi ons of .can be written as a product of two functi ons -namel y, , . ÷ , . . I -thi s i s a tri vi al type of product to whi ch the appl i cation of i ntegration by parts is only occasional l y useful . Thus, i n the present probl em, there i s no reason to use i ntegration by parts, so we adopt i ntegration by substi ­ tuti on as our i ni ti al operati on. Stop readi ng and try to sol ve the probl em. Note that the probl em-sol vi ng method consi derations di scussed so far i n thi s problem are all speci fc to cal cul us. Substi tuti on and i nte­ gration by parts are not general probl em-sol vi ng methods. However, i n deci di ng what type of substitution to make, general probl em-sol vi ng methods pl ay some rol e. In parti cul ar, hi l l cl i mbi ng i s useful . The hi l l cl i mbi ng uses an evaluation functi on concerned roughl y wi th si mpl i ci ty of functional form and the l i kel i hood of your knowi ng an i ntegral for the function resul ti ng from thi s substituti on. At the same time, another eval uation function i s general l y working at cross purposes wi th the frst one -namel y, the si mpl i ci ty of the functional form for the substi ­ tution .÷ , . A good rul e of thumb i s to try a substitution whose functional form i s l ess compl i cated than that of the ori gi nal functi on and resul ts i n a functi on to be i ntegrated that i s al so l ess compl i cated than the origi nal functi on. Thus, i n the present probl em, although a substitution of the form .÷ + . woul d greatl y si mpl ify the origi nal probl em, the form of the substitution functi on woul d be as compl i cated as the original functi on to be i ntegrated. Stop readi ng and try agai n to sol ve the probl em, if you di d not before. Better substi tuti ons woul d be .¬ . or .÷ + . Usi ng the l atter substituti on, .÷ + . and ..÷ · . du, yi el ds ¦ . . ..÷ · ¦ . � du ÷ · ¦ - � du ÷ 2 ( u - l og . + C ÷ · + . - ·l og + . + _ ÷ ·. ·l og + . + § Frequentl y, several substi tuti ons wi l l be requi red in order to sol ve the probl em, and, at each step you are essenti al l y usi ng the method of hi l l cl i mbi ng on an eval uation function concerned wi th si mpl i ci ty of functional form. There i s no preci se defni ti on of si mpl i ci ty of func­ ti onal form, but that lack sho!d not prevent you from expl i ci tl y recog­ ni zi ng that this i s what you are doing and that you have rather good ZZb Chapter J J judgment as t o what functions are si mpl er than other functions ( i n the sense of bei ng cl oser to functions for whi ch you know the i ntegral ) . As long as you are able t o deci de that t he functions resul ti ng from certain substituti ons are si mpl er than functions resul ting from other substitutions, you are i n a position to make good use of the hi l l ­ cl i mbing method, whether or not you can expl i ci tl y defne the eval ua­ tion function. Deri ve t he functi onal form of the fol l owi ng i ntegral : f x2 eX dx. The spe­ cifc background knowl edge i ncl udes knowl edge of the i ntegral s of the el ementary functi ons pl us the i ntegrati on by parts formul a -namel y, f uv dx ¯ u V - f u' V dx, where V = f v dx and H | = du/dx. Si nce the function to be integrated is an obvious product of two si mpl er functions, the method of integration by parts i s suggested. Whether i ntegration by parts i s maki ng progress toward the goal i s determined a great deal by the general probl em-sol vi ng method of hi l l cl i mbi ng on an eval uation function of si mpl i ci ty of functional form and ease of i ntegrati on. Stop reading and try to sol ve the probl em. In the present case, two appl icati ons of the method of i ntegration by parts is necessary i n order to solve the probl em. At each stage the appl i cation of i ntegration by parts resul ts in functi ons to be i nte­ grated that are si mpl er than the functions to be integrated prior to the appl i cation of i ntegration by parts. The specifc sol ution is as fol l ows: J x 2 e X dx ÷ x 2 ex - J 2xex dx + Á ÷ x 2 ex - 2xe. + 2e. + C ÷ ( x 2 - 2x + 2 ) eX + C I n the present i nstance, it woul d be possi bl e to give d preci se defni ­ tion of the eval uation function on whi ch the hi l l cl i mbing i s occurring ­ namel y, the exponent of x in the product x"eX when thi s product i s the function to be integrated. Repeated appl i cation of the method of i ntegration by parts resul ts i n reduction of the exponent, eventual l y t o xOex or eX. However, whether or not it i s possi bl e t o expl i ci tl y defne the eval uati on function bei ng used, hi l l -cl i mbing methods can be ex­ tremel y useful in sol vi ng such a probl em, so l ong as your judgment of si mpl i ci ty i s reasonabl y accurate. Fi nd the val ues of x for whi ch the functi on y = f( x) i s a maxi mum or mi ni mum. The functi on i s defned by the equation X + xy + ]º¯ 27. Rele­ vant background i nformati on i ncl udes the chain rul e for di ferentiati ng the functi on of a functi on, the rul e for di ferenti ati ng the product of two functi ons, and the theorem that the deri vati ve of a functi on equal s zero at a mi ni mum or maxi mum. Problems from Mathematics, Science, and Engineering ZZ¯ When we are fnding the maxima or mi ni ma of even a function of a single variabl e, y ÷ , . we are essential l y sol vi ng two equations for the values of two unknowns, .and y. Thi s fact is ofen not apparent to students when they origi nal l y l earn the method of fndi ng maxima and minima by diferentiating , . setting i t equal to zero, and sol vi ng for .because the original equation was already solved for y as a function of . I n such a case, the deri vati ve wi l l i nvol ve onl y a si ngle variable . When the derivative i s set equal to zero, the resul ting equation i s solved for the value of .for whi ch the function i s a maxi­ mum or minimum. I n the present probl em, the i nitial function y ÷ , . i s defned i mpl i ci tl y by the equation . xy y 2 ¬ ·· I n thi s case, it i s necessary to take a more general approach to the probl em, i n whi ch fnding the derivative and setting it equal to zero al l ows us to obtain a second equation, i n addi tion to the equation . . y ÷ ··. We hope that these two equations wi l l permit us to sol ve for the .y) points for whi ch the function has a maxi mum or mi ni mum. Stop readi ng and try to sol ve the probl em. To solve the probl em, we shoul d i nitial l y set a subgoal : to obtain an equation that i nvol ves the deri vati ve y' ÷ . ..Stop reading and try again to solve the probl em, if you did not before. Thi s subgoal can be achieved by diferentiating the given equation with respect to .(empl oying the product and chai n rul es for di ferenti­ ation). The resul ting equation i s ·.. y ¬ 2yy' ¬ o. Thi s equation can be solved for y' by algebraic manipul ati on, yi el ding the equation _ _ ·.y) y .· When thi s equation is set equal to zero, we obtain y ÷ ·.Substi­ tuting thi s equation i nto the original equati on, we obtain . ·. ·. ÷ ··or ·. ¬ ··or .¯ -· ) ÷ -and the probl em is sol ved. Once agai n, a si mpl e defnition of a singl e subgoal -namel y, obtai ni ng an expression for y' i n terms of .and y -resulted i n straightforward sol ution of the probl em. DIFFERENTIAL EQUATIONS The solutions of diferential equations provide parti cul arly good ex­ ampl es of the use of the probl em-sol vi ng methods of hi l l cl i mbi ng, subgoal s, and analogy to si mi l ar probl ems. Perhaps the most important specifc trai ning i s the ability to place a diferential equation in the proper cl ass. Once you note what other di ferential equations the one in front of you i s si mi l ar to (what cl ass it bel ongs to), you can then ZZÛ Chapter J J appl y the techni ques associated with the sol uti on of that cl ass of dif­ ferential equations. You need not even have much specifc knowledge of how to sol ve equations of a parti cul ar cl ass, so long as you can identify the cl ass and look up i n a book how to solve equations of that cl ass. Thus, analogy to si mi l ar probl ems is the crucial frst step in the solution of many di ferential equations. When the gi ven diferential equation i s a member of a cl ass for which sol ution methods are known, the methods of hill cl i mbing and sub­ goal s (using eval uation functi ons) are also quite important. For ex­ ampl e, in sol vi ng diferential equations of diferent forms, we ofen proceed by setting as a subgoal the transformation of the diferential equation i nto another diferential equation of simpl er form, and then by using the known sol ution methods for the si mpl er form. In grab-bag cl asses of di ferential equations (such as mi scel l aneous nonl i near di ferential equations), we may attempt to defne subgoal s such as transforming the equation to l i near form or reducing the order or degree of the equation, but frequentl y we si mpl y try out various operations on the gi ven nonl i near diferential equations to see which ones resul t i n an equation of the si mpl est form. The l atter i s cl earl y an exampl e of hi l l cl i mbi ng, usi ng an eval uation function that somehow weights diferent features of a diferential equation for overall ease of sol ution. For equations wi th order .and degree .> 1 ) , the relevant eval uation function i s frequentl y the vector . with lower values of either .or bei ng more highly valued. Diferential equations pro­ vi de good exampl es of vector eval uation functi ons, where there are many diferent properti es on whi ch hi l l cl i mbi ng might be tried to see whi ch, if any, approach would solve the probl em. Frequently, the sol uti on of a di ferential equation requi res a sequence of steps i n whi ch the degree and order of the equation are progressi vel y reduced, fnal l y resul ting i n a diferential equation of the frst order and frst degree. The order in whi ch the degree and order of the di ferential equation i s progressi vel y reduced may vary from probl em to probl em. Once the stage i s reached where you have a nonl i near di ferential equati on of the frst order and frst degree to sol ve, a variety of poten­ tial sol ution sequences can fol l ow, again depending on the type of frst-order, frst-degree di ferential equation. The nonl i near di ferenti al equation may be reduci bl e to l i near form by some suitable transformation. No general rul es exi st for determi ni ng such transformati ons nor the types of nonl i near equations to which they appl y, but experi ence wi th a wide variety of such probl ems may i ndi cate that the present probl em is si mi l ar to some probl em al ready Probl ems from Mathematics, Science, and Engineering ZZÜ solved in thi s way. Havi ng achi eved a l i near equation of the frst order and frst degree, you then appl y solution methods appropriate to thi s type of equation (for exampl e, usi ng Laplace transforms or i nte­ grating factors) . Another sol uti on sequence starti ng with a nonl i near, frst-order, frst-degree di ferential equation (whi ch is appropriate in some cases) i s to try to transform the equation to be a member of a parti cul ar cl ass of diferential equations known as ..., ·.,... Achi ev­ ing thi s subgoal may requi re you to fnd an appropriate i ntegrating factor to transform the gi ven diferential equati on i nto an exact di f­ ferential equation. Once an exact di ferential equation has been ob­ tai ned, you simpl y fol l ow solution methods appropriate for thi s type of equation. Another sol uti on sequence i s appropriate to diferential equati ons of the form . . .· .. - . .· where . ·, .,· ¥ O. To sol ve such equations, we set the subgoal of transforming thi s i nhomogeneous equation i nto a homogeneous equa­ tion by maki ng a substi tuti on. The next subgoal i s to transform thi s homogeneous equation i nto a diferential equati on i n whi ch the variabl es are separated, which i s then sol ved by di rect i ntegrati on. There are other solution sequences appropri ate to other types of nonl i near di ferential equations of the frst order and frst degree. However, just consi dering the sol uti on sequences di scussed here, note that an experienced sol ver of diferenti al equati ons has estab­ l i shed an eval uation function for frst-order, frst-degree di ferential equations, whi ch i s essential l y a partial orderi ng of a variety of di f­ ferent forms of such di ferenti al equati ons. I n thi s partial orderi ng, equati ons with separated variabl es are more hi ghl y val ued than homo­ geneous equations i n which the variabl es are not separated, the l atter being more highly valued than the type of i nhomogeneous equati ons described above, whi ch i n turn are more hi ghl y val ued than many nonli near di ferential equati ons not of thi s or any other i dentifable type. At the same time, exact di ferential equati ons are more hi ghl y valued than these mi scel l aneous nonl i near diferential equati ons, but there i s no relative ordering of exact diferential equati ons relative to equati ons wi thi n another sol uti on sequence, such as that appropri­ ate for the i nhomogeneous equations of the previousl y speci fed type. Al ong the same l i nes, l i near diferential equations are more hi ghl y ZóÛ Chapter J J valued than the mi scel l aneous nonl i near diferential equati ons, but they are not necessari l y more highl y valued than types of equations within some other solution sequence. This i s what we mean by saying that these diferent types of frst-order, frst-degree diferential equa­ ti ons have an eval uation function in the form of a ,..·., rather than i n a compl ete or si mpl e rank ordering of al l the diferent types of such equations. I f we know a variety of such types of diferential equations and the appropriate partial ordering type of eval uation function defned over them (that is, know the variety of diferent solution sequences), we are in a good position either to defne subgoal s or to recognize progress i n the use of the hi l l -cl i mbi ng method. However, there i s sti l l the prob­ lem of determi ni ng the proper operation (substitution, integrating fac­ tor, and the l i ke) to take in order to achieve a diferential equation of the more highl y eval uated (si mpl er) form. Perhaps general problem­ sol ving methods are appl i cabl e to this aspect of solving diferential equations, but, frankl y, I have so little experience i n sol ving diferen­ tial equations that I feel incompetent to di scuss the matter further. I n any event, once agai n, a sol ution of mathematical problems re­ qui res a mi xture of specifc knowl edge of mathemati cs and the use of general probl em-sol vi ng methods. You can, of course, l earn how to sol ve di ferential equations and other mathematical probl ems without appreciating that you are thereby appl yi ng general probl em-sol ving methods. However, understanding general probl em-sol vi ng methods probabl y faci l i tates your understanding the variety of techniques appl i cabl e to such mathematical probl ems. A general l y accepted dogma in educational psychology is that the more you can rel ate new knowl­ edge to ol d knowl edge, the faster and more complete your learning will be (though how good the evi dence for this i s I certai nl y do not know). So, if you know general probl em-sol vi ng methods you shoul d be abl e to qui ckl y organize many specifc methods for sol vi ng difer­ ential equati ons when these methods are i ntroduced i n terms of de­ fni ng cl asses of si mi l ar probl ems and defni ng eval uation functions that permit you to use the subgoal and hi l l -cl i mbing methods. To compl ement t hi s rather abstract di scussi on, l et us consider the sol ution of the fol l owing diferential equation, which was produced by AI Stevens, a student i n one of my probl em-sol vi ng cl asses: · --· · ÷ 1 . ¸ .¸ .. .. Stop reading and try to sol ve thi s diferential equati on. Problems from Mathematics, Science, and Engineering ZóT Stevens frst defned as a subgoal the transformation of thi s equa­ tion i nto a l i near, second-order diferential equati on, but he qui ckl y re­ pl aced thi s subgoal with a diferent subgoal -namel y, that of reduci ng the equation from second order, nonl i near to frst order, nonl i near. Stop reading and try agai n to sol ve the probl em, if you di d not do so before. The second subgoal is easi l y achieved because the diferenlal equa­ tion is of the form ... -, ... with . - not bei ng a functi on of .By recogni zing the diferential equation as a member of thi s cl ass, Stevens made avai l abl e hi s knowl edge that a standard substitution -namel y, -. ..-woul d transform the second-order di ferential equation i nto a frst-order diferential equati on. Si nce ...-... and ...- .. . .. then . .. ¬ .. Substituting into the original equation yi el ds the frst-order nonl i near di ferential equati on · .. · -1 . Stop reading and try agai n to sol ve the probl em, if you did not do so before. Algebraic mani pul ation of the equation yiel ds .. . -, Such an equation belongs to another specifc cl ass of diferential equa­ tions - namel y, Bernoul l i equati ons -whi ch are of the form . . - -, You can look up in a book that such equati ons can be reduced to l i nearity by the substitution - I n thi s case, that means the substitution - for whi ch - Substitution then yi el ds and . . , . . . . , __ . , g . | - Mul ti pl yi ng through by · , yi el ds . · - . which is a l i near equation in w and From thi s poi nt, the sol uti on i s straightforward by methods that can be looked up i n a book. I n retrospect, Stevens noticed that hi s frst-order, nonl i near di fer­ ential equati on, · .. · ¯ 1 , had the form . . -1 , ZóZ Chapter J J i n whi ch the variabl es y and are tri vi al l y separabl e. Thus, the equation is easi l y solvable without i t having to be reduced to l i near form. The sol uti on is as fol l ows: I ntegrati ng, Si nce -dyjdx, ¸ , - y ¸ -(y + d ¦ (y + y c 1 ) 11 2 dy - ¦ dx From a tabl e of i ntegral s, -2 ( 2c 1 - y) + ) -À + 2 · PROBABILITY AND STATISTICS A sampl e of two observations, x and y, are drawn from the uni form di s­ tribution on the interval from zero to I , f( x) ¯ f( y) ¯ I , 0 " x, y " I . Fi nd the rth raw moment of z, ( Ir : . ) , where z ¯ xy. Note that Ir: . ¯ ( z r g( z ) d z There are at l east three di ferent ways of sol vi ng thi s probl em. Stop readi ng and try to thi nk of as many ways as you can. The most obvious (but also the most di fcul t) way i s to set two subgoal s : (a) that of fndi ng the probabi l i ty densi ty function , ·. , for the new random vari able ·-. . y and (b) that of pl ugging thi s proba­ bi l i ty densi ty functi on into the defni ti on for the rth raw moment of · Let us assume that we know a si mpl e generali zation of the above formul a that extends i t in order to fnd the rth moment of a ,. of random variabl es, . y ) , where the joi nt probabi l i ty densi ty func- Problems from Mathematics, Science, and Engineering Zóó tion for the random variabl es .and is represented by , . The formul a i s Pr. ht r. 9 r ¬ j¸ . . ,. -. Now stop readi ng and try to sol ve the probl em, if you di d not do so before. I n thi s case, we can compute the rth moment of the function ·¬. provi ded we know the probabi l i ty densi ty function of the joi nt di stri bu­ tion of .and Thus, the frst subgoal i s to determine thi s functi on, ,. We must assume as i mpl i ci t information (though i t was not specifcal l y stated i n the probl em) that the two sampl e observations .and y) are independent or uncorrel ated. Knowing these observati ons are i ndependent and knowi ng the densi ty functi ons for each, we get the joint densi ty function ,. ÷, . ,¬ I , l .` I , ,. ¯ 0, el sewhere. From thi s poi nt on, the sol uti on is a si mpl e i ntegra­ tion, as fol l ows : Pt . z ÷ Pr. h( r. 9 i ÷ ¡ .,. ... ¬ ¡ ... ¡ . ( r + 1 ) 2 Final l y, one can set a total l y di ferent subgoal of fndi ng the moment generating function for the new vari able ·¬ . and di ferentiating the moment generating function r ti mes to fnd the rth moment of · I n sol vi ng the probl em by thi s method, we need to know more speci fc background information (such as the defni tion of a moment generat­ ing function and the Taylor seri es expansi on for .. but otherwi se the probl em i s sol ved i n a straightforward manner by this method as wel l . The pri nci pal general probl em-sol vi ng method used t o sol ve thi s problem was the setting of subgoal s. A variety of such subgoal s were l ogical l y rel ated to the sol uti on of the problem -namel y, that of de­ ri vi ng the probabi l i ty densi ty function , · , that of deri vi ng the joint di stribution function , . or that of deri vi ng the moment generati ng function for ·Setting the subgoal s i n each case i s a part of an overal l cal cul ative plan for sol vi ng the probl em in each of the three cases. For exampl e, a frst step i n sol vi ng the probl em might be to write down Zó4 Chapter J J the formul a for the rth moment of , i n terms of a doubl e i ntegral of . and the joi nt probabi l i ty densi ty function ,. Thi s would suggest that we needed to determine the joint probabi l i ty density func­ tion as a subgoal i n order to do the i ntegration and solve the probl em. The l engths of two part s, A and B, are normal l y di stri buted wi th means µ¸ ¡ ¯ 2 centi meters and µ¿ ¯ 4 centi meters and standard devi ati ons O_ ¯ 0. 03 centi meter and O¿j ¯ 0. 04 centi meter. One A piece and one B pi ece are randoml y assembl ed and l ai d end to end to form a length about 6 centi meters long. If the assembl y i s to ft certai n qual i t y control stan­ dards, it must be between 5. 9 1 and 6. 09 centi meters l ong. What per­ centage of such assembl i es wi l l fai l to fal l wi thi n these l i mi t s? Stop readi ng and try to sol ve the probl em. We set as a subgoal that we must determi ne the di stri buti on function for the sum of two random variabl es, A and B. We know from back­ ground stati stical knowl edge that, if A and B are normal l y di stributed random variabl es, then A B wi l l be a normal l y di stributed random variabl e with a mean equal to the sum of the means of the component random variabl es and a variance equal to the sum of the variances. Thus, µ_ + _ ¯ centi meters and Õ_ ( _ ¬ . · · ¯ · · ¯ · Havi ng achi eved the subgoal of determi ni ng the di stri buti on functi on for the random variabl e A B, we now set the second and fnal subgoal to be to determine what percentage of the di stribution lies outside a regi on of on either si de of the mean. This subgoal can be determined from a tabl e of the normal di stributi on, provided we know how many standard devi ations i s represented by centi­ meter. To determi ne this amount we si mpl y di vi de by ·to get 1 . 8 standard devi ati on uni ts, tel l i ng us that we are aski ng for the per­ centage of cases fal l i ng in the two tai l s of a normal di stri buti on 1 . 8 standard devi ati on uni ts out from the mean. Looking up the value , ¬ 1 . 8 i n a tabl e of the normal di stribution gives the fgure of · i n one tail or 7. 2 percent i n both tai l s. Once agai n, the sol uti on of the probl em proceeds from our setting a series of one or more subgoal s that, taken together, constitute the sol uti on to the enti re probl em. Determi ne one way i n whi ch the random vari abl e ¸ mi ght have been formed, where the moment generati ng functi on of ¸ i s 40 + ( 8'1 2 ) M z « ( ) = ( 1 _ 2 ( ) 31 2 Stop reading and try to sol ve the probl em. Problems from Mathematics, Science, and Engineeri ng Zób I n contrast to earl i er probl ems, i n this one the goal i s specifed but the gi vens are not, and we must determine some set of gi vens such that the goal -namel y, the moment generati ng function -can be de­ rived as a consequence. The obvious way to check out any hypothe­ si zed set of gi vens i s to use the method of contradi cti on. I n addi ti on, we shoul d use the method of working backward from the goal expres­ si on, si nce thi s i s a uni que starting point i n the probl em. The most rel evant pi ece of background i nformation i s that the moment generat­ ing functi on of a sum of random variabl es is the product of the moment generating functions of each component random variabl e. Stop read­ i ng and try again to sol ve the probl em, if you did not before. By exami ni ng a tabl e of such moment generati ng functions, we can qui ckl y excl ude the possi bi l ity that the moment generating functi on of , i s i tsel f a random vari abl e with a si mpl e standard di stri buti on function. The next si mpl est hypothesi s woul d be that , is the sum of two ran­ dom variabl es, each of which has a si mpl e fami l iar di stri buti on func­ tion. Thi s bei ng the case, we shoul d work backward from the goal expression by factoring it i nto two components, each of whi ch is a moment generating function for a fami l iar di stribution functi on. The most obvious spl i t of the goal moment generati ng functi on i s probabl y t o mul tipl y t he numerator ti mes t he reci procal of t he denomi nator. I t turns out that the numerator is the moment generati ng function for a normal di stri bution with mean 4 and standard deviation I , and the reci procal of the denomi nator i s the moment generating functi on for a random variabl e of the , di stri bution on ·degrees of freedom, whi ch means that , i s the sum of these two random variabl es. Had thi s parti cul ar factorization not worked, there are a number of other si mpl e factorizati ons of the moment generati ng function that might have been matched for form agai nst our table of moment generat­ i ng functions for fami l iar di stri buti ons. I n many ways, the sol uti on of the probl em i s terri bl y si mpl e. We can ask, "How can we not start with the goal expressi on, whi ch i s the onl y gi ven i n the probl em other than impl icit gi ven i nformation? I ndeed, if we start to mani pul ate the goal . expression and know the relevant background i nformati on about the moment generating functi ons, it i s di fcul t to sec how we can fai l to sol ve the probl em. Neverthel ess, many peopl e do fai l to sol ve thi s probl em and other equal l y si mpl e probl ems, because they have no idea what to do. In many cases, they are genui nel y defci ent in important background i nformati on, but those who knew they had to work backward from the goal expressi on in the present probl em woul d l i kel y l ook up i n books the rel evant information about moment Zób Chapter J J generating functi ons that was needed i n order to sol ve the probl em. Those who have a thorough knowledge of the speci fc subject matter probabl y need to have no consci ous understandi ng of general probl em­ sol vi ng methods i n order to sol ve this and many other probl ems. However, those who are l earni ng the speci fc subject matter wi l l be ai ded i n thi s learni ng by a thorough knowl edge of general problem­ sol vi ng methods, whi ch suggest what types of i nformation are needed i n order to solve probl ems. Agai n, i n the probl em wi th the two pi eces A and B that are joined end-to-end to form a new combined pi ece that must fall wi thi n certain tol erance l i mi ts, students might lack the specifc background i nforma­ tion about the di stribution function of the sum of two normal l y di s­ tributed random variabl es. However, having cl earl y defned the subgoal of determi ning such a di stri bution function in order to determine the percentage of cases that l i e i n its tai l s, it is l i kel y that students would l ook for the di rectl y rel evant pi ece of information they l acked. Formulas for getting certai n i nformation from other i nformation ofen automati cal ly provi de you with a set of subgoal s -namel y, that of determi ni ng the values of the vari ous components of these formulas. Thus, if you have enough speci fc background i nformation to know the appropriate general formul as, you can ofen substitute that informa­ ti on for an understandi ng of general probl em- sol vi ng methods i n those cases where you know some general formul a that encompasses al l the aspects of the probl em. However, if no such formul a exi sts or if you do not know it, understandi ng general probl em-sol vi ng techni ques can be quite cruci al i n devi si ng an adequate pl an to solve the probl em. COMBINATORIAL ANALYSIS How many ways can a set of contestants consi sti ng of four men, three women, two boys, and three gi rl s be sel ected from an audi ence consi sti ng of ei ght men, ni ne women, si x boys, and si x gi rl s? Stop readi ng and try t o sol ve t he probl em. To sol ve the probl em, we mi ght set a seri es of subgoal s ; that i s, we mi ght determi ne how many ways there are to pi ck frst the men al one, then the women al one, then the boys alone, and then the gi rl s al one. Let us cal l the sol uti ons to these four subprobl ems Í¡ , Í_ , ͸ , and Í¿ • Stop readi ng and try agai n to sol ve the probl em, if you di d not before. The number of ways to pick an enti re set of contestants is si mpl y the product Í[ • Í_ • ͸ ¡ : Í_ . Each of the subgoal s i s a si mpl e combi na- Problems from Mathematics, Science, and Engineering Zó¯ ti ons probl em (unordered sets obtained by sampl i ng wi thout repl ace­ ment) . Thus , the total number of ways i s si mpl y ¸ · · · ¸ ¸ · · ¸ ¸ � ¸ ¸ · · ¸ · · · · · · · · John and Fred agree to pl ay a tenni s match, wi th the wi nner to be the person who frst wi ns two sets i n a row or a total of t hree sets. Fi nd the number of ways the match can occur. Stop reading and try to solve the probl em. The most straightforward way to determi ne the number of ways the match can occur i s to construct a tree di agram, marki ng all termi nal s of the tree where ei ther one of the condi ti ons i s frst satisfed and stop­ pi ng the growth of the tree from that point on. The tree has two branches at each node -namel y, A wi ns or B wi ns. Al ternati vel y, we can determi ne the answer without expl i ci tl y con­ structi ng the tree, by the fol l owi ng l i ne of reasoni ng. We frst make certain i nferences from the i nformati on given i n the probl em-namel y, that the match cannot end before two sets have been pl ayed and must end afer a maxi mum of fve sets have been pl ayed ( si nce out of fve sets one pl ayer must wi n at l east three sets) . Havi ng made these i n­ ferences, the probl em of determi ni ng the number of ways the match can occur can be reduced to a set of four subprobl ems -namel y, we must determine how many ways the match can end afer two sets, three sets, four sets, or fve sets. Stop readi ng and try agai n to sol ve the prob­ lem, if you did not before. Cl earl y, there are onl y t wo ways t he match can end after t wo sets ­ namel y, A wi ns both sets or B wi ns both sets. There are al so onl y two ways the match can end after three sets: A wins the frst set and B wi ns the next two, or B wi ns the frst set and A wi ns the next two. Now we might note t hat, i n general , at each l evel of the tree, afer the second, there wi l l be exactl y two termi nal nodes and two nonterminal nodes under the rul e that the wi nni ng pl ayer must wi n two sets i n a row. Thus, at every node prior t o t he l ast, there wi l l be exactl y t wo termi nal nodes and two nonterminal nodes. At the last node, there wi l l be four terminal nodes, si nce, by the three-set rul e, al l nodes must be termi nal once fve sets have been pl ayed. Thus, there are two terminal nodes after two sets, two termi nal nodes afer three sets, two termi nal nodes afer four set s, and four termi nal nodes after fve sets, or termi nal nodes i n all (and so ways the match can occur). ZóÛ Chapter J J The pri nci pal general probl em-sol vi ng methods used i n sol vi ng the probl em were i nference and the subgoal method. Note that al though the special i zed method of expl i ctI y constructing a tree di agram wi l l al so sol ve thi s probl em without the need for usi ng the more general subgoal method, the subgoal method i n combi nation with certain i n­ ferences general i zes easi l y to probl ems i n whi ch to construct an ex­ pl ici t tree woul d be extremel y l abori ous. John pays a quarter to pl ay a si mpl e coi n-fi pping game agai nst a gambl i ng casi no. The quarter enti tl es hi m to pl ay a maxi mum of fve coi n fi ps agai nst the house. John wi ns $ 1 every t i me he cal l s the coi n correct l y ( head or tai l s) and l oses $ 1 every ti me he cal l s the coi n i ncorrectl y. John begi ns wi th $3 and wi l l stop pl ayi ng whenever he l oses hi s enti re stake or wi ns $3 (that i s , has a total of $6) . Of course, he must qui t afer pl ayi ng a maxi mum of fve coin fi ps. Fi nd the number of ways the. pl ay­ i ng can occur. Stop reading and try to sol ve the probl em. A frst general probl em-sol vi ng method we mi ght use i s to note the si mi l arity to the previous probl em. This si mi l arity l eads to the conjec­ ture that we coul d sol ve the probl em ei ther by constructing an expl i ci t tree di agram or by maki ng certai n i nferences about the tree di agram and then breaki ng up the probl em i nto subprobl ems to determine how many ways the pl ayi ng can occur, stopping afer Í coi n fi ps, for al l Í Î 5 . Stop readi ng and try again to sol ve the probl em, if you di d not before. One si mpl e i nference is that John must pl ay for at l east three coi n fi ps, si nce, at worst, he wi l l l oose . on each coi n fi p, and he has .·to pl ay with. Si nce he can pl ay at most for fve coi n fi ps for hi s original quarter, we know that we can break the probl em into three sub­ probl ems -namel y, to determine how many ways the pl ayi ng wi l l stop afer three, four, and fve coin fi ps. Cl earl y, there are exactl y two ways the pl ayi ng wi l l stop afer three coi n fi ps (two terminal nodes), l eav­ ing si x nontermi nal nodes afer three coi n fi ps. Another rel evant in­ ference is that i t is i mpossi bl e to be ahead or behi nd by an even number of dol l ars afer any odd number of coi n fi ps (such as three fi ps). Thus, i t i s i mpossi bl e for John to be ei ther even or ahead or behi nd by $2 after three fi ps. Hence, the si x nontermi nal nodes must al l i n­ vol ve diferent sequences of wi nni ngs and l osi ngs that total either . or . and, by symmetry, there must be three of each type. From thi s we can concl ude that there are no terminal nodes afer four coi n fi ps and · nontermi nal nodes. The 1 2 nontermi nal nodes at l evel Problems from Mathematics, Science, and Engineeri ng ZóÜ 4 l ead to 24 nodes at l evel 5 , al l of whi ch, by defni ti on, must be ter­ mi nal . Thus, there are preci sel y 26 di ferent ways the pl ayi ng can occur. Binomial theorem. Prove that ( a + b ) /1 Z � ¸ n , a ll - r • b r Lr - o r ¸ n , n ! where Z I ( _ ) I r r . n r . The basic probl em-sol vi ng method to be used i s mathematical i n­ ducti on, whi ch we have al ready noted i nvol ves a combi nati on of the general probl em-sol vi ng methods of speci al case ( provi ng the theorem true for ÷ I) and the subgoal method ( di vi di ng the proof of the theorem i nto two part s: provi ng i t true for ÷ 1 and showing that, if it is true for the theorem is true for . Stop readi ng and t ry t o sol ve t he probl em. The theorem i s tri vi al l y true for ÷ so the crux of the proof con­ si sts of assumi ng that the theorem hol ds for .· and provi ng i t i s true for . · To prove thi s l emma, we assume the theorem i s true for and mul ti pl y both si des of the equation by .· Thi s operation yi el ds the term .· on the left si de of the equation, as desi red. The right si de of the equation wi l l cl early i nvol ve exactl y 2 terms of the form . • · where Î goes from zero to as desi red. Thi s i s obvi ous by i nspecti on. What remai ns i s to prove that the coefcients of each . ·term have the form ¸ � , Except for the terms . and · whi ch ari se onl y once i n the mul ti pl i cati on of .· L;= o ¸ ¸, . · every other . ·term ari ses in two pl aces . The term i n the product that contai ns · i s obtained from . _ � ,. · ¸ . ¸ ¸ , ,. · ¦ ÷ _ � , . · ¸ ; , , · ÷ _ � , ¸ , ¦ . · Al l that remai ns is to show that _ � , ' ¦ ÷ ¸ , , Thi s re­ mai ni ng subgoal is tri vi al l y establ i shed by algebraic combi nation of the two fractions, and the theorem is proved. Z4Û Chapter J J Besi des usi ng the method of mathematical i nducti on, whi ch we have noted is an appl i cCtion of two general probl em-sol vi ng methods, our i mpl ementati on of the proof i nvol ved breaki ng the probl em i nto two part s: frst, determi ni ng that there were the correct number and type of terms . ·on the right side of the equation and, second, deter­ mi ni ng that the coefci ents of each such term were of the proper form, namel y, ¸ ¸ 1 , . Achi evi ng the fnal subgoal of showi ng that ¸¸¸ � 1 , ¸�¸ ¡ " ¸ ¸ 1 , coul d be sai d to i nvol ve hi l I cl i mbi ng on a two­ di mensi onal eval uati on function consi sting of the number of terms i n the numerator and the denomi nator of the coefcient of . · •I n the goal , the coefci ent of . ·i s a si mpl e fracti on consi sting of one factorial in the numerator di vi ded by two factorial s in the denomi- nator. The expressi on ¸¸¸ � 1 , + ¸�, | cl earl y i nvol ves two separate factorial fracti ons that must be combi ned i nto a si ngl e factorial fraction with two factori al s i n the denomi nator and one i n the numerator. By analogy to si mi l ar probl ems, we frst express both fracti ons i n terms of the same denomi nator by mul ti pl yi ng numerator and denomi nator of each fracti on by the appropri ate numbers. Then we add the numerators, putti ng them over the common denomi nator, and factor the numerator to obtai n a coefci ent of . ·of the desi red form. Of course, m ost probl em solvers engaged i n sol vi ng probl ems of this type wiII have practi ced the l atter sequence of operati ons so well i n hundreds of precedi ng probl ems that they wi l l hardl y need to thi nk of appl ying any general probl em- sol vi ng method i n order to i mpl ement the algebraic sol uti on. The eval uation functi on used i n defni ng the i ni ti al breakup i nto sub­ probl ems (determi ni ng whether there were the right number of . · terms and determi ni ng whether the coefci ents matched) comes straight from characterizati on of the right si de of the goal expressi on, namel y � ¸ ¸. · After mul ti pl yi ng . · � ¸ ¸. · � Î � Î ³ we obtai n · 1 ) terms, and these terms must be re�uced to + · terms of the proper form, in order to achi eve the goal . Thi s reduction cali be subdi vi ded i nto two parts: frst we achi eve the right number of terms havi ng the proper . · components and then we determi ne if the coefci ents of these terms match the desi red coefci ents i n the goal expressi on. Note that in order to defne the subgoal s, it i s not neces­ sary to expl i ci tl y defne any single numeri cal or vectored-val ued evaluation functi on. Al l that is necessary is that we have a more or Problems from Mathematics. Science. and Engineering Z4T l ess expl i ci t awareness of some of the di mensi ons on whi ch the goal expressi on di fers from the gi ven expressi on and defne subgoal s on the basi s that they match the goal expressi on on more di mensi ons than the given expressi on. NUMBER THEORY Prove that if ( 2" - 1 ) is a prime number, then N is a prime number. Thi s problem and i ts proof were gi ven to me by AI Stevens. Stop readi ng and try to solve the probl em. A good general probl em-sol vi ng method to appl y i ni ti al l y i s the method of contradi cti on. Thi s method i s suggested by the exi stence of two si mpl e alternati ves for Ð. either i t i s pri me or i t i s not prime. I f it i s not pri me, i t i s expressi bl e as the produce of two i nteger factors, neither of whi ch equal s unity. If Ð bei ng not pri me in conjuncti on with ·· 1 bei ng prime can be shown to yi el d the contradi ctory concl u­ sion that ·· 1 i s not pri me, then the ori gi nal theorem wi l l be estab­ l i shed. Stop reading and try agai n to solve the probl em, if you did not do so before. To i mpl ement the method we must show Ð i s not pri me i mpl i es that · 1 i s not pri me. If Ð is not pri me, then · 1 " · 1 , where j and : · Under these ci rcumstances · 1 can be factored i nto Thi s l atter is establ i shed by si mpl e di vi si on of · 1 ) i nto · 1 ) . Thus, ·· 1 i s not pri me, contradi cting the gi ven i nformati on. About the onl y general probl em-sol vi ng method I can suggest that might gi ve you the i dea of trying to use the factor · 1 ) woul d be general experience with probl ems i nvol vi ng si mi l ar expressi ons ­ namel y, those of the form . b Ill . Of course, there are not too many obvious factors to try to use other than ei ther · 1 ) or · 1 ) , ei ther of whi ch wi l l do. Thus, once you deci de to use the method of contradi cti on, the rest of the probl em is rel ati vel y straightforward. MODERN ALGEBRA Gi ven that the posi ti ve i ntegers are wel l ordered (each nonempt y subset of the i ntegers contai ns exactl y one smal l est i nteger), prove that there is no i nteger between 0 and 1 . Z4Z Chapter J J Stop readi ng and try to sol ve the probl em. Use the method of contradi cti on. N ow try again to sol ve the probl em, i f you di d not do so before. Assume that there are one or more i ntegers between and I . By the wel l -ordering property, there is i n this nonempty set of i ntegers be­ tween and I some least i nteger, M, for whi ch ´ M ´ I . Mul ti pl yi ng both si des of these i nequal i ti es by the number Mç we have ´ M 2 ´ M. Thus, M 2 must be another i nteger i n the cl ass of i ntegers between and I and, furthermore, M 2 ´ Mç which contradi cts the assumption that M was the l east i nteger between and I . Si nce the contradi cti on was reached by assumi ng the exi stence of i ntegers between and I , this i mpl i es that there i s no i nteger between and I . Gi ven a set of el ements, G, wi th a bi nary operati on, * , defned over G such that G is a group. The defni ti on of the group is a set of el ements wi th a bi nary operati on such that al l of the fol l owi ng four properti es hol d. ( 1 ) Cl osure: for al l a, b, and c i n G, a* b ¯ d i s a member of G. ( 2) As­ soci at i vi t y: for al l a, b, and c i n G, ( a*b) *c ¯ a*( b*c) . (3) Lef i dent i t y: for al l a i n G, there exi sts an e in G such that e*a ¯ a. ( 4) Lef i nverse: for al l a in G, there exi sts an a - I in G such that a - I * a ¯ C. For such a system prove the fol l owi ng theorem: Uni que left i nverse: for al l a in G, there exi sts a unique a-I i n G such that a - I * a ¯ t. Stop readi ng and try to prove the above theorem. Whenever we encounter a uni queness proof, the method of contra­ di cti on is i mmedi atel y suggested. That i s, we shoul d assume that there exi st two di ferent l ef i nverses . and . � such that . · . " t and .� · ." t¡ and attempt to show that . ÷ .� (contradi cti ng the as­ sumpti on that . and .� are di ferent ) . Stop readi ng and try agai n to sol ve the probl em. I t i s tri vi al to concl ude that, if . ·.÷ C and , � ·." t, then . ·. ¬ . · .However, it may appear somewhat di fcul t to peel of the i denti cal a ' s from the right-hand side of the equation ( si nce we have not al ready proved any right cancel l ati on law). Therefore, we might set as a subgoal ( l emma) to prove the right cancel l ation l aw -namel y, that for al l b i n G, · · .÷ · .i mpl i es that b ÷ However, thi s l aw is cer­ tai nl y not easi er to prove than the uni que lef i nverse theorem i tsel f. Thus, t hi s subgoal appears unl i kel y to be useful in the present probl em. However, there i s another subgoal (conjectured l emma) that i s easier to establ i sh and thus faci l i tates sol ution of the present probl em. Thi s l emma does al l ow us i n essence to peel of the .from the equation . · .¬ . · .Try to conjecture thi s lemma ( subgoal ) and then prove i t, if you have not done so al ready. The useful l emma ( subgoal ) is that the l ef i nverse of an el ement Problems from Mathematics, Science, and Engineering Z4ó i n a group is the same as the right i nverse -namel y, . · .-i mpl i es that . · . - Cl earl y, if thi s l emma were true, i t woul d permi t us to mul ti pl y both si des of the equation . · .-. · .on the right by the quanti ty . and change the i dentical a' s into e' s on both sides of the equati on. Since we have speci al given i nformation regardi ng e' s , they might be easier to peel of than a' s. Stop reading and try to prove the l emma that the l ef i nverse equal s the right i nverse. Proof of thi s l emma i nvol ves use of the inference method ; that i s, we si mpl y perform substitution operati ons on t he quantity . · . to attempt to show that . · -The exact proof of thi s l emma fol l ows. . · ,÷ . · · -. · , · . · . -. · . · . · . , - . · . · . · . Let . · -·then ·÷ · · ·There exi sts ·in G such that · · ·- ·- · ·- · · · · ·-· · · · · , -· · ·-e Thus, · - and . · .-e Q. E. D. Now stop readi ng and try t o prove the rest of the theorem, i f you have not done so al ready. The frst l emma does not qui te permit us to prove the theorem i n a straightforward way, si nce what we obtain i s an expressi on of the form . · ÷ . · and we are not yet justifed in droppi ng the e' s from both si des of the equati on. We have been gi ven the left i denti ty property but not the right identity property. Therefore, i t i s necessary to set a second subgoal ( l emma) of provi ng the right i denti ty property for a group -namel y, that · - , i mpl i es that . · -.for al l .in G. Stop readi ng and prove this second lemma and then conti nue to prove the rest of the theorem, if you have not done so al ready. The proof of the second l emma is quite tri vi al agai n by the i nference method and i s gi ven bel ow: .· -. · . · . ÷ . · . · .- .-. Q. E. D. Gi ven the above two l emmas, the proof of the original theorem concerning the uni queness of the lef i nverse i s quite tri vi al and is gi ven bel ow: and . , · . - . · . · . . · . . -. . . Z44 Chapter J J But, by the frst l emma, . · . - Therefore, . · -. · and, by the second l emma, .-. ( Q. E. D. ) . MECHANICS What constant force wi l l cause a mass of 3 ki l ograms to achi eve the speed of 20 meters per second in 6 seconds starting from rest ? Relevant back­ ground i nformati on is Newton' s second l aw: f ¯ NU, where U ¯ dv/dt ¯ dx/dt 2 • Al so rel evant is some very el ementary knowl edge of cal cul us. Stop readi ng and try to sol ve the probl em. Si nce we know a formul a for the goal quantity (force) , the frst step is to work backward from the goal and write down the known formula for force -namel y, , -. Since we know the mass (m), we i mmedi­ atel y defne as a subgoal the determi nation of the accel eration .. Stop readi ng and try agai n to solve the probl em, if you did not do so before. By defni ti on, .-..and .-... •We choose to work with the former formul a, since the given i nformation involves vel ocities .and not positions .. Since we do not know .. but onl y certain val ues of the vel oci ty at the begi nni ng and end of the motion . - and .,-·meters per second) , we set another subgoal of transform­ i ng the equation .-..i nto an equation relating .to the known quanti ti es V_ and , •Stop readi ng and try again to sol ve the probl em, if you di d not before. El ementary knowl edge of calcul us tel l s us that thi s sol ution i s achi eved by use of the i ntegration operati on, yi el di ng .. -. ;..-;. and since .i s a constant over ti me, .-+ c. The constant of i ntegrati on C -. -0, si nce -at -O. Thus, we have the formul a .- . . whi ch i mpl i es that .- · -· meters/sec2• Havi ng achi eved the subgoal of determi ni ng the accel era­ ti on, the rest is si mpl e. Si nce the mass equal s ·ki l ograms, force equal s · ·- · newtons. The pri nci pal probl em-sol vi ng methods used were ( a) that of working backward to determi ne the pri nci pal subgoal i n the probl em, namel y, determi ning the (constant) accel eration, and ( b) that of hi l l cl i mbi ng i n the determination of accel erati on, usi ng an eval uati on function con­ cerned with how close to given quantities the quanti ti es were on the right side of the equati on. By thi s l atter eval uation functi on, we choose ..over ... si nce the former i s at least i n some way concerned with a quantity (vel oci ty) that is known at some poi nts. By contrast , the l atter expressi on is concerned wi th posi ti on, about whi ch nothing Problems from Mathematics, Science, and Engineering Z4b is known at all i n the given i nformati on. Furthermore, we choose to transform dv/dl, whi ch is a statement about the derivative of vel oci ty, i nto a statement about v' s at certai n poi nts i n ti me, si nce the l atter are di rectl y known from the gi ven i nformation and the former is not. HEAT A cal ori meter contai ns 500 grams of water and 300 grams of i ce, al l at a temperature of 0° C. A 1 ,000 gram mass of an unknown substance i s taken from a furnace where i t s temperat ure was 240° C and i s dropped i mmediatel y i nto the cal ori meter. As a consequence, al l the i ce i s just mel ted wi th the temperature of the water remai ni ng at 0°. What woul d be the fnal temperature of the water had the mass of the unknown substance been 2, 000 grams? Negl ect heat l oss from the cal ori meter and the heat capaci ty of the cal ori meter. The rel evant background i nformati on for sol vi ng t hi s probl em consi st s of the fol l owi ng. The heat of fusi on of water equal s 80 cal/gm, whi ch means that 80 cal ori es of heat must be suppl i ed to convert I gram of i ce at 0° C to I gram of water at 0° C. Materi al s are consi dered to have an approxi matel y constant speci fc heat capaci ty (c) over modest ranges of temperatures (consi der the ranges di scussed i n the present experi ment to be "modest ") . When a body changes tempera­ ture, the heat gained or lost equal s the mass of the body t i mes the speci fc heat capaci ty ti mes the di ference i n temperature (i n degrees C). Fi nal l y, under the condi ti ons of cal ori meter experi ments, the l aw of conservati on of heat hol ds - namel y, heat l ost equal s heat gai ned. Stop readi ng and try to sol ve the probl em. I t i s desi rabl e to i ntroduce efci ent symbol i c notati on to represent the unknown quanti ti es i n the present probl em. Let 1 2 be the tempera­ ture of the system (the 2, 000 gram substance and the water in the calorimeter after the 2, 000 gram substance has been dropped i nto the water and al l owed to reach equi l i bri um). Let [ _ be the specifc heat capacity of the unknown substance. The goal is to sol ve for 12, but to do so, we must set a subgoal . What i s i t? Stop readi ng and try to sol ve the probl em, if you have not done so al ready. The obvi ous subgoal i s to determi ne the speci fc heat capaci ty of the unknown substance. Thi s subgoal can evi dentl y be achi eved by usi ng the resul t s from the frst experi ment, where a 1 , 000 gram mass of the substance was just sufci ent to mel t 3 00 grams of i ce wi thout changing its temperature. U si ng the formul a that heat l ost equal s heat gai ned, we know that 1 000 · L ¸•( 240 - 0) ¬ 80 · 300 or [ _ ¬ 0. 1 . Stop readi ng and sol ve the rest of the probl em, if you have not done so al ready. Z4b Chapter J J Havi ng sol ved for the speci fc heat capaci ty of the unknown sub­ stance, i t is now possi bl e to appl y the heat-Iost-equal s-heat-gai ned formul a to the resul ts of the second experiment i n order to derive the fnal temperature of the system afer the second experiment. The calcu­ lation i s as fol l ows : ( 2000) ( 0. 1 ) ( 240 - 1 2 ) " 500( 1 2 - 0) + 300( 1 2 - 0) + ( 80) ( 300) Thi s si mpl e l i near equation in one unknown is tri vi al l y sol ved to yi el d 1 2 " 24° C, whi ch i s the fnal temperature of the system. Pri nci pal general probl em- sol vi ng methods used i n the present prob­ lem were to label unknown quanti ti es and to set a subgoal . Al so, i n a probl em of thi s type presented in a physi cs book, you woul d have to generate the rel evant background i nformati on, si nce it woul d not be stated expl i ci tl y i n the probl em. ELECTRICITY Deri ve a formul a for the el ectri c-fel d i ntensi ty, Ì,establ i shed by a charge di stri buted uniforml y al ong an i nfni tel y l ong l i ne wi th a l i near charge densi ty À. The i mportant background i nformati on i ncl udes the fol l owi ng; The magni tude of the el ectri c fel d produced by a poi nt charge of magni ­ tude q at a di stance l from the poi nt charge i s ̯ q/41Eor, where Eo i s a known uni versal constant that is dependent upon the measuri ng uni ts, À ¯ dq/dl, where / represents posi ti on al ong the l i ne. The di recti on of the el ectri c-fel d vector i s radi al l y out from the poi nt charge. The el ectric fel d produced at a point by a set of point charges i s equal to the vector sum of the el ectric fel d produced by al l component point charges at that poi nt. Al so rel evant i s some knowl edge of el ementary trigonometry, vectors, and cal cul us. Stop readi ng and try to sol ve the probl em. The most rel evant general probl em-sol vi ng method to the sol uti on of thi s probl em i s for us to defne subgoal s (break up the probl em i nto parts). What is the frst subgoal , we might consi der? Stop readi ng and try agai n to sol ve the probl em, if you did not before. Al though t he probl em asks us to describe the enti re el ectric fel d produced by the l i ne ( at an i nfni ty of poi nts i n space), we know by anal ogy to si mi l ar probl ems that thi s statement means we must derive a formul a for the el ectric fel d at some arbi trary poi nt i n space. Thus, the probl em i s si mpl i fed by consi deri ng the el ectri c fel d at onl y a si ngl e ( variabl e) poi nt in space. Furthermore, symmetry i ndi cates Problems from Mathematics, Science, and Engineering Z4¯ that the onl y rel evant i nformation is the di stance of the poi nt from the l i ne, represented by i n Fi g. 1 1 - 2. Cl earl y, the el ectric fel d set up by a charge di stributed al ong a straight l i ne must have cyl i ndri cal sym­ metry (be equal at al l poi nts at the same di stance from the l i ne) , si nce there i s nothing di ferent about the gi ven i nformati on for any such poi nt. Stop readi ng and try agai n to sol ve the probl em, if you di d not do so before. Another useful general probl em-sol vi ng techni que woul d be to draw a di agram representi ng the i mportant i nformation i n the probl em. The probl em can be broken i nto parts by defni ng another subgoal . What subgoal might thi s be? Stop readi ng and try agai n to sol ve the probl em. h a x is dL × × × × × I × h × × N N N · 0 m=m= m== m= ¿ a x is dl FIGURE 1 1 -2 El ectric fel d produced by an infnite l i ne with l i near charge densi ty k. Si nce we know from a physi cal assumpti on that the el ectric fel d at a poi nt i s equal to the sum of the contri buti ons of the el ectric fel d produced by al l charges, it i s rel evant to attempt to determi ne the i ndi vi dual contri buti on to the el ectric fel d at a poi nt di stant from the l i ne due to any l i ttl e pi ece of charge along the l i ne. Consi der the el ec­ tri c fel d produced at the poi nt by the amount of charge present al ong an i nfni tesi mal l y smal l segment of the l i ne . The charge i n thi s seg­ ment i s .,÷ .Stop readi ng and try agai n to sol ve the problem by frst sol vi ng the subgoal , if you have not done so al ready. The contri buti on to the el ectric fel d .c produced by the charge .in an i nfni tesi mal l y smal l segment of a l i ne . is gi ven by the formul a (whi ch we know from background i nformati on) .c÷ .·-. Note that .cshoul d be a vector quantity, and we have onl y obtained an expression for the magni tude of the vector. It i s al so necessary to state the di rection of the vector. Accordi ng to background i nformati on, Z4Û Chapter J J thi s di rection i s evi dentl y radi al l y out from the point charge at .as shown in Fig. 1 1 - 2. Havi ng achi eved the frst subgoal , our next sub­ goal i s to combi ne the contri buti ons to the feld from al l segments . along the i nfni tel y l ong l i ne. Thi s combi ni ng wi l l evi dentl y i nvol ve an i ntegrati on, si nce the segments are i nfni tesi mal l y smal l , rather than a summation where the contri buti ons to the fel d are fni te i n number. Stop readi ng and try to sol ve the rest of the probl em, if you have not done so al ready. In attempting to combi ne separate contri buti ons of each .along the l i ne, i t is necessary to note that the di recti ons of the vectors dE produced by each .are di ferent. Thus, we cannot si mpl y i ntegrate the magni tudes of these vectors with respect to I from mi nus i nfni ty to pl us i nfni ty. I nstead, we must resol ve each vector i nto I and com­ ponents and i ntegrate each separatel y wi th respect to I from mi nus i nfni ty to pl us i nfni ty. Thus, the next subgoal i s to resol ve the elec­ tric fel d produced by each .i nto two components. El ementary trigo­ nometry appl i ed to the previ ous formul a for the el ectric feld produced by dl yi el ds the fol l owi ng components: ¸ dl ¸ ¸-I ¸ -A dl I . dE, -dE cos · - 4 � - 4 � - 4 + 1 2 )3/ 2 -. -. · . o ¸ . ¸ ¸ ¸ dl . dE " -dE SI n · - 4 � - 4 - - 4 + 1 2 )3/ 2 -. -. -. Note that in the fnal expressi ons for dE, and dE" we substi tuted ( 12 + 1 2 ) 1 / 2 for because we are i ntendi ng to i ntegrate with respect to I, and is a functi on of I. Thus, we must express i n terms of I. Thi s sort of mani pul ati on to el i mi nate unnecessary terms by expressi ng them in other necessary terms is a form of hi l l cl i mbi ng on an eval uati on func­ ti on concerned wi th the number of unknown terms. Stop readi ng and try to solve the rest of the probl em, if you have not done so al ready. Of course, al l that remai ns now is to actual l y perform the two i nte­ grati ons to determi ne the E, and Ell components of the fel d at the poi nt 11 di stant from the l i ne. Thi s is shown i n the work bel ow: =[0 + 0] = 0 · · . Problems from Mathematics, Science, and Engi neering Z4Ü Pri nci pal probl em-sol vi ng methods used in the sol ution of t hi s el ec­ trostati cs probl em were subgoal s, representi ng i nformati on by sym­ bol s and di agrams, symmetry ( noti ci ng equi val ence cl asses) , si mi larity to previous probl ems , and perhaps some l i mited use of hi l l cl i mbi ng. We might even contend that the subgoal of computing the contri buti on to the el ectric fel d at a point produced by a smal l quanti t y of charge dq di stri buted over a smal l segment of the l i ne dl consti tuted the sol uti on of a si mpl er probl em and thus was an exampl e of that general probl em-sol vi ng method, i n addi ti on to representi ng the subgoal method. ELECTRICAL ENGINEERING Givens: You have a l i mi ted suppl y of 2-i nput PÎL gates and 2-i nput LH gates to use in constructi ng a vari ety of control ci rcui ts. A 2-i nput PÎL or LH gate has t wo i nput wi res and one output wi re. Al l i nput l evel s and output l evel s are either 0 or I ( bi nary digi tal -l ogi c ci rcui ts). A 2-i nput PÎL gate has a I on the output wi re, if and onl y if both i nput wi res are at the l l evel . A 2-i nput LH gate has a 1 on the output wi re, if and onl y if either one or both of i ts inputs is at the 1 l evel . In constructi ng control ci rcui ts, i t i s i mportant to know that you may connect the same (source) wi re to many di ferent input wi res of many di ferent gates. Al so, the out­ put wi re of one gate may be connected to one or more i nput wires of one or more gates i n chai ns and even l oops. In parti cul ar, the output of a gate may be connected to one of i ts own i nputs. However, you may not con­ nect two output s. A somewhat related restri cti on i s that you must not connect two wi res to the same i nput wi re, whenever doi ng so woul d compl ete an undesi rabl e ci rcui t between the two wi res. In the present probl em, assume that al l such ci rcui t s are undesi rabl e and do not connect two wi res to the same i nput wi re of a gate. Goal: In the part of the ci rcui t you are now constructi ng, there are 6 i nput wi res and Í J output wi res, each of whi ch can be at the 0 l evel or the I l evel . Onl y Í J patterns of O' s and I ' s wi l l ever occur on the 6 input wi res. Your task i s to choose the set of Í J i nput patterns and construct ZbÛ Chapter J J a decodi ng ci rcui t, using the mi ni mum number of 2-i nput PDLor LHgates, such that when any one of these 1 5 i nput patterns occurs, one and onl y one of the 1 5 output wi res wi l l be at the 1 l evel (the rest bei ng at 0). Natural l y, a di ferent output wi re shoul d be at the 1 l evel for each of the 1 5 di ferent i nput patterns. Stop readi ng and try to sol ve the probl em. The frst step here, as i n any probl em, i s to expl ore the probl em, deri vi ng whatever concl usi ons can be deri ved easi l y. For exampl e, wi th 6 bi nary i nputs, there are 6 4 possi bl e i nput patterns, onl y 1 5 of whi ch are bei ng used. Four bi nary i nput wires woul d sufce to present 1 5 di ferent input patterns, so there must be some advantage i n usi ng more i nput wi res. I t woul d be a good guess that it si mpl i fes the decod­ i ng ci rcui t and mi ni mi zes the number of gates to employ 6 i nput wi res, rather than 4 . Al ong the same l i ne, i t woul d be reasonabl e to conjec­ ture that the probl em woul d be essenti al l y solved if we knew which 1 5 i nput patterns to use. I f we were i ncl i ned toward number theory, we might i nqui re about the properti es of the numbers 6 or 1 5 . I n thi s particul ar probl em, such an i nqui ry coul d yi el d an i mmedi ate i dea for the correct solution, especi al l y if the probl em sol ver were al ready sufci entl y fami l i ar with the use of ANO and OK gates i n ci rcui t probl ems. However, let us not fol l ow up thi s speci fc approach to the probl em now. You can go back to consi der thi s approach, afer we have gone through more straight­ forward and more general methods. Another thi ng we mi ght deri ve i s the concl usi on that the sol ution of the probl em must requi re at l east 1 5 gates, one for each di ferent out­ put wi re. If the probl em can be sol ved with 1 5 gates, thi s number must be the mi ni mum. Of course, we do not know yet whether a greater number of gates than 1 5 wi l l be requi red. Stop readi ng and try agai n to sol ve the probl em, if you di d not do so before. So much for deri vi ng quick concl usi ons. If you are i nexperi enced in usi ng ANO and OK gates i n ci rcui t probl ems, you will probabl y want to spend some ti me thi nki ng about their properti es and usi ng them i n a more or l ess random way, unrel ated to the probl em. You are probabl y somewhat fami l iar wi th ..and fÎ as used i n logical expressi ons, but psychol ogi cal l y that i s not quite the same as usi ng ANO and OR gates as transformati ons (operators) in ci rcui t-desi gn probl ems. N ear the begi nni ng of your work on the probl em, you shoul d devel op useful representati ons of concepts i n i t. Vector notation for the si x­ bi t bi nary i nput patterns wi l l probabl y aid your thi nki ng -for exampl e, 1 1 0 I 00 or 00 I 000. Some sort of spatial representation of the two Problems from Mathematics, Science, and Engi neering ZbT diferent ki nds of gates ( l abeled boxes), and the i nput and output wi res ( l i nes) might also be hel pful to you. Stop readi ng and try agai n to sol ve the probl em, if you di d not before. These somewhat ponderous prel i mi nari es to real work on the prob­ lem may seem compl etel y unnecessary to some, but those for whom the prel i mi nari es are unnecessary are ei ther l ucky i n sol vi ng thi s parti cul ar probl em or el s e are consciousl y or unconsci ousl y accomp­ l i shi ng these prel i mi nari es very qui ckl y i n thei r brai ns. Once a person becomes ski l led at probl em sol vi ng, these prel i mi nari es, which take several paragraphs to expl ai n, can be accompl i shed i n seconds in the head. Havi ng accompl i shed the prel i mi nari es of ful l y underst�nding the probl em, deri vi ng quick concl usi ons, and devel opi ng some useful verbal and spatial representations, it i s ti me to see if some sol uti on to the problem just pops i nto your head, probabl y because it is analogous to si mi l ar probl ems you have sol ved i n the past. If nothi ng comes to mind, you mi ght try more acti vel y to thi nk about whether you have solved si mi l ar probl ems i n the past and what speci fc or general methods you used then. Let us assume that thi s is a fai l ure; you never encountered a ci rcui t probl em before i n your l i fe or, i n any event, you have not remembered anythi ng that seems useful from previ ous probl ems. What next? You mi ght try breaki ng the probl em i nto part s ( sub­ probl ems or subgoal s) . For exampl e, you coul d note that three i nput wi res can have eight di ferent i nput patterns, whi ch means that per­ haps the probl em could be broken down i nto two subgoal s : that of mappi ng eight i nput patterns on wi res 1 , 2, and 3 onto ei ght of the output wi res, and that of mappi ng seven i nput patterns on wi res 4, 5 , and 6 onto the remai ni ng seven output wi res. I t seems a trife i nel egant to have been given 1 5 , i nstead of 1 6, codes, but i n a real -worl d prob­ lem, nothi ng guarantees this kind of el egance. Of course, this i s a made-up probl em, and it is more el egant than thi s. I gnoring the ques­ ti on of i nel egance, we might spend some ti me tryi ng thi s approach based on the subgoal method. However, it happens in thi s case that the anal ysi s into subprobl ems is not hel pful . There i s a power in the combi nati ons across the two sets of three input lines that i s being l ost by thi s anal ysi s i nto subprobl ems. I have not thought of any other anal ysi s of thi s probl em i nto subprobl ems that i s hel pful , ei ther. Thus, the subgoal method i s a compl ete bust on thi s probl em, but if you tried thi s method you would be making a rel ati vel y i ntel l igent error. What other general probl em-sol vi ng method might you use? Stop readi ng and try agai n to sol ve the probl em, if you have not al ready. ZbZ Chapter J J Many other general probl em-sol vi ng methods coul d be tried, but the one that real l y cracks the probl em open i n a systemati c, straight­ forward, though somewhat ti me-consumi ng, manner is to solve si mpl er probl ems. There are a l arge number of si mpl er probl ems. You can start as si mpl e as you wi sh and work your way up through more compl i ­ cated probl ems, and hope the general pri nci pl e of the solution to the original probl em becomes cl ear. Stop readi ng and try agai n to solve the probl em, if you di d not before. A good subprobl em to start with would be fve i nput patterns on four i nput wi res, to be decoded onto fve output wi res. There are onl y 1 6 possi bl e bi nary i nput patterns on four wi res, and you are to select fve of the 1 6 to achi eve a circuit usi ng the smal l est number of gates. Presumably, i n working on t hi s subprobl em, you l earn a number of pri nci pl es that wi l l be useful in sol vi ng the original probl em. For exampl e, you learn to focus on the i nput wi res, whi ch are at the 1 l evel in any parti cul ar i nput pattern, because, without any i nvertors, it i s onl y the I -l evel i nputs that can be used to turn on the correct output. Al so, presumabl y you real i ze (if you di d not al ready) that some ci rcui ts wi l l turn on the correct output but also turn on some i ncorrect output wi res, in vi ol ati on of the requi rements of the probl em. Thi s type of di fcul ty mi ght i ncl i ne you agai nst sel ecti ng bi nary i nput codes that had too many I ' s i n them. Note that we avoi ded choosi ng a si mpl er probl em that had no more i nput patterns than it had i nput l i nes, because such a probl em permits the trivial one-to-one solution that obvi ousl y will not work i n the original probl em and wi l l give no i nsights i nto the original probl em. Thus, we may already have real i zed that many or all of the i nput pat­ terns must have more than a si ngl e 1 i n them. I n the present si mpl er probl em, the combi nation of not wanti ng too many I ' s and wanting more than a si ngl e 1 i n many or all of the input patterns essential l y forces us t o use di ferent combi nati ons of two I ' s as i nputs t o ANO gates -for exampl e, 1 1 00, 1 0 1 0, 1 00 I , 0 1 1 0, 0 1 0 1 . At thi s poi nt, we mi ght see that this type of solution general i zes di rectl y to the original probl em, or we might solve another, sl i ghtl y more di fcul t probl em before seei ng that the sol uti on general i zes to the ori gi nal probl em. Stop readi ng and try agai n to sol ve the probl em, if you di d not before. Let us back up a l i ttl e. Maybe we never thought expl i ci tl y of the pri nci pl e that too many I ' s i n an i nput pattern are no good. Neverthe­ l ess, we woul d be apt to obtain the sol uti on to the si mpl er problem because the range of possi bl e sol utions i s so much reduced. To be sure, there are a lot of di ferent combi nati ons of 1 6 patterns taken fve at a ti me, and we must al so descri be the decodi ng ci rcui t for any fve we Problems from Mathematics, Science, and Engineeri ng Zbó sel ect. However, the number of l ogical l y d i ferent ....of potenti al sol uti ons i s much smal l er than thi s. Without tryi ng to enumerate al l of the logical l y di ferent cl asses of sets of fve i nput patterns, we can i ndi cate the nature of the features used to defne these cl asses ac­ cordi ng to whether the pattern used consi sts of al l I ' s ( 1 1 1 1 ) ; three I ' s (such as 1 1 1 0) ; two I ' s (such as 1 1 00) ; one I (such as 0 1 00) ; al l O' s (0000) ; or whether t he same wi re i s at t he I l evel i n al l of t he fve pat­ terns, four of the fve, three of the fve, and so on. If you use the method of cl assi fcatory trial and error ( bei ng systematic about noti ng the features of the types of sol uti ons you have consi dered and rejected), i t shoul d not take too l ong to hi t upon the optimal sol uti on to the s i mpl er probl em. Stop readi ng and try agai n to sol ve the probl em, if you have not done so al ready. The sol uti on to the original probl em is to choose the fol l owi ng set of 1 5 input patterns: 1 1 0000, 1 0 1 000, 1 00 1 00, 1 000 1 0, 1 0000 1 , 0 1 1 000, 0 1 0 1 00, 0 1 00 1 0, 0 1 000 1 , 00 1 1 00, 00 1 0 1 0, 00 1 00 1 , 000 1 1 0, 000 I 0 I , 0000 I I . The decodi ng ci rcui t uses just 1 5 ANOgates with i nput l i nes I and 2 connected to the frst ANO gate, i nput l i nes I and 3 to the second, i nput l i nes I and 4 to the thi rd, and so on, up to i nput l i nes 5 and 6 to the 1 5th. What if you chose so tri vi al a si mpl er probl em that no useful i n­ si ghts were obtai ned? An exampl e woul d be three i nput patterns on three i nput wi res to be decoded onto three output wi res. I f you avoided the tri vi al one-to-one sol uti on, you coul d sti l l l earn the necessary pri nci pl es from thi s si mpl er probl em. If you did not avoid the tri vi al sol uti on, you coul d then pose a somewhat more compi ex probl em, conti nui ng t hi s process unti l a probl em was posed that was si mpl e enough to sol ve easi l y but hard enough to i nvol ve some of the i m­ portant pri nci pl es of the sol ution to the ori gi nal probl em. What if your judgment regardi ng si mpl er probl ems was faul ty, and a harder probl em was sel ected? For exampl e, you might thi nk that eight i nput patterns on four i nput wi res was a si mpl er probl em than the original probl em ( 1 5 input patterns on six wi res) , but i t i s not. Furthermore, a sol uti on to thi s probl em wi l l tend to l ead you away from the opti mal sol ution to the original probl em. You must honestl y face the fact that thi s is a potential trap that ofen accompani es the use of the si mpl er-probl em method. Your criteria for judgi ng the si mpl i ci ty of a probl em are vi tal l y i mportant for the success of the method. I f you know thi s, expl i ci tl y specify cri teri a for probl em si m­ pl i ci ty wi thi n any cl ass of probl ems, and conti nual l y questi on these criteria for probl em si mpl i ci ty when the supposedl y si mpl er probl em Zb4 Chapter J J proves di fcul t, then you can avoid bei ng trapped by the method. What if, in working on the ori gi nal probl em or some si mpl er prob­ l em, you devel op mental sets ( unconscious assumpti ons) about the sol uti on to the probl em that are wrong and that prevent you from ob­ tai ni ng the necessary ideas for sol vi ng the probl em? This ofen happens, especi al l y when a person does not have a habi t of conti nual l y trying to specify the methods bei ng used and the assumpti ons bei ng made. For exampl e, i n the present probl em, we can devel op the working hypothesi s that somehow the six input wi res should be considered i n three groups of two wi res each. Thi s sort of crudel y formul ated work­ i ng hypothesi s coul d be very hel pful , if i t were correct. I n thi s probl em i t i s not correct, and i t can be di sti nctl y del eteri ous for getting the necessary i deas. As usual , an ounce of prevention i s worth a pound of cure. If you are careful to note the worki ng assumpti ons you make, i t wi l l be easy to questi on those assumpti ons and thi nk of other ideas (worki ng as­ sumpti ons) that vi ol ate them. However, someti mes even the most anal ytical probl em solvers make unconsci ous worki ng assumptions and then fnd themsel ves going around i n circles -that is, repeatedl y trying out the same i ncorrect sol uti ons wi thi n a l i mited set that does not contai n the correct sol uti on. I f you are aware of thi s possi bi l ity, then you can try to characterize your i mpl i ci t assumptions and make one or more contrary assumpti ons. COMPUTER PROGRAMMING Computer programmi ng probl ems provide parti cul arl y good exampl es of subgoal s , the representati on of i nformati on (nami ng), i nference (representati on of i mpl i ci t i nformati on), analogy, and special case. Computer programmi ng probl ems frequentl y i nvol ve the solution of one or more mathematical probl ems, such as deri vi ng an al gori thm for the solution of an equati on, i n addi ti on to the defnition of a se­ quence of i nstructi ons to achi eve the sol uti on of the probl em by the computer. Thi s probl em sol vi ng al ready provi des an exampl e of the subgoal method, with one subgoal being the mathematical sol ution of one or more probl ems and a second subgoal being the representation of this solution i n a programmi ng l anguage. Another basi c appl i cation of the subgoal method to vi rtual I y all computer programmi ng probl ems i s the di vi si on of the probl em i nto three part s: i nput, computati on, and output. I n addi ti on to the input of the program i tsel f, most programmi ng probl ems requi re the val ues of certai n variabl es to be i nput to the machi ne. The computer must be told from what source to expect the i nput (cards, magnetic tape, Problems from Mathematics. Science. and Engineering Zbb paper tape, or whatever) and the format of the input (al phabetic, al pha­ numeric, numeri cal , two-col umn fel ds, three-col umn fel ds, and so on). In addi tion, the computer must be tol d where to store thi s data and what names to gi ve to the vari ous subsets of i nput data. These i nstructi ons represent further subsubgoal s of the i nput phase of the programmi ng probl em. Si mi l arl y, the computer must be tol d what val ues from what arrays t o output, on what output medi um ( pri nter, cards, magnetic tape, paper tape, and so on), the output for­ mat, and al phanumeric headi ngs for vari ous porti ons of the output. The computational porti on of any l arge program must frequentl y be di vi ded further i nto subgoal s, the sol uti on to each of these subgoal s bei ng cal l ed a subrouti ne. For exampl e, it may be that a porti on of the computation i nvol ved i n a computer program is to fnd the val ue of a function such as ) ÷ , . ÷ a log .+ ·si n .+ L Y. Wi thout con­ si dering exactl y where i n the program thi s subroutine wi l l be used, a computer programmer might write up a program for the computation of thi s function and gi ve that portion of the program ( subrouti ne) a name, so that it might be cal l ed at any poi nt duri ng the execution of the main program. The computation of val ues of functi ons constitutes but one rel ati vel y tri vi al exampl e of the appl i cation of the subgoal method to computer programming probl ems. Other exampl es of sub­ routi nes i ncl ude random-number generators, shufi ng programs, fnd­ ing the maxi mum val ue i n an array of val ues, orderi ng or ranki ng a set of numbers, and searchi ng for a parti cul ar al phanumeric l abel . Fre­ quentl y, the programmer knows a number of subrouti nes that wi l l be requi red to sol ve a computer programmi ng probl em, programs to achi eve these subrouti nes can be devel oped rel ati vel y i ndependentl y of one another and of the mai n program. Because of the l arge i nde­ pendence that can be achi eved i n writing a computer program to achi eve various subgoal s, i t i s possi bl e for a team of programmers to di vi de the work of writing a l arge program. However, i t i s frequentl y necessary for certain common nami ng conventi ons to be observed and for the writer of the main program to speci fy to the writers of each of the component subrouti nes the form and l ocation of the i nput to thei r subrouti nes and the desi red form and l ocati on for the output from the subrouti nes. The i mportance of gi vi ng names to al l the i mportant concepts i n a programmi ng probl em is so obvi ousl y forced upon any programmer by the necessi ty of representi ng every i mportant aspect of the probl em i n a computer program that i t bears no extensi ve di scussi on. The frequent need to represent i mpl i ci t i nformation i n the sol uti on of programming probl ems i s al most, but not qui te, as obvi ous as the necessity of naming i mportant concepts. For exampl e, assume that Zbb Chapter J J one requi red subrouti ne i s to sampl e randoml y from a set without replacement. We might achi eve thi s by si mul ati ng card shufi ng in the computer. To si mul ate card shufi ng i n a computer i t is necessary to represent expl i ci tl y what we know i mpl i ci tl y to be i nvol ved in shufi ng a deck of cards. Shufi ng a deck of cards i nvol ves frst making a si ngl e parti ti on at some random point near the middle of the deck i n order to di vi de the set i nto two subsets (two i nterval subsets). The cut might easi l y be achi eved i n a computer by pi cki ng a random num­ ber between 0 and and addi ng thi s to a number that is 5 less than half of the number of cards in the deck. Havi ng si mulated the cut, we are now faced wi th si mul ati ng the actual shufe. What i s evi dentl y i nvol ved i n thi s shufe i s that the top card from one of the two subsets is i nserted at some random point wi thi n the top few cards of the other subset and the second card i s i nserted randoml y a few cards bel ow the frst card, and so on. Thi s can be easi l y si mul ated on a computer by pi cki ng a random number between 0 and 2 for the number of cards from the other subset to i ntervene between any two adjacent cards from the frst subset. Thi s is not necessari l y the best shufi ng routi ne, but i t vi vi dl y i l l ustrates the process of expl i ci tl y representing i mpl i ci t i nformation as a component to the sol ution of a programmi ng probl em. Analogy i s wi del y used i n the sol uti on of programmi ng probl ems i n the rel ati vel y tri vi al sense that whenever a probl em can be i dentifed as being essenti al l y i denti cal to a probl em for which a program al ready exi sts, a programmer wi l l obtain that program from some l i brary and i ncorporate i t i nto hi s own program to sol ve that portion of the prob­ l em. Thi s method pl ays an i mportant rol e i n sol vi ng computer program­ mi ng probl ems, but i ts use i s so wi del y understood that it hardl y deserves much comment here. Fi nal l y, the method of speci al case ofen pl ays an i mportant rol e i n the sol uti on of computer programmi ng probl ems. Programmi ng prob­ l ems frequentl y i nvol ve doi ng certai n computational jobs over and over agai n for some mul ti di mensi onal array of val ues or vectors of val ues as the i nput. It si mpl i fes the probl em greatl y to frst write a program to sol ve the probl em in a speci al case and then extend thi s sol uti on to the enti re mul ti di mensi onal array. Very frequentl y, thi s method amounts to l i ttle more than getti ng a subroutine for doi ng a parti cul ar job ( such as computi ng the val ue of a functi on) and then embeddi ng that subrouti ne wi thi n a set of control l oops that iterate the subrouti ne through all of the val ues in an i nput matri x and output the resul ts of the computati on i nto the proper pl aces i n an output matri x. The method of speci al case i s sometimes equi val ent to the subgoal method. References Bartl ett, F. Thinking: An experiment and social study. New York: Basi c Books, 1 958. Chessi n, P. L. Probl em for sol uti on. American Mathematical Monthly, 1 954, 6 1 , 258-59. Duncker, K. On probl em sol vi ng. Psychological Monographs, 1 945, åò ( 5, Whol e No. 270). Fel l er, W. An introduction to probability theory and its applications. ( 3rd ed. ) Vol . I . New York: John Wi l ey & Sons, 1 957. Newel l , A. , Shaw, J . L. , & Si mon, H. A. The processes of creati ve t hi nki ng. I n H. E. Gruber, G. Terrel l , & M. Werthei mer ( Eds. ) , Contemporar Approaches to Creative Thinking. New York: Atherton Press, 1 962. Pp. 63 -1 1 O. Pol ya, G. How to solve it. Garden Ci ty, N. Y. : Doubl eday & Company, 1 957. Pol ya, G. Mathematical discovery. Vol . I . On understanding, learning, and teaching problem solving. New York: John Wi l ey & Sons, 1 962. Si mon, H. A. , & Newel l , A. Human probl em sol vi ng. American Psychologist, 1 97 1 , 26, 1 45 -1 59. Abstract i on, 27, 3 1 , 1 64, |HO- 1 8 1 , |H3 Acti ons, | | , 1 2. 1 7- 1 9, 68 commutat i ve, 48-50 equi val ent 4H, 72 i denti ty, 49 i nverse, 49-50 Act i on sequences. 46-64, 6H, 1 44 equi val ence cl asses of. 47-64 Al gori thm, | |H Anal ogy. 5cc Rel ati ons between probl ems, equi val ent ; Rel at i ons bet ween probl ems, si mi l ar Argument set , 204-205 , 20H Art i fci al i nt el l i gence, 5 Assumpt i ons wi thout l oss of general i t y, 4 1 , 224 Bartl ett, F . . 1 2 H. 25 7 Begi nni ng st at e. 5cc Gi ven state Breaki ng a probl em i nto part s. 5cc Subgoal s Chessi n, P. L. . 1 3 2. 2 5 7 Ci rcl i ng, 85-KH Combi nati ons. 50. 1 69- 1 7 1 . 1 97- 1 9K, 200-202 , 23 6-2 3 7 Compl ete speci fcat i on, 1 3 , 1 4- 1 5 , 1 6 Compl et i ng the square, 2 1 7-2 28 Computer si mul at i on of probl em sol vi ng. 5 Consci ous vs. automat ic probl em sol vi ng, 6 Cont radi ct i on. 20, 40-4 1 . 1 09- 1 3 6, 1 3 7. 1 56. 1 68- 1 69. 224, 1 3 5 . 24 1 . 242 cl assi fcatory, appl i ed to i nfni t e search space, | | | , 1 3 3- 1 3 6 Index cl assi fcatory, appl i ed to large search space, 1 1 0- 1 1 1 , 1 26- 1 3 3 i ndi rect proof, | 1 0- 1 1 5 mul ti pl e-choice, appl i ed to smal l search space, | 1 0, 1 1 5- 1 26 Coordi nate notati on, 30-32 Detours , 7 1 , 79, 85-88, 93 , 1 54- 1 5 5 Di agrams. 5cc Representati on of i nformati on, di agrammati c Do somet hi ng di ferent . 5cc Loops Drawi ng concl usi ons. 5cc I nference Duncker, K. , 26-27, 34, 2 5 7 End-bunchi ng, 84-H5 Equal i t y vs . equi val ence, 202-203 Eval uat i on funct i on, 1 9, 67-90, 93-94, 95-96, 1 07- 1 08, 1 44- 1 46, 209-2 1 2 , 2 1 5 . 2 1 7, 225-226, 228-23 1 . 240-24 1 , 244, 248 expect ed val ue, 72-73 one-di mensi onal , 69, 84-85 si ngl e-val ued, 94 vector-val ued, 69. 7 1 , 73-74, 94 Exampl e probl ems ABC puzzl e, 1 0 1 - 1 02 abstract al gebra, | | | - | 1 2 , | 1 2 , 1 1 3 , 1 47- 1 48, 1 50- 1 5 1 , 1 73 , 24 1 -244 anal yt i c geomet ry, 2 1 7-222 beam balance ( coi n wei ghi ng) , 34-3 5 , 7 1 -7 3 , | | | , 1 26- 1 27, 1 5 7- 1 5 9 bl ock ( cube) cut t i ng, 32-3 3 , 93, 1 1 3 - 1 1 4 bowl i ng-pi n reversal . 1 24- 1 26. 1 5 6 cal cul us, 222, 227 canteen di vi si on, 97 ZbÛ Example probl ems (conttnucd) card probabi l i t y, 98-99 cheap neckl ace, 55-57 checkerboard path, 1 66- 1 68 checker-rearrangement, 1 44- 1 46, 1 5 3 chess, 9, | | , 1 2, 1 3 , 1 6- 1 7, 44-45 , 8 1 -82 combi natorial anal ysi s, 23 6-24 1 computer programmi ng, 254-2 5 6 concept attai nment, | | | , 1 27- 1 28 diferent i al equations, 75 , 227-232 di scri mi nati on reversal , 78 doubl i ng game, 1 4 1 - 1 42 Duncker, 26-27 Duncker radiati on, 34 el ectrical engi neeri ng, 249-254 el ect ri ci t y, 246-249 fox, goose, and corn, 1 54- 1 5 5 functi on "i nducti on, " 1 1 7 functi on opti mi zati on, 83-84 geometry constructi on, 38-39, 39-43 , 59-62 geometry fnd, 1 06- 1 07, 1 75- 1 76, 1 77- 1 78, 1 78- 1 80, 1 82 geometry proof, 1 05- 1 06, 1 1 4- 1 1 5 , 1 7 1 - 1 73 , 1 73- 1 75 heat, 245-246 hol e-i n-sphere, 1 75 - 1 76 I nstant I nsani t y, 7-9, | | , 1 2 , 1 3 , 1 6, 84-85 i nteger-path addi ti on, 1 30- 1 3 2 , 1 66- 1 67 l etter ari thmeti c, | | | , 1 28- 1 29 l i ars vs . truars , 3 6-37, 1 1 8- 1 20, 1 20- 1 2 1 l ogi c, 36-37, 1 1 0, 1 1 8- 1 20, 1 20- 1 2 1 , 1 2 1 - 1 24 l onesome ei ght, 1 3 2- 1 3 3 maze, 7 1 mechani cs, 244-245 mi ni mum di st ance, 1 62- 1 63 , 1 63- 1 66 mi ssi onari es and canni bal s, 86-88, 93 , 1 54- 1 5 5 mul ti pl e-choi ce, | 1 0, 1 69 ni m, 1 42- 1 44, 1 5 3 , 1 54 ni ne-dot four-l i ne, 64-65 notched checkerboard, 29-30, 93, 1 1 3 , 1 66 notched col orl ess matri x, 30-32 number theory, 1 08, 1 1 7, 24 1 page numberi ng, 1 3 3- 1 34 probabi l i t y, 23 2-234, 234-2 3 5 Pythagorean Theorem, 1 48- 1 50 rai l road si di ng, 52-55 ri ver-crossi ng raft , 98 HOW TO SOLVE PROBLEMS roots of l i near equati ons, 5 8-59, 73-75 , 204, 209-2 1 2 root s of nonl i near equat i ons, 75 , 1 1 6- 1 1 7, 1 3 5- 1 3 6, 2 1 3-2 1 5 scal e (coi n) weighing, 1 59- 1 6 1 si x-arrow, 49-5 1 , 75-77 63- l i nk chai n, 43-44, 99- 1 0 | Smi th, Jones, and Robi nson, 1 2 1 - 1 24, 1 87 spati al rearrangement, 1 56- 1 5 7 stat i sti cs, 234, 236 story algebra, 1 04- 1 05 tabl e, chi p pl acement, 1 76- 1 77 t hi rteen, 26-27 t hree-way-questi on i nformat i on theory, 1 5 8- 1 59 Tower of Hanoi (di sk t ransfer) , 1 02- 1 04, 1 5 3 t rigonometry, 2 1 5-2 1 7 tri p pl anni ng, 70-7 1 , 96 wal ki ng fy, 1 63- 1 66 water j ar, 1 46- 1 47, 204 work-rate, 1 94- 1 96 Expected val ue, 72-73 Expressi on, 1 0, 1 5- 1 6, 89, 1 9 1 - 1 92, 1 95 Fel l er, W. , 20 | , 25 7 Fi gures. 5cc Representat i on of informati on, di agrammati c Formul a, 4 Functi on, 206, 208 General izat i on, 1 80- 1 83 Gi vens, 2f, 1 0f, 23 , 25-3 3 , 203-204, 222 conjunct i ve vs. di sjunct i ve, 1 3 8 Gi ven state, 1 6, 1 7 Goal , 1 0, 1 3- 1 5 , 23 , 36-45 , 1 09- 1 1 1 , 1 3 7, 222 uni quel y speci fed, 1 3 8, 1 39, 1 43 Goal state, 1 6, 1 7, 1 9, 79 Goi ng around i n ci rcl es. 5cc Loops Hal f-i nterval search techni que, | | | , 1 3 5 - 1 36 Heuri sti c approach, 5 7 Hi l l cl i mbi ng, 1 9, 67-90, 1 44- 1 46, 209-2 1 2 , 2 1 6-2 1 7, 225-226, 227-23 1 , 240-24 1 , 244, 248-249 I mpl i ci t i nformat i on, 2 1 -45, 1 03 , 1 88- 1 89, 2 3 3 , 2 3 5 , 24� 254, 255-256 rel evant vs. i rrel evant , 26, 3 2 I mpl i ci t propert i es. 5cc I mpl i ci t i nformat ion INDEX I mpl i ci t speci fcati on, 1 4, 2 1 -45, 5 5 I ncompl ete speci fcat i on, 1 3 , 1 4- 1 5, 1 6, 1 9-20, 79, 92-93 , 1 39 I ncubati on, 65-66 I ndi rect proof. 5cc Contradi cti on I nducti on, mathemat i cal , 1 08, 1 69, 239-240 I nference, 2 1 -45 , 60, 96, 1 00, 1 09, 1 28- 1 29, 1 30, 1 3 1 , 1 3 3 , 1 34, 1 37, 1 6 1 , 1 86, 1 9 1 - 1 92 , 2 1 9-220, 23 7, 23 8, 243 , 247, 250 I nverse mappi ng, 207 I nverse operat i ons, 1 5 , 20, 1 40 I terati ve contradi cti on, 1 3 3- 1 36 Label i ng, 5cc Representati on of informati on, symbol i c Lemma, 1 07, 242-244 Ll oyd, S. , 45 Loops, 46, 53-55, 63-66, 254 Macroacti on , 1 9, 5 K-62 Mappi ngs, 204-20K compl ete, 206 i nverse , 206-207 one-to-one, 207-20K onto, 207-20K Maxi mum and mi ni mum l i mi t s absol ut e, K3 l ocal , 83-84, K5 , 222-223, 226-227 Mi croact i on, 5 K Mnemonic symbol convent i ons, 1 90, 1 92- 1 93 , 1 95- 1 96, 208 Mul t i pl e gi ven states. 5cc I ncompl ete speci fcat i on Mul t i pl e goal st at es. 5cc I ncompl et e speci fcat i on Newel l , ^. , 1 2 K, 1 29, 1 3 K, 25 7 Node, 1 7- 1 9, 47, 67-69 Numeri cal propert i es, 26-27 Operand, 1 3 , |K Ordered pairs, 30, 1 70, 1 96- 1 97, 1 99 Ordered set s, 50, 1 7 1 , 1 99-200 Operat i ons, 1 0, 1 1 - 1 3 , 1 8, 25-26, 34-3 6, 203-204 bi nary, 1 3 9- 1 40 dest ruct i ve, 1 2 , 1 5 . 5 K . 88-K9 i nverse, 1 5 , 20, 1 40 nondestruct i ve, 1 2 , 1 5, 1 6, 58, 88-K9 one-to-one , 89, 1 39- 1 40 unary, 1 3 9- 1 40 wel l -defned, 1 3 9 Permutation, 50, 1 7 i , 1 96- 1 97 Pol ya, G. , 38, 1 3 3 , 1 82 , 2 5 7 Probl em, 1 0- 1 5 acti on, 88-90, 1 39, 1 4 1 construction, 1 5 , 84 fnd, 9, | | , 1 2, 1 3 , i 4, 1 7, 1 39 formal , 2f i nference, 88-90, i 39, 1 47 i nsight, 23-24, 64, 92-93 , 1 60- 1 6 1 , 1 77, 1 86 opti mi zati on, 1 5 , 1 7, 83 practi cal , 2f, 1 0 ZbT proof, 9f, | | , 1 2 , i 3 , 1 4, 1 7, 1 3 8, 1 39 puzzl e, 3 Probl ems, exampl es of. 5cc Exampl e probl ems Probl em st at e, 1 5- 1 6, 1 7- 1 9, 48, 5 1 -5 5 Probl em-sol vi ng methods general , 3f speci fc, 3f Pruni ng the t ree, 1 9 Recursi on, 1 02 - 1 04 Refexi ve rel at i ons, 202 Rel ati ons , 202-203 Rel at i ons bet ween probl ems, 1 5 2- 1 83 equi val ent , 1 5 2- 1 5 3 , 25 6 general i zati on, 1 80- 1 83 si mi l ar, 3 , 74-75 , 1 5 3- 1 68, 227-230, 238, 24� 246, 249, 25 1 , 2 5 2-254 speci al cases , 1 5 3 , i 68- 1 80, 239, 256 Represent at i on of i nformat i on, 2 1 -45 , 1 80- i 8 1 , 1 8 3 , 1 84-208 di agrammat i c, 25, 3 8, 40, 5 3 , 1 1 4- 1 1 5 , 1 84, 1 86, 1 87- 1 89, 247, 249, 250-25 1 symbol i c, 25 , 3 8, 40, 1 03 , 1 84, 1 86, 1 87- 1 88, 1 89- 1 96, 245-246, 249, 250-25 1 , 2 54-25 5 tabul ar, 1 2 3 - 1 24, 1 3 1 , 1 86- 1 87 Rul es of i nference, | | I Sampl i ng wi t h repl acement , 1 96-202 wi t hout repl acement , 1 69- 1 7 1 , 1 96-202 , 2 3 6-2 3 7 Search, 1 9, 67, 86, 92-93 , 1 1 0- 1 1 1 , 1 1 9- 1 20, 1 26, 1 29- 1 3 0 Search space. 5cc State-acti on t ree Shaw, J. C, 1 3 8, 2 5 7 Si mi l ar problems. 5cc Rel at i ons between probl ems, si mi l ar Si mon, H. A. , 1 28 , 1 29, 1 3 K , 25 7 Si mpl e probl ems. 5cc Rel at i ons between probl ems, si mi l ar ZbZ Sol uti on, 1 3 , 1 6- 1 7 Spati al representa!ion. 5cc Represent ati on of i nformat i on, di agrammatic State eval uati on. 5ec Eval uation functi on State-act i on tree, 1 7- 1 9, 47, 5 8, 67-70, 86, 9 1 -92, 1 44, 204 Subgoal s, 4, 1 0, 1 9, 5 3-54, 74, 77, 90, 9 1 - 1 08, 1 29, 1 40, 1 47, 1 48- 1 49, 1 50, 1 7 1 , 209-2 1 2 , 2 1 3-2 1 5 , 2 1 5-2 1 7, 2 1 7-2 1 8, 2 1 8-2 1 9, 2 1 9-222, 222-223, 227, 227-23 1 , 23 2-234, 23 6, 2 3 6-237, 2 3 7-238, 23 8-239, 239-24 1 , 242-243 , 244, 245-246, 246-249, 25 1 , 254-2 5 5 ordered vs . unordered, 94-96 Subprobl ems. 5cc Subgoal s Subscri pts, 1 92- 1 94 Superscri pt s, 1 92- 1 93 Symbol i c representati on. 5cc Representati on of i nformat i on, symbol i c HOW TO SOLVE PROBLEMS Symmet ri c relati ons , 202 Symmetry, 1 77, 238, 246-247, 249 Tabl es. 5cc Representat i on of i nformati on, tabul ar Target set , 204-205 Topol ogi cal propert i es, 27-3 3 Transformat i ons, 1 2, 22-23 Trial-and-error search, 67, 1 3 1 cl assi fcatory, 1 9, 47-64, 25 2-253 random, 22-2 3 , 46 systemati c, 46-47, 5 8 Transi t i ve rel at i ons. 202 Unordered pai rs, 27-2 8, 1 70- 1 7 1 , 1 97- 1 99 Unordered set s, 50, 1 69- 1 7 1 , 200-202, 23 6-237 Wi shful t hi nki ng, Pol ya on, 3 8 Worki ng backward, 1 4, 1 9, 77, 96, 1 3 7- 1 5 1 , 2 1 6-2 1 7, 2 1 7-2 1 8, 220, 2 3 5 , 244 Worki ng forward, 1 3 7- 1 40, 2 1 7


Comments

Copyright © 2025 UPDOCS Inc.