A Hybrid Method For Solving A Single Nonlinear Equation by Jonathan H. Whitacre Submitted in Partial Fulfillment of the Requirements for the Degree of Masters in the Mathematics Program YOUNGSTOWN STATE UNIVERSITY December, 2010 A Hybrid Method For Solving A Single Nonlinear Equation Jonathan H. Whitacre I hereby release this thesis to the public. I understand that this thesis will be made available from the OhioLINK ETD Center and the Maag Library Circulation Desk for public access. I also authorize the University or other individuals to make copies of this thesis as needed for scholarly research. Signature: Jonathan H. Whitacre, Student Date Approvals: Jozsi Jalics, Ph.D, Thesis Advisor Date Richard Burden, Ph.D, Committee Member Date J. D. Faires, Ph.D, Committee Member Date Peter J. Kasvinsky, Dean of School of Graduate Studies & Research Date ABSTRACT The purpose of this paper is to develop a root finding method for non-linear func- tions. The problem, f(x) = 0 where x ∈ Rfractur, is common in many areas of mathematics and can be traced back as far as 1700 B.C. A cuneiform table in the Yale Babylonian Collection dating from that period gives a base-60 number equivalent to 1.414222 as an approximation to √2, a result accurate to within 10−5 (√2 ≈ 1.414214).[4] We wanted to develop a hybrid method that quickly produces a small interval containing the solu- tion and then switch to a method with faster convergence. We have created a method to solve functions whose exact roots are not easy to find using common techniques learned in algebra and calculus courses. We have compiled test functions, some of our own and some from other works on the same topic. We have also compared our method with that of several other methods consisting of Secant Method, False Posi- tion, a modified version of Modified False Position, Inverse Quadratic Interpolation, Bisection and a few other hybrid methods. Our method begins with the modified version of Modified False Position, which will be discussed in more detail later, then switches to M¨uller’s method once a certain tolerance is reached. In certain instances, our method switches back to the modified version of Modified False Position. We found our method outperformed these methods in most cases and was competitive to the other hybrid methods, and in many cases, it outperformed them as well. iii ACKNOWLEDGMENTS I would like to give special thanks to Dr. Jozsi Jalics for coordinating my thesis. I had him for the first time when I took his Ordinary Differential Equations class in Spring 2010 and found his teaching style to be quite pleasant and decided to run my thesis under him since he was knowledgeable in the topics I wished to write my thesis on. I also did quite well in his class and wish I was able to take another class with him, however,due to the close proximity tomy graduation and limited classes I was eligible to take, the opportunity was not available. I would also like to thank Dr. Richard Burden for doing most of the work in coor- dinating my thesis by suggesting the project in which my thesis covered and meeting with me several times a week and even a few days over the summer along with much assistance with the code to run our hybrid method on. I first had him Fall 2009 for Numerical Analysis and also enjoyed his class and his teaching style and desired to study the subject further. Again due to limited classes he taught, I was unable to take anymore classes so I decided to do my thesis in Numerical Analysis. Dr. Burden is a co-authorofa Numerical Analysis book thatis currently onit’s 9th editionthatwas one of our sources in several sections of this thesis. Further thanks goes to Dr. J. D. Faires for his assistance with inserting the images into my thesis and other assistance with formatting of these pages and how to use LaTeX.Ihad seenhim many times, butneverhad the privilege tositunderhis teaching through my years at Youngstown State University since the classes he taught didn’t fit intomy schedule. Ifirstmethim in Fall 2010 half way throughthe semesterwhenIhad LaTeX problems that my fellow graduate teaching assistants were unable to decipher. He is a very prestigious man and has been in the mathematics field for many years and is the other co-author of the Numerical Analysis book with Dr. Burden. Next I would like to thank my fellow graduate teaching assistant, Moriah Wright, for her patience and assistance with several LaTeX questions and problems in which I sought out her help. She is a very kind and helpful person and has transitioned from an acquaintance to a friend over the past several years I have known her. Also, I would like to thank another fellow graduate teaching assistant, Damon Haught, for his help with the proper coding to give the proper layout of the title, sig- nature and copyright pages of my thesis. I have had several classes with him over the past several years and graduated with my undergraduate degree and now my mas- ter’s degree during the same semester. He is a very friendly man and very pleasant to be around. I may never know how he is able to balance out college classes, teaching iv and life at home as a father of three, husband and part-time dance instructor. He is an inspiration to those who have had the privilege of knowing him. Before too long, I need to thank my parents. Other than all the obvious reasons, I would like to thank them for their financial support throughout my undergraduate semesters. It was wonderful to graduate with my Bachelor’s degree without any stu- dent loans. I would like to thank them and all other family members for all they have done throughout the years and continue to do. Last, but definitely not least, I would like to thank my wonderful fianc´ee, Kirsten Anderson,whowill be my wife on7-9-11, (July 9th, 2011), a date thatis suitable for and can be appreciated by any mathematician. I appreciate her patience with me through the times I have worked on this paper and not been able to spend as much time as desired with her. She is an amazing woman that has helped me through some difficult issues in life in the year and eight or more months we’ve known each other, and has loved me unconditionally since we met. There is so much I could say about her here, but since this part is not the essence of this paper, I will say no more than she has kept me going and I can’t wait to have her as my wife! v Contents 1 Methods of Solution 1 2 Measuring Speed of Convergence 1 3 Non-bracketing Methods 2 3.1 Newton’s Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 3.2 Secant Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3.3 M¨uller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3.4 Inverse Quadratic Interpolation . . . . . . . . . . . . . . . . . . . . . . . . 4 4 Bracketing Methods 5 4.1 Bisection Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 4.2 False Position (Regula Falsi) . . . . . . . . . . . . . . . . . . . . . . . . . . 5 4.3 Modified False Position . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 4.4 Illinois Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 5 Theorems 7 6 Hybrid Methods 13 6.1 Brent’s Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 6.2 Our Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 7 Error Analysis 14 8 Examples 18 9 Results 20 10 Discussion 25 11 References 25 12 Appendix 26 vi 1 Methods of Solution To solve f(x) = 0 where x ∈ Rfractur and f is a continuous function, we first find an inter- val [a,b] for which f(a) and f(b) are of opposite sign. This insures us that a solution p to f(x) = 0 lies in [a,b] by the Intermediate Value Theorem. We will discuss two classes of methods for finding a solution to the equation by generating sequences of approximations {pi}. We begin by letting a1 = a, b1 = b and calculate p1 ∈ [a,b]. Then determine [a2,b2] where a2 = p1, b2 = b1, or a2 = a1 and b2 = p2 so that f(a2) and f(b2) are opposite signs. This is a bracketing method. In bracketing methods, we generate a sequence of closed intervals {[ai,bi]} which bracket the root p and contains the approximations pi. Thus,|p−pi| ≤ bi −ai for all i. The second class of methods do not generate intervals [ai,bi] at each iteration. In fact, in the non-bracketing methods we select p0 ∈ [a,b] and generate a sequence of ap- proximations {pi}. There is no guarantee that all approximations pi are in the interval [a,b], but if limi→∞pi = p then for i sufficiently large both p and pi will be in [a,b]. This uncertainty is a disad- vantage in comparison to bracketing methods. In both bracketing and non-bracketing methods, a given tolerance epsilon1 must be supplied so that once |p− pi| < epsilon1, the computa- tions cease at the ith iteration. 2 Measuring Speed of Convergence Suppose {pi}∞i=0 is a sequence that converges to p, with pi negationslash= p for all i. If positive constants K and α exist with limi→∞ |pi+1 −p||p i −p|α = K, then{pi}∞i=0 converges to p of order α, with asymptotic error constant K. An iterative technique of the formpi = g(pi−1) is said tobe of order α if the sequence {pi}∞i=0 converges to the solution p = g(p) of order α. [4] In general, a sequence with a high order of convergence converges more rapidly than a sequence with a lower order. The asymptotic constant affects the speed of con- vergence but is not as important as the order. Two cases of order get special attention. 1 1. If α = 1 (and K < 1), the sequence is linearly convergent. 2. If α = 2, the sequence is quadratically convergent. [4] Finding this α for a given method requires methodical calculations at each iteration. For this reason, some methods, especially hybrid ones, are unable to have a value of α calculated to determine it’s speed of convergence. 3 Non-bracketing Methods 3.1 Newton’s Method To find successive approximations to the root p of a differentiable function f(x) = 0 with initial approximation p0, Newton’s method has the following formula: pi = pi−1 − f(pi−1)fprime(p i−1) , for i ≥ 1. [4] pi is formed by calculating the x-intercept of the tangent line to f(x) at (pi−1,f(pi−1)). Sinceweareinterestedinderivativefreecalculations, thismethodwillnotbediscussed further. The reason for mentioning it is because the next method, the Secant method, uses a similar technique, but instead of having to calculate a derivative, it uses an ap- proximation forthederivative thatis evidentin theformula in the section. Fora graph- ical interpretation of Newton’s method, see Figure 1 below. y xp0p1p2 Tangent at p1 y = f(x) Tangent at p0 Figure 1: This figure is a model that demonstrates how Newton’s method generates successive iterates. The α from above for Newton’s method is 2 (unless fprime(p) = 0). 2 3.2 Secant Method Suppose p0 and p1 are initial approximations to the root of f(x) = 0 such that the root p ∈ (p0,p1). The approximation p2 is the x-intercept of the line joining (p0,f(p0)) and (p1,f(p1)). To find pi for all i ≥ 2, pi = pi−1 − f(pi−1)(pi−1 −pi−2)f(p i−1)−f(pi−2) . The α from above for the Secant method is 1.618, which we show in Theorem 2 of Section 5. For a graphical interpretation of the Secant method, see Figure 2 below. y xp0 p1p2 p3 p4 y = f(x) Figure 2: This figure is a model that demonstrates how the Secant method generates successive iterates. 3.3 M¨uller M¨uller’s method usesthree initial approximations p0, p1, and p2 tothe rootpand deter- mines the next approximation p3 by considering the intersection of the x-axis with the parabola through (p0,f(p0)),(p1,f(p1)), and (p2,f(p2)). To begin, one must consider the quadratic polynomial P(x) = a(x −p2)2 + b(x − p2) + c. It can be shown that the constants a,b and c should have the following values: c = f(p2), b = (p0 −p2) 2[f(p1)−f(p2)]−(p1 −p2)2[f(p0)−f(p2)] (p0 −p2)(p1 −p2)(p0 −p1) , 3 and a = (p1 −p2)[f(p0)−f(p2)]−(p0 −p2) 2[f(p1)−f(p2)] (p0 −p2)(p1 −p2)(p0 −p1) . To find pi for all i ≥ 3, pi = pi−1 − 2cb + sgn(b)√b2 −4ac. Once pi is found, the procedure is re-initialized using pi−2, pi−1, and pi in place of pi−3, pi−2, and pi−1 to determine the next approximation pi+1. Note: It can happen that b2 −4ac < 0 giving complex roots. This created a problem in our method, and we shall discuss why later. See Figure 3 below for a graphical interpretation. y xp0 p1 p2 A y = f(x) Figure 3: This figure is a model that demonstrates how M¨uller’s method differs from the Secant method generates successive iterates. Also, A denotes p2 formed by the Secant method. The order of convergence, α, for M¨uller’s method is 1.84, which we show in Theo- rem 2. 3.4 Inverse Quadratic Interpolation Firstsupposef ∈ C1[a,b], fprime(x) negationslash= 0on[a,b] andf hasonezeropin[a,b]. Letx0,x1,and x2, be 3 distinct numbers in [a,b] with f(xi) = yi, for each i = 0,1,2. To approximate p, construct the interpolating polynomial of degree 2 on the nodes y0,y1, and y2 for f−1, the inverse function of f and evaluate at 0 to get the next iterate. Since yi = f(xi) and 4 0 = f(p), it follows that f−1(yi) = xi and p = f−1(0). To find xi for all i ≥3, xi = fi−2fi−1xi−3[f i−3 −fi−2][fi−3 −fi−1] + fi−3fi−1xi−2[f i−2 −fi−3][fi−2 −fi−1] + fi−3fi−2xi−3[f i−1 −fi−3][fi−1 −fi−2] , where fj = f(xj) for all j. This differs from M¨uller’s method since M¨uller’s method always looks for closest x-intercept to p2. 4 Bracketing Methods 4.1 Bisection Method Suppose f is a continuous function defined on the interval [a,b], with f(a) and f(b) of oppositesign. BytheIntermediateValueTheorem,thereexistsanumberp ∈ (a,b) with f(p) = 0. Also suppose there are an odd number of zeros of f ∈ (a,b). The method nowrequiresrepeatedlyhalving subintervals of[a,b] and,ateachstep,locating thehalf containing p. To begin, set a1 = a and b1 = b, and let p1 = a1 + b12 , the midpoint of [a,b]. If f(p1) = 0, then p = p1, and we are done. If f(p1) negationslash= 0, then f(p1) has the same sign as either f(a1) or f(b1). When f(p1) and f(a1) have the same sign, p ∈ (p1,b1), and we set a2 = p1 and b2 = b1. When f(p1) and f(a1) have opposite signs, p ∈ (a1,p1), and we set a2 = a1 and b2 = p1. We then reapply the process to the interval [a2,b2], and so on. For a graphical interpretation, see Figure 4. 4.2 False Position (Regula Falsi) This method generates approximations in a similar manner as the Secant method, but also includes a testtoensurethatthe rootis always bracketedbetweensuccessiveitera- tions. Start by choosing p0 and p1 with f(p0)f(p1) < 0. The approximation p2 is chosen inthesamemannerastheSecantmethod,asthex-interceptofthelinejoining(p0,f(p0)) and (p1,f(p1)). To decide which secant line to use to compute p3, we check f(p1)f(p2). If this value is negative, then p1 and p2 bracket a root, and we choose p3 as the x- intercept of the line joining (p1,f(p1)) and (p2,f(p2)). If not, we choose p3 as the x- intercept of the line joining (p0,f(p0)) and (p2,f(p2)), and then interchange the indices p0 and p1. Similarly, once p3 is found, the sign of f(p3)f(p2) determines whether we usep2 and p3 orp3 and p1 tocompute p4. In the last case a relabeling ofp2 and p1 is per- 5 y xa = a1 b = b1p1 p2 p3 p a1 b1 a2 b2 a3 p3 b3 y = f(x) p1 p2 Figure 4: This figure is a model that demonstrates how the Bisection method generates successive iterates. formed. The relabeling ensuresthattherootis bracketedbetweensuccessive iterations. For a graphical interpretation, see Figure 5 below. y xp0 p1p2 p3 p4 y = f(x) Figure 5: This figure is a model that demonstrateshowFalse Position generatessucces- sive iterates. The order of convergence α for False Position is 1 if the function is concave or con- vex or else it can behave like the Secant method. 6 4.3 Modified False Position Modified False Position is similar to False Position, except here we multiply f(pi−1) or f(pi) by λ when calculating pi+1 to avoid the retention of the left or right end point over successive iterations. For a graphical interpretation, see Figure 6 below. y x (pi−1,λf(pi−1)) pi pi+1 pi+2 p pi−1 y = f(x) Figure 6: This figure shows how Modified False Position generates successive iterates, scaling down f(pi−1) by a factor of λ to avoid end point retention as mentioned in the False Position section. Note: The same occurrence happens if the graph is reversed, and this time f(pi) is scaled down by a factor of λ. 4.4 Illinois Method This method follows the False Position method, but uses λ = 12. The estimates chosen for the next iteration are selected according to the following rules: 1. if f(pi+1)f(pi) < 0, then (pi,f(pi)) and (pi+1,f(pi+1)) are retained. 2. if f(pi+1)f(pi) > 0, then parenleftBig pi−1, f(pi−1)2 parenrightBig and (pi+1,f(pi+1)) are retained. The function values used at each iteration will have opposite signs and the intro- ductionof thevalue f(pi−1)2 forf(pi−1) is a modification designedtospeedconvergence by eventually stopping the retention of one of the end points. 5 Theorems Theorem 1: If f(x) is a function such that: 1. fprime(x) negationslash= 0 on an interval [a,b] 7 2. fprimeprime exists on [a,b] 3. f(a)f(b) < 0 4. the Secant method converges to a root p ∈ (a,b), then the order of convergence is α = 12(1 +√5) Proof. The Secant Method is a linear interpolation based on (pi,f(pi)), (pi−1,f(pi−1)) so f(x) = p1(x) + f primeprime(ξ i) 2 (x−pi−1)(x−pi), where p1(x) is the linear interpolating polynomial. Using Newton’s divided difference form for p1(x) gives f(x) = f(pi) + f[pi,pi−1](x−pi) + 12fprimeprime(ξi)(x−pi−1)(x−pi). Set f(x) = 0 and solve for x ignoring error term to get x = pi+1 0 = f(pi) + f[pi,pi−1](pi+1 −pi). (1) We note that pi+1 = pi − f(pi)f[p i,pi−1] = pi −f(pi) parenleftbigg p i −pi−1 f(pi)−f(pi−1) parenrightbigg . The actual solution p to f(p) = 0 satisfies 0 = f(p) 0 = f(pi) + f[pi,pi−1](p−pi) + 12fprimeprime(ξi)(p−pi−1)(p−pi) (2) Subtracting equation (2) from equation (1), we get 0 = f[pi,pi−1](p−pi+1) + 12fprimeprime(ξ)(p−pi−1)(p−pi) Let ei = p−pi for every i. Then f[pi,pi−1]ei+1 = −12f(ξi)eiei−1 8 so ei+1 = f primeprime(ξ i) f[pi,pi−1]eiei−1. By Mean Value Theorem, f[pi,pi−1] = f(pi)−f(pi−1)p i −pi−1 = fprime(ξprimei) for some ξprimei ∈ (pi,pi−1) so ei+1 = − f primeprime(ξi) 2fprime(ξi)eiei−1 NowsupposetheSecantmethodconvergesandtheorderofconvergenceisα. Then for large i, |ei+1||ei|α ≈ K. Thus, |ei+1| ≈ K|ei|α and |ei| ≈ K|ei−1|α ⇒ |ei−1| ≈ K−1α|ei|1α. So K|ei|α ≈ c|ei|K−1α|ei|1α and |ei|α−1−1α ≈ cK−1α−1. The left hand side is dependentof i and the right hand side is independentof i Thus we have α−1− 1α = 0 so α2 −α−1 = 0 and α = 1± √1 + 4 2 = 1 2(1 + √5) = 1.6180399.... Theorem 2: If f(x) ∈ C3[a,b] and f(a)f(b) < 0 and M¨uller’s method converges to a simple root p ∈ [a,b], then the order of convergence is α ≈ 1.84. Proof. We assume f(x) ∈ C3[a,b] with f(a)f(b) < 0. fprimeprimeprime(x) is continuous on [a,b] so the 9 second degree interpolating polynomial for any x0,x1 and x2 ∈ [a,b] satisfies f(x) = P2(x) + f primeprimeprime(ξ) 6 (x−x0)(x−x1)(x−x2) for any x where ξ depends on x. Assume M¨uller’s method converges to a sim- ple zero p ∈ (a,b). Let {pi} denote the M¨uller’s sequence. For i sufficiently large pi−1,pi,pi+1 and pi+2 are all contained in [a,b]. We seek the rate of convergence α of {pi}, that is, limi→∞ |p−pi+1||p−p i|α = λ. We shall show that α ≈ 1.84. We assume that for large i |pi| ≤ max(|a|,|b|). Consider the divided difference form of the interpolating polynomial of degree 2 on pi+2,pi+1 and pi: P2(x) = f(pi+2) + f[pi+2,pi+1](x−pi+2) + f[pi+2,pi+1,pi](x−pi+2)(x−pi+1). Now pi+3 is a zero of P2(x). Letting a1 = f[pi+2,pi+1] and a2 = f[pi+2,pi+1,pi] we have P2(pi+3) = 0 = f(pi+2) + a1(pi+3 −pi+2) + a2(pi+3 −pi+2)(pi+3 −pi+1). (3) Let f(p) = 0. So there exists a number ξ in the smallest interval containing pi, pi+1, and pi+2, such that f(p)−P2(p) = f primeprimeprime(ξ) 6 (p−pi+2)(p−pi+1)(p−pi) 0 = P2(p) + f primeprimeprime(ξ) 6 (p−pi+2)(p−pi+1)(p−pi). Letting P2(p) = f(pi+2) + a1(p−pi+2) + a2(p−pi+2)(p−pi+1) and ei = p−pi for all i we get 0 = f(pi+2) + a1(p−pi+2) + a2(p−pi+2)(p−pi+1) + f primeprimeprime(ξ) 6 ei+2ei+1ei. (4) Subtracting equation (3) from equation (4) gives 0 = a1(p−pi+2) + a2(p−pi+2)(p−pi+1)−a1(pi+3 −pi+2) 10 −a2(pi+3 −pi+2)(pi+3 −pi+1) + f primeprimeprime(ξ) 6 ei+2ei+1ei, and 0 = a1[p−pi+2 −pi+3 + pi+2] + a2[p2 −ppi+2 −ppi+1 + pi+1pi+2 −p2i+3, +pi+2pi+3 + pi+3pi+1 −pi+1pi+2] + f primeprimeprime(ξ) 6 ei+2ei+1ei. We also have, 0 = a1[p−pi+3] + a2[p2 −p2i+3 + (p−pi+3)[−pi+2 −pi+1]] + f primeprimeprime(ξ) 6 ei+2ei+1ei, and 0 = a1ei+3 + a2ei+3[p + pi+3 −pi+2 −pi+1] + f primeprimeprime(ξ) 6 ei+2ei+1ei, finally, 0 = ei+3[a1 + a2(p + pi+3 −pi+2 −pi+1)] + f primeprimeprime(ξ) 6 ei+2ei+1ei. So we have K1ei+3 = K2ei+2ei+1ei where K1 = [a1 + a2(p + pi+3 −pi+2 −pi+1)],K2 = −f primeprimeprime(ξ) 6 . Suppose that i is sufficiently large that we have both K3 ≈ |ei+2||e i+1|α and K3 ≈ |ei+3||e i+2|α . Thus |ei+3| = K3|ei+2|α and |ei+2| = K3|ei+1|α ⇒|ei+1|α = vextendsinglevextendsingle vextendsinglevextendsingleei+2 K3 vextendsinglevextendsingle vextendsinglevextendsingle which implies |ei+1| = |ei+2|K 3 1 α. 11 Then |ei+1| = K3|ei|α ⇒|ei| = |ei+1|K 3 1 α =   |ei+2| K3 1 α K3   1 α = 1 K 1 α3 |ei+2| 1α2 K 1 α3 = |ei+2| 1 α2 K 2 α3 Now K1|ei+3| = K2|ei+2| bracketleftbigg|e i+2| K3 bracketrightbigg1 α  |ei+2| 1 α2 K 2 α3   = K2 K 3 α3 |ei+2|1+ 1α+ 1α2 . So |e i+3|vextendsingle vextendsinglevextendsingle vextendsinglee 1+ 1α+ 1α2 i+2 vextendsinglevextendsingle vextendsinglevextendsingle = K4K3 and |ei+2|α−1−1α− 1α2 = K4K3. The left hand side is dependentof i and the right hand side is independentof i. So α−1− 1α − 1α2 = 0 and α3 −α2 −α−1 = 0. Using Newton’s method, we find that the real root of α−1− 1α − 1α2 = 0 12 is α = 1.839286755 ≈ 1.84 (as suggested in Atkinson). 6 Hybrid Methods 6.1 Brent’s Method Brent’s method [6] is a hybrid method that seeks to combine the Bisection method, the Secant method, which he calls Linear Interpolation, and sometimes Inverse Quadratic Interpolation. Brent has shown that the method converges at least as fast as the Bisec- tionmethod. Italsoexhibits super-linearconvergencetoasingle zeroofa continuously differentiable function since in that case only linear interpolation or better is used. We note that linear interpolation gives p1(x) = x−p1p 0 −p1 f(p0) + x−p1p 1 −p0 f(p1). Solving P1(x) = 0 for x = p2 gives the Secant method or linear interpolation p2 = p1 − f(p1)(p1 −p0)f(p 1)−f(p0) Brent uses three values of x. They are called a, b and c where initially a = c and f(b)f(c) ≤ 0 with |f(b)| ≤ |f(c)| and the initial interval is [b,c] or [c,b]. The values a, b and c are changed at each step so that b is the better approximation and the former b replaces a. If f(b) = 0, then we are done. If f(b) negationslash= 0, let m = 12(c−b). Thus, b + m = b + 12c− 12b = b + c2 , which is a bisection method. Let i = b + f(b)(b−a)f(b)−f(a) where calculations are done to min- imize rounding error for the Secant method. However, if a, b and c are distinct, we can do inverse quadratic interpolation. In summary, we use Bisection if |f(b)|≥|f(a)|, 13 otherwise|f(b)| < |f(a)| ≤|f(c)| so let r1 = f(a)f(c),r2 = f(b)f(c) and r3 = f(b)f(a) p = r1[(c−b)r1(r1 −r2)−(b−a)(r2 −1)] q = −(r1 −1)(r2 −1)(r3 −1). Now i = b + pq. 6.2 Our Method WecreatedamethodthatbeginswithamodifiedversionoftheModifiedFalsePosition method in which we use λ = 1.5 instead of 2, as in the Illinois method, which we found to be more efficient through several test cases. Once |pi+1 −pi| < epsilon1, our method switches to that of M¨uller’s. However, in M¨uller’s method the three approximations, f(pi−1),f(pi),f(pi+1), may be of the same sign. This may cause the method to produce complex values. In this case, we jump back into the modified version of Modified False Position. We are unable to calculate an α for our method since we change methods within our calculations, but when pn is far from p, the method is strictly the modified version of Modified False Position and has convergence order of α = 1 or α = 1.6. When pn is close to p, the method is strictly M¨uller’s method and has convergence order α = 1.84. 7 Error Analysis We will begin our discussion of error analysis with the Secant method, which has the formula to generate approximations pi, pi = pi−1 − f(pi−1)(pi−1 −pi−2)f(p i−1)−f(pi−2) . We assume fprimeprime(x) negationslash= 0 is continuous and fprime(x) negationslash= 0 for an interval [a,b] containing p, the solution to f(x) = 0. We have shown that the Secant method gives for large i the relation |ei+1|≈ K|ei|α 14 where α ≈ 1 + √5 2 We discuss False Position versusthe Secant method. False Positionalways converges if f(x) is continuous and f(a)f(b) < 0, where p0 = a and p1 = b. The worse case is when f(x) is concave or convex on [a,b]. Suppose we have generated pj−1 and pj where f(pj−1)f(pj) < 0. We generate pj+1 using the Secant method formula. If f(pj) and f(pj+1) are of opposite sign we use pj and pj+1to generate pj+2 as in a Secant method step. However, if f(pj−1) and f(pj+1) have opposite signs we discard pj and use pj−1 and pj+1 to generate pj+2. With a convex or concave function the process could continue with pj−1 being retained and pj+k replacing pj+k−1 for k = 1,2,3,.... For large i + 1 = j + k + 1 we could have |ei+1| ≈ c|ei||ej−1| so |e i+1| |ei| ≈ c|ej−1| and limi→∞ |ei+1||e i| = c|ej−1|. This makes False Position a linearly convergent method. We again use the interpolation polynomial to obtain f(x) = f(pi) + f[pi,pi−1](x−pi) + (x−pi−1)(x−pi)12fprimeprime(ξi) and for x = p f(p) = 0 = f(pi) + f[pi,pi−1](p−pi) + 12fprimeprime(ξi)(p−pi−1)(p−pi), (5) but the Secant method gives 0 = f(pi) + (pi+1 −pi)f[pi,pi−1] (6) Subtracting equation (6) from equation (5) gives 0 = f[pi,pi−1](p−pi −pi+1 + pi) + (p−pi−1)(p−pi)12fprimeprime(ξi). 15 Simplifying, 0 = f[pi,pi−1]ei+1 + eiei−112fprimeprime(ξi) so ei+1 = −eiei−1 f primeprime(ξi) f[pi−1,pi] = −eiei−1 fprimeprime(ξi) 2fprime(ξprimei). Assume fprimeprime, fprime have constant sign on (a,b), then ei+1eiei−1 has constant sign. If f(p0)f(p1) < 0, then e0 = p−p0 > 0 and e1 = p−p1 < 0. Case 1: keep p1, p2 so e2 > 0 e2 e1e0 < 0. For e3e2e1 < 0 we need e3 > 0. For e4e3e2 < 0 we need e4 < 0. This leads to a sign pattern of the eprimeis to be +−+ +− Case 2: if e0 < 0, e1 > 0, the sign pattern is −+−−+−−+ False Position: pis betweenpi−1 andpi,sothesignpatternis−+−+... or +−+−... So in this case for the Secant method, every third step is not False Position. InFalse Position: Supposethatfor certain if(pi+1)f(pi) > 0 and f(pi+1)f(pi−1) < 0 so we keep pi−1 as the new pi and throw out pi. Now pi+2 is chosen by the secant line through (pi−1,f(pi−1)) and (pi+1,f(pi+1)). Illinois-Typemodification: Takepi+2 astherootofthesecantlinethrough(pi−1, 12f(pi−1)) and (pi+1,f(pi+1)) It can still happen that f(pi+2)f(pi+1) > 0 and we could then use in place of 12 the values 14,18, 116,... until there is a sign change. [7] We seek improvements in False Position other that the Illinois method. In Figure 2 ofSection4.3, f(x)isconcaveandthispresentsdifficultiesintheFalsePositionmethod. The intervals bracketing p are [p0,p1],[p0,p2],[p0,p3],.... The sequence {pi} will con- verge linearly to p since p0 is never discarded. Thus, the length of the bracketing inter- vals do not converge to zero as in the Bisection method. To remedy this problem we consider the situation where f(pi−1)f(pi) < 0 and f(pi+1)f(pi) > 0 so we would retain pi−1 and discard pi. Instead of using (pi−1,f(pi−1)) and (pi+1,f(pi+1)) to generatepi+2, we consider using (pi−1,λf(pi−1)) and (pi+1,f(pi+1)). Optimally, we choose λ so that pi+2 = p. By similar triangles λf(pi−1) −f(pi+1) = p−pi−1 pi+1 −p 16 Solving for λ and absorbing minus sign into the numerator λ = f(pi+1)[pi−1 −p]f(p i−1)[pi+1 −p] = bracketleftbigg f(p i+1) pi+1 −p bracketrightbiggbracketleftbiggp i−1 −p f(pi−1) bracketrightbigg Since f(p) = 0, we have λ = bracketleftbiggf(p i+1)−f(p) pi+1 −p bracketrightbiggbracketleftbigg p i−1 −p f(pi−1)−f(p) bracketrightbigg = f[p,pi+1]f[p,p i−1] = f[pi+1,p]f[p i−1,p] using the divided difference notation. Since p is unknown, we need to approximate λ = f[pi+1,p]f[pi−1,p] The Illinois method makes a simple choice λ = 12. If we let ei = pi −p, it can be shown that for large i ei+1 ≈ f primeprime(p) 2fprime(p)eiei−1 for the Secant method, which uses pi−1 and pi to get pi+1. If f(pi+1)f(pi) < 0, we continue with the Secant method resulting in ei+2 ≈ f primeprime(p) 2fprime(p)ei+1ei. However, if f(pi+1)f(pi−1) < 0 leaving f(pi+1)f(pi) > 0, we the Illinois modified step with λ = 12 which gives ei+2 ≈−ei+1. So asymptotically this gives a sign change. If ei+1 > 0, that is pi+1 > p, we must have had pi−1 < p. Then ei+2 < 0 implies pi+2 < p also. Consequently, pi+2 replaces pi−1 and we continue. LettingU stand for an unmodifiedstep(Secantmethod)and M stand for a modified step, we have the sequence of steps UUM, UUM, UUM, ... or UM, UUM, UUM as basic patterns. However, then there are other choices for λ. First we can approximate f[pi−1,p] with the following choices A1: f[pi−1,pi] A2: f[pi−1,pi+1] A3: f[pi+1,pi−1] + f[pi,pi−1]−f[pi+1,p] and similarly approximate f[pi+1,p] with 17 B1: f[pi+1,pi] B2: f[pi+1,pi−1] B3: f[pi+1,pi] + f[pi+1,pi−1]−f[pi,pi−1]. The reasons for A1, A2, B1 and B2 are a matter of convenient use of what has been computed. The choices A3 and B3 correspond to the derivative P2(x) of the interpolat- ing polynomial P2(x) based on pi−1,pi and pi+1 evaluated at x = pi+1 (A3) or x = pi−1 (B3) respectively. The method of Anderson and Bj¨orck has the choice of A1 and B1 together. In the paper by Ford, the methods used are Method 1: A3,B3 Method 2: A2,B1 Method 3: A2,B3 Method 4: A1,B3 Method 5: A3,B1 Method 5 was discarded since it was never better than Anderson-Bj¨orck, and often worse. The Pegasus method uses λ = f(pi)f(p i) + f(pi−1) . In all cases above, we have a modified step described by pi+2 = f(pi+1)pi−1 −λf(pi−1)pi+1f(p i+1)−λf(pi−1) . 8 Examples In this section, we have listed the various test functions we have found and have com- pared our method against those mentioned in previous sections. In the following sec- tion, we have included charts that list the number of iterations each method took to arrive at the approximation of the root p and will analyze the results. The results of each set of test functions are listed in the chart with the name(s) of those who used them in their work. The example problems used by Anderson and Bj¨orck are: 1. f(x) = sin(x)−0.5 with x0 = 0,x1 = 1.5 2. f(x) = 2xe−n + 1−2e−nx with x0 = 0,x1 = 1 18 3. f(x) = (1 + (1−n)2)x−(1−nx)2 with x0 = 0,x1 = 1 4. f(x) = x2 −(1−x)n with x0 = 0,x1 = 1 5. f(x) = 1 + (1 + (1−n)4)x−(1−nx)4 with x0 = 0,x1 = 1 6. f(x) = e−x(x−1) + xn with x0 = 0,x1 = 1 7. f(x) = nx−1(n−1)x with x0 = 0.01,x1 = 1 Calculations were preformed to a precision of 6∗10−8. [3] (Note: We will omit example 5 because the value of x0 = 0 is a zero of the function already.) The example problems used by Dowell and Jarratt are the same as those used by Anderson and Bj¨orck, except now the calculations were preformed to a precision of 5∗10−20. [2] The example problems used by Ford are: 1. f(x) = 4cos(x)−ex with root p = 0.904788 3. f(x) = 2xe−20 + 1−2e−20x with root p = 0.034657 4. f(x) = ex−1−25 −1 with root p = 0.04 6. f(x) = 1010x1x −1 with root p = 0.1 7. f(x) = x20 −1 with root p = 1.0 8. f(x) = e 21000 x 1.11∗1011x2 −1 with root p = 551.774 9. f(x) = 1x + ln(x)−100 with root p = 0.009556 10. f(x) = eex −ee1 with root p = 1.0 11. f(x) = sinparenleftbig0.01x parenrightbig−0.01 with root p = 0.999983 Calculations were preformed to a precision of 10−14. [5] (Note: For chart simplicity, we denoted “Anderson-Bj¨orck”as “A-B”, “Method 1”, “Method 2”, “Method 3”and “Method 4”as “M1”, “M2”, “M3”and “M4”respectively. Also note that we omitted Ford’s examples 2 and 5 since we were unable to get them to run in our Maple program.) The example problems used in the Various Common Methods chart below are: 1. f(x) = x3 −2x2 −5 with root p = 2.69064744802861 2. f(x) = x3 + 2x2 −1 with root p = −1.61803398874989 3. f(x) = 2xcos(2x)−(x−2)2 with root p = 3.72211277310179 4. f(x) = 3xtan(2x)−(x−2)2 with root p = 0.495135510634739 19 Calculations were preformed to a precision of 10−10. [4] (Note: For chart simplicity, we denoted “False Position”as “FP”, “Modified False Position”as “MFP”and “Inverse Quadratic Interpolation”as “Inv. Quad. Int.”.) 9 Results The following are tables comparing the number of iterations required by the various methods mentioned in previous sections with that of our method. We will highlight some of the tests where our method outperformed the others, “competitive”(within 2 iterations) and those where our method did much worse. Anderson-Bj¨orck v. Our Method Example n Illinois Brent’s Algorithm Pegasus Our value Method Method “A” Method Method 1 6 6 5 6 6 2 1 6 5 4 5 5 5 9 8 8 7 7 15 9 9 10 8 9 20 9 9 10 8 9 3 2 7 7 6 6 5 5 7 6 6 6 5 15 5 5 5 6 5 20 4 5 4 5 5 4 2 1 1 1 1 2 5 6 6 6 7 7 15 10 8 9 9 11 20 10 10 9 10 11 20 Anderson-Bj¨orck v. Our Method continued Example n Illinois Brent’s Algorithm Pegasus Our value Method Method “A” Method Method 6 1 7 6 5 6 5 5 7 6 6 11 8 10 11 6 7 11 12 15 >16 9 7 15 15 20 >16 10 8 >16 >16 7 2 11 3 4 12 13 5 13 10 5 11 14 15 12 9 4 11 11 20 13 11 4 8 11 Overall, our method fared quite well in all seven examples. Brent’s method and Algorithm “A”seem to have performed the best, but in examples 2 and 3, our method matched or outperformed both methods for most n values. In example 7, our method did worse in comparison to all but the Illinois method. Dowell-Jarratt v. Our Method Example n Bisection False False Position Illinois Our value Method Position and Bisection Method Method 1 6 7 2 1 64 23 19 9 6 5 64 40 21 10 9 15 67 41 23 11 10 20 67 42 23 11 10 3 2 64 25 21 9 5 5 62 16 17 9 5 15 71 11 15 7 5 20 71 10 15 7 5 21 Dowell-Jarratt v. Our Method continued Example n Bisection False False Position Illinois Our value Method Position and Bisection Method Method 4 2 1 1 1 1 2 5 64 54 19 8 8 15 61 179 23 11 15 20 62 245 23 12 12 6 1 64 26 19 9 7 5 63 114 21 9 12 10 57 1,286 23 13 14 15 55 10,000 21 16 18 7 2 59 2008 2 14 14 5 56 809 25 14 16 15 61 262 25 14 12 20 58 192 25 15 13 In this example 2 and 3, our method slightly outperforms the Illinois method, and simultaneously outperforms the other methods significantly. In all the other examples, our method was competitive or slightly outperformed the Illinois method depending on different values of n. With the exception of examples 4 and 7 with the value of n being 2, our method significantly outperformed the methods other than Illinois. 22 Ford v. Our Method Example Initial Illinois Pegasus A-B M1 M2 M3 M4 Our Bracket Method Method Method 1 [0,1.5] 8 7 6 8 7 6 7 7 [−1,3] 11 10 10 13 11 10 10 10 [−1.5,6] 20 20 17 19 18 17 17 20 3 [0,1] 9 10 11 10 14 9 10 10 [−0.1,1.5] 15 14 43 11 46 12 12 15 [−0.5,2] 33 31 200+ 16 200+ 14 13 35 [−1,4] 47 44 200+ 20 200+ 16 16 48 4 [0.035,0.05] 14 14 18 9 21 12 10 18 [0.03,0.09] 27 26 200+ 15 200+ 7 13 28 [0.025,0.5] 49 46 200+ 19 200+ 17 15 50 [0.02,1] 58 55 200+ 19 200+ 18 18 60 6 [0.095,1.0] 35 33 12 17 14 12 13 39 [0.075,0.15] 23 23 200+ 17 200+ 16 16 27 [.08,0.5] 38 35 200+ 31 200+ 22 21 40 [0.05,0.2] 36 34 200+ 21 200+ 15 18 38 7 [0.9,1.05] 8 8 9 9 10 7 8 8 [0.7,1.2] 15 14 23 13 24 11 11 17 [0,2.5] 42 42 200+ 19 200+ 16 16 44 [−0.5,5] 60 59 200+ 21 200+ 19 19 62 8 [550,560] 7 5 5 6 6 5 6 12 [525,590] 11 9 9 9 10 9 9 11 [400,600] 29 27 18 15 20 12 12 31 [350,850] 44 42 200+ 15 200+ 15 19 46 23 Ford v. Our Method (continued) Example Initial Illinois Pegasus A-B M1 M2 M3 M4 Our Bracket Method Method Method 9 [0.005,0.02] 10 7 7 7 8 9 8 8 [0.001,0.05] 13 13 7 12 9 11 10 16 [0.0001,0.1] 17 16 8 11 11 12 12 20 [0.001,100] 21 18 17 13 23 17 22 19 10 [0,2] 13 14 12 11 13 11 11 17 [−4,2] 19 17 32 14 36 10 12 20 [−10,3] 43 42 200+ 16 200+ 14 14 47 [0.5,3.5] 51 48 11 13 12 10 12 53 11 [0.5,2] 10 6 5 5 8 7 8 9 [0.2,6] 12 10 5 10 9 9 10 12 [0.05,20] 15 11 5 12 9 10 11 16 [0.004,200] 21 18 7 13 10 13 15 20 In most problems, our method was competitive or not too far off from all the methods except for Method 1, Method 2 and Method 4, which seem to be the best in these tests. One thing to notice is example 10 where our method failed to beat any other method for any of the initial brackets. Various Common Methods v. Our Method Example Bracket Secant FP MFP Inv. Quad. Bisection Our Method Int. Method Method 1 [1,4] 13 34 10 8 36 9 2 [−3,−1.3] 12 102 11 12 35 11 3 [3,4] 9 14 9 8 35 7 4 bracketleftbig0, pi6bracketrightbig 8 11 8 7 34 6 In these examples, our method proved to be the best in every case except for where Inverse Quadratic Interpolation outperformed ours by one iteration in example 1 and our modified version of Modified False Position tied our method in example 2. 24 10 Discussion In summary, we have created a hybrid method by using our modified version of the Modified False Position method, which changes the divisor of 2 to 1.5, until the esti- mate, pi to the root p of f(x) = 0 is within a certain tolerance, then switches to M¨uller’s method. We compared this method some of the well-known root finding methods studied in the field of numerical analysis as well as some lesser known methods. The lesser known methods include Brent’s method, Illinois method, Pegasus method, one created by Anderson-Bj¨orck and the four created by Ford. We have found our method to be quite competitive in many of the twenty-five examples considered,better than all the other methods in two examples, and worse than the others in just one example. There are othermethods thatwe haven’t considered. In section8, we discussedfive methods that Ford used. Ford introduced three possible numerators (A1, A2 and A3) and denominators (B1, B2 and B3) for new methods of solving equations. Clearly there are9possiblecombinations. FordconsideredthecombinationofA3andB3as“Method 1”, A2 and B1 as “Method 2”, A2 and B3 as “Method 3”, A1 and B3 as “Method 4”and A3 and B1 as “Method 5”. Anderson and Bj¨orck’s method uses A1 and B1. This leaves the combinations of A1 and B2, A2 and B2, and A3 and B2. We didn’t consider these either in the interest of time as well as the excellent results of our hybrid method that took several months to perfect. 11 References [1] Dowell, M. and Jarratt, P. A modified Regula Falsi method for computing the root of an equation, BIT, Volume 11, pp 168-174 (1971). [2] Dowell, M. and Jarratt, P. The Pegasus method for computing the root of an equation, BIT, Volume 12, pp 503-508 (1972). [3] Anderson, N. and Bj¨orck. A. A new high order method of Regula Falsi type for computing the root of an equation, BIT, Volume 813, pp 253-264 (1973). [4] Burden, R. and Faires, J. Numerical Analysis (8th edition), Thomson, Brooks/Cole, pp 46, 63-64, 67-70, 75, 92-93, 108, 117 (2005). [5] Ford, J.A. Improved algorithms of Illinois-type for the numerical solution of nonlinear equations, Technical Report CSM-257, University of Essex (1995) [6] Brent, R. Algorithms for minimization without derivatives, Englewood Cliffs, NJ, pp 21-22, 48-53 (1973). 25 [7] Dahlquist, G. and Bj¨orck, A. (Translated by Ned Anderson) Numerical Methods, Prentice-Hall, Englewood Cliffs, NJ, pp 229-232 (1974). [8] Atkinson, K. An Introduction to Numerical Analysis (2nd edition), Wiley and Sons, pp 73-75 (1989). 12 Appendix The following is the code of our hybrid method written in Maple. restart; Digits := 15; f := x → f(x); F := x → D(f)(x); p0 := γ; p1 := δ; p2 := (p0+p1)2.0 ; # Note: here f(x) is the function in consideration, γ is a lower bound approximation to the root p of f(x) = 0 and δ is an upper bound approximation to p so that p ∈ [γ,δ]. TOL := 10−10; TOL1 := .1; den := 2.0; eps := evalf(TOL + 2−53 ∗max(abs(p0),abs(p1),1)); eps1 := .95∗eps; # Note: here eps is the tolerance discussed by Ford [5], # Input the maximum number of iterations desired. N := 200; p[0] := p0; q[0] := evalf(f(p[0])); q0 := q[0]; p[1] := p1; q[1] := evalf(f(p[1])); q1 := q[1]; p[2] := p2; q[2] := evalf(f(p[2])); q2 := q[2]; side := 0; a[1] := p0; b[1] := p1; sw := 1; sw2 := 0; for i from 2 to N do if sw = 1 then p2 := (q1∗p0−q0∗p1)(q1−q0) ; print(i,p0,p1,p2,q0,q1,q2,side,sw) : print(abs(p2−p1)) : q2 := f(p2) : p[i] := p2 : q[i] := q2 : if abs(p2−p1) < eps1 or abs(q2) < eps then break: end if: if q2∗q0 > 0 then p0 := p2 : q0 := q2 : if side = −1 then q1 := q1den : end if: a[i] := p2 : b[i] := b[i−1] : side := −1 : else if q2∗q1 > 0 then p1 := p2 : q1 := q2 : b[i] := p2;a[i] := a[i−1] : if side = 1 then q0 := q0den : end if: side := 1 else break: end if: end if: 26 if abs(p[i]−p[i−1]) < TOL1 then sw := 0 : e1 := abs(p[i]−p[i−1]) end if: if sw2 = 1 then sw := 1: end if: else x2 := p[i−1] : x1 := p[i−2] : x0 := p[i−3] : f2 := f(x2) : f1 := f(x1) : f0 := f(x0) : h0 := x1−x0 : h1 := x2−x1 : del10 := (f1−f0)h0 : del11 := (f2−f1)h1 : del2 := (del11−del10)(h1+h0) : B := del11 + h1∗del2 : D2 := B ∗B −4∗f2∗del2 : D1 := sqrt(D2) : E1 := B + D1 : if abs(B −D1) > abs(E1) then E1 := B −D1 end if: h2 := −2∗f2E1 : p[i] := x2 + h2 : p3 := p[i] : q3 := f(p3) : q[i] := q3 : e2 := abs(p[i]−p[i−1]) : print(i,p[i],sw) : print(e2) : if (e2 < eps1,abs(q2) < eps) then break: end if: if (e2 > e1,abs(q3) > abs(q2)) then p0 := a[i−1] : p1 := b[i−1] : q0 := f(p0) : q1 := f(p1) : sw := 1 : sw2 := 1 : a[i] := a[i−1] : b[i] := b[i−1] : else if q3∗f(a[i−1]) > 0 then a[i] := p3 : b[i] := b[i−1] : else b[i] := p3 : a[i] := a[i−1] : end if: if b[i] < a[i] then cc := a[i] : a[i] := b[i] : b[i] := cc: end if: end if: end if: end: for j from 1 to i do print(j, a[j], b[j]): end: 27