SAS for Time Series

April 8, 2018 | Author: Anonymous | Category: Documents
Report this link


Description

SAS for ® Forecasting Time Series second edition John C. Brocklebank, Ph.D. David A. Dickey, Ph.D. The correct bibliographic citation for this manual is as follows: Brocklebank, John C., and David A. Dickey. 2003. SAS® for Forecasting Time Series, Second Edition. Cary, NC: SAS Institute Inc. SAS® for Forecasting Time Series, Second Edition Copyright © 2003 by SAS Institute Inc., Cary, NC, USA Jointly co-published by SAS Institute and Wiley 2003. SAS Institute Inc. ISBN 1-59047-182-2 John Wiley & Sons, Inc. ISBN 0-471-39566-8 All rights reserved. Printed in the United States of America. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, or otherwise, without the prior written permission of the publisher, SAS Institute Inc. U.S. Government Restricted Rights Notice: Use, duplication, or disclosure of this software and related documentation by the U.S. government is subject to the Agreement with SAS Institute and the restrictions set forth in FAR 52.227-19, Commercial Computer Software-Restricted Rights (June 1987). SAS Institute Inc., SAS Campus Drive, Cary, North Carolina 27513. 1st printing, May 2003 SAS Publishing provides a complete selection of books and electronic products to help customers use SAS software to its fullest potential. For more information about our e-books, e-learning products, CDs, and hardcopy books, visit the SAS Publishing Web site at support.sas.com/pubs or call 1-800-727-3228. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies. ® 97 noitacifitnedI ledoM fo yrammuS 4.3.3 97 slaudiseR fo kcehC erauqS-ihC 3.3.3 65 noitacifitnedI seireS emiT 2.3.3 55 ytilibitrevnI dna ytiranoitatS 1.3.3 55 noitacifitnedI ledoM 3.3 25 snoitciderP erutuF 2.2.3 15 snoitciderP daehA-petS-enO 1.2.3 15 noitciderP 2.3 05 noitatoN dna ygolonimreT 2.1.3 94 dnuorgkcaB lacitsitatS 1.1.3 94 noitcudortnI 1.3 94 ............................................................ led o M AMIRA laren eG ehT 3 re tp ahC 54 GER CORP ni ledoM RA na gnittiF 3.2 14 secnairavoC rof snoitauqE reklaW-eluY 3.2.2 04 seireS emiT rof B noitatoN tfihskcaB 2.2.2 03 AMIRA CORP htiw gnitsaceroF 1.2.2 92 gnitsaceroF 2.2 82 dnuorgkcaB lacitsitatS 2.1.2 72 noitatoN dna ygolonimreT 1.1.2 72 noitcudortnI 1.2 72 ....................................................... n o iss ergero tu A :sled o M elp miS 2 re tp ahC 12 ataD demrofsnarT htiw noissergeR 3.3.1 31 ytilanosaeS ralugeR ylhgiH 2.3.1 6 noissergeR raeniL 1.3.1 6 noissergeR :sledoM elpmiS 3.1 4 etalerretnI serudecorP erawtfoS STE/SAS woH 2.2.1 2 snoitpO 1.2.1 2 erawtfoS STE/SAS dna sdohteM sisylanA 2.1 1 noitcudortnI 1.1 1 ..................................................................... seireS e m iT f o weivrevO 1 re tp ahC stnemgdelwonkcA ecaferP xi .................................................................................................... iiv ................................................................................................................... Contents 062 noitcnuF esnopseR eslupmI 3.2.5 852 seulavnegiE dna noitargetnioC 2.2.5 652 noitcudortnI 1.2.5 652 noitargetnioC 2.5 942 secnairaV lauqenU rof HCRAGI dna ,HCRAG ,HCRA 4.1.5 542 secnairaV lauqenU 3.1.5 142 ytisrevinU a ta dnameD ygrenE :elpmaxE 2.1.5 932 srorrE evissergerotuA 1.1.5 932 secnairaV lauqenU dna srorrE seireS emiT htiw noissergeR 1.5 932 ..................................... s n o itacilp p A laicep S :led oM AMIRA ehT 5 re tp ahC 4.4 3.4 732 kcattA tsirorreT 4.4.4 332 )noitnevretnI( eracS kliM 3.4.4 132 detisiveR seireS noitcurtsnoC 2.4.4 322 selaS liateR aniloraC htroN 1.4.4 322 selpmaxE rehtruF 312 noitnevretnI :B3 esaC 4.3.4 381 snoitcnuF refsnarT lareneG :3 esaC 3.3.4 971 snoitcnuF refsnarT elpmiS :2 esaC 2.3.4 761 srorrE seireS emiT htiw noissergeR :1 esaC 1.3.4 761 elpmaxE dna ygolodohteM 761 noitnevretnI :B3 esaC 6.2.4 661 srotacidnI gnidaeL :A3 esaC 5.2.4 661 noitcnuF refsnarT lareneG :3 esaC 4.2.4 561 noitcnuF refsnarT elpmiS :2 esaC 3.2.4 561 noitnevretnI :A1 esaC 2.2.4 461 srorrE seireS emiT htiw noissergeR :1 esaC 1.2.4 461 selbairaV yrotanalpxE htiw sledoM 541 noitacifitnedI ledoM 2.1.4 341 gniledoM lanosaeS ot noitcudortnI 1.1.4 341 seireS emiT lanosaeS 2.4 1.4 341 ........................... s n o itacilp pA yr o tcu d o rtn I :led oM AMIRA ehT 4 re tp ahC 041 yrammuS 5.3 821 seuqinhceT noitacifitnedI rehtO 01.4.3 321 dnerT raeniL a evomeR ot gnicnereffiD 9.4.3 311 ataD yranoitatsnoN rof sledoM 8.4.3 501 seireS revliS dna seireS MBI gnitsaceroF :selpmaxE 7.4.3 401 stsaceroF no gnicnereffiD fo tceffE 6.4.3 201 seireS yranoitatsnoN 5.4.3 79 8 seireS rof tnemetatS ETAMITSE 4.4.3 59 AMIRA CORP ni desU sdohteM noitamitsE 3.4.3 09 sisylanA tropxE leetS dna norI :elpmaxE 2.4.3 18 8–1 seireS rof tnemetatS YFITNEDI 1.4.3 08 snoitcurtsnI dna selpmaxE 4.3 iv Contents 453 yaleD eruP dna ,esahP ,niaG no sliateD 5.01.7 053 ataD reviR esueN eht fo sisylanA lartcepS-ssorC 4.01.7 643 stnemetatS ARTCEPS CORP 3.01.7 443 artcepS esahP dna edutilpmA-ssorC gniterpretnI 2.01.7 143 seititnauQ lartcepS-ssorC gniterpretnI 1.01.7 143 sisylanA lartcepS-ssorC 01.7 043 margodoireP dehtoomS ehT :murtcepS eht gnitamitsE 9.7 933 )gnidaeR lanoitpO( liateD lacitamehtaM emoS 8.7 533 ytisneD lartcepS ehT 7.7 433 gnisailA dna snoitautculF tsaF ylemertxE 6.7 033 seicneuqerF cinomraH 5.7 823 esioN etihW rof gnitseT 4.7 623 decudortnI ARTCEPS CORP 3.7 423 ytivitcA emyznE tnalP :elpmaxE 2.7 323 noitcudortnI :ataD cidoireP 1.7 323 ............................................................................ sisylanA lartcepS 7 re tp ahC 703 elpmaxE detalumiS 3.3.6 503 snoitalerroC lacinonaC 2.3.6 503 secnairavoC morf denimreteD srotceV etatS 1.3.6 203 ECAPSETATS CORP 3.6 103 2 noisnemiD fo )1,1(AMRA 2.2.6 892 selpmaxE etairavinU emoS 1.2.6 892 selpmaxE eroM 2.6 492 sledoM AMRA rotceV dna ecapS etatS fo ecnelaviuqE 3.1.6 582 elpmaxE etairavitluM elpmiS A 2.1.6 382 selpmaxE etairavinU elpmiS emoS 1.1.6 382 noitcudortnI 1.6 382 ..................................................................... gn iled oM ecapS etatS 6 re tp ahC 972 stsaceroF dna scitsongaiD 11.2.5 772 setamitsE eht gniterpretnI 01.2.5 572 XAMRAV CORP 9.2.5 372 sgaL eroM dna stpecretnI 8.2.5 072 rotceV gnitargetnioC eht gnitamitsE 7.2.5 562 elpmaxE evitartsullI nA 6.2.5 362 stooR tinU dna noitargetnioC 5.2.5 062 sledoM redrO-rehgiH ni stooR 4.2.5 Contents v 983 583 .................................................................................................................... xednI ............................................................................................................ secnerefeR 9.8 8.8 7.8 6.8 5.8 4.8 3.8 2.8 1.8 383 yrammuS 183 tnempoleveD ledoM gnikeeS-laoG 673 syalpsiD lacihparG 673 scirteM ecnamrofreP laoG ssenisuB 573 tnempoleveD dracerocS 863 erudecorP FPH 263 metsyS gnitsaceroF seireS emiT ehT 063 ledoM ataD gnitsaceroF 953 noitcudortnI 953 ......................................................... gn itsacero F d n a gniniM ataD 8 re tp ahC vi Contents Preface A time series is a set of ordered observations on a quantitative characteristic of a phenomenon at equally spaced time points. The goal of univariate time series analysis is to forecast values of a single historical series. The goal of multivariate time series analysis can be to model the relationships among component series as well as to forecast those components. Time series analysis can be accomplished most effectively by the SAS procedures ARIMA, STATESPACE, SPECTRA, and VARMAX. To use these procedures properly, you must (1) understand the statistics you need for the analysis and (2) know how to invoke the procedures. SAS for Forecasting Time Series, Second Edition, makes it easier for you to apply these procedures to your data analysis problems. Chapter 1, “Overview of Time Series,” reviews the goals and key characteristics of time series. The analysis methods available through SAS/ETS software are presented, beginning with the simpler procedures FORECAST, AUTOREG, and X11 and continuing with the more powerful SPECTRA, ARIMA, and STATESPACE. This chapter shows the interrelationships among the various procedures. It ends with a discussion of linear regression, seasonality in regression, and regression with transformed data. Chapter 2, “Simple Models: Autoregression,” presents the statistical background necessary to model and forecast simple autoregressive (AR) processes. A three-part forecasting strategy is used with PROC ARIMA to identify, estimate, and forecast. The backshift notation is used to write a time series as a weighted sum of past shocks and to compute covariances through the Yule-Walker equations. The chapter ends with an example involving an AR process with regression techniques by overfitting. Chapter 3, “The General ARIMA Model,” extends the class of models to include moving averages and mixed ARMA models. Each model is introduced with its autocovariance function. Estimated autocovariances are used to determine a model to be fit, after which PROC ARIMA is used to fit the model, forecast future values, and provide forecast intervals. A section on time series identification defines the autocorrelation function, partial autocorrelation function, and inverse autocorrelation function. Newer identification techniques are also discussed. A catalog of examples is developed, and properties useful for associating different forms of these functions with the corresponding time series are described. This chapter includes the results of 150 observations generated from each of eight sample series. Stationarity and invertibility, nonstationarity, and differencing are discussed. Chapter 4, “The ARIMA Model: Introductory Applications,” describes the ARIMA model and its introductory applications. Seasonal modeling and model identification are explained, with Box and Jenkins’s popular airline data modeled. The chapter combines regression with time series errors to provide a richer class of forecasting models. Three cases are highlighted: Case 1 is a typical regression, case 2 is a simple transfer function, and case 3 is a general transfer function. New in Chapter 4 for the second edition are several interesting intervention examples involving analyses of .skcatta tsirorret ,1002 ,11 rebmetpeS eht fo emulov kcots enilria no tceffe eht klim detniat tuoba yticilbup fo sesahcrup klim no tceffe eht ecnatsissa yrotcerid rof gnigrahc fo sllac no tceffe eht viii Preface Chapter 5, “The ARIMA Model: Special Applications,” extends the regression with time series errors class of models to cases where the error variance can change over time—the ARCH and GARCH class. Multivariate models in which individual nonstationary series vary together over time are referred to as “cointegration” or “error correction” models. These are also discussed and illustrated. This chapter presents new developments since the first edition of the book. Chapter 6, “State Space Modeling,” uses the AR model to motivate the construction of the state vector. Next, the equivalence of state space and vector ARMA models is discussed. Examples of multivariate processes and their state space equations are shown. The STATESPACE procedure is outlined, and a section on canonical correlation analysis and Akaike’s information criterion is included. The chapter ends with the analysis of a bivariate series exhibiting feedback, a characteristic that cannot be handled with the general ARIMA transfer function approach. Chapter 7, “Spectral Analysis,” describes the SPECTRA procedure and how spectral analysis is used to detect sinusoidal components in time series models. In periodogram analysis, regressions are run on a sequence of values to find hidden periodicities. Spectra for different series, smoothing the periodogram, Fourier coefficients, and white noise tests are covered. The chapter ends with a discussion of cross-spectral analysis. New for the second edition is more in-depth discussion of tests for white noise and the ideas behind spectral analysis. Chapter 8, “Data Mining and Forecasting,” deals with the process of forecasting many time series with little intervention by the user. The goal of the chapter is to illustrate a modern automated interface for a collection of forecasting models, including many that have been discussed thus far. Chapter 8 also examines the SAS/ETS Time Series Forecasting System (TSFS), which provides a menu-driven interface to SAS/ETS and SAS/GRAPH procedures in order to facilitate quick and easy analysis of time series data. The chapter also includes a discussion detailing the use of PROC HPF, an automated high-performance forecasting procedure that is designed to forecast thousands of univariate time series. Acknowledgments For the 1986 edition we owe a great debt to students of the SAS Applied Time Series Analysis and Forecasting course, who made valuable suggestions on the subject material and generally stimulated our interest. We are also indebted to Alice T. Allen, Kathryn A. Council, Stephen Ewing, Wayne A. Fuller, Francis G. Giesbrecht, Robert P. Hastings, Herbert Kirk, Stephenie Joyner, Ann A. Lehman, Larry Stewart, and Houston Stokes. David M. DeLong and Bart Killam enthusiastically reviewed and offered comments on several chapters. For the production of the manuscript, we owe special thanks to Deborah S. Blank for her patience and diligence. For the second edition we gratefully acknowledge the SAS technical reviewers, Brent Cohen, Evan Anderson, Gul Ege, Bob Lucas, Sanford Gayle, and Youngjin Park. We are also grateful to the two outside reviewers, Houston Stokes of the University of Illinois at Chicago and David Booth of Kent State University. We would like to acknowledge several people at SAS whose efforts have contributed to the completion of the second edition: Keith Collins, Tom Grant, Julie Platt, John West, Sam Pipkin, Ed Huddleston, Candy Farrell, Patricia Spain, and Patrice Cherry. Finally, we would like to thank our wives, Vicki H. Brocklebank and Barbara S. Dickey, who were understanding and supportive throughout the writing of this book. x 1 Overview of Time Series 1.1 Introduction This book deals with data collected at equally spaced points in time. The discussion begins with a single observation at each point. It continues with k series being observed at each point and then analyzed together in terms of their interrelationships. One of the main goals of univariate time series analysis is to forecast future values of the series. For multivariate series, relationships among component series, as well as forecasts of these components, may be of interest. Secondary goals are smoothing, interpolating, and modeling of the structure. Three important characteristics of time series are often encountered: seasonality, trend, and autocorrelation. Seasonality occurs, for example, when data are collected monthly and the value of the series in any given month is closely related to the value of the series in that same month in previous years. Seasonality can be very regular or can change slowly over a period of years. A trend is a regular, slowly evolving change in the series level. Changes that can be modeled by loworder polynomials or low-frequency sinusoids fit into this category. For example, if a plot of sales over time shows a steady increase of $500 per month, you may fit a linear trend to the sales data. A trend is a long-term movement in the series. In contrast, autocorrelation is a local phenomenon. When deviations from an overall trend tend to be followed by deviations of a like sign, the deviations are positively autocorrelated. Autocorrelation is the phenomenon that distinguishes time series from other branches of statistical analysis. For example, consider a manufacturing plant that produces computer parts. Normal production is 100 units per day, although actual production varies from this mean of 100. Variation can be caused by machine failure, absenteeism, or incentives like bonuses or approaching deadlines. A machine may malfunction for several days, resulting in a run of low productivity. Similarly, an approaching deadline may increase production over several days. This is an example of positive autocorrelation, with data falling and staying below 100 for a few days, then rising above 100 and staying high for a while, then falling again, and so on. Another example of positive autocorrelation is the flow rate of a river. Consider variation around the seasonal level: you may see high flow rates for several days following rain and low flow rates for several days during dry periods. 4 etalerretnI serudecorP erawtfoS STE/SAS woH 2.2.1 2 snoitpO 1.2.1 12 ataD demrofsnarT htiw noissergeR 3.3.1 31 ytilanosaeS ralugeR ylhgiH 2.3.1 6 noissergeR raeniL 1.3.1 2 erawtfoS STE/SAS dna sdohteM sisylanA 2.1 1 noitcudortnI 1.1 6 noissergeR :sledoM elpmiS 3.1 retpahC 2 SAS for Forecasting Time Series Negative autocorrelation occurs less often than positive autocorrelation. An example is a worker's attempt to control temperature in a furnace. The autocorrelation pattern depends on the worker's habits, but suppose he reads a low value of a furnace temperature and turns up the heat too far and similarly turns it down too far when readings are high. If he reads and adjusts the temperature each minute, you can expect a low temperature reading to be followed by a high reading. As a second example, an athlete may follow a long workout day with a short workout day and vice versa. The time he spends exercising daily displays negative autocorrelation. 1.2 Analysis Methods and SAS/ETS Software 1.2.1 Options When you perform univariate time series analysis, you observe a single series over time. The goal is to model the historic series and then to use the model to forecast future values of the series. You can use some simple SAS/ETS software procedures to model low-order polynomial trends and autocorrelation. PROC FORECAST automatically fits an overall linear or quadratic trend with autoregressive (AR) error structure when you specify METHOD=STEPAR. As explained later, AR errors are not the most general types of errors that analysts study. For seasonal data you may want to fit a Winters exponentially smoothed trend-seasonal model with METHOD=WINTERS. If the trend is local, you may prefer METHOD=EXPO, which uses exponential smoothing to fit a local linear or quadratic trend. For higher-order trends or for cases where the forecast variable Yt is related to one or more explanatory variables Xt, PROC AUTOREG estimates this relationship and fits an AR series as an error term. Polynomials in time and seasonal indicator variables (see Section 1.3.2) can be computed as far into the future as desired. If the explanatory variable is a nondeterministic time series, however, actual future values are not available. PROC AUTOREG treats future values of the explanatory variable as known, so user-supplied forecasts of future values with PROC AUTOREG may give incorrect standard errors of forecast estimates. More sophisticated procedures like PROC STATESPACE, PROC VARMAX, or PROC ARIMA, with their transfer function options, are preferable when the explanatory variable's future values are unknown. One approach to modeling seasonality in time series is the use of seasonal indicator variables in PROC AUTOREG to model a highly regular seasonality. Also, the AR error series from PROC AUTOREG or from PROC FORECAST with METHOD=STEPAR can include some correlation at seasonal lags (that is, it may relate the deviation from trend at time t to the deviation at time t 12 in monthly data). The WINTERS method of PROC FORECAST uses updating equations similar to exponential smoothing to fit a seasonal multiplicative model. Another approach to seasonality is to remove it from the series and to forecast the seasonally adjusted series with other seasonally adjusted series used as inputs, if desired. The U.S. Census Bureau has adjusted thousands of series with its X-11 seasonal adjustment package. This package is the result of years of work by census researchers and is the basis for the seasonally adjusted figures that the federal government reports. You can seasonally adjust your own data using PROC X11, which is the census program set up as a SAS procedure. If you are using seasonally adjusted figures as explanatory variables, this procedure is useful. − Chapter 1: Overview of Time Series 3 An alternative to using X-11 is to model the seasonality as part of an ARIMA model or, if the seasonality is highly regular, to model it with indicator variables or trigonometric functions as explanatory variables. A final introductory point about the PROC X11 program is that it identifies * and adjusts for outliers. If you are unsure about the presence of seasonality, you can use PROC SPECTRA to check for it; this procedure decomposes a series into cyclical components of various periodicities. Monthly data with highly regular seasonality have a large ordinate at period 12 in the PROC SPECTRA output SAS data set. Other periodicities, like multiyear business cycles, may appear in this analysis. PROC SPECTRA also provides a check on model residuals to see if they exhibit cyclical patterns over time. Often these cyclical patterns are not found by other procedures. Thus, it is good practice to analyze residuals with this procedure. Finally, PROC SPECTRA relates an output time series Yt to one or more input or explanatory series Xt in terms of cycles. Specifically, cross-spectral analysis estimates the change in amplitude and phase when a cyclical component of an input series is used to predict the corresponding component of an output series. This enables the analyst to separate long-term movements from short-term movements. Without a doubt, the most powerful and sophisticated methodology for forecasting univariate series is the ARIMA modeling methodology popularized by Box and Jenkins (1976). A flexible class of models is introduced, and one member of the class is fit to the historic data. Then the model is used to forecast the series. Seasonal data can be accommodated, and seasonality can be local; that is, seasonality for month t may be closely related to seasonality for this same month one or two years previously but less closely related to seasonality for this month several years previously. Local trending and even long-term upward or downward drifting in the data can be accommodated in ARIMA models through differencing. Explanatory time series as inputs to a transfer function model can also be accommodated. Future values of nondeterministic, independent input series can be forecast by PROC ARIMA, which, unlike the previously mentioned procedures, accounts for the fact that these inputs are forecast when you compute prediction error variances and prediction limits for forecasts. A relatively new procedure, PROC VARMAX, models vector processes with possible explanatory variables, the X in VARMAX. As in PROC STATESPACE, this approach assumes that at each time point you observe a vector of responses each entry of which depends on its own lagged values and lags of the other vector entries, but unlike STATESPACE, VARMAX also allows explanatory variables X as well as cointegration among the elements of the response vector. Cointegration is an idea that has become quite popular in recent econometrics. The idea is that each element of the response vector might be a nonstationary process, one that has no tendency to return to a mean or deterministic trend function, and yet one or more linear combinations of the responses are stationary, remaining near some constant. An analogy is two lifeboats adrift in a stormy sea but tied together by a rope. Their location might be expressible mathematically as a random walk with no tendency to return to a particular point. Over time the boats drift arbitrarily far from any particular location. Nevertheless, because they are tied together, the difference in their positions would never be too far from 0. Prices of two similar stocks might, over time, vary according to a random walk with no tendency to return to a given mean, and yet if they are indeed similar, their price difference may not get too far from 0. Recently the Census Bureau has upgraded X-11, including an option to extend the series using ARIMA models prior to applying the centered filters used to deseasonalize the data. The resulting X-12 is incorporated as PROC X12 in SAS software. * 4 SAS for Forecasting Time Series 1.2.2 How SAS/ETS Software Procedures Interrelate PROC ARIMA emulates PROC AUTOREG if you choose not to model the inputs. ARIMA can also fit a richer error structure. Specifically, the error structure can be an autoregressive (AR), moving average (MA), or mixed-model structure. PROC ARIMA can emulate PROC FORECAST with METHOD=STEPAR if you use polynomial inputs and AR error specifications. However, unlike FORECAST, ARIMA provides test statistics for the model parameters and checks model adequacy. PROC ARIMA can emulate PROC FORECAST with METHOD=EXPO if you fit a moving average of order d to the dth difference of the data. Instead of arbitrarily choosing a smoothing constant, as necessary in PROC FORECAST METHOD=EXPO, the data tell you what smoothing constant to use when you invoke PROC ARIMA. Furthermore, PROC ARIMA produces more reasonable forecast intervals. In short, PROC ARIMA does everything the simpler procedures do and does it better. However, to benefit from this additional flexibility and sophistication in software, you must have enough expertise and time to analyze the series. You must be able to identify and specify the form of the time series model using the autocorrelations, partial autocorrelations, inverse autocorrelations, and cross-correlations of the time series. Later chapters explain in detail what these terms mean and how to use them. Once you identify a model, fitting and forecasting are almost automatic. The identification process is more complicated when you use input series. For proper identification, the ARIMA methodology requires that inputs be independent of each other and that there be no feedback from the output series to the input series. For example, if the temperature Tt in a room at time t is to be explained by current and lagged furnace temperatures Ft, lack of feedback corresponds to there being no thermostat in the room. A thermostat causes the furnace temperature to adjust to recent room temperatures. These ARIMA restrictions may be unrealistic in many examples. You can use PROC STATESPACE and PROC VARMAX to model multiple time series without these restrictions. Although PROC STATESPACE and PROC VARMAX are sophisticated in theory, they are easy to run in their default mode. The theory allows you to model several time series together, accounting for relationships of individual component series with current and past values of the other series. Feedback and cross-correlated input series are allowed. Unlike PROC ARIMA, PROC STATESPACE uses an information criterion to select a model, thus eliminating the difficult identification process in PROC ARIMA. For example, you can put data on sales, advertising, unemployment rates, and interest rates into the procedure and automatically produce forecasts of these series. It is not necessary to intervene, but you must be certain that you have a property known as stationarity in your series to obtain theoretically valid results. The stationarity concept is discussed in Chapter 3, “The General ARIMA Model,” where you will learn how to make nonstationary series stationary. Although the automatic modeling in PROC STATESPACE sounds appealing, two papers in the Proceedings of the Ninth Annual SAS Users Group International Conference (one by Bailey and the other by Chavern) argue that you should use such automated procedures cautiously. Chavern gives an example in which PROC STATESPACE, in its default mode, fails to give as accurate a forecast as a certain vector autoregression. (However, the stationarity of the data is questionable, and stationarity is required to use PROC STATESPACE appropriately.) Bailey shows a PROC STATESPACE Chapter 1: Overview of Time Series 5 forecast considerably better than its competitors in some time intervals but not in others. In SAS Views: SAS Applied Time Series Analysis and Forecasting, Brocklebank and Dickey generate data from a simple MA model and feed these data into PROC STATESPACE in the default mode. The dimension of the model is overestimated when 50 observations are used, but the procedure is successful for samples of 100 and 500 observations from this simple series. Thus, it is wise to consider intervening in the modeling procedure through PROC STATESPACE’s control options. If a transfer function model is appropriate, PROC ARIMA is a viable alternative. This chapter introduces some techniques for analyzing and forecasting time series and lists the SAS procedures for the appropriate computations. As you continue reading the rest of the book, you may want to refer back to this chapter to clarify the relationships among the various procedures. Figure 1.1 shows the interrelationships among the SAS/ETS software procedures mentioned. Table 1.1 lists some common questions and answers concerning the procedures. Figure 1.1 How SAS/ETS Software Procedures Interrelate RAPETS=DOHTEM TSACEROF CORP srorrE seireS emiT AMIRA CORP sledoM noitnevretnI slaudiseR detalerrocotuA GEROTUA CORP ECAPSETATS CORP stupnI modnaR htiw sledoM etairavitluM OPXE=DOHTEM TSACEROF CORP sledoM gnihtoomS laitnenopxE stupnI modnaR htiw sledoM etairavitluM sledoM noitcnuF refsnarT XAMRAV CORP 6 SAS for Forecasting Time Series Table 1.1 Selected Questions and Answers Concerning SAS/ETS Software Procedures Questions 1. 2. 3. 4. 5. 6. 7. Is a frequency domain analysis (F) or time domain analysis (T) conducted? Are forecasts automatically generated? Do predicted values have 95% confidence limits? Can you supply leading indicator variables or explanatory variables? Does the procedure run with little user intervention? Is minimal time series background required for implementation? Does the procedure handle series with embedded missing values? SAS/ETS Procedures FORECAST AUTOREG X11 X12 SPECTRA ARIMA STATESPACE VARMAX MODEL Time Series Forecasting System Answers 1 T T T T F T T T T T 2 Y Y* Y* Y* N Y* Y Y Y* Y 3 Y Y N Y N Y Y* Y Y Y 4 ′ 5 Y Y Y Y Y N Y Y Y Y 6 Y Y Y N N N N N N Y 7 Y Y N Y N N N N Y Y N Y N Y N Y Y Y Y Y * = requires user intervention = supplied by the program F = frequency domain analysis ′ N = no T = time domain analysis Y = yes 1.3 Simple Models: Regression 1.3.1 Linear Regression This section introduces linear regression, an elementary but common method of mathematical modeling. Suppose that at time t you observe Yt. You also observe explanatory variables X1t, X2t, and so on. For example, Yt could be sales in month t, X1t could be advertising expenditure in month t, and X2t could be competitors' sales in month t. Output 1.1 shows a simple plot of monthly sales versus date. Chapter 1: Overview of Time Series 7 Output 1.1 Producing a Simple Plot of Monthly Data A multiple linear regression model relating the variables is t These assumptions allow you to use standard regression methodology, such as PROC REG or PROC GLM. For example, suppose you have 80 observations and you issue the following statements: TITLE “PREDICTING SALES USING ADVERTISING”; TITLE2 “EXPENDITURES AND COMPETITORS’ SALES”; PROC REG DATA=SALES; MODEL SALES=ADV COMP / DW; OUTPUT OUT=OUT1 P=P R=R; RUN; )s morf tnereffid t rof detalerrocnu era s ε dna ε ( rehto hcae htiw detalerrocnu era semit lla ta ecnairav emas eht evah t t .noitubirtsid lamron a evah t ε For this model, assume that the errors • ε+ t 2 X2β + 1X1β + 0β = Y t • • t t ε 8 SAS for Forecasting Time Series Output 1.2 shows the estimates of . The standard errors are incorrect if the assumptions on are not satisfied. You have created an output data set called OUT1 and have called for the Durbin-Watson option to check on these error assumptions. 2 β dna , 1β , 0 β Output 1.2 Performing a Multiple Regression The test statistics produced by PROC REG are designed specifically to detect departures from the uncorrelated) of the form null hypothesis ( 1000.0 1000.0 1000.0 |T| > borP 3955.0 5075.0 1000.0 F>borP 041.15 eulaV F and et is an uncorrelated series. This type of error term, in which where called an AR (autoregressive) error of the first order. 1< ρ t t is related to 1− t ε t ε , is e + 1−t ερ = :0 H ε :1H t ε 382.0 08 493.1 451.7013.5 122.7 noitalerrocotuA redrO ts1 ).sbO fo rebmuN roF( D nostaW-nibruD 33456480.0 48640719.1 55893.759373 rorrE dradnatS 706506.0576971.01 5610072 etamitsE retemaraP 1 1 1 FD PMOC VDA PECRETNI elbairaV 0=retemaraP :0H rof T setamitsE retemaraP qs-R jdA erauqs-R 51612.61 .V.C 17807.2274603 naeM peD 22759.979694 ESM tooR 31E899724.4 31E9518109.1 31E2281625.2 serauqS fo muS 97 77 2 FD latoT C rorrE ledoM ecruoS 188770989642 31E1190362.1 erauqS naeM ecnairaV fo sisylanA SELAS :elbairaV tnednepeD 1LEDOM :ledoM erudecorP GER ehT SELAS 'SROTITEPMOC DNA SERUTIDNEPXE GNISITREVDA GNISU SELAS GNITCIDERP Chapter 1: Overview of Time Series 9 The Durbin-Watson option in the MODEL statement produces the Durbin-Watson test statistic t 1= t ˆ n 2 − where t is closer to than in the independent case, so Positive autocorrelation means that should be smaller. It follows that d should also be smaller. The smallest possible value for d is 0. If d is significantly less than 2, positive autocorrelation is present. When is a Durbin-Watson statistic significant? The answer depends on the number of coefficients in the regression and on the number of observations. In this case, you have k=3 coefficients for the intercept, ADV, and COMP) and n=80 observations. In general, if you want ( to test for positive autocorrelation at the 5% significance level, you must compare d=1.349 to a critical value. Even with k and n fixed, the critical value can vary depending on actual values of the independent variables. The results of Durbin and Watson imply that if k=3 and n=80, the critical value must be between dL=1.59 and dU=1.69. Since d is less than dL, you would reject the null hypotheses of uncorrelated errors in favor of the alternative: positive autocorrelation. If d>2, which is evidence of negative autocorrelation, compute d =4–d and compare the results to dL and dU. Specifically, if d (1.954) were greater than 1.69, you would be unable to reject the null hypothesis of uncorrelated errors. If d were less than 1.59 you would reject the null hypothesis of uncorrelated errors in favor of the alternative: negative autocorrelation. Note that if 1.59 < d < 1.69 you cannot be sure whether d is to the left or right of the actual critical value c because you know only that 1.59 < c < 1.69 Durbin and Watson have constructed tables of bounds for the critical values. Most tables use k =k 1, which equals the number of explanatory variables, excluding the intercept and n (number of observations) to obtain the bounds dL and dU for any given regression (Draper and Smith 1998).* Three warnings apply to the Durbin-Watson test. First, it is designed to detect first-order AR errors. Although this type of autocorrelation is only one possibility, it seems to be the most common. The test has some power against other types of autocorrelation. Second, the Durbin-Watson bounds do not hold when lagged values of the dependent variable appear on the right side of the regression. Thus, if the example had used last month's sales to help explain this month's sales, you would not know correct bounds for the critical value. Third, if you incorrectly specify the model, the DurbinWatson statistic often lies in the critical region even though no real autocorrelation is present. Suppose an important variable, such as X3t=product availability, had been omitted in the sales example. This omission could produce a significant d. Some practitioners use d as a lack-of-fit statistic, which is justified only if you assume a priori that a correctly specified model cannot have autocorrelated errors and, thus, that significance of d must be due to lack of fit. ′ ′ ′ ′ * Exact p-values for d are now available in PROC AUTOREG as will be seen in Output 1.2A later in this section. 1− t ε − t ε t ε 2 σn and the denominator has an expected value of approximately uncorrelated, the ratio d should be approximately 2. 1− t ε t . Thus, if the errors are 2 σ )1 − n(2 ε t ε If the actual errors are uncorrelated, the numerator of d has an expected value of about 2 X 2β ε Σ ˆ − / t 1X1β 2 ) 1− t ˆ ˆ ε − t ε ( 2 =ntΣ = d ˆ − 0β 2 ˆ β − dna t Y = tε ˆ ,1 β , 0β 10 SAS for Forecasting Time Series The output also produced a first-order autocorrelation, 382.0 denoted as When n is large and the errors are uncorrelated, 2 /1 is approximately distributed as a standard normal variate. Thus, a value 2 /1 exceeding 1.645 is significant evidence of positive autocorrelation at the 5% significance level. This is especially helpful when the number of observations exceeds the largest in the Durbin-Watson table—for example, 2 382.0 You should use this test only for large n values. It is subject to the three warnings given for the Watson test is preferable. In general, d is approximately taht g n i to n y b n e es y l isae si sihT . and Durbin and Watson also gave a computer-intensive way to compute exact p-values for their test statistic d. This has been incorporated in PROC AUTOREG. For the sales data, you issue this code to fit a model for sales as a function of this-period and last-period advertising. PROC AUTOREG DATA=NCSALES; MODEL SALES=ADV ADV1 / DWPROB; RUN; The resulting Output 1.2A shows a significant d=.5427 (p-value .0001 < .05). Could this be because of an omitted variable? Try the model with competitor’s sales included. PROC AUTOREG DATA=NCSALES; MODEL SALES=ADV ADV1 COMP / DWPROB; RUN; Now, in Output 1.2B, d =1.8728 is insignificant (p-value .2239 > .05). Note also the increase in R-square (the proportion of variation explained by the model) from 39% to 82%. What is the effect of an increase of $1 in advertising expenditure? It gives a sales increase estimated at $6.04 this period but a decrease of $5.18 next period. You wonder if the true coefficients on ADV and ADV1 are the same with opposite signs; that is, you wonder if these coefficients add to 0. If they do, then the increase we get this period from advertising is followed by a decrease of equal magnitude next ) 2 ρ − 1( / ρ n ˆ ˆ 2 /1 Durbin-Watson test. Because of the approximate nature of the ρ − 1(2 2 /1 )ˆ 2 t ε ˆ ∑ 2 / ) 1− t ε − tε ( ˆ ˆ t ε ˆ 2 ∑ / 1− tε t ε ˆ ˆ −1 (.283)/ ∑ =ρ ∑ ) ) 2 2 ρ − 1( / ρ n ˆ ˆ ρ − 1( / ρ n ˆ ˆ =d ˆ 2 /1 2 /1 08 =ρ ˆ = 2.639 test, the Durbin- Chapter 1: Overview of Time Series 11 period. This means our advertising dollar simply shifts the timing of sales rather than increasing the level of sales. Having no autocorrelation evident, you fit the model in PROC REG asking for a test that the coefficients of ADV and ADV1 add to 0. PROC REG DATA = SALES; MODEL SALES = ADV ADV1 COMP; TEMPR: TEST ADV+ADV1=0; RUN; Output 1.2C gives the results. Notice that the regression is exactly that given by PROC AUTOREG with no NLAG= specified. The p-value (.077>.05) is not small enough to reject the hypothesis that the coefficients are of equal magnitude, and thus it is possible that advertising just shifts the timing, a temporary effect. Note the label TEMPR on the test. Note also that, although we may have information on our company’s plans to advertise, we would likely not know what our competitor’s sales will be in future months, so at best we would have to substitute estimates of these future values in forecasting our sales. It appears that an increase of $1.00 in our competitor’s sales is associated with a $0.56 decrease in our sales. From Output 1.2C the forecasting equation is seen to be PREDICTED SALES = 35967 – 0.563227COMP + 6.038203ADV – 5.188384ADV1 Output 1.2A borP xorppA oitaR t 1000.0 1000.0 0490.0 1000.0 6683.0 576.1761 557.9818 77 122.5408.6 596.1 setamitsE serauqS tsaeL yranidrO SELAS = elbairaV tnednepeD WDborP 67 :FD 55257591 :rotanimoneD :FD 768.38830136:rotaremuN RPMET :tseT SELAS :elbairaV tnednepeD SELAS S'ROTITEPMOC DNA SERUTIDNEPXE GNISITREVDA GNISU SELAS GNITCIDERP 440.0 08 378.1 1000.0 1000.0 1000.0 1000.0 |T| > borP 499.9265.11 507.31783.7 0=retemaraP :0H rof T 47521915.0 48242225.0 50690140.0 8768400.9684 rorrE dradnatS noitalerrocotuA redrO ts1 ).sbO fo rebmuN roF( D nostaW-nibruD 483881.5302830.6 722365.076953 etamitsE retemaraP 1 1 1 1 FD 1VDA VDA PMOC PECRETNI elbairaV setamitsE retemaraP 3618.0 3328.0 qs-R jdA erauqs-R 30239.41 05212.03692 61393.4244 .V.C naeM peD ESM tooR 1000.0 F>borP 820.811 eulaV F 4.9534898148 97 548.45257591 2.8639177841 67 7.3661240132 2.1994621396 3 erauqS serauqS FD naeM latoT C rorrE ledoM ecruoS fo muS ecnairaV fo sisylanA SELAS :elbairaV tnednepeD SELAS S'ROTITEPMOC DNA SERUTIDNEPXE GNISITREVDA GNISU SELAS GNITCIDERP 1000.0 1000.0 1000.0 1000.0 499.9265.11 507.31783.7 1915.0 2225.0 1140.0 0.9684 rorrE dtS WD rP 03.568.21 85.2 eulaV t 34511.0 98511.0 83012.03 rorrE dradnatS 44116.078094.1 27359.77 etamitsE retemaraP 1 1 1 FD 2REVLISL 1REVLISL tpecretnI elbairaV setamitsE retemaraP 5998.0 6309.0 qS-R jdA erauqS-R 55310.5 00067.246 21522.23 raV ffeoC naeM tnednepeD ESM tooR 1000.< F > rP 62.022 eulaV F 05854.8301 727822 erauqS naeM 162605 80884 454754 serauqS fo muS 94 74 2 FD latoT detcerroC rorrE ledoM ecruoS ecnairaV fo sisylanA REVLIS :elbairaV tnednepeD 1LEDOM :ledoM erudecorP GER ehT REVLIS FO SKCOTS DNE HTNOM 09265.995 03539.1601 27482 592783 34507491 SS I epyT 3364.0 0279.0 1790.0 1000.< 5900.0 |t| > rP 47.040.0 07.141.9 27.2 eulaV t 58151.0 73162.0 87062.0 65151.0 40958.73 rorrE dradnatS 63211.012900.0 13244.098583.1 62148.201 etamitsE retemaraP 1 1 1 1 1 FD 4REVLISL 3REVLISL 2REVLISL 1REVLISL tpecretnI elbairaV setamitsE retemaraP 2988.0 6898.0 qS-R jdA erauqS-R 56591.5 38598.636 09090.33 raV ffeoC naeM tnednepeD ESM tooR 1000.< F > rP 03.59 eulaV F 56700.5901 753401 erauqS naeM 415464 58074 924714 serauqS fo muS 74 34 4 FD latoT detcerroC rorrE ledoM ecruoS ecnairaV fo sisylanA REVLIS :elbairaV tnednepeD 1LEDOM :ledoM erudecorP GER ehT REVLIS FO SKCOTS DNE HTNOM Output 2.5 Using PROC PRINT to List the Data and PROC REG to Fit an AR Process (continued) Chapter 2: Simple Models: Autoregression 47 48 SAS for Forecasting Time Series Output 2.5 shows that lags 3 and 4 may not be needed because the overall F statistic for these two as lags is computed ((1062 + 600) / 2) / 1095 = .76 34 2 You have identified the model through overfitting, and now the final estimated model t which becomes All parameters are significant according to their t statistics. The fact that M=1 almost solves the characteristic equation 2 M16. suggests that this series may be nonstationary. In Chapter 3, you extend your class of models to include moving averages and mixed ARMA models. These models require more sophisticated fitting and identification techniques than the simple regression with overfitting used in the silver example. t e e + + ) 746 2 − tY4116. − 2 − tY ( 4116. − 1− tY9094.1 − F This is insignificant compared to the to produce F. distribution. Alternatively, a TEST statement could be used is ) 746 + + 7359.77 − M94. 1− tY ( 9094.1 1 − 1 = t Y = 746 − t Y 821 seuqinhceT noitacifitnedI rehtO 01.4.3 321 dnerT raeniL a evomeR ot gnicnereffiD 9.4.3 311 ataD yranoitatsnoN rof sledoM 8.4.3 501 seireS revliS dna seireS MBI gnitsaceroF :selpmaxE 7.4.3 401 stsaceroF no gnicnereffiD fo tceffE 6.4.3 201 seireS yranoitatsnoN 5.4.3 79 8 seireS rof tnemetatS ETAMITSE 4.4.3 59 AMIRA CORP ni desU sdohteM noitamitsE 3.4.3 09 sisylanA tropxE leetS dna norI :elpmaxE 2.4.3 18 8–1 seireS rof tnemetatS YFITNEDI 1.4.3 97 noitacifitnedI ledoM fo yrammuS 97 slaudiseR fo kcehC erauqS-ihC 65 noitacifitnedI seireS emiT 55 ytilibitrevnI dna ytiranoitatS 4.3.3 3.3.3 2.3.3 1.3.3 08 snoitcurtsnI dna selpmaxE 4.3 55 noitacifitnedI ledoM 3.3 15 noitciderP 2.3 25 snoitciderP erutuF 2.2.3 15 snoitciderP daehA-petS-enO 1.2.3 05 noitatoN dna ygolonimreT 2.1.3 94 dnuorgkcaB lacitsitatS 1.1.3 3.1 Introduction Chapter The general class of autoregressive moving average (ARMA) models is developed in this chapter. As each new model is introduced, its autocovariance function is given. This helps you use the estimated autocovariances C(j) that PROC ARIMA produces to select an appropriate model for the data. Using estimated autocovariances to determine a model to be fit is called model identification. Once you select the model, you can use PROC ARIMA to fit the model, forecast future values, and provide forecast intervals. ) j (γ 3.1.1 Statistical Background 041 yrammuS 5.3 94 noitcudortnI 1.3 3 The General ARIMA Model 50 SAS for Forecasting Time Series 3.1.2 Terminology and Notation The moving average of order 1 is given by (3.1) Clearly, .2σ 1− te )j ( and 0 for j>1. dealing with an MA process of order 1 because .08 and 0= )j ( for j>1. Note that 2 /1 The general moving average of order q is written as q − te q Similarly, you write the mixed autoregressive moving average model ARMA(p,q) as ) ρ µ− ρ −t ( ρ Y t e) q B q β − . . . − )j ( γ and is characterized by the fact that 2 B2 α −. β − ... − . . q − te qβ − . . . − 1− te 1β − te = − ) µ − 1− tY ( 1α − ) µ − tY ( .β regardless of the value 0. In the example, the autocorrelations for lags 0 through 4 are 1, .4, 0, 0, and and are 0 for j>q. In backshift notation you write . )j ( ρ , )0 ( γ If each autocovariance is divided by For a moving average like equation 3.1, )j ( = 2σ 5.− = β so and The model is the resulting sequence of autocorrelations is 04 = σβ − 2 ,0 = )3 ( γ ,0 = )2 ( γ ,04 = )1( γ ,001 = )0 ( γ If you observe the autocovariance sequence ,001 = 2 σ ) 2 β + 1( for j>1. Also, you know that 2 σβ − = )) 2− eβ − 0 = ) j (γ t 1− te() 1− te β− ) t 2 e((E = )1(γ β + 1( 2 σ = ) 0(γ = ) Y( rav ≤ ) 2 β + 1( / β − ≤ 2 / 1 − γ β − B 1β − 1( + µ = 1− te 1 1− te5. ) 2 β + 1( / β− = )1( ρ β− = ) 1− Y , Y( voc =) + e+µ= j − tY , tY( t te t +µ= 1 = )0 ( ρ t t t e where β− e+ µ = t voc t tY tY t Y Y ρ is a white noise (uncorrelated) sequence with mean 0 and variance . . . , you are and Chapter 3: The General ARIMA Model 51 or in backshift notation as te ) B q q For example, the model t is an ARMA(1,1) with mean In practice, parameters are estimated and then used to estimate prediction error variances for several periods ahead. PROC ARIMA provides these computations. 3.2 Prediction 3.2.1 One-Step-Ahead Predictions You can further clarify the example above by predicting sequentially one step at a time. Let n denote the number of available observations. The next (n+1) observation in the sequence satisfies ne4. First, predict Yn+1 by ne4. so predict Yn+2 by removing “future es” (subscripts greater than n): 1+ nY6. and 0 + 2 + nY6. ˆ so the prediction error is 2 + ne4. This example shows that you can readily compute and the prediction error variance is predictions and associated error variances after model parameters or their estimates are available. . 2 σ63.2 . 2 σ2 The prediction error is en+1+ en+2, which has variance 2 + ne4. 2 + ne4. + + 1+ ne4. + 2 +ne + ) ne4. + 1+ ne + nY 6.( 6. = + 2 + ne + 1+ nY6. = 3 + ne 3 + ne + + ) 2 + nY ) 2 +n ˆ e 1+ n e4. − + + 2 + nY ( 6. 1+ ne ( 6. 3 + ne ˆ .2σ with error variance 2 +nY Next, β− ... − B1β − 1( = ) µ − tY ( ) p B p α − .0 =µ e) B4. + 1( = + 1+ ne + nY6. = = + = = ne42. 2 + nY 6. + nY6. = 1+nY ˆ 3 + nY ˆ ... t Y ) B6. + − B1α − 1( − = = nY63. 3 +nY 3 +nY 3 + nY 1+nY − 1( ˆ Finally, 52 SAS for Forecasting Time Series The predictions for the model 1− te4. 2 can be computed recursively as follows: — — — 6 1 8– 3– 5 01 noitavre sbO and by and by 3.656. These are one-step-ahead predictions for the historic data. For example, you use only the data up through (and the assumed ) to predict . The sum of squares of these residuals, is called the conditional sum of squares associated with the parameters 0.6 and 0.4. If you search over AR and MA parameters to find those that minimize this conditional sum of squares, you are performing conditional least squares estimation, the default in PROC ARIMA. An estimate of the white noise variance is given by dividing the conditional sum of squares by n minus the number of estimated parameters; that is, for this ARMA(1,1) with mean 0. 4 3.2.2 Future Predictions Predictions into the future are of real interest, while one-step-ahead computations are used to start the as 0 for t beyond n ( observations in the process. Continuing the process as shown, estimate example); that is, estimate future Ys by their predictions. The next three predictions are as follows: 7e is 0.6(6) + 0.4(2.344) = 4.538 with error is 0.6(4.538) + 0.4(0) = 2.723 with error is 0.6(2.723) + 0.4(0) = 1.634 with error PROC ARIMA provides these computations for you. The illustration simply shows what PROC ARIMA is computing. j in the example). The prediction error variance increases from var(et) to var(Yt). In a practical application, the form 7Y )6.( ˆ j j +7 Y Note that the prediction of is just and thus declines exponentially to the series mean (0 Y 6=n . 7e6.0 + 8e + 9e + 8e 01 = 1e 7e ,420.622 s te = 2 443.2 4.3 4 Y 3= t − = )4− (4.0 + )3− (6.0 = 3r4.0 + 3Y6.0 + + 52 + 001 .4 − = 1 − 3Y = 3r The residual is Then predict by Y 2r .01 .5− = 01 − 5 = 2 r = 1e Start by assuming the mean (0) as a prediction of with implied error using the assumed The residual is predict by estimate of 1 436.1 — 327.2 — 835.4 — 443.2 656.3 + e + 1− Y6. = t Y 46.6– 46.7 t .01 1 = )5− (4.0 + )5(6.0 = 2r4.0 + 2Y6.0 6.4– 4.3– = 1e t Y 4– 1 5– 01 )0( 01 3 Y 6 Y 7Y 8Y 9Y noitciderP ˆ ˆ ˆ laudiseR Predict by Using as an ,01 46.6 4 =2−6=2−n = 1e4.0 + 1Y6.0 ,2 e − 5 Y Chapter 3: The General ARIMA Model 53 and parameter values 1− te4. are not known. They can be determined through PROC ARIMA. In practice, estimated parameters are used to compute predictions and standard errors. This procedure requires sample sizes much larger than those in the example above. Although they would not have to be, the forecasting methods used in PROC ARIMA are tied to the method of estimation. If you use conditional least squares, the forecast is based on the expression of as an infinite autoregression. For example, suppose , a simple MA(1). Note that so at time you have substituting this second ∞ t t t When ML or ULS estimation is used, optimal linear forecasts based on the finite past are computed. ;2 and are just coefficients; they do not represent autoregressive or moving average parameters. Using calculus you find )2( γ     )1(γ  1−    )4 (γ     )3( γ   )2 (γ     )1( γ  1−   )0(γ   )1(γ  )2( γ   )3(γ sφ j )2 (γ )0( γ )1( γ )1( γ ,4 Y 3Y 2Y 1Y )2 ( γ )0(γ )1(γ )1(γ    )3(γ   4φ     )2( γ   3φ  = )1( γ   2φ     1  φ )0( γ  5 Y Likewise, to forecast using , , , the four are computed as : 2 Y 1 Y φ )µ − 1Y ( 2φ + )µ − 2Y( 1 + µ 3 Y This would give the best forecast of based on a linear combination of )0(γ   )1(γ t Y that is, you want the minimum variance forecast of 2 based on its two predecessors. Note that here and φ 1 φ } ] ) µ − 2 − tY( 2φ − ) µ − 1− tY ( 1φ − ) µ − tY ( [{ E Suppose you want to minimize by finding .µ − t Y necessarily the best linear combination of lagged Y values for predicting 2 and ) µ− j − Y( t j β ∑ and, of course, the parameters are replaced by estimates. The truncated sum is not : prior to time 1, the “infinite past.” PROC ARIMA assumes Y values before time 1 are just equal to 1= j 1− t ) µ− j −t Y( j β ∑ 1= j )0( γ  )1( γ   2φ  =   1  φ .) µ− j −t Y( = 3Y ˆ j β ∑−µ= 1= j t Y ˆ 1− t up to time is The expression depends on Y values t Y . te + )µ − ∞ j −t Y( j β ∑− = µ − 1= j ) t Y( can alternatively be expressed as ∞ Thus the forecast of given data ,) µ− j − Y( t j β ∑= e 0= j t ∞ j − te j β 1< β assuming that so that converges to 0 as j gets large, you find Y( = t e expression into the first, you have Continuing in this fashion, and which 1− te . 2 − e β + ) µ − 1− Y (β + ) µ − 2 ; 2 − eβ + µ − 1− Y = 1− e β− e+µ = Y t t t t t 1− t 1− te , 1− t β− + e= eβ + µ − t t e = 1− tYα − 1− tY 6. t Y= φ − µ− t t t Y e Y t Y 1 φ 54 SAS for Forecasting Time Series For reasonably long time series whose parameters are well inside the stationarity and invertibility regions, the best linear combination forecast used when ML or ULS is specified does not differ by much from the truncated sum used when CLS is specified. (See Section 3.3.1.) ahead, are listed below. The top row shows the first 14 weights assuming infinite past the next two rows show finite past weights for and past observations. ,8.0 Despite the fairly large and small n values, the weights are quite similar. Increasing to 25 produces weights indistinguishable, out to 2 decimal places, from those for the infinite past. the series is said to be “noninvertible,” indicating that you cannot get a nice, If convergent series representation of as a function of current and lagged Y values. Not only does this negate the discussion above, but since a reasonable estimate of cannot be extracted from the data, it eliminates any sensible model-based forecasting. In the moving average of order has an associated polynomial equation in the algebraic variable M, Note the analogy with the characteristic equation computed from the autoregressive coefficients. Fortunately in practice it is rare to encounter a naturally measured series that appears to be noninvertible. However, when differences are taken, noninvertibility can be artificially induced. For example, the time series is a simple linear trend plus white noise. Some practitioners have the false impression that any sort of trend in a time series should be removed by taking differences. If that is done, one sees that 1− te on this being a fundamental result of basic statistical ordinary least squares regression of theory. The practitioner perhaps was confused by thinking in a very narrow time series way. ,t t t e + t1 α + 0 α= Y t average has been produced. Note that the parameters of 1 α t1 α + 0α so that in the process of reducing the trend 1> M whose roots must satisfy in order for the series to be invertible. to a constant , a noninvertible moving are best estimated by the ,) )8.0( ( j 41− t 40. 20. — , Y − − q − n 31− t 50. 20. — Y − − 12.− − − 12. 80. 7− t Y t e 62.− − − 62. 61. 41 = n 6− t Y − 33. 23.− − − 52. 5− t t e + 1α = ) 1− e + )1 − t ( 1α + Y 9=n 14.− − − 14. 53. 4− t =β For an MA(1) process with lag 1 parameter the weights on past Y, used in forecasting 1 step and ,β structure and any number of lags. For the MA(1) example with parameter you have    0     0   0    β−  1− j φ   β +1 2  β−  0   0 β− β +1 2 β− 0 0 β− β +1 2 β− Y t t e + t1 α + 15.− − − 15. 74. 3− t 2 Y    0   4φ     0   3φ  = β−   2φ     β + 1  1φ  Y t e 16. 46.− − − 46. 2− t 1− te 0 α= Y Y 0 1− e = α ( − ) e + t1α + 0 α ( = 1− Y − t q − te q t β 08.− − − 08. 97. ,0 1− t t β− Y = q M qβ − t Y tsap etinif , 41 = n 9=n gal tsap etinifnI tsap etinif , − 1− e1β − e = t )h( γ , t Here is the autocovariance at lag h and the equations for the s can be set up for any ARMA to predict the fifth Y − M1β − 1 1=β t t t Y Y Chapter 3: The General ARIMA Model 55 3.3 Model Identification 3.3.1 Stationarity and Invertibility Consider the ARMA model ) ... ... The model is stationary if all values of M such that 0= p Mp are larger than 1 in absolute value. Stationarity ensures that early values of e have little influence on the current value of Y. It also ensures that setting a few values of e to 0 at the beginning of a series does not affect the predictions very much, provided the series is moderately long. In the ARMA(1,1) example, the prediction of Y6 with 0 as an estimate of e1 differs from the prediction using the true e1 by the quantity .01 e1. Any MA process is stationary. One AR example is t which is not stationary (the roots of 1 1.3M+.3M2=0 are M=1 and M=10/3). Another example is which is stationary (the roots of 1 1.3M+.42M2=0 are M=10/7 and M=10/6). A series satisfies the invertibility condition if all Ms for which 0= q Mq The invertibility condition ensures that Yt can be expressed in terms of et and are such that an infinite weighted sum of previous Ys. In the example, t and + 1− tY so + 4− tY460. The decreasing weights on lagged values of Y allow you to estimate et from recent values of Y. Note that in Section 3.2.1 the forecast of Yn+1 was .6Yn+.4en, so the ability to estimate en from the data was crucial. − 4− tY460. µ − tY ( ) p B p α − . . . − 2B 2 α − B1α − 1( te ) − − + q B q β − . . . − 2B 2 β − B1β − 1( = − 3− tY61. 3− tY61. α − ... − β − ... − + − Y) t e= e= 2− tY 4. 2− tY 4. B6. t Y) 2 − 1( 1−)B4. + 1( = e t Y) 2 .1 2 2 − M2 > M2 B24. B3. 1− tY M α − M 1α − 1 β − M 1β − 1 + B3.1 − 1( + B3.1 − 1( + e= − t t Y = e t Y t t 56 SAS for Forecasting Time Series 3.3.2 Time Series Identification You need to identify the form of the model. You can do this in PROC ARIMA by inspecting dataderived estimates of three functions: These functions are defined below. A short catalog of examples is developed, and properties useful for associating different forms of these functions with the corresponding time series forms are summarized. In PROC ARIMA, an IDENTIFY statement produces estimates of all these functions. For example, the following SAS statements produce lists and plots of all three of these functions for the variable Y in the data set SERIES: PROC ARIMA DATA=SERIES; IDENTIFY VAR=Y; RUN; )j ( Recall that is the covariance between Yt and Yt-j, which is assumed to be the same for every t (stationarity). See the listing below of autocovariance functions for Series 1–8 (in these examples, et is white noise with variance Series 1 2 3 4 5 6 7 8 Model 0 > )1(γ ) j (γ ,) ( 1 γ ) 0( γ begins with values and followed by t e = 2 − tY 2 α − 1− tY 1α − t Y For an AR(2) series t e = 1− tYρ − ) 2 t ρ − 1( / 2 σ ρ = Y For an AR(1) series j (such as Series 1 and 2), the covariance sequence is (such as Series 3 and 4), the covariance sequence that satisfy ) .)FCAP( noitcnuf noitalerrocotua laitrap )FCAI( noitcnuf noitalerrocotua esrevni )FCA( noitcnuf noitalerrocotua 1,1( AMRA , 1− e4. + 0 < )1( γ ) ( γ 2 RA , e + 2− Y94. + 1− Y 7. = Y ) ( ) ( 2 RA , e + 2− Y4. + 1− Y3. = Y noitcnuF ecnairavocotuA 1.2.3.3 2 AM , 2− e4. + 1− e3. − ,)1( RA , e + 1− Y8. − = Y t ,)1( RA , e + 1− Y8. = Y ) ( t 1 AM , 1− e8. + ) esion etihw( t t t .)1 = 2 σ t t t t t t t e + 1− tY6. = t t t , t t t e= e= e= t t t t t t t t Y Y Y Y ) j (γ )j ( γ Chapter 3: The General ARIMA Model 57 For a general AR(p) series te it is bounded by a function that decreases exponentially to zero. For MA(1), 1− te For a general MA(q) q − te q For an ARMA(1,1) process 1− te occurs. Thus, an apparently arbitrary drop followed by exponential decay characterizes the ARMA(1,1) covariance function. For the ARMA(p,q) process q − te q there are r = max (p – 1, q) beginning values followed by behavior characteristic of an AR(p); that is, 0≠j 0 = )j ( γ for j>r. For a white noise sequence, β − . . . − 1− te1β − te = ) µ − p − tY ( p α− . . . − ) µ − 1− Y ( 1α − ) µ − Y ( if . .β α 0 = )p − j ( γ p α − ) (γ 1 ) 0( γ there is a dropoff from ) to determined by and ) j (γ ) q( γ )1( γ β − e = ) µ − 1− Y ( α − ) µ − Y ( ) 0( γ the q beginning values are , β− ... − 2 − te 2 t β− … − 1− .1 1− te 1 > ) 1− j (γα = t j t β− j ( 1 te γ α − )j ( γ 0 = )j ( γ and for ,..., . Then =0 for |j|>q. For j>1, the pattern )j ( γ 1< λ < 0 where ) j (γ 2 σβ− 2 σ ) 2 β + 1( = )0 ( γ β− = )1− ( γ = t e , = µ− =µ− j λH < )j ( γ )1( γ . ρ for j> )j ( γ The fact that satisfies the same difference equation as the series ensures that and H is some finite constant. In other words, may oscillate, but ) j (γ 0 = )p − j (γ p , )1 − p ( γ α − . . . − )2 − j (γ 2 α ) 0( γ beginning values are − )1 − j (γ 1α ,..., from which satisfies 2 α 1 α The covariances may oscillate with a period depending on values are determined from the Yule-Walker equations. = p − tY p 0 = )2 − α − ... − j (γ 2 α 2 − tY 2 − )1 − α − 1− tY 1α − j ( γ 1α − ) j (γ − )j (γ t t and (such as Series 4). Beginning t tY tY Y 1 > j rof )1 − j ( ρ6. = ,1657. = )1( ρ ,1 = )0 ( ρ , 1− e4. + e = t t 2 > j rof 0 = t )j ( ρ ,23.− = )2 ( ρ t t t 1 > j rof )2 − j ( ρ94. − )1 − j ( ρ7. = t t t )j ( t ρ 4 ,8964. = )1( ρ , e + 2 − Y94. − 1− Y7. = 1 > j rof )2 − j ( ρ4. + )1 − j ( ρ3. = t t t Y Y Y )j ( t ρ 3 ,0005. = )1( ρ , e + 2 − Y4. + 1− Y3. = ) j 58 SAS for Forecasting Time Series Note that the pattern, rather than the magnitude of the sequence is associated with the model by computing autocorrelations form. Normalize the autocovariance sequence , )j ( γ The ACFs for the eight series previously listed are listed below. )j − (ρ = )j ( ) 0(γ / )j ( γ= ) j (γ FCA 2.2.3.3 Note that for all series and that 1 = )0 ( ρ )j ( Series ρ ρ 1 j 8. = )j ( ρ ,e + t 1− t Y8. = t Y 2 8.− ( = )j ( ρ ,e + t 1− t Y8.− = t 5 1 > j rof 0 = )j ( ρ ,8784. = )1( ρ , 1− e8. + e = t t t Y 6 ,441.− = )1( ρ , 2 − e4. − 1− e3. − e = Y 7 0 > j rof 0 = )j ( ρ ,1 = )0 ( ρ , e = t The ACFs are plotted in Output 3.1. )j ( t ρ 8 1− t Y6. − Y Y t Model, ACF Chapter 3: The General ARIMA Model 59 Output 3.1 Plotting Actual Autocorrelations for Series 1–8 60 SAS for Forecasting Time Series Output 3.1 Plotting Actual Autocorrelations for Series 1–8 (continued) Chapter 3: The General ARIMA Model 61 Output 3.1 Plotting Actual Autocorrelations for Series 1–8 (continued) 62 SAS for Forecasting Time Series Output 3.1 Plotting Actual Autocorrelations for Series 1–8 (continued) Chapter 3: The General ARIMA Model 63 autocorrelations. autocorrelations estimated by the are obtained by solving equations similar to the regression normal equations. )0 ( sequence is useful for identifying the form of a time series model. The PACF is most useful for identifying AR processes because, for an AR(p), the PACF is 0 beyond lag p. For MA or mixed (ARMA) processes, the theoretical PACF does not become 0 after a fixed number of lags. You can solve the previous set of equations for the catalog of series. When you observe an estimated PACF compare its behavior to the behavior shown next to choose a model. The following is a list of actual partial autocorrelations for Series 1–8: ,jπ ˆ . b= π j j For each j, let j (A new set of equations is needed for each j.) As with autocorrelations, the      ) j ( γ  b  .   .      .  =  .  .   .   2  )2 ( γ    b  )1( γ   1b  j π ˆ In an autoregression of order p, the coefficients j estimate 0s for all j>p. The theoretical partial 1 π 1− Y ˆ t j π ˆ j .jπ ˆ 2   )0 ( γ  .  .   .  )2 − j ( γ   )1 − j ( γ π −Y ˆ 2 t π ˆ . . . . . . . . . )2 )0 ( )1( −j . . . FCAP 3.2.3.3 γ γ j The PACF is motivated by the regression approach to the silver example in Chapter 2, “Simple . Next, regress Yt Models: Autoregression.” First, regress Yt on Yt–1 and call the coefficient on on Yt–1, Yt–2 and call the coefficient on . Continue in this manner, regressing Yt on Yt–1, Yt–2,. . . , Yt–j and calling the last coefficient The values are the estimated partial ( π γ )1 − )1( . . . j( γ γ   γ       π 64 SAS for Forecasting Time Series Lag Series 1 2 3 4 5 6 7 8 Model Yt = .8Yt–1 + et Yt = .8 Yt + et Yt = .3Yt–1 + .4Yt–2 + et Yt = .7Yt–1 .49Yt–2 + et Yt = et + .8 et–1 − − 1 0.8 0.8 2 0 0 0.4 0.4900 0.3123 3 0 0 0 0 0.2215 0.1304 4 0 0 0 0 0.1652 0.1634 5 0 0 0 0 0.1267 0.0944 Yt = et Yt = .6Yt–1 + et + .4et–1 0 0 0.2756 0 0.1087 0 0.0434 Plots of these values against lag number, with A used as a plot symbol for the ACF and P for the PACF, are given in Output 3.2. A list of actual autocorrelations for Series 1 8 follows: Lag Series 1 2 3 4 5 6 7 8 Model Yt = .8Yt–1 + et Yt = .8 Yt + et Yt = .3Yt–1 + .4Yt–2 + et Yt = . 7Yt–1 .49Yt–2 + et Yt = et + .8 et–1 − − 1 0.8 0.8 − 2 0.64 0.64 0.550 − 3 0.512 0.512 0.365 0.343 − 0.500 0.470 0.488 0.330 0.161 − 0.161 0 0.32 0 0 0 0 0.272 0 0 0 0.163 Yt = et Yt = .6Yt–1 + et + .4et–1 0 0.756 − − Yt = et .3 e t–1 .4et–2 0.144 0.454 − − − − − 0.7561 4 0.410 0.410 − − − − − − Yt = et .3 et–1 .4et–2 0.144 0.3480 − − 0.4878 − − − − 0.5 0.4698 0 0.0173 5 0.328 0.328 0.245 0.055 0 0 0 0.098 − Chapter 3: The General ARIMA Model 65 Output 3.2 shows the plots. Output 3.2 Plotting Actual Autocorrelations and Actual Partial Autocorrelations for Series 1–8 66 SAS for Forecasting Time Series Output 3.2 Plotting Actual Autocorrelations and Actual Partial Autocorrelations for Series 1–8 (continued) Chapter 3: The General ARIMA Model 67 Output 3.2 Plotting Actual Autocorrelations and Actual Partial Autocorrelations for Series 1–8 (continued) 68 SAS for Forecasting Time Series Output 3.2 Plotting Actual Autocorrelations and Actual Partial Autocorrelations for Series 1–8 (continued) Chapter 3: The General ARIMA Model 69 Begin the PROC ARIMA analysis by estimating the three functions defined above. Use these estimates to identify the form of the model. Define the estimated autocovariance C(j) as n / )Y Compute standard errors for autocorrelations in PROC ARIMA as follows: The group of plots in Output 3.3 illustrates the actual (A) and estimated (E) ACFs for the series. Each data series contains 150 observations. The purpose of the plots is to indicate the amount of sampling error in the estimates. snur noita mmus eht erehw elihw j ≥ i rof 0 = ) i ( γ taht sisehtopyh eht rednu ecnairav etairporppa eht si sihT .ecnairav siht fo toor erauqs eht si rorre dradnats ehT .1–j ot 1+j– morf n / ) ) i( 2 rΣ ( ecnairav a n gissa Y where the summation is from 1 to n j and autocorrelation by ,) j ( r − FCA detamitsE no ital errocotua roF .j < i rof 0 ≠ ) i ( γ − j + tY() Y ) 0(C / ) j (C − t Y( Σ = ) j (C 4.2.3.3 is the mean of the entire series. Define the estimated = ) j (r 70 SAS for Forecasting Time Series Output 3.3 Plotting Actual and Estimated Autocorrelations for Series 1–8 Chapter 3: The General ARIMA Model 71 Output 3.3 Plotting Actual and Estimated Autocorrelations for Series 1–8 (continued) 72 SAS for Forecasting Time Series Output 3.3 Plotting Actual and Estimated Autocorrelations for Series 1–8 (continued) Chapter 3: The General ARIMA Model 73 Output 3.3 Plotting Actual and Estimated Autocorrelations for Series 1–8 (continued) or nearly 0, an approximate standard error for the estimated partial autocorrelation is n 1/2. − j π FCAP detamitsE 5.2.3.3 .) j ( The partial autocorrelations are defined in Section 3.3.2.3 as solutions to equations involving the To estimate these partial autocorrelations, substitute estimated covariances C(j) covariances for the actual covariances and solve. For j large enough that the actual partial autocorrelation is 0 γ 74 SAS for Forecasting Time Series The next group of plots, in Output 3.4, illustrate the actual (A) and estimated (E) PACFs for the series. Output 3.4 Plotting Actual and Estimated Partial Autocorrelations for Series 1–8 Chapter 3: The General ARIMA Model 75 Output 3.4 Plotting Actual and Estimated Partial Autocorrelations for Series 1–8 (continued) 76 SAS for Forecasting Time Series Output 3.4 Plotting Actual and Estimated Partial Autocorrelations for Series 1–8 (continued) Chapter 3: The General ARIMA Model 77 Output 3.4 Plotting Actual and Estimated Partial Autocorrelations for Series 1–8 (continued) 78 SAS for Forecasting Time Series is defined as the ACF of 1− te8. In the catalog of Series 1–8, for example, the IACF of Series 3 is the same as the ACF of Series 6 and vice versa. The inverse model is the moving average 3− te021. The inverse autocovariances are estimated by 2 at lag 0, 2 at lag 1, 2 2 σ021. at lag 2, and at lag 3. t e + )µ − 3− tY ( 021. − ) µ − 2 − Y ( 043. + ) µ − 1− Y ( 003. = µ − σ))021.()043.( − )043.()003.( + 003. −( t + 2 − e043. − 1− e003. − FCAI detamitsE 7.2.3.3 Suppose you know that a series comes from an AR(3) process. Fit an AR(3) model to obtain estimated coefficients—for example, t t t σ) 2 021. + σ))021.()003.( − 043. −( t t e = )µ − 2 043. FCAI 6.2.3.3 − The IACF of an ARMA(p,q) model is defined as the ACF of the ARMA(q,p) model you obtain if you switch sides with the MA and AR operators. Thus, the inverse autocorrelation of t + t t e e Y() 2 = µ− = µ− 003. B8. − 1( + 1( t t Y Y Y Chapter 3: The General ARIMA Model 79 In general, you do not know the order p of the process, nor do you know the form (it may be MA or ARMA). Use the fact (see Section 3.3.1) that any invertible ARMA series can be represented as an infinite-order AR and therefore can be approximated by an AR(p) with p large. Set p to the minimum of the NLAG value and one-half the number of observations after differencing. Then do the following: 3.3.3 Chi-Square Check of Residuals In the identification stage, PROC ARIMA uses the autocorrelations to form a statistic whose approximate distribution is chi-square under the null hypothesis that the series is white noise. The test is the Ljung modification of the Box-Pierce Q statistic. Both Q statistics are described in Box, Jenkins, and Riensel (1994) and the Ljung modification in Ljung and Box (1978, p. 297). The formula for this statistic is )j where r(j) is the estimated autocorrelation at lag j and k can be any positive integer. In PROC ARIMA several ks are used. Later in the modeling stage, PROC ARIMA calculates the same statistic on the model residuals to test the hypothesis that they are white noise. The statistic is compared to critical values from a chisquare distribution. If your model is correct, the residuals should be white noise and the chi-square statistic should be small (the PROB value should be large). A significant chi-square statistic indicates that your model does not fit well. 3.3.4 Summary of Model Identification At the identification stage, you compute the ACF, PACF, and IACF. Behavior of the estimated functions is the key to model identification. The behavior of functions for different processes is summarized in the following table: .setamitse gnitluser eht ot –n fo srorre dradnats ngissA .3=p rof evoba detartsulli sa seires AM gnidnopserroc rof secnairavoc etupmoc ,stneiciffeoc detamitse eht gnisU .atad eht ot )p(RA tiF 2/1 − n( / ) j ( 2 r 1= j k Σ )2 + n(n 80 SAS for Forecasting Time Series Table 3.1 Summary of Model Identification MA(q) ACF PACF IACF D(q) T T AR(p) T D(p) D(p) ARMA(p, q) T T T White noise 0 0 0 where D(q) T 0 means the function drops off to 0 after lag q means the function tails off exponentially means the function is 0 at all nonzero lags. 3.4 Examples and Instructions The following pages contain results for 150 observations generated from each of the eight sample series discussed earlier. Thus, the ACFs correspond to the Es in Output 3.3. Even with 150 observations, considerable variation occurs. To obtain all of the output shown for the first series Y1, use these SAS statements: PROC ARIMA DATA=SERIES; IDENTIFY VAR=Y1 NLAG=10; RUN; The VAR= option is required. The NLAG= option gives the number of autocorrelations to be computed and defaults to 24. When you fit an ARIMA(p,d,q), NLAG+1 must be greater than p+d+q to obtain initial parameter estimates. For the ARMA(p,q) models discussed so far, d is 0. The following options can also be used: NOPRINT suppresses printout. This is useful because you must use an IDENTIFY statement prior to an ESTIMATE statement. If you have seen the output on a previous run, you may want to suppress it with this option. CENTER subtracts the series mean from each observation prior to the analysis. DATA=SASdataset specifies the SAS data set to be analyzed (the default is the most recently created SAS data set). F T AM E Y L Chapter 3: The General ARIMA Model 81 3.4.1 IDENTIFY Statement for Series 1–8 The following SAS statements, when used on the generated data, produce Output 3.5: PROC ARIMA DATA=SERIES; IDENTIFY VAR=Y1 NLAG=10; IDENTIFY VAR=Y2 NLAG=10; more SAS statements IDENTIFY VAR=Y8 NLAG=10; RUN; Try to identify all eight of these series. These are presented in Section 3.3.2.1, so you can check your diagnosis against the actual model. For example, look at Y6. First, observe that the calculated Q statistic is 17.03, which would be compared to a chi-square distribution with six degrees of freedom. The 5% critical value is 12.59, so you have significant evidence against the null hypothesis that the considered model is adequate. Because no model is specified, this Q statistic simply tests the hypothesis that the original data are white noise. The number 0.0092 is the area under the chisquare distribution to the right of the calculated 17.03. Because 0.0092 is less than .05, without recourse to a chi-square table, you see that 17.03 is to the right of the 5% critical value. Either way, you decide that Y6 is not a white noise series. Contrast this with Y7, where the calculated statistic 2.85 has an area 0.8269 to its right; 2.85 is far to the left of the critical value and nowhere near significance. Therefore, you decide that Y7 is a white noise series. A model is needed for Y6. The PACF and IACF are nonzero through several lags, which means that an AR diagnosis requires perhaps seven lags. A model with few parameters is preferable. The ACF is near 0 after two lags, indicating that you may choose an MA(2). Because an MA model has a persistently nonzero PACF and IACF, the MA(2) diagnosis seems appropriate. At this stage, you have identified the form of the model and can assign the remainder of the analysis to PROC ARIMA. You must identify the model because PROC ARIMA does not do it automatically. The generated series has 150 observations; note the width of the standard error bands on the autocorrelations. Even with 150 observations, reading fine detail from the ACF is unlikely. Your goal is to use these functions to limit your search to a few plausible models rather than to pinpoint one model at the identification stage. 842.0 152.0 613.0 014.0 675.0 867.0 1000.< qSihC > rP 6 FD 27.202 erauqS -ihC 6 gaL oT --------------------snoitalerrocotuA-------------------- esioN etihW rof kcehC noitalerrocotuA | | | | | | | | | | . |* . |* . | . *| .**| . | . *| . |* . |* ***************| . . . . . . . . . . | | | | | | | | | | 80330.086630.052600.0 08820.0 46211.0 99510.0 13560.0 03050.006530.022867.0 noitalerroC 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA laitraP | | | | | | | | | | . | . . | . . |* . . | . . |* . .**| . . |* . . *| . . |* . . |********** | | | | | | | | | | 40020.0 64900.015620.039500.0 97340.069580.0 51960.024270.0 00730.047484.0noitalerroC 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA esrevnI . . . . . . . . . 079161.0 604061.0 790851.0 435551.0 668251.0 480051.0 185541.0 966731.0 365021.0 056180.0 0 rorrE dtS | . ***| | . ****| | .*****| | .*****| | .*****| | .*****| | ******| | ********| | ************| | ***************| |********************| . | | | | | | | | | | | 15341.0 44491.0 68432.0 55542.0 93842.0 44152.0 99513.0 79904.0 75575.0 22867.0 00000.1 noitalerroC 414273.0 765405.0 854906.0 891736.0 475446.0 784256.0 399918.0 078360.1 106394.1 815399.1 679495.2 ecnairavoC 01 9 8 7 6 5 4 3 2 1 0 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA 051 398016.1 17538.0- snoitavresbO fo rebmuN noitaiveD dradnatS seireS gnikroW fo naeM 1Y = elbairaV fo emaN erudecorP AMIRA ehT Output 3.5 Using the IDENTIFY Statement for Series 1–8: PROC ARIMA 82 SAS for Forecasting Time Series 833.0 014.0- 084.0 375.0- 456.0 697.0- 1000.< qSihC > rP 6 FD 42.492 erauqS -ihC 6 gaL oT --------------------snoitalerrocotuA-------------------- esioN etihW rof kcehC noitalerrocotuA | | | | | | | | | | . | . .**| . .**| . . | . . |* . . | . . |* . . |**. . *| . . |**************** | | | | | | | | | | 44300.0 92080.0 66570.0 80300.077140.004320.057930.066101.092350.0 94697.0noitalerroC 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA laitraP | | | | | | | | | | . | . . |* . . |**. . |**. . | . . | . . *| . .**| . .**| . **********| . | | | | | | | | | | 11200.054250.063221.002001.008610.033420.0 55930.0 23001.0 24480.0 85005.0 noitalerroC 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA esrevnI 738581.0 313481.0 838181.0 869871.0 276471.0 231861.0 537851.0 213441.0 589221.0 056180.0 0 rorrE dtS | . ***| . | . |**** . | . *****| . | . |******. | *******| . | . |******** | **********| . | . |*********** | *************| . | . |**************** |********************| | | | | | | | | | | | 22361.0 96502.96062.0 17872.35733.0 50014.89974.0 55275.78356.0 94697.00000.1 noitalerroC 196494.0 324326.0731097.0 037448.0820320.1 318242.1557454.1 843537.1918189.1 760414.2398030.3 ecnairavoC 01 9 8 7 6 5 4 3 2 1 0 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA 051 649047.1 40370.0- snoitavresbO fo rebmuN noitaiveD dradnatS seireS gnikroW fo naeM 2Y = elbairaV fo emaN erudecorP AMIRA ehT Output 3.5 Using the IDENTIFY Statement for Series 1–8: PROC ARIMA (continued) Chapter 3: The General ARIMA Model 83 302.0 951.0 372.0 852.0 494.0 354.0 1000.< qSihC > rP 6 FD 65.101 erauqS -ihC 6 gaL oT --------------------snoitalerrocotuA-------------------- esioN etihW rof kcehC noitalerrocotuA | | | | | | | | | | . |* . |* . *| .**| .**| . | . *| . |* *******| *********| . . . . . . . . . . | | | | | | | | | | 33940.022620.045230.0 78670.0 71570.0 67200.035940.0 58070.038363.0 30354.0 noitalerroC 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA laitraP | | | | | | | | | | . *| . . | . . |* . . |* . . | . . *| . . |* . . *| . . |****** . |**** | | | | | | | | | | 44930.0 74300.049360.036340.000400.0 77170.0 19820.024440.0 62203.045771.0noitalerroC 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA esrevnI 409821.0 209721.0 059521.0 232421.0 799121.0 906021.0 614611.0 625211.0 079690.0 056180.0 0 rorrE dtS | . **| | . ***| | .****| | .****| | .****| | . ***| | *****| | *****| | **********| | *********| |********************| . . . . . . . . . . | | | | | | | | | | | 61101.0 59831.0 97291.0 45971.0 61302.0 09851.0 10372.0 54852.0 93494.0 30354.0 00000.1 noitalerroC 468451.0 017212.0 521592.0 058472.0 500113.0 252342.0 829714.0 356593.0 838657.0 315396.0 248035.1 ecnairavoC 01 9 8 7 6 5 4 3 2 1 0 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA 051 272732.1 46055.0- snoitavresbO fo rebmuN noitaiveD dradnatS seireS gnikroW fo naeM 3Y = elbairaV fo emaN erudecorP AMIRA ehT Output 3.5 Using the IDENTIFY Statement for Series 1–8: PROC ARIMA (continued) 84 SAS for Forecasting Time Series 521.0 650.0- 682.0- 524.0- 651.0- 094.0 1000.< qSihC > rP 6 FD 33.48 erauqS -ihC 6 gaL oT --------------------snoitalerrocotuA-------------------- esioN etihW rof kcehC noitalerrocotuA | | | | | | | | | | . *| . . | . . *| . . | . . *| . . |**. . |* . . |**. . |********** **********| . | | | | | | | | | | 50530.0 91900.0 09150.0 78600.0 11750.0 38690.023620.073490.063225.034094.0 noitalerroC 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA laitraP | | | | | | | | | | . | . . | . . |* . . *| . . |* . .**| . . |**. .**| . *****| . . |*********** | | | | | | | | | | 89910.095220.0 99050.027830.0 21660.081990.0 45990.011901.0 76722.0 70375.0noitalerroC 73830.68820.0 93621.0 03071.0 66421.0 29550.04682.19424.02651.34094.0 00000.1 noitalerroC 01 9 8 7 6 5 4 3 2 1 gaL 891370.0560550.0 111142.0 788423.0 718732.0 286601.0063645.0106018.0579792.0985539.0 296709.1 ecnairavoC 01 9 8 7 6 5 4 3 2 1 0 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1366021.0 716021.0 137911.0 501811.0 522711.0 740711.0 872211.0 099001.0 663990.0 056180.0 0 rorrE dtS snoitalerrocotuA esrevnI | . |* . | | . *| . | | . ***| . | | . ***| . | | . **| . | | . |* . | | . |****** | | . |******** | | . |***. | | **********| . | |********************| | 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA 051 291183.1 38512.0- snoitavresbO fo rebmuN noitaiveD dradnatS seireS gnikroW fo naeM 4Y = elbairaV fo emaN erudecorP AMIRA ehT Output 3.5 Using the IDENTIFY Statement for Series 1–8: PROC ARIMA (continued) Chapter 3: The General ARIMA Model 85 600.0 480.0- 711.0- 611.0- 230.0- 294.0 1000.< qSihC > rP 6 FD 84.24 erauqS -ihC 6 gaL oT --------------------snoitalerrocotuA-------------------- esioN etihW rof kcehC noitalerrocotuA | | | | | | | | | | . | . . | . . *| . .**| . . | . . *| . . |*** ***| . . |******* **********| . | | | | | | | | | | 71500.0 30310.077550.0 40280.0 18600.0 37450.0 68071.051621.0 39063.037194.0 noitalerroC 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA laitraP | | | | | | | | | | . | . . | . . |* . . | . . | . . |* . ***| . . |***** *********| . . |************** | | | | | | | | | | 26200.084010.0 52240.066410.0 44000.0 05140.039831.0 45942.037944.0 60396.0noitalerroC 97000.0 16180.0 15531.0 63490.0 93600.0 25380.92711.16511.68130.37194.0 00000.1 noitalerroC 01 9 8 7 6 5 4 3 2 1 gaL 9073100.0 254141.0 168432.0 455361.0 860110.0 367441.0782302.0083002.0712550.0572258.0 912337.1 ecnairavoC 01 9 8 7 6 5 4 3 2 1 0 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1979301.0 155301.0 363201.0 187101.0 877101.0 023101.0 114001.0 025990.0 254990.0 056180.0 0 rorrE dtS snoitalerrocotuA esrevnI | . | . | | . **| . | | .***| . | | . **| . | | . | . | | . |** . | | . |** . | | . |** . | | . |* . | | **********| . | |********************| | 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA 051 815613.1 84003.0- snoitavresbO fo rebmuN noitaiveD dradnatS seireS gnikroW fo naeM 5Y = elbairaV fo emaN erudecorP AMIRA ehT Output 3.5 Using the IDENTIFY Statement for Series 1–8: PROC ARIMA (continued) 86 SAS for Forecasting Time Series 900.0 560.0- 600.0 190.0- 592.0- 501.0- 2900.0 qSihC > rP 6 FD 30.71 erauqS -ihC 6 gaL oT --------------------snoitalerrocotuA-------------------- esioN etihW rof kcehC noitalerrocotuA | | | | | | | | | | . . . . . . . . . . |* . | . | . |**. |*** |**** |*** |**** |****** |**. | | | | | | | | | | 81540.055210.026800.039421.088631.087802.032941.079281.002903.010501.0noitalerroC 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA laitraP | | | | | | | | | | . *| . | . *| .**| ***| ******| *******| *********| ***********| ***********| . . . . . . . . . . | | | | | | | | | | 60620.0 68220.0 51730.0 97901.0 26161.0 37182.0 09733.0 16244.0 27375.0 30545.0 noitalerroC 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA esrevnI 739090.0 229090.0 013090.0 022090.0 412090.0 209980.0 998980.0 782980.0 545280.0 056180.0 0 rorrE dtS | . |* . | . | . | . **| . | . *| . | . | . | . |* . | . | . | . |** . | . |****** | . |**. |********************| | | | | | | | | | | | 39360.44410.0 32190.0 49430.0 40900.0 09460.63600.0 66090.77492.10501.00000.1 noitalerroC 275380.0288810.0 262911.0 776540.0 218110.0 348480.04013800.0 515811.0043583.0672731.0172703.1 ecnairavoC 01 9 8 7 6 5 4 3 2 1 0 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA 051 953341.1 35240.0- snoitavresbO fo rebmuN noitaiveD dradnatS seireS gnikroW fo naeM 6Y = elbairaV fo emaN erudecorP AMIRA ehT Output 3.5 Using the IDENTIFY Statement for Series 1–8: PROC ARIMA (continued) Chapter 3: The General ARIMA Model 87 220.0 180.0- 210.0- 301.0- 210.0- 910.0 9628.0 qSihC > rP 6 FD 58.2 erauqS -ihC 6 gaL oT --------------------snoitalerrocotuA-------------------- esioN etihW rof kcehC noitalerrocotuA | | | | | | | | | | . | . . *| . .**| . . *| . . | . . |**. . | . . |**. . | . . | . | | | | | | | | | | 41120.046330.0 26580.0 95140.0 47410.0 53480.076800.031201.015210.008810.0 noitalerroC 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA laitraP | | | | | | | | | | . | . . |* . . |**. . |* . . | . . *| . . | . .**| . . | . . | . | | | | | | | | | | 26020.0 42330.079180.054530.080220.071860.0 39700.042480.0 21020.0 09400.0noitalerroC 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA esrevnI 602480.0 621480.0 003380.0 131380.0 090380.0 655280.0 445280.0 196180.0 976180.0 056180.0 0 rorrE dtS | . |* . | . *| . | .**| . | . *| . | . | . | . |**. | . | . | . |**. | . | . | . | . |********************| | | | | | | | | | | | 01520.58130.0 18101.0 30640.0 44220.0 64180.91210.45201.51210.08810.0 00000.1 noitalerroC 272620.0733330.0 445601.0 671840.0 984320.0 052580.0457210.0313701.0517210.0086910.0 345640.1 ecnairavoC 01 9 8 7 6 5 4 3 2 1 0 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA 051 700320.1 26751.0- snoitavresbO fo rebmuN noitaiveD dradnatS seireS gnikroW fo naeM 7Y = elbairaV fo emaN erudecorP AMIRA ehT Output 3.5 Using the IDENTIFY Statement for Series 1–8: PROC ARIMA (continued) 88 SAS for Forecasting Time Series 060.0 010.0 430.0 931.0 373.0 927.0 1000.< qSihC > rP 6 FD 56.601 erauqS -ihC 6 gaL oT --------------------snoitalerrocotuA-------------------- esioN etihW rof kcehC noitalerrocotuA | | | | | | | | | | . | . . |* . . *| . . | . .**| . . | . . | . . *| . . |******* ***************| . | | | | | | | | | | 83100.077930.066520.0 70120.0 78011.0 81910.0 35700.0 22250.0 38833.083927.0 noitalerroC 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA laitraP | | | | | | | | | | . | . . | . . |* . . *| . . |* . .**| . . |* . . | . ****| . . |************* | | | | | | | | | | 46000.0 81710.0 82740.055540.0 37860.068980.0 19660.059000.0 88102.0 39256.0noitalerroC 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA esrevnI . . . . . . . . . 756821.0 620821.0 579621.0 952621.0 170621.0 660621.0 500621.0 679421.0 303711.0 056180.0 0 rorrE dtS | . *| | . **| | . ***| | . **| | . *| | . | | . *| | . ***| | *******| | ***************| |********************| . | | | | | | | | | | | 73650.0 12011.0 38141.0 95611.0 55950.0 50010.0 89330.0 51931.0 24373.0 83927.0 00000.1 noitalerroC 728241.0 852972.0 004953.0 444592.0 388051.0 374520.0 390680.0 595253.0 612649.0 391848.1 239335.2 ecnairavoC 01 9 8 7 6 5 4 3 2 1 0 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA 051 338195.1 50475.0- snoitavresbO fo rebmuN noitaiveD dradnatS seireS gnikroW fo naeM 8Y = elbairaV fo emaN erudecorP AMIRA ehT Output 3.5 Using the IDENTIFY Statement for Series 1–8: PROC ARIMA (continued) Chapter 3: The General ARIMA Model 89 90 SAS for Forecasting Time Series 3.4.2 Example: Iron and Steel Export Analysis The U.S. iron and steel export yearly series (Fairchild Publications 1981) graphed in Output 3.6 is a good illustration of model identification. Output 3.6 Plotting a Yearly Series The following statements produce the results in Output 3.7: PROC ARIMA DATA=STEEL; IDENTIFY VAR=EXPORT NLAG=10; RUN; | | | | | | | | | | . . . . . . . . . . *| . |* . **| . |* . *| . |* . |* . | . ***| . |********** | | | | | | | | | | 63240.0 44730.019970.0 75840.014940.0 01550.035030.090310.086741.0 70184.0noitalerroC 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA esrevnI . . . . . . . . . 167781.0 049681.0 727581.0 283581.0 473581.0 671481.0 858281.0 116281.0 842181.0 657051.0 0 rorrE dtS | . |** | . |** | . |** | . |* | . | | . **| | . **| | . *| | . **| | *********| |********************| . | | | | | | | | | | | 84580.42280.57990.50350.61800.0 76890.0 91301.0 35440.0 34401.0 39174.0 00000.1 noitalerroC 188652.0851742.0077992.0424951.0715420.0 435692.0 790013.0 538331.0 938313.0 832814.1 061500.3 ecnairavoC 01 9 8 7 6 5 4 3 2 1 0 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA 44 45337.1 281814.4 snoitavresbO fo rebmuN noitaiveD dradnatS seireS gnikroW fo naeM TROPXE = elbairaV fo emaN erudecorP AMIRA ehT 0891-7391 SNOT NOILLIM NI THGIEW SPARCS GNIDULCXE STROPXE LEETS DNA NORI Output 3.7 Identifying a Model Using the IDENTIFY Statement For the AR(1) model use the statement Although the Q statistic fails by a slim margin to be significant, the lag 1 autocorrelation 0.47193 is beyond the two standard error bands. Thus, you want to fit a model despite the Q value. From the ACF, it appears that an MA(1) is appropriate. From the PACF and IACF, an AR(1) also appears consistent with these data. You can fit both and select the one with the smallest error mean square. To fit the MA(1) model, use the statement ESTIMATE Q=1; Chapter 3: The General ARIMA Model 91 ESTIMATE P=1; 92 SAS for Forecasting Time Series Output 3.7 Identifying a Model Using the IDENTIFY Statement (continued) Suppose you overfit, using an MA(2) as an initial step. Specify these statements: PROC ARIMA DATA=STEEL; IDENTIFY VAR=EXPORT NOPRINT; ESTIMATE Q=2; RUN; Any ESTIMATE statement must be preceded with an IDENTIFY statement. In this example, NOPRINT suppresses the printout of ACF, IACF, and PACF. Note that the Q statistics in Output 3.8 are quite small, indicating a good fit for the MA(2) model. However, when you examine the parameter estimates and their t statistics , you see that more parameters were fit than necessary. An MA(1) model is appropriate because the t statistic for the lag 2 parameter is only 0.85. Also, it is wise to ignore the fact that the previous Q was insignificant due to the large t value, 3.60, associated with the lag 1 coefficient. In Output 3.7 the Q was calculated from six autocorrelations , and the large lag 1 autocorrelation's effect was diminished by the other five small autocorrelations. --------------------snoitalerrocotuA-------------------- 800.0 990.0 − − | | | | | | | | | | 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1- 301.0 . |* . | . |** . |* . |* . | . **| . **| . |*** *********| 0891-7391 SNOT NOILLIM NI THGIEW SPARCS GNIDULCXE STROPXE LEETS DNA NORI esioN etihW rof kcehC noitalerrocotuA 540.0 snoitalerrocotuA laitraP . . . . . . . . . . 401.0 274.0 | | | | | | | | | | noitalerroC 51750.065110.001380.033330.049550.035010.0 58180.0 64870.0 81251.039174.0 qSihC > rP 6850.0 FD 6 gaL 01 9 8 7 6 5 4 3 2 1 erauqS -ihC 51.21 gaL oT 6 erudecorP AMIRA ehT )2(**B 24231.0 + )1(**B 82065.0 + 1 srotcaF egarevA gnivoM 999334.4 :1 rotcaF naeM detamitsE TROPXE elbairaV rof ledoM 060.0521.0750.0 230.0520.0 321.0 361.0180.0 860.0890.0800.0 060.0 201.0630.0 530.0600.0 680.0500.0770.0600.0702.0660.0 500.0 200.03559.0 3589.0 5589.0 3569.0 qSihC > rP 22 61 01 4 FD 01.21 42.6 18.2 85.0 erauqS -ihC 42 81 21 6 gaL oT --------------------snoitalerrocotuA-------------------- slaudiseR fo kcehC noitalerrocotuA 000.1 294.0 110.02,1AM 294.0 000.1 310.01,1AM 110.0310.0000.1 UM 2,1AM 1,1AM UM retemaraP setamitsE retemaraP fo snoitalerroC .tnanimreted gol edulcni ton od CBS dna CIA * 44 slaudiseR fo rebmuN 7432.271 CBS 1288.661 CIA 928955.1 etamitsE rorrE dtS 860334.2 etamitsE ecnairaV 999334.4 etamitsE tnatsnoC 2 1 0 gaL 0993.0 8000.0 1000.< |t| > rP xorppA 58.006.333.11 eulaV t 53551.0 24551.0 73193.0 rorrE dradnatS 24231.082065.000434.4 etamitsE 2,1AM 1,1AM UM retemaraP noitamitsE serauqS tsaeL lanoitidnoC erudecorP AMIRA ehT 0891-7391 SNOT NOILLIM NI THGIEW SPARCS GNIDULCXE STROPXE LEETS DNA NORI Output 3.8 Fitting an MA(2) Model with the ESTIMATE Statement Chapter 3: The General ARIMA Model 93 )1(**B 72894.0 + 1 :1 rotcaF srotcaF egarevA gnivoM 610124.4 naeM detamitsE TROPXE elbairaV rof ledoM 270.0941.0930.0 020.0410.0 690.0 641.0570.0 480.0290.0310.0580.0 411.0440.0 250.0820.0531.0100.0970.0490.0 602.0360.0 600.0950.0 8629.0 4789.0 3789.0 6339.0 qSihC > rP 32 71 11 5 FD 00.41 86.6 32.3 13.1 erauqS -ihC 42 81 21 6 gaL oT --------------------snoitalerrocotuA-------------------- slaudiseR fo kcehC noitalerrocotuA 000.1 800.01,1AM 800.0000.1 UM 1,1AM UM retemaraP setamitsE retemaraP fo snoitalerroC .tnanimreted gol edulcni ton od CBS dna CIA * 44 slaudiseR fo rebmuN 8831.961 CBS 4075.561 CIA 942355.1 etamitsE rorrE dtS 385214.2 etamitsE ecnairaV 610124.4 etamitsE tnatsnoC 1 0 gaL 6000.0 1000.< |t| > rP xorppA 96.347.21 eulaV t 21531.0 30743.0 rorrE dradnatS 72894.020124.4 etamitsE 1,1AM UM retemaraP noitamitsE serauqS tsaeL lanoitidnoC erudecorP AMIRA ehT 0891-7391 SNOT NOILLIM NI THGIEW SPARCS GNIDULCXE STROPXE LEETS DNA NORI Output 3.9 Fitting an MA(1) Model with the ESTIMATE Statement 1− te3894. + e + 124.4 = t t Y 94 SAS for Forecasting Time Series The results are shown in Output 3.9. The Q statistics are still small, so you have no evidence of a lack of fit for the order 1 MA model. The estimated model is now You now fit an MA(1) model using these statements: PROC ARIMA DATA=STEEL; IDENTIFY VAR=EXPORT NOPRINT; ESTIMATE Q=1; RUN; Chapter 3: The General ARIMA Model 95 3.4.3 Estimation Methods Used in PROC ARIMA How does PROC ARIMA estimate this MA coefficient? As in the AR case, three techniques are available: In the CLS method you attempt to minimize te 1+p = t 2 n where p is the order of the AR part of the process and et is a residual. In the example, ) 1− te ( To illustrate further, suppose you are given data Y1, Y2, . . . , Y6, where you assume 1− te You find that 98.75 and 62.51 Solving 6362. .92. −=β yields the initial estimate 1− tˆ 92. e Compute − − − Wt( 0.30) 0 12 4 6 6 0 − = ) 2 β + 1( / β − = )1(ρ ˆ − êt( 0.30) 12 0.60 6.82 6.95 1.91 − − êt( 0.29) 12 0.48 6.86 7.01 1.97 − − − Yt 12 3 β Suppose you want to estimate 3894. −=β ˆ that β − β ˆ 124 .4 = µ ˆ µ ˆ where and .)LM( doohilekil mumixam )SLU( serauqs tsael lanoitidnocnu )SLC( serauqs tsael lanoitidnoc are parameter estimates. Begin by assuming e0=0. ARIMA computations indicate provide the minimum for the iron export data. and from the data given below: Sum of squares 7 9 4 7 7.57 7.57 301.62 300.13 β+µ− ˆ ˆ − t t Y e t Y = )0( γ ˆ = )1( γ ˆ = = t = ˆ e Σ − Y t t e − − 96 SAS for Forecasting Time Series and thus 26.103 6ˆ 2ˆ = 2 e + . . . + 2 e + 21e ˆ but by how The larger gives a smaller sum of squares, so you would like to continue increasing much? Letting be the true value of the parameter, you can use Taylor's series expansion to write You need to compute the derivative Wt. Derivatives are defined as limits—for example, δ / ))β( e − ) δ + β( e( 0→iδl = )β( ˆ ˆ ˆ m t t t You have now computed et( .29) and et( .30), so you can approximate Wt by otherwise specified. Now, regressing et( .30) on Wt( .30) gives a coefficient ) ) 0 ( ) 75.7 ( This is an estimate of 03. β so you compute a new estimate of 7516. by 1 00. = δ .03.− =β ˆ − − ) 6 ( )19.1( − ) 6 ( )59.6 ( − ) 4 ( )28.6 ( − )21( ) 06.( ( 1 . 7513.0 0 =δ as in the third row of the table above, where − = ))0 ( + − − 2 )6( 1 . / − 0 + 2 )6 ( )) 03. + − = 7513.0 − 03.− −( e − )92. −( e( − 2 )4 ( t + 0β = β − 0β ˆ + ) 0 ( te 2 β Because is white noise, this looks like a regression equation that you can use to estimate β where –Wt is the derivative of et with respect to 3.2 and ignoring the remainder yields )0 ( and Rt is a remainder term. Rearranging equation and In PROC ARIMA, ,β ˆ ,03. − =β ˆ For example, using Perhaps you can improve upon et values to the previous list and thus compute .92. − =β ˆ t R + )β − 0β()β( W − )β( e = ) 0 β( ˆ ˆ ˆ t e ˆ Starting with e0=0, values of 21 are listed under the Yt values. Thus, 31.003 β t 84. e + )β − ˆ 68.6 = )21 −(92. − 3 − = 2ˆ e 6ˆ 2ˆ = 2 e + . . . + 2 e + 21e ˆ t − = )0(92. − 0 () ( tW ˆ = )84.(92. − 7 = 3ˆ e ββ 0 β t β ˆ ) 21( ( / 1Y = )β( ˆ W t = 1ˆ e you can add a second row of (3.2) .β − 0 β ˆ t t − e e unless Chapter 3: The General ARIMA Model 97 You can extend this method to higher-order and mixed processes. The technique used in PROC ARIMA is more sophisticated than the one given here, but it operates under the same principle. The METHOD=ULS technique more accurately computes prediction error variances and finite sample predictions than METHOD=CLS. METHOD=CLS assumes a constant variance and the same linear combination of past values as the optimum prediction. Also, when you specify METHOD=ML, the quantity to be minimized is not the sum of squares; instead, it is the negative log of the likelihood function. Although CLS, ULS, and ML should give similar results for reasonably large data sets, studies comparing the three methods indicate that ML is the most accurate. Initial values are computed from the Yule-Walker equations for the first round of the iterative procedure as in the example above. See also Section 2.2.1. 3.4.4 ESTIMATE Statement for Series 8 Finally, reexamine the generated series Y8, 1− te4. The following statements produce Output 3.10: PROC ARIMA IDENTIFY ESTIMATE ESTIMATE RUN; DATA=SERIES; VAR=Y8 NOPRINT; P=1 Q=1 PRINTALL GRID; P=2 Q=2; The PRINTALL option shows the iterations. Because the iterations stop when the changes in parameter estimates are small, you have no guarantee that the final parameter estimates have minimized the residual sum of squares (or maximized the likelihood). To check this, use the GRID option to evaluate the sum of squares (or likelihood) on a grid surrounding the final parameter estimates. Examine the grids in Output 3.10 and verify that the middle sum of squares, 164.77, is the smallest of the nine tabulated values. For example, increasing the AR estimate .52459 to .52959 and decreasing the MA estimate –.32122 to –.32622 increases the sum of squares from 164.77 to 164.79. A message associated with the last command indicates that the procedure could not find estimates that minimized the error sum of squares because excess lags are specified on both sides of the ARMA model. 8166. −=β ˆ β∆ ˆ estimation improvement technique until the changes appears to minimize the sum of squares at 271.153. + e= t 7516. 1− tY 6. −=β ˆ Using as an initial value, you can again compute an improvement. Continue iterating the become small. For this data set, 78.172 = )7516. −( 2 e Σ ˆ − t β This estimate of results in a lower sum of squares, t Y 4 100.0 9-E1 6177.461 slaudiseR derauqS fo muS 151000.0 158020.0 8-E287.4 noitcnuF evitcejbO ni egnahC evitaleR 406000.0 100.0 setamitsE ni egnahC evitaleR mumixaM 3 serauqS tsaeL lanoitidnoC snoitaretI atleD noitabrutreP evitavireD laciremuN tneiciffeoC adbmaL s'tdrauqraM eulaV noitcnuF evitcejbO noitcnuF evitcejbO noitaretI tsaL morf egnahC erauqS-R tneidarG fo eulaV etulosbA mumixaM eulaV airetirC etanretlA airetirC etanretlA eulaV airetirC eulaV gnippotS noitaretI airetirC noitanimreT detamitsE sretemaraP dohteM noitamitsE yrammuS noitazimitpO noitamitsE AMIRA 151000.0 458000.0 915500.0 43540.0 1 tirC R 9-E1 8-E1 7-E1 6-E1 10000.0 adbmaL 17631.017631.098631.014931.058271.0tnatsnoC 95425.0 96425.0 91525.0 16625.0 88535.0 1,1RA 22123.020123.089913.051413.071272.01,1AM 65782.026782.003882.005492.024273.0UM 77.461 77.461 77.461 87.461 61.561 ESS 4 3 2 1 0 noitaretI noitamitsE serauqS tsaeL lanoitidnoC 310711.1 58271.0tsE ecnairaV esioN etihW etamitsE mreT tnatsnoC 1 71272.0etamitsE setamitsE egarevA gnivoM laitinI 88535.0 etamitsE setamitsE evissergerotuA laitinI noitamitsE yranimilerP erudecorP AMIRA ehT 1 Output 3.10 Using the ESTIMATE Statement for Series 8: PROC ARIMA 98 SAS for Forecasting Time Series 87.461 87.461 87.461 22613.0- 77.461 77.461 77.461 22123.0- 87.461 87.461 87.461 22623.0- 65282.065782.065292.0)8y( UM )8y( 1,1AM :setamitsE raeN dirG no ecafruS ESS 430.0041.0400.0 510.0220.0110.0100.0 200.0 430.0 280.0 430.0 700.0 830.0 730.0760.0400.0 050.0150.0 010.0 930.0 950.0990.0 601.0 920.0300.0 340.0530.0400.0 950.0600.00299.0 3269.0 2099.0 8579.0 7127.0 qSihC > rP 82 22 61 01 4 FD 91.31 57.11 97.5 22.3 80.2 erauqS -ihC 03 42 81 21 6 gaL oT --------------------snoitalerrocotuA-------------------- slaudiseR fo kcehC noitalerrocotuA 000.1 096.0 450.0 1,1RA 096.0 000.1 610.0 1,1AM 450.0 610.0 000.1 UM 1,1RA 1,1AM UM retemaraP setamitsE retemaraP fo snoitalerroC .tnanimreted gol edulcni ton od CBS dna CIA * 051 slaudiseR fo rebmuN 2208.454 CBS 3077.544 CIA 427850.1 etamitsE rorrE dtS 598021.1 etamitsE ecnairaV 17631.0etamitsE tnatsnoC 1 1 0 gaL 1000.< 5300.0 7422.0 |t| > rP xorppA 93.5 79.222.1eulaV t 92790.0 01801.0 78532.0 rorrE dradnatS 95425.0 22123.065782.0etamitsE 1,1RA 1,1AM UM retemaraP noitamitsE serauqS tsaeL lanoitidnoC Output 3.10 Using the ESTIMATE Statement for Series 8: PROC ARIMA (continued) Chapter 3: The General ARIMA Model 99 .degrevnoc evah ton yam setamitsE 02 100.0 10000.0 7346.461 slaudiseR derauqS fo muS 660811.0 1563.441 942726.1 100.0 setamitsE ni egnahC evitaleR mumixaM 5 serauqS tsaeL lanoitidnoC egasseM gninraW snoitaretI atleD noitabrutreP evitavireD laciremuN tneiciffeoC adbmaL s'tdrauqraM eulaV noitcnuF evitcejbO noitcnuF evitcejbO noitaretI tsaL morf egnahC erauqS-R tneidarG fo eulaV etulosbA mumixaM eulaV airetirC eulaV gnippotS noitaretI airetirC noitanimreT detamitsE sretemaraP dohteM noitamitsE yrammuS noitazimitpO noitamitsE AMIRA .degrevnoc evah ton yam setamitsE :GNINRAW .detanimret neeb sah ssecorp noitareti ehT .elbatsnu si setamitse wen eht yb denifed ledom ehT :GNINRAW )1(**B 22123.0 + 1 :1 rotcaF srotcaF egarevA gnivoM )1(**B 95425.0 - 1 :1 rotcaF srotcaF evissergerotuA 65782.0naeM detamitsE 8y elbairaV rof ledoM 77.461 87.461 97.461 95925.0 87.461 77.461 87.461 95425.0 97.461 87.461 77.461 95915.0 22613.022123.022623.0)8y( 1,1AM )8y( 1,1RA :setamitsE raeN dirG no ecafruS ESS 87.461 87.461 87.461 95925.0 77.461 77.461 77.461 95425.0 87.461 87.461 87.461 95915.0 65282.065782.065292.0)8y( UM )8y( 1,1RA :setamitsE raeN dirG no ecafruS ESS Output 3.10 Using the ESTIMATE Statement for Series 8: PROC ARIMA (continued) 100 SAS for Forecasting Time Series )2(**B 96913.0 - )1(**B13086.0 - 1 srotcaF egarevA gnivoM )2(**B 36525.0 + )1(**B 91525.1 - 1 srotcaF evissergerotuA 53503.0- :1 rotcaF :1 rotcaF naeM detamitsE 8y elbairaV rof ledoM 530.0141.0300.0 610.0220.0110.0200.0 200.0 430.0 280.0 430.0 700.0 730.0 730.0860.0400.0 050.0150.0 010.0 830.0 950.0990.0 601.0 920.0100.0 340.0530.0400.0 950.0700.01289.0 2429.0 6179.0 8919.0 9453.0 qSihC > rP 62 02 41 8 2 FD 02.31 67.11 87.5 22.3 70.2 erauqS -ihC 03 42 81 21 6 gaL oT --------------------snoitalerrocotuA-------------------- slaudiseR fo kcehC noitalerrocotuA 000.1 369.0244.0329.0732.0 2,1RA 369.0000.1 336.0 479.0 032.01,1RA 244.0336.0 000.1 165.0 551.02,1AM 329.0479.0 165.0 000.1 852.01,1AM 732.0 032.0551.0852.0000.1 UM 2,1RA 1,1RA 2,1AM 1,1AM UM retemaraP setamitsE retemaraP fo snoitalerroC .tnanimreted gol edulcni ton od CBS dna CIA * 051 slaudiseR fo rebmuN 707.464 CBS 8356.944 CIA 685560.1 etamitsE rorrE dtS 374531.1 etamitsE ecnairaV 31000.0etamitsE tnatsnoC 2 1 2 1 0 gaL 1600.0 1000.< 0960.0 8760.0 3974.0 |t| > rP xorppA 87.262.4 38.1 48.1 17.0eulaV t 39881.0 96753.0 84471.0 86963.0 65034.0 rorrE dradnatS 36525.091525.1 96913.0 13086.0 53503.0etamitsE 2,1RA 1,1RA 2,1AM 1,1AM UM retemaraP noitamitsE serauqS tsaeL lanoitidnoC Output 3.10 Using the ESTIMATE Statement for Series 8: PROC ARIMA (continued) Chapter 3: The General ARIMA Model 101 102 SAS for Forecasting Time Series To understand the failure to converge, note that implies that − 1− tY 1− te4. Every yields a different ARMA(2,2), each equivalent to the original Y8. Thus, the procedure could not find one ARMA(2,2) model that seemed best. Although you sometimes overfit and test coefficients for significance to select a model (as illustrated with the iron and steel data), the example above shows that this method fails when you overfit on both sides of the ARMA equation at once. Notice that is the same as or, eliminating the common factor, t 3.4.5 Nonstationary Series The theory behind PROC ARIMA requires that a series be stationary. Theoretically, the stationarity of a series ) hinges on the solutions M of the characteristic equation 0= p Mp is stationary, but the following series is not: t The characteristic equation for the nonstationary example above is 0= 2 M5. with solutions M=1 and M=2. These solutions are called roots of the characteristic polynomial, and because one of them is 1 the series is nonstationary. This unit root nonstationarity has several implications, which are explored below. The overfit example at the end of the previous section ended when the common factor neared an “unstable” value. ,)B , 1 − 1( > M If all Ms that satisfy this equation have t e)7913. + B3086. − 1( = ) µ − ty () 2B525. + B525.1 − 1( the series is stationary. For example, the series 2− te 4. ϕ + 1− e )4. + ϕ ( + ϕ Now multiply this last equation on both sides by t t t e)B7913. + 1()B − 1( = ) µ − ty ()B525. − 1()B − 1( µ− e)B8. + 1( = ) µ − e) B8. + 1( = ) µ − t tY() . te) B 7913. e) e ) q Bϕ − 1( Bq p = 2− Yϕ6. − 1− Y )6. − ϕ ( + Y Bp β − ... − 2− te4. α − ... − α − ... − t + 1( = ) µ − y ()B525. − 1( + t Y( ) t 1− te Y( ) 2 + e= B2 2 2 2 M2 B46. t 2 t B2 = β − B 1β − 1( = B5. 2− tY 6. α − M 1α − 1 α − B 1α − 1( 1− tY 6. t + M5.1 − 1 + B5.1 − 1( + B5.1 − 1( − ϕ t t Y and add to the first equation, obtaining Chapter 3: The General ARIMA Model 103 First, expanding the model gives 1− t 8. which shows that drops out of the equation. As a result, series forecasts do not tend to return to the historic series mean. This is in contrast to stationary series, where is estimated and where forecasts always approach this estimated mean. In the nonstationary example, Yt is the series level and is the first difference or change in the series. By substitution, 1− te8. 1− t so when the levels Yt satisfy an equation with a single unit root nonstationarity, the first differences Wt satisfy a stationary equation, often with mean 0. Similarly, you can eliminate a double unit root as in 1− te8. by computing and then analyzing the second difference 2− t Because you do not know the model, how do you know when to difference? You decide by examining the ACF or performing a test as in Section 3.4.8. If the ACF dies off very slowly, a unit root is indicated. The slow dying off may occur after one or two substantial drops in the ACF. Note that the sequence 1, .50, .48, .49, .45, .51, .47, . . . is considered to die off slowly in this context even though the initial drop from 1 to .5 is large and the magnitude of the autocorrelation is not near 1. Using the IDENTIFY statement, you can accomplish differencing easily. The statement IDENTIFY VAR=Y(1); produces the correlation function for Wt, where 1− t A subsequent ESTIMATE statement operates on Wt, so the NOCONSTANT option is normally used. The statement IDENTIFY VAR=Y(1,1); specifies analysis of the second difference, or ) 2− t The default is no differencing for the variables. Assuming a nonzero mean in the differenced data is equivalent to assuming a deterministic trend in the original data because You can fit this easily by omitting the NOCONSTANT option. .β = ))1 − t(β + α( − ) tβ + α( .t Y∇ 2 t Y∇ The first and second differences are often written difference more than twice. and For nonseasonal data, you rarely µ + e = µ)5. + 5.1 − 1( + t t Y + 1− Y2 − + e = )µ − Y − 1− Y( − ) 1− Y − Y( t 2− tY5. t t + e= µ t Y( ) t Y= + β Y− Y= Y− Y= 2 1− tW5. 1− tY5.1 B 1− tW + B2 − 1( t t t − − W − t t t t t W W W t Y 104 SAS for Forecasting Time Series 3.4.6 Effect of Differencing on Forecasts PROC ARIMA provides forecasts and 95% upper and lower confidence bounds for predictions for the general ARIMA model. If you specify differencing, modeling is done on the differenced series, but predictions are given for the original series levels. Also, when you specify a model with differencing, prediction error variances increase without bound as you predict further into the future. In general, by using estimated parameters and by estimating from the model residuals, you can easily derive the forecasts and their variances from the model. PROC ARIMA accomplishes this task for you automatically. For example, in the model t note that t Thus, the first differences 1− t are stationary. Given data Y1, Y2, . . . , Yn from this series, you predict future values by first predicting future values of Wn+j, using .5jWn as the prediction. Now j so the forecast of Yn+j is n To illustrate further, the following computation of forecasts shows a few values of Yt, Wt, and predictions : jY ˆ Actual t Yt Wt Note that i Forecast 99 100(n) 518 43 550 32 101 566 16 102 574 8 103 578 4 104 580 2 98 475 28 approaches 1 as j increases, so the forecasts converge to 550 + (1)(32) = 582 2 σ +nW + ... + t e = ) 2 − tY − 1− tY(5. − ) 1− tY − 2 + nW e = 2− tY5. + 1− tY5.1 − + 1+ nW W )5.( i Y− Y= = n 1= i Y j t )5.( 1= i Σ+ − j +nY t n Y( j W t Y Y Σ Chapter 3: The General ARIMA Model 105 Forecast errors can be computed from the forecast errors of the Ws—for example, 2 +n ) 2 + ne and Rewriting + 1+ne5. yields the forecast error 2 + ne 3.4.7 Examples: Forecasting IBM Series and Silver Series An example that obviously needs differencing is the IBM stock price series reported by Box and Jenkins (1976). In this example, the data are analyzed with PROC ARIMA and are forecast 15 periods ahead. Box and Jenkins report values of daily closing prices of IBM stock. You read in the series and check the ACF: DATA IBM; INPUT PRICE @@; T+1; CARDS; data lines ; RUN; PROC ARIMA DATA=IBM; IDENTIFY VAR=PRICE CENTER NLAG=15; IDENTIFY VAR=PRICE(1) NLAG=15; RUN; The plot of the original data is shown in Output 3.11, and the IDENTIFY results in Output 3.12. . 2 σ52.3 with the variance + nW52.( + ) 1+ne + nW52. + nW5. W+ + W+ Y= +Y nW 5.( 1 n + + nY n nY + = = 1+ne5.1 2 +nY 2 n 2 + nY ˆ 106 SAS for Forecasting Time Series Output 3.11 Plotting the Original Data | | | | | | | | | | | | | | | . | . . | . . | . . | . . |*. . |*. **| . . |** . | . . | . . |*. . | . **| . . |*. . |********** | | | | | | | | | | | | | | | 37600.0 58910.011910.0 95800.0 93720.052720.038901.0 93090.028610.0 50020.0 09230.077610.083870.0 19720.040715.0noitalerroC 51 41 31 21 11 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA esrevnI srorre dradnats owt skram "." 049462.0 627652.0 870842.0 749832.0 472922.0 399812.0 800802.0 271691.0 733381.0 582961.0 596351.0 950631.0 344511.0 177980.0 850250.0 0 rorrE dtS | ******************| . | ******************| . | ******************| . | ******************| . | ******************| . | ******************| . | *******************| . | *******************| . | *******************| . | *******************| . | *******************| . | *******************| . |********************| . |********************| . |********************| . |********************| | | | | | | | | | | | | | | | | 78088.0 71988.0 34798.0 38509.0 01419.0 51229.0 82039.0 64939.0 99749.0 71659.0 38369.0 99079.0 80879.0 09589.0 04399.0 00000.1 noitalerroC 149.0326 466.9826 290.8436 794.7046 010.6646 589.2256 844.0856 104.5466 177.5076 785.3676 018.7186 334.8686 926.8196 419.3796 669.6207 856.3707 ecnairavoC 51 41 31 21 11 01 9 8 7 6 5 4 3 2 1 0 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA 963 40501.48 0 snoitavresbO fo rebmuN noitaiveD dradnatS seireS gnikroW fo naeM ECIRP = elbairaV fo emaN erudecorP AMIRA ehT 26VON2 OT 16YAM71 SECIRP KCOTS YLIAD MBI B SEIRES Output 3.12 Identifying the IBM Price Series Chapter 3: The General ARIMA Model 107 357250.0 327250.0 266250.0 905250.0 905250.0 921250.0 0 rorrE dtS | **| . | . | . | . |*. | . |*. | . | . | **| . |********************| | | | | | | | 68021.0 10420.66430.23450.93100.85580.0 00000.1 noitalerroC 460053.6 164162.1718028.1957358.2498270.0410694.4 905835.25 ecnairavoC 6 5 4 3 2 1 0 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA 1 863 543842.7 98972.01 gnicnereffid yb detanimile )s(noitavresbO snoitavresbO fo rebmuN noitaiveD dradnatS seireS gnikroW fo naeM gnicnereffiD fo )s(doireP ECIRP = elbairaV fo emaN 609.0 659.0 419.0 469.0 229.0 179.0 039.0 879.0 939.0 689.0 849.0 399.0 1000.< 1000.< qSihC > rP 21 6 FD 04.7904 13.5312 erauqS -ihC 21 6 gaL oT --------------------snoitalerrocotuA-------------------- esioN etihW rof kcehC noitalerrocotuA | . | . | . | . | . | . | . |*. | . | . | **| . | . |*. | . |*. | . |*. | . |*. | . | . | .*| . | . | . | . |*. |********************| . | | | | | | | | | | | | | | | 96410.040210.0 78100.0 03820.054800.042180.0 46350.031620.049330.022440.053510.069350.0 52320.046170.004399.0 noitalerroC 51 41 31 21 11 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA laitraP erudecorP AMIRA ehT 26VON2 OT 16YAM71 SECIRP KCOTS YLIAD MBI B SEIRES Output 3.12 Identifying the IBM Price Series (continued) 108 SAS for Forecasting Time Series | | | | | | | | | | | | | | | . |*. .*| . . |*. .*| . **| . .*| . . |*. . | . .*| . **| . . | . . |*. . |*. . | . **| . | | | | | | | | | | | | | | | 06650.019170.0 43070.083620.0 76680.0 10540.0 14260.057320.0 55540.0 19221.0 04910.056520.058350.077800.085580.0 noitalerroC 51 41 31 21 11 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA laitraP | | | | | | | | | | | | | | | .*| . . |** **| . . |*. . |** . | . .*| . . | . . |*. . |** .*| . .*| . .*| . . | . . |** | | | | | | | | | | | | | | | 26350.0 24770.090870.0 04130.093970.006420.038050.0 55420.006950.019001.084140.0 23040.0 36620.0 63210.086780.0noitalerroC 51 41 31 21 11 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA esrevnI srorre dradnats owt skram "." 418450.0 006450.0 784450.0 343450.0 640450.0 220450.0 108350.0 637350.0 005350.0 | | | | | | | | | . |*. .*| . . |*. .*| . **| . . | . . |*. .*| . .*| . | | | | | | | | | 50660.75560.0 57740.16350.0 79670.0 78120.0 03660.26530.0 52860.0 100074.3101544.3 407805.2993618.2 887340.4 812941.1 682384.3606178.1 527585.3 51 41 31 21 11 01 9 8 7 Output 3.12 Identifying the IBM Price Series (continued) Chapter 3: The General ARIMA Model 109 110 SAS for Forecasting Time Series Output 3.12 Identifying the IBM Price Series (continued) The ACF dies off very slowly. The PACF indicates a very high coefficient, 0.99340, in the regression of Yt on Yt–1. The ACF of the differenced series looks like white noise. In fact, the Q statistics 9.98 and 17.42 are not significant. For example, the probability of a value larger than 9.98 in a distribution is .126, so 9.98 is to the left of the critical value and, therefore, is not significant. The Q statistics are computed with the first six (9.98) and first twelve (17.42) autocorrelations of the differenced series. With a first difference, it is common to find an indication of a lag 1 MA term. The first autocorrelation is 0.08558 with a standard error of about 1/(368)1/2=.052. Next, suppress the printout with the IDENTIFY statement (you have already looked at it but still want PROC ARIMA to compute initial estimates) and estimate the model: PROC ARIMA DATA=IBM; IDENTIFY VAR=PRICE(1) NOPRINT; ESTIMATE Q=1 NOCONSTANT; RUN; 6 2 The results are shown in Output 3.13. Output 3.13 Analyzing Daily Series with the ESTIMATE Statement: PROC ARIMA retemaraP --------------------snoitalerrocotuA-------------------450.0 121.0 gaL 1 770.0 420.0|t| > rP xorppA 0790.0 26VON2 OT 16YAM71 SECIRP KCOTS YLIAD MBI B SEIRES noitamitsE serauqS tsaeL lanoitidnoC 220.0 530.0158.5052 349.1052 211632.7 23163.25 eulaV t 66.1erudecorP AMIRA ehT 660.0450.0rorrE dradnatS 30250.0 CBS CIA etamitsE rorrE dtS etamitsE ecnairaV 630.0 100.0etamitsE 85680.0860.0 680.0 esioN etihW rof kcehC noitalerrocotuA qSihC > rP 4431.0 6521.0 1,1AM FD 21 6 erauqS -ihC 24.71 89.9 χ gaL oT 21 6 Chapter 3: The General ARIMA Model 111 Although the evidence is not strong enough to indicate that the series has a nonzero first-order autocorrelation, you nevertheless fit the MA(1) model. The t statistic –1.66 is significant at the 10% level. More attention should be paid to the lower-order and seasonal autocorrelations than to the others. In this example, you ignore an autocorrelation 0.121 at lag 6 that was even bigger than the lag 1 autocorrelation. Similarly, residuals from the final fitted model show a Q statistic 31.04 that attains significance because of autocorrelations .118 and .113 at lags 16 and 17. Ignore this significance in favor of the more parsimonious MA(1) model. The model appears to fit; therefore, make a third run to forecast: PROC ARIMA DATA=IBM; IDENTIFY VAR=PRICE(1) NOPRINT; ESTIMATE Q=1 NOCONSTANT NOPRINT; FORECAST LEAD=15; RUN; See the forecasts in Output 3.14. --------------------snoitalerrocotuA------------------- 750.0 710.0 530.0 990.0 520.0 040.0 450.0 021.0 960.0 810.0 220.0 820.0 660.0 311.0 270.0 030.0- .tnanimreted gol edulcni ton od CBS dna CIA * 863 slaudiseR fo rebmuN 500.0 060.0700.0150.0 720.0811.0 420.0 620.0slaudiseR fo kcehC noitalerrocotuA 1 )1(**B 85680.0 + 1 .ledom siht ni mret naem oN ECIRP elbairaV rof ledoM 370.0 950.0 360.0550.0980.0180.0070.0150.0srotcaF egarevA gnivoM gnicnereffiD fo )s(doireP 730.0580.0470.0001.0270.0 970.0 930.0 500.0 :1 rotcaF 611.0660.0 270.0 820.0 140.0 750.0650.0 100.0 qSihC > rP 4400.0 2110.0 2210.0 4900.0 6910.0 8910.0 5632.0 7122.0 74 14 53 92 32 71 11 5 FD 33.67 24.46 74.65 38.94 50.93 40.13 49.31 99.6 erauqS -ihC 84 24 63 03 42 81 21 6 gaL oT Output 3.13 Analyzing Daily Series with the ESTIMATE Statement: PROC ARIMA (continued) 112 SAS for Forecasting Time Series If 1− te as in the IBM example, then by repeated back substitution ... or  . . . + 3− t so that ). . . Forecasting Yt by such an exponentially weighted sum of past Ys is called single exponential smoothing. Higher degrees of differencing plus the inclusion of more MA terms is equivalent to higher-order exponential smoothing. PROC ARIMA, however, unlike PROC FORECAST with METHOD=EXPO, estimates the parameters from the data. Dickey and Fuller (1979) give a formal test of the null hypothesis that an AR series has a unit root nonstationarity versus the alternative that it is stationary. Said and Dickey (1984) extend the test to ARIMA models. The test involves a regression of where p is at least as large as the order of the AR process or, in the case of the mixed process, is large enough to give a good is called because it does not have approximation to the model. The t test on µτ p − tY 7583.383 0723.873 2665.173 7267.614 2827.414 7816.214 stimiL ecnedifnoC %59 26VON2 OT 16YAM71 SECIRP KCOTS YLIAD MBI B SEIRES 7183.133 3044.633 2102.343 7400.892 2930.003 7841.203 1 )1(**B 85680.0 + 1 :1 rotcaF .ledom siht ni mret naem oN ECIRP elbairav rof stsaceroF ECIRP elbairav rof ledoM srotcaF egarevA gnivoM gnicnereffiD fo )s(doireP rorrE dtS 6662.31 6586.01 1632.7 0692.03 9752.92 6181.82 erudecorP AMIRA ehT Output 3.14 Forecasting Daily Series: PROC ARIMA ∇ , . . . , 1− tY ]senil tuptuo erom[ + ) 3− Y − tsaceroF 7383.753 7383.753 7383.753 7383.753 7383.753 7383.753 Y − 1− Y t ∇ t 2 − tY( ,Y − 1−  Y 2 β + 2 − Yβ + 1− Y ) β − 1( + e 2 β + ) 2− Y − + sbO 273 173 073 483 383 283 t Y 3 − tY no 2 t β+ t ) 1− tY 2 − tY 1− tY( − t β+ β + ) 1− Y − t Y = tY β− 1− tY() t e = 1− tY − t ∇ β − 1( = erehw ( t t Y( = = t t t Y Y ˆ Y t e t Y∇ Chapter 3: The General ARIMA Model 113 a Student's t distribution and must be compared to tables provided by Fuller (1996, p. 642). The silver series from Chapter 2, “Simple Models: Autoregression,” is used as an illustration in the next section. 3.4.8 Models for Nonstationary Data You can formally test for unit root nonstationarity with careful modeling and special distributions. can be Any autoregressive model like the AR(2) model t Stationarity depends on the roots of the characteristic equation so if is a root, then So the term drops out of the model and forecasts do not revert to the mean. This discussion suggests a least squares regression of on and with an intercept and the use of the resulting coefficient or t test on the term as a test of the null hypothesis that the series has a unit root nonstationarity. If all roots M exceed 1 in magnitude, the coefficient of will be negative, suggesting a one-tailed test to the left if stationarity is the alternative. There is, however, one major problem with this idea: nor its t test has a standard distribution, even when the neither the estimated coefficient of sample size becomes very large. This does not mean the test cannot be done, but it does require the tabulation of a new distribution for the test statistics. Dickey and Fuller (1979, 1981) studied the distributions of estimators and t statistics in autoregressive models with unit roots. The leftmost column of the following tables shows the denotes a first difference. regressions they studied. Here ) 1= M ,0 = 2M 2 α − M1α − 1 µ − 1− Y t t e + ))1 − t(β − α − 1− tY (ρ = e + ) 2 − tY − 1− tY ( 2 α − ) µ − 1− tY () 2 α − 1α − 1( − = t written in terms of differences and the lagged level term ( becomes 1− t ). With a little algebra, the AR(2) e + )µ − 2− t Y( 2 α + ) µ − 1− Y( 1α = µ − t t e µ − 1− Y ( + β = Y∇ t e + ) µ − 1− tY (ρ = µ − mrof snoitaived ni )1(RA 1 = ρ : 0H t t e = Y∇ = Y∇ e t t e + 1− t t t Y t t Yρ = Y t Y∇ = 1− Y − Y ) ) µ − 1− Y ( µ − 1− Y ( t t t t t e + 1− tY )1 − ρ( + β + ) tβ + α()ρ − 1( = t β− α− t Y ) 2− tY t t e + 1− tY)1 − ρ( + µ)1 − ρ( = − 1− Y ( t Y .0 mrof noisserger ni )1(RA t = ) 2 α − 1α − 1( k − tY k − tY t :eseht no Y∇ e + 1− tY)1 − ρ( = k − tY ∇ 1− ∇ 1− Y ∇ 1− ∇ 1− Y ∇ 1− ∇ 1− Y 1− t Y t t Y Y− Y t Y t Y 1− Y , t ,1, ssergeR t 1− t t ,1, , Y− Y t t t t Y∇ Y∇ Y∇ t t t 114 SAS for Forecasting Time Series The lagged differences are referred to as “augmenting lags” and the tests as “Augmented DickeyFuller” or “ADF” tests. The three regression models allow for three kinds of trends. For illustration a lag 1 autoregressive model with autoregressive parameter is shown in the preceding table both in deviations form and in the algebraically equivalent regression form. The deviations form is most and if we have appropriate starting values, then the expected value instructive. It shows that if of is 0, , or depending on which model is assumed. Fit the first model only if you know the mean of your data is 0 (for example, might already be a difference of some observed variable). Use the third model if you suspect a regular trend up or down in your data. If you fit the third model when is really 0, your tests will be valid, but not as powerful as those from the second Note that for known parameters and n data points, the forecast of 2 would be However, if the L step ahead forecast is with forecast error variance so that the error variance increases without bound in this case. In both cases, the forecasts have a component that increases at the linear rate For the regression under discussion, the distributions for the coefficients of , 1, and t are all nonstandard. Tables of critical values and discussion of the theory are given in Fuller (1996). One very nice feature of these regressions is that the coefficients of the lagged differences have normal distributions in the limit. Thus a standard F test to see if a set of these lagged differences can be omitted is justified in large samples, as are the t statistics for the individual lagged difference coefficients. They converge to standard normal distributions. The coefficients of Y –1 and the associated t tests have distributions that differ among the three regressions and are nonstandard. Fortunately, however, the t test statistics have the same limit distributions no matter how many augmenting lags are used. As an example, stocks of silver on the New York Commodities Exchange were analyzed in Chapter 2 of this book. We reanalyze the data here using DEL to denote the difference, DELi for its ith lag, and LSILVER for the lagged level of silver. The WHERE PART=1; statement restricts analysis to the data used in the first edition. PROC REG DATA=SILVER; MODEL DEL=LSILVER DEL1 DEL2 DEL3 DEL4 /NOPRINT; TEST DEL2=0, DEL3=0, DEL4=0; WHERE PART=1; RUN; PROC REG DATA=SILVER; MODEL DEL=LSILVER DEL1; WHERE PART=1; RUN; t j Some output follows. First you have the result of the test statement for the model with four augmenting lags in Output 3.15. −t Y∇ , 2 σL 1− t Y ,) Lβ + nY ρ − 1(/ 2 σ increases, the forecast error variance approaches the variance of Y around the trend. 2 σ) 2 −L2ρ + + 2ρ + 1( 1< ρ for with forecast error variance 1=ρ L+n Y 1< ρ . β )n β − α − nY ( Lρ + )L + n(β + α β model. The parameter represents a trend slope when ρ t Y 1< ρ t β+ α β 1=ρ µ t Y and is called a “drift” when . . As L Chapter 3: The General ARIMA Model 115 Because this test involves only the lagged differences, the F distribution is justified in large samples. Although the sample size here is not particularly large, the p-value 0.2803 is not even close to 0.05, thus providing no evidence against leaving out all but the first augmenting lag. The second PROC REG produces Output 3.16. Output 3.16 PROC REG on Silver Data FD elbairaV Because the printed p-value 0.0079 is less than 0.05, the uninformed user might conclude that there is strong evidence against a unit root in favor of stationarity. This is an error because all p-values from PROC REG are computed from the t distribution whereas, under the null hypothesis of a unit root, this statistic has the distribution tabulated by Dickey and Fuller. The appropriate 5% left tail critical (Fuller 1996, p. 642), so the statistic is not far enough below 0 value of the limit distribution is to reject the unit root null hypothesis. Nonstationarity cannot be rejected. This test is also available in PROC ARIMA starting with Version 6 and can be obtained as follows PROC ARIMA DATA=SILVER; I VAR = SILVER STATIONARITY=(ADF=(1)) OUTCOV=ADF; RUN; 68.2 Output 3.17 contains several tests. stseT tooR tinU relluF-yekciD detnemguA F > rP F 4843.0 92.4 7911.0 68.3 uaT < rP uaT 7962.0 9860.0 0085.0 36.287.282.0- ohR < rP ohR 3831.0 1210.0 2326.0 2011.51- 1 5497.71- 1 1642.0- 1 sgaL dnerT naeM elgniS naeM oreZ epyT Output 3.17 Unit Root Tests, Silver Data |t| > rP 1000.< 9700.0 2800.0 F > rP 3082.0 LED elbairaV tnednepeD rof stluseR 1 tseT eulaV t 12.6 87.267.2 eulaV F 23.1 rorrE dradnatS 60801.0 61240.0 59363.72 erauqS naeM 08715.178 11791.2511 setamitsE retemaraP etamitsE retemaraP 51176.0 30711.037085.57 FD 14 3 rotanimoneD rotaremuN − 1 1 1 ecruoS 1LED REVLISL tpecretnI Output 3.15 Test of Augmenting Lags 116 SAS for Forecasting Time Series Every observed data point exceeds 400, so any test from a model that assumes a 0 mean can be ignored. Also, the PROC REG output strongly indicated that one lagged difference was required. Thus the tests with no lagged differences can also be ignored and are not requested here. The output in an shows coefficient (or “normalized bias”) unit root tests that would be computed as For the AR(2) model with roots and m, the regression model AR(1) model with coefficient form becomes )1 − ρ(n ˆ t so that the coefficient of is in terms of the roots. If it is seen that the coefficient of , 0.671152 in the silver example, is an estimate of m, so it is not surprising under that an adjustment using that statistic is required to get a test statistic that behaves like Specifically you divide the lag 1 coefficient by , then multiply by n. Similar adjustments can be made in higher-order processes. For the silver data, is shown in the printout and has a p-value (.0121) less than 0.05. However, based on simulated size and power results (Dickey 1984), the tau tests are preferable to these normalized bias tests. Furthermore, the adjustment for lagged differences is motivated by large sample theory and is not particularly large. The associated tau test, , has a p-value exceeding 0.05 and hence fails to provide significant evidence at the usual 0.05 level against the unit root null hypothesis. The F type statistics are discussed in Dickey and Fuller (1981). If interest lies there is no advantage to using the F statistics, which include restrictions only in inference about Simulations indicate that the polynomial deterministic on the intercept and trend as a part of trend should have as low a degree as is consistent with the data, in order to get good power. The 50 observations studied thus far do not display any noticeable trend, so the model with a constant mean seems reasonable, although tests based on the model with linear trend would be valid and would guard against any unrecognized linear trend. These tests are seen to provide even less evidence against the unit root. In summary, then, getting a test with validity and good statistical power requires appropriate decisions about the model, in terms of lags and trends. This is no surprise, as any statistical hypothesis test requires a realistic model for the data. )1 − ρ(n ˆ 87.2 )251176. − , 1=ρ t − 1( e + ) 2 − Y − 1− Y (ρm + ) µ − 1− Y ()ρ − 1()m − 1( − = 1− Y − Y e + ) 2 − tY − 1− tY ( 2 α − ) µ − 1− tY () 2 α − 1α − 1( − = t )430711.0 ( ρ t − t ) m − 1()ρ − 1( − . 0H 5497.71 − = )251176. − 1(/ )430711.0− (05 . ρ 05 = n 1− t 1− t Y− Y t Y ) 2− tY , ρ t t − 1− Y ( t .1 = ρ : 0H Chapter 3: The General ARIMA Model 117 The data analyzed here were used in the first edition of this book. Since then, more data on this series have been collected. The full set of data make it clear that the series is not stationary, in agreement with the tau statistic. In Output 3.18, the original series of 50 is plotted along with forecasts and confidence bands from an AR(2) that assumes stationarity in levels (solid lines), and an AR(1) fit to the differenced data (dashed lines). The more recent data are appended to the original 50. It is seen that for a few months into the forecast the series stays within the solid line bands, and it appears that the analyst who chooses stationarity is the better forecaster. He also has much tighter forecast bands. However, a little further ahead, the observations burst through his bands, never to return. The unit root forecast, though its bands may seem unpleasantly wide, does seem to give a more realistic assessment of the uncertainty inherent in this series. Output 3.18 Silver Series, Stationary and Nonstationary Models To illustrate the effects of trends, Output 3.19 shows the logarithm of the closing price of Amazon.com stock. The data were downloaded from the stock reports available through the Web search engine Yahoo! The closing prices are fairly tightly clustered around a linear trend as displayed in the top part of the figure. The ACF, IACF, and PACF of the series are displayed just below the series plot and those of the differenced series just below that. Notice that the ACF of the original series dies off very slowly. This could be due to a deterministic trend, a unit root, or both. The three plots along the bottom seem to indicate that differencing has reduced the series to stationarity. 118 SAS for Forecasting Time Series Output 3.19 Amazon Closing Prices Chapter 3: The General ARIMA Model 119 In contrast, Output 3.20 shows the volume of the same Amazon.com stocks. These too show a trend, but notice the IACF of the differenced series. If a series has a unit root on the moving average side, the IACF will die off slowly. This is in line with what you've learned about unit roots on the autoregressive side. For the model the dual model obtained by switching the backshift so that if is (near) 1 you expect the IACF to behave like operator to the AR side is the ACF of a (near) unit root process—that is, to die off slowly. Output 3.20 Amazon Volume ρ , 1− t eρ − te = , te = t Y t Y )B ρ − 1( 120 SAS for Forecasting Time Series This behavior is expected anytime is the difference of an originally stationary series. Chang and Dickey (1993) give a detailed proof of what happens to the IACF when such overdifferencing occurs. They find that an essentially linear descent in the IACF is consistent with overdifferencing. This can follow an initial drop-off, as appears to happen in the volume data. Notice that a linear trend is reduced to a constant by first differencing so such a trend will not affect the behavior of the IACF of the differenced series. Of course a linear trend in the data will make the ACF of the levels appear to die off very slowly, as is also apparent in the volume data. The apparent mixed message-differencing indicated by the levels’ ACF and too much differencing indicated by the differences’ IACF is not really so inconsistent. You just need to think a little outside the class of ARIMA models to models with time trends and ARIMA errors. Regression of differences on 1, t, a lagged level, and lagged differences indicated that no lagged differences were needed for the log transformed closing price series and two were needed for volume. Using the indicated models, the parameter estimates from PROC REG using the differenced series as a response, DATE as the time variable, LAGC and LAGV as the lag levels of closing price and volume, respectively, and lagged differences DV1 and DV2 for volume are shown in Output 3.21. Output 3.21 Closing Price and Volume— Unit Root Test t 10520.0 52225000.0 26420.0 SS I epyT 20561.4 51340.1 40296.52 94300.0 88510.0 SS I epyT |t| > rP 9900.0 1410.0 6410.0 |t| > rP 2000.0 6200.0 1000.< 1000.< 1000.< eulaV t 95.264.2 54.2- eulaV t 08.330.393.608.5 06.5- setamitsE retemaraP setamitsE retemaraP rorrE dradnatS 42110.0 27460000.0 34378.0 rorrE dradnatS 77340.0 52640.0 99430.0 81352000.0 09511.3 etamitsE retemaraP 01920.005951000.0 93931.2- etamitsE retemaraP 12661.069931.045322.074100.0 36434.71- Y FD 1 1 1 FD 1 1 1 1 1 CGAL etad tpecretnI 2VD 1VD VGAL etad tpecretnI elbairaV elbairaV Chapter 3: The General ARIMA Model 121 As before, these tests can be automated using the IDENTIFY statement in PROC ARIMA. For these examples, clearly only the linear trend tests are to be considered. Although power is gained by using a lower-order polynomial when it is consistent with the data, the assumption that the trend is simply a constant is clearly inappropriate here. (LAGC) for closing price and (LAGV) for The tau statistics (see Fuller 1996) are volume. Using the large n critical values at significance level 0.10, at 0.05, and at 0.01, it is seen that unit roots are rejected even at the 0.01 level for volume. Thus the volume series displays stationary fluctuations around a linear trend. There is not evidence for stationarity in closing prices even at the 0.10 level, so even though the series seems to hug the linear trend line pretty closely, the deviations cannot be distinguished from a unit root process whose variance grows without bound. An investment strategy based on an assumption of reversion of log transformed closing prices to the linear trend line does not seem to be supported here. That is not to refute the undeniable upward trend in the data—it comes out in the intercept or “drift” term (estimate 0.0068318) of the model for the differenced series. The model (computations not shown) is 1− te74540.0 69. have this positive drift term as their average, so it implies a positive change on The differences, average with each passing unit of time. A daily increase of 0.0068318 in the logarithm implies a multiplicative or 0.68% daily increase, which compounds to a -fold increase over the roughly 260 trading days in a year. This was a period of phenomenal growth for many such technology stocks, with this data going from about 3.5 to about 120 over two years’ time, roughly the predicted 36-fold increase. The top panel of Output 3.22 shows closing price forecasts and intervals for the unit root with drift model (forecast rising almost linearly from the last observation and outermost bands) and for a model with stationary residuals from a linear trend (forecast converging to trend line and interior bands) for the log scale data. The plot below, in which each of these has been transformed back to the original scale by exponentiation, deserves some comments. First, note the strong effect of the logarithmic transformation. Any attempt to model on the original scale would have to account for the obviously unequal variation in the data and would require a somewhat complex trend function, whereas once logs are taken, a rather simple model, random walk with drift, seems to suffice. There is a fairly long string of values starting around January 1999 that are pretty far above the trend curve. Recall that this trend curve is simply an exponentiation of the linear trend on the log scale and hence approximates a median, not a mean. This 50% probability number, the median, may be a more easily understood number for an investment strategist than the mean in a highly skewed distribution such as this. Also note that the chosen model, random walk with drift, does not even use this curve, so a forecast beginning on February 1, 1999, for example, would emanate from the February 1, 1999, data point and follow a path approximately parallel to this trend line. The residuals from this trend line would not represent forecasting errors from either model. Even for the model that assumes stationary but strongly correlated errors, the forecast consists of the trend plus an adjustment based on the error correlation structure. 68600. 3− 93.6 14. − 3− + t e + 8138600.0 = 31.3 95.2 − − t Y∇ 1= 8138600. ,t 6= Y∇ 87.1 0 e e= )8138600.0 (062 e 122 SAS for Forecasting Time Series Output 3.22 Amazon Closing Price (two Models, two Scales) Chapter 3: The General ARIMA Model 123 In fact the plot actually contains forecasts throughout the historic series from both models but they overlay the data so closely as to be hardly distinguishable from it. Note also that the combination of logs and differencing, while it makes the transformed series behave nicely statistically, produces very wide forecast intervals on the original scale. While this may disappoint the analyst, it might nevertheless be a reasonable assessment of uncertainty, given that 95% confidence is required and that this is a volatile series. In summary, ignorance of unit roots and deterministic trends in time series can lead to clearly inappropriate mean reverting forecasts, while careful modeling of unit roots and deterministic trends can lead to quite reasonable and informative forecasts. Note that p-values produced under the assumption of stationarity can be quite misleading when unit roots are in fact present as shown in the silver and stock closing price examples. Both of these show inappropriately small p-values when the p-values are computed from the t rather than from the Dickey-Fuller distributions. In the regression of differences on trend terms, lagged level, and lagged differences, the usual (t and F) distributions are appropriate in large samples for inference on the lagged differences. To get tests with the proper behavior, carefully deciding on the number of lagged differences is important. Hall (1992) studies several methods and finds that overfitting lagged differences then testing to leave some out is a good method. This was illustrated in the silver example and was done for all examples here. Dickey, Bell, and Miller (1986) in their appendix show that the addition of seasonal dummy variables to a model does not change the large sample (limit) behavior of the unit root tests discussed here. Some practitioners are under the false impression that differencing is justified anytime data appear to have a trend. In fact, such differencing may or may not be appropriate. This is discussed next. 3.4.9 Differencing to Remove a Linear Trend Occasionally, practitioners difference data to remove a linear trend. Note that if Yt has a linear trend then the differenced series 1− t t involves only the constant. For example, suppose t where et is white noise. Then which does not have a trend but, unfortunately, is a noninvertible moving average. Thus, the data have been overdifferenced. Now the IACF of W looks like the ACF of a time series with a unit root nonstationarity; that is, the IACF of W dies off very slowly. You can detect overdifferencing this way. The linear trend plus white noise model presented above is interesting. The ACF of the original data dies off slowly because of the trend. You respond by differencing, and then the IACF of the differenced series indicates that you have overdifferenced. This mixed signaling by the diagnostic functions simply tells you that the data do not fit an ARMA model on the original levels scale or on the differences scale. You can obtain the correct analysis in this particular case by regressing Y on t using PROC REG or PROC GLM. The situation is different if the error series et is not white noise but is instead a nonstationary time series whose difference 1− te 1− te e+ − e+β= Y− Y= t t t β+α= β+α − e t t W W t t Y 124 SAS for Forecasting Time Series Output 3.23 Plotting the Original Series The ACF shown in Output 3.24 is obtained by specifying the following statements: PROC ARIMA DATA=WORKERS; IDENTIFY VAR=NONPUB(1) NLAG=10; TITLE 'PUBLISHING AND PRINTING NONPRODUCTION WORKERS'; TITLE2 '1944-1977'; RUN; Because the ACF looks like that of an MA(1) and because it is very common to fit an MA(1) term when a first difference is taken, you do that fitting by specifying these statements: PROC ARIMA DATA=WORKERS; IDENTIFY VAR=NONPUB(1) NOPRINT; ESTIMATE Q=1; FORECAST LEAD=10; RUN; . is stationary. In that case, a model in the differences is appropriate and has an intercept estimating This scenario seems to hold in the publishing and printing data that produce the plot (U.S. Bureau of Labor 1977) shown in Output 3.23. The data are the percentages of nonproduction workers in the industry over several years. β 551991.0 718791.0 890791.0 743591.0 978491.0 626491.0 719191.0 477191.0 582091.0 870471.0 0 rorrE dtS | . | | . **| | . |* | . |** | . |* | . *| | . ***| | . *| | . |** | .******| |********************| . . . . . . . . . . | | | | | | | | | | | 02600.26390.0 84860.74601.19450.23040.0 44131.0 90030.0 68690.61213.0 00000.1 noitalerroC 3736100.0807420.0 470810.0101820.0294410.0146010.0 196430.0 2249700.0 565520.0783280.0 039362.0 ecnairavoC 01 9 8 7 6 5 4 3 2 1 0 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA 1 33 147315.0 30303.0 1 gnicnereffid yb detanimile )s(noitavresbO snoitavresbO fo rebmuN noitaiveD dradnatS seireS gnikroW fo naeM gnicnereffiD fo )s(doireP BUPNON = elbairaV fo emaN erudecorP AMIRA ehT 7791-4491 SREKROW NOITCUDORPNON GNITNIRP DNA GNIHSILBUP Output 3.24 Modeling and Forecasting with the IDENTIFY, ESTIMATE, and FORECAST Statements: PROC ARIMA The output shows a good fit based on the Q statistics the parameter estimates, and their t statistics . Note that the MU (0.3033) estimate is statistically significant and is roughly the slope in the plot of the data. Also, the MA coefficient is not near 1; in fact, it is a negative number. Thus, you have little evidence of overdifferencing. With only 33 observations, you have a lot of sampling variability (for example, look at the two standard error marks on the ACF). The number 0.3033 is sometimes called drift. Chapter 3: The General ARIMA Model 125 550.0- 040.0 131.0 030.0 790.0- 213.0 6175.0 qSihC > rP 6 FD 97.4 erauqS -ihC 6 gaL oT --------------------snoitalerrocotuA-------------------- esioN etihW rof kcehC noitalerrocotuA | | | | | | | | | | . |** . **| . |* . |** . |* . | . *| . ***| . |**** .******| . . . . . . . . . . | | | | | | | | | | 85001.074221.0 83720.003690.089830.079900.091250.0 37551.0 82512.061213.0 noitalerroC 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA laitraP | | | | | | | | | | . **| . . |*** . . ***| . . |* . . *| . . | . . |* . . |** . . *****| . . |********* | | | | | | | | | | 59670.0 94651.072841.0 95340.079230.0 68910.0 20160.061180.066622.0 44934.0noitalerroC 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA esrevnI Output 3.24 Modeling and Forecasting with the IDENTIFY, ESTIMATE, and FORECAST Statements: PROC ARIMA (continued) 126 SAS for Forecasting Time Series 1 3303.0 gnicnereffiD fo )s(doireP naeM detamitsE BUPNON elbairaV rof ledoM 750.0 580.0 750.0630.0321.0720.0270.0 230.0300.0801.0390.0911.0 760.0 571.0751.0 020.0 320.0100.0 411.0980.0700.0 460.0 450.0330.09099.0 6779.0 4579.0 9169.0 qSihC > rP 32 71 11 5 FD 70.01 14.7 08.3 10.1 erauqS -ihC 42 81 21 6 gaL oT --------------------snoitalerrocotuA-------------------- slaudiseR fo kcehC noitalerrocotuA 000.1 600.0 1,1AM 600.0 000.1 UM 1,1AM UM retemaraP setamitsE retemaraP fo snoitalerroC .tnanimreted gol edulcni ton od CBS dna CIA * 33 slaudiseR fo rebmuN 2762.15 CBS 91472.84 CIA 482884.0 etamitsE rorrE dtS 224832.0 etamitsE ecnairaV 3303.0 etamitsE tnatsnoC 1 0 gaL 0700.0 3910.0 |t| > rP xorppA 98.274.2 eulaV t 84161.0 49221.0 rorrE dradnatS 62664.003303.0 etamitsE 1,1AM UM retemaraP noitamitsE serauqS tsaeL lanoitidnoC erudecorP AMIRA ehT 7791-4491 SREKROW NOITCUDORPNON GNITNIRP DNA GNIHSILBUP Output 3.24 Modeling and Forecasting with the IDENTIFY, ESTIMATE, and FORECAST Statements: PROC ARIMA (continued) Chapter 3: The General ARIMA Model 127 128 SAS for Forecasting Time Series Output 3.24 Modeling and Forecasting with the IDENTIFY, ESTIMATE, and FORECAST Statements: PROC ARIMA (continued) srotcaF egarevA gnivoM 3.4.10 Other Identification Techniques In addition to the ACF, IACF, and PACF, three methods called ESACF, SCAN, and MINIC are available for simultaneously identifying both the autoregressive and moving average orders. These consist of tables with rows labeled AR 0, AR 1, etc. and columns MA 0, MA 1, etc. You look at the table entries to find the row and column whose labels give the correct p and q. Tsay and Tiao (1984, 1985) develop the ESACF and SCAN methods and show they even work when the autoregressive operator has roots on the unit circle, in which case rather than p is found. For ESACF and SCAN should give 3 as the autoregressive order. The key to showing their results is that standard estimation techniques give consistent estimators of the autoregressive operator coefficients even in the presence of unit roots. These methods can be understood through an ARMA(1,1) example. Suppose you have the where is the deviation from the mean at time t. The ARMA(1,1) process )1 is minimized. This is sometimes referred to as “performing on , or “projecting” onto the space spanned by a theoretical regression” of , It is accomplished by solving the matrix system of equations    kk    ) k (ρ   φ   1          )2(ρ  =  2 kφ   )2 − k (ρ      )1(ρ   1kφ   )1 − k (ρ t Z jk φ k − tZ , … , 2− t )2 − k (ρ )1 − k(ρ      1 )1(ρ   Z } ) k − tZ kk )1( 2 1− t ρ Z φ− t Z 1 − 2 − tZ 2 kφ − 1− tZ1kφ − k − tZ , . k − tZ , , 1− t … Z based on t . That is, you want to find coefficients for which t Z The partial autocorrelations are motivated by the problem of finding the best linear predictor of − j (ρα = ) j (ρ ] 2 2 )j ( ρ autocorrelations for . are 4016.05 7270.05 6025.94 3159.84 5063.84 9147.74 5580.74 3373.64 4565.54 6025.44 stimiL ecnedifnoC %59 ) α − β( + α − 1[/ ])βα − 1()β − α([ = )1(ρ ,1 = )0 (ρ Z , 1− eβ − e = 1− Z α − 1679.14 2709.14 6258.14 4518.14 6997.14 6118.14 3168.14 9669.14 3861.24 5606.24 )1(**B 62664.0 + 1 BUPNON elbairav rof stsaceroF d+p t rorrE dtS 7202.2 1380.2 2659.1 4028.1 7376.1 9215.1 7233.1 1421.1 6668.0 3884.0 :1 rotcaF t t tsaceroF 2392.64 9989.54 6686.54 3383.54 0080.54 7677.44 4374.44 1071.44 8668.34 5365.34 t t t e = ) 3− tY − 1− tY (7.0 − ) 2 − tY − Z sbO 44 34 24 14 04 93 83 73 63 53 … , 2− t 1> j Z Z ({E 1− t t Y( , and Z F T AM E Y L in the ARMA(1,1) example, you note that k Chapter 3: The General ARIMA Model 129 which does not in general equal Therefore Z – 11Z –1 is not and thus does not equal The autocorrelations of would not drop to 0 beyond the moving average order. Increasing k beyond 1 will not solve the problem. Still, it is clear that there is some linear combination of and namely whose autocorrelations theoretically identify the order of the moving average part of your model. In general neither the sequence nor any sequence contains the autoregressive coefficients unless the process is a pure autoregression. You are looking for a linear combination whose autocorrelation is 0 for j exceeding the moving average j or    1 − 121φ   α   1  11     φ − α  =  1  1 − 11φ 2  α   α  22C   11 − φ   11 =   φ 1  12C   1 , 11 φ 11 −  11 11  22  φ  )1 − 121φ( φ =  C  φ  12C   0 = )1(ρ Relating this to the Cs and noting that 11  φ      22    α  )1(ρ =  φ   1 1  12φ   )1(ρ  0  1 1   11φ )1( 1  ρ  1 j 2 φ )1( ρα = )2(ρ Noting that , the coefficients satisfy  22   11  C   φ−  12C   1 0   22φ  =    12φ  1 . 22 C11φ− = 22φ and In matrix form you have . 2 − tZ 22 φ+ 1 − tZ12φ =) 2 − tZ11 − 1− tZ ( 22C φ + 1 − tZ12C = 1 − t,1R 22C + 1 − tZ12 C That is, 22 Thus it must be that 1− t Z t Z , 22 C 12 C coefficients, say and which give the same fit, or projection, as regressing 1 − t,1 R 1− t Z t Z , 2−t Z 1− t Z linear combination of and so regressing on and produces regression on and , 2 − tZ11 φ− 1 − tZ = 1 − t,1R 1− t Z t Z The lagged residual from the theoretical regression of on C order q (1 in our example). The trick is to discover p and the s from the data. is which is a 1− t , 1− t Z11φ − Z Zα − Z t t .α , 1− t Z t Z .1− t eβ − te ,] ) 2 1− Z α − Z φ α − β( + α − 1[/ ])βα − 1()β − α([ = )1(ρ = 1π = 11φ j k φ 2 p −t Z p C − − 2− Z2C − 1− Z1C − Z t k π t t C + 12C = 12φ t t 1= k At π ,2 , 1= k kk φ = kπ Letting 3.3.2.3.) for … produces the sequence of partial autocorrelations. (See Section . 2− t t t Z 130 SAS for Forecasting Time Series You now “filter” Z using only that is, you compute which is just and this in turn is a moving average of order 1. Its lag 1 autocorrelation (it is nonzero) will appear in the AR 1 row and MA 0 column of the ESACF table. Let the residual from this regression be denoted . The next step is to regress on and In this regression, the theoretical . The lag 2 autocorrelation of which is 0, will be written in the MA 1 column of the AR 1 row. For the ESACF of a general ARMA in the AR p row, once your coefficients and the filtered series will be a MA(q), so its autocorrelations will be 0 beyond lag q. The entries in the AR k row of the ESACF table are computed as follows: t, j (and ) Second-round coefficients: Record in MA 0 column, the lag 1 autocorrelation of (3) Regress on 1 etc. Notice that at each step, you lag all residuals that were previously included as regressors and add the lag of the most recent residual to your regression. The estimated C coefficients and resulting filtered theoretically row p should series differ at each step. Looking down the ESACF table of an AR be the first row in which a string of 0s appears and it should start at the MA q column. Finding that row and the first 0 entry in it puts you in row p column q of the ESACF. The model is now identified. Here is a theoretical ESACF table for an ARMA(1,1) with “X” for nonzero numbers: ,) q ,p ( 0 0 0 0 X 5 AM 0 0 0 0 X 4 AM 2 + k,3C , 1+ k,3 (and Third-round coefficients: Record in MA 1 column the lag 2 autocorrelation of ) t, 3 R with residual t, 2 R + k,2 X 0 0 0 X 3 AM C C 1 − t,2 R , 2 − t,1R , k − tZ , , k2 C , 1 X X 0 0 X 2 AM − t,1R , k− tZ , , k3C , k−t k−t … Z 2C − − 2− Z 22C − 1− Z12C − Z Z 3C − − 2− Z 23C − 1− Z13C − Z … k , 12 k , 13 C C X X X *0 X 1 AM … , 2 − tZ , 1− t , 2 − tZ , 1− t t t X X X X X 0 AM Z Z t t Z (2) Regress on k1C , … , 21C , 11 C Coefficients: with residual 1 R k − tZ , , 2 − tZ , 1− t … Z t Z (1) Regress on with residual k C regression has at least q lagged residuals, the first p theoretical ) q ,p ( , 1− t Zα − Z t t, 2 R previously. Notice the use of the lagged value of 3 … −t Z φ− −Z = 1− t 11 Z t t Z coefficient of will again be but its estimate may differ somewhat from the one obtained and the second lag of the first round residual will be the p autoregressive , 1− t Zα − Z t , 1− tZ12 C − Z t .1− t,2 R , 2 − t,1 R , 1− t Z ;α = 12C ,α Z t 2 4 RA 3 RA 2 RA 1 RA 0 RA t t t 2 − t,1 t, 2 R R Chapter 3: The General ARIMA Model 131 The string of 0s slides to the right as the AR row number moves beyond p, so there appears a combination. triangular array of 0s whose “point” 0* is at the correct In practice, the theoretical regressions are replaced by least squares regressions, so the ESACF table will only have numbers near 0 where the theoretical ESACF table has 0s. A recursive algorithm is used to quickly compute the needed coefficients without having to compute so many actual regressions. PROC ARIMA will also use asymptotically valid standard errors based on Bartlett’s formula to deliver a table of approximate p-values for the ESACF entries and will suggest values of p and q as a tentative identification. See Tsay and Tiao (1984) for further details. Tsay and Tiao (1985) suggest a second table called SCAN. It is computed using canonical correlations. For the ARMA(1,1) model, recall that the autocovariances are 5− tY , , 1− tY , t The entries in square brackets form the submatrix of covariances between the vectors and That submatrix A, the variance matrix of and the variance of are matrix ,) 1− tY , tY ( then H is called Recall that, for a square matrix M, if a column vector H exists such that you an eigenvector and the scalar b is the corresponding eigenvalue of matrix M. Using . A CA C 22 1− ′ 11 1− of 0 eigenvalues of A is the same as the number of 0 eigenvalues of general time series covariance matrices. 2 A12−C A11−1C ′ 22 11 ;H0 = HA 1−C′A1−C ,′ )0 ,0 ( = HA see that so that is, has an eigenvalue 0. The number This is true for , ) ′ α − ,1( = H , Hb = HM .′ ) 3− tY , 2 − tY ( 2 R is analogous to a regression matrix and canonical correlations between ′ ) 1− tY , tY ( statistic. Its eigenvalues are called squared ,) 1− tY , tY ( 11 C . A12−2C A − 11C ′ error variance matrix 2 A12−C A11−1C ′ Because matrix represents the variance of ′ ) 3− tY , 2 − tY( 22 1− CA ′ ′ ) 3− tY , 2 − tY ( ′ ) 1− tY , tY ( The best linear predictor of based on is Y etc., so the covariance matrix of  )1( γ α 3   )1(γ α 4 with prediction the … ,)1( γ ,)0( γ   )0(γ   )1( γ  )1( γα   )1(γ 2 α   )0(γ  )1(γ 11C )1( γ )1( γ )1( γα )0( γ )1( γ )1( γ 3 2 )0(   )1( γ   = 22C = 11C  α 2 α  )1(γ = A 1 α  γ α α ) q ,p ( is )1( γ )1( γα )1( γα )0(γ )1(γ )1(γ 2 α )1( γ )1( γα )1( γα )0( γ )1( γ )1( γ 2 α 2×2 ,)1( γ ])1( γα[ )1( γ )1( γ ])1( γ [ )0( γ )1( γ 3 α = )4( γ 2 3 α α )1( γ α   4  )1( γ α  3 ])1( γ α[  2 =Γ ])1( γα[  )1( γ    )0( γ ,)1( γ .) 3− tY , 2 − tY ( ) 3− tY , 2 − tY ( 2 α = )3( γ ,)1( γα 22 C ) 1− tY , tY ( = )2( γ 132 SAS for Forecasting Time Series The fourth of these statements is easily seen. The linear combinations and its second lag have correlation 0 because each is an MA(1). The smallest canonical correlation is and and finding the pair with obtained by taking linear combinations of correlation closest to 0. Since there exist linear combinations in the two sets that are uncorrelated, the smallest canonical correlation must be 0. Again you have a method of finding a linear combination whose autocorrelation sequence is 0 beyond the moving average lag q. matrix whose upper-left element is in row Again there is a full-rank )1( γ 2 α  , column 1 of the original matrix. In this has all 0 elements, namely  0  0  0  0   α−   )1( γ α  =   0   1   )1(γ α ) α − ,1( = H 2 2 ,3 A corresponding eigenvectors. Similarly, using )1(γ )1( γ α  = α 2 3 H A 2 ,3 3, 3 A showing that matrix has (at least) 2 eigenvalues that are 0 with the columns of H being the and   0   α− 0 =  1    0 0 H 3A 3,  0   )1(γ α  α−  )1( γα  1   )1( γ 2 )1( γ α )1( γ α )1( γα 2×3 2 3  )1( γ α  )1( γ α  = 3 4  H A 3,3 . 3,3 A is matrix H for which 3×3 2, 2 A notation, the A with square bracketed elements is denoted and the bottom left matrix of m ,j A , , 2 − tY , 1− tY , t Y In general, construct an arbitrarily large covariance matrix of 1− t Yα − … t Y ) 3− tY , 2 − tY ( 1+ j ) 1− tY , tY ( ) 3− tY , 2 − tY ( ) 1− tY , tY ( (4) The vectors 2 A12−C A11−1C and ′ 2×2 (3) The 2×2 (2) The 2×2 (1) The matrix A is not of full rank (its rank is 1). matrix A has at least one eigenvalue 0. matrix has at least one eigenvalue 0. have at least one squared canonical correlation that is 0. α The matrix A has first column that is times the second, which implies these equivalent statements: 3− t Yα − 2− Y m×m t and let be the Γ Chapter 3: The General ARIMA Model 133 with and have at least one 0 eigenvalue. This provides the key to the SCAN table. If row, column entry is the smallest canonical correlation derived you make a table whose from you have this table for the current example: where the Xs represent nonzero numbers. Relabeling the rows and columns with and gives the SCAN (smallest canonical correlation) table. It has a rectangular array of 0s whose upper-left corner is at the p and q corresponding to the correct model, ARMA(1,1) for the current example. The first column of the SCAN table consists of the autocorrelations and the first row consists of the partial autocorrelations. ) q ,p ( is zero for Therefore, in the variance covariance matrix for such a process, any submatrix with whose upper-left element is at row j, column 1 of the original matrix will have at least one 0 eigenvalue with eigenvector if Hence 0 will appear in the theoretical table whenever and . Approximate standard errors are obtained by applying Bartlett's formula to the series filtered by the autoregressive coefficients, which in turn can be extracted from the H matrix (eigenvectors). An asymptotically valid test, again making use of Bartlett's formula, is available and PROC ARIMA displays a table of the resulting p-values. p >m .q >j m×m − j ( γ p α − − )2 − j ( γ 2 α − )1 − j ( γ1α − ) j ( γ )p .q >j ′ )0 , … ,0 ,0 , p ,i α− , α coefficients are you notice from the Yule-Walker equations that Γ In PROC ARIMA, entries of the variance-covariance matrix estimated autocovariances. To see why the 0s appear for an ARMA above would be replaced by whose autoregressive m, j A 1− j = q ) q ,p ( have at least one 0 eigenvalue for this example. For general ARIMA htj models, all 1> j m, j A . 1> j 2, j 4=m 3=p A X X 0 0 0 0 0 0 3=m 2=p 6×6 X X 0 0 0 0 0 0 htm 2=m 1= p X X 0 0 0 0 0 0 p >m 1= m 0=p q>j p >m α− ,1α− ,1( X X X X X X X X … ,2 , m, j q>j 1− m = p 4=j 3=j 2=j 1= j 3=q 2=q 1= q 0=q A 2, 3 1> m A so has (at least) one 0 eigenvalue, as does for all In fact all with and 134 SAS for Forecasting Time Series The MINIC method simply attempts to fit models over a grid of p and q choices, and records the SBC information criterion for each fit in a table. The Schwartz Bayesian Information Criterion is where p and q are the autoregressive and moving average orders of the candidate model and is an estimate of the innovations variance. Some sources refer to Schwartz’s criterion, perhaps normalized by n, as BIC. Here, the symbol SBC is used so that Schwartz’s criterion will not be confused with the BIC criterion of Sawa (1978). Sawa’s BIC, used as a model selection tool in PROC REG, is for a full regression model with n observations and k parameters. The MINIC technique chooses p and q giving the smallest SBC. It is possible, of course, that the fitting will fail due to singularities in which case the SBC is set to missing. The fitting of models in computing MINIC follows a clever algorithm suggested by Hannan and Rissanen (1982) using ideas dating back to Durbin (1960). First, using the Yule-Walker equations, a long autoregressive model is fit to the data. For the ARMA(1,1) example of this section it is seen that t truncated version of this infinite autoregression will approximate the process well. To the extent that this is true, the Yule-Walker equations for a length k (k large) autoregression can be solved to of the coefficients of the terms and a residual series give estimates, say t selected by minimizing the AIC criterion. PROC I I I RUN; ARIMA VAR=Y VAR=Y VAR=Y DATA=A; NLAG=1 MINIC P=(0:5) Q=(0:5); NLAG=1 ESACF P=(0:5) Q=(0:5); NLAG=1 SCAN P=(0:5) Q=(0:5); 4. = β 8. = α To illustrate, 1000 observations on an ARMA(1,1) with analyzed. The following code generates Output 3.25: and are generated and t e ˆ squares for this regression, pick p and q to minimize the SBC criterion The length of the autoregressive model for the .)n (nl )q n /1 qp ˆ 2 σ . q − tˆ , e , 2 − tˆ , 1− tˆ , p − tY , e e , 1− tY t Y + p ( + ) qp σ(nl n = CBS ˆ 2 , q ,p of order regress on Letting e that is close to the actual series. Next, for a candidate model be times the error sum of series can be t e … j −t Y … k − tY k b ˆ − − 2 − Y 2ˆ − 1− Y1ˆ − b b b , jˆ t , 1< β and as long as the coefficients on lagged Y will die off quite quickly, indicating that a ] ) k −n ( 2 e+] n − k− n + 3− Y2β + 2− Yβ + 1− Y[)β − α( = Y n )2 t + k ([2 + ) 2 s (nl n t t ,)n (nl )q t 2 s + p ( + ) 2 s (nl n = CBS t t Y = t e ˆ 2 3 1 q 5 4 1 d+p FCASE )leveL ecnacifingiS %5( stseT noitceleS redrO evitatneT )q,d+p(AMRA 1588.0 2969.0 4789.0 3006.0 1594.0 0100.0 5 AM 0882.0 0857.0 4367.0 3975.0 3622.0 1000.0 4 AM 9873.0 5447.0 8959.0 3106.0 9213.0 1000.0 3 AM 1380.0 1000.0 1000.0 0465.0 5401.0 1000.0 2 AM 1000.0 1000.0 1000.0 1000.0 9841.0 1000.0 1 AM 1000.0 1000.0 1000.0 1000.0 1000.0 1000.0 0 AM 5 4 3 2 1 0 RA RA RA RA RA RA sgaL seulaV ytilibaborP FCASE 1500.05100.0 6000.0 7120.02420.07651.0 5 AM 8830.0 9110.0 6110.0 6810.0 7140.0 4812.0 4 AM 6330.0 2310.00200.0 4810.0 0630.05752.0 3 AM 3370.02491.0 2091.0 7910.0 4650.0 7043.0 2 AM 7771.0 3571.07532.03992.04150.04493.0 1 AM 6384.0 4704.07021.04754.06233.05505.0 0 AM 5 4 3 2 1 0 RA RA RA RA RA RA sgaL noitcnuF noitalerrocotuA elpmaS dednetxE 73840.0- = )0,3(CIB :eulaV elbaT muminiM )9(RA :ledom seires rorrE 79400.035110.029710.082420.082030.0778300.0 5 AM 67110.011710.066320.072030.056530.0751820.0 4 AM 7710.027320.045820.02530.03040.0353950.0 3 AM 16420.074030.073530.012140.020540.0165711.0 2 AM 54130.069630.086140.040440.035740.0205771.0 1 AM 33830.068340.073840.085930.08800.065482.0 0 AM 5 4 3 2 1 0 RA RA RA RA RA RA sgaL noiretirC noitamrofnI muminiM Output 3.25 ESACF, SCAN and MINIC Displays Chapter 3: The General ARIMA Model 135 136 SAS for Forecasting Time Series Output 3.25 ESACF, SCAN and MINIC Display (continued) The tentative order selections in ESACF and SCAN simply look at all triangles (rectangles) for which every element is insignificant at the specified level (0.05 by default). These are listed in descending order of size (below the tables), size being the number of elements in the triangle or rectangle. In our example ESACF (previous page) and SCAN (above) list the correct (1,1) order at a preliminary AR(9) model, to create the the top of the list. The MINIC criterion uses estimated white noise series, then selects as the order, this also being one choice given by the SCAN option. The second smallest SBC, occurs at the correct As a check on the relative merits of these methods, 50 ARMA(1,1) series each of length 500 are pairs obtained by choosing and from such generated for each of the 12 that This gives 600 series. For each, the ESACF, SCAN, and MINIC methods are used, the results are saved, and the estimated p and q are extracted for each method. The whole experiment is repeated with series of length 50. A final set of 600 runs for using gives the last three columns. Asterisks indicate the correct model. .)1,1( 05 = n }9. ,3. ,3. ,9. { = )q ,p ( 5 AM 5 AM 4000.0 2000.0 1000.0 0100.0 3000.0 9420.0 4006.0 0807.0 7789.0 5543.0 5395.0 0100.0 − 1− te3. − + t e+ 4 AM 4 AM 2000.0 2000.0 2000.0 2000.0 1100.0 3840.0 8786.0 1337.0 6676.0 0227.0 3153.0 1000.0 β 4 − tY 5. )leveL ecnacifingiS %5( stseT noitceleS redrO evitatneT )q,d+p(AMRA setamitsE noitalerroC lacinonaC derauqS seulaV ytilibaborP ]1[erauqS-ihC NACS = α t Y 3 AM 3 AM 1000.0 2000.0 1000.0 2000.0 8000.0 0760.0 1119.0 1366.0 1379.0 7676.0 7904.0 1000.0 ,35740. , )0 ,3( = ) q , p ( 9= k − 2 AM 2 AM 2000.0 2000.0 2000.0 2000.0 1200.0 0711.0 3147.0 5976.0 0766.0 1417.0 5491.0 1000.0 0 1 q 3 1 d+p NACS ) , 1 AM 1 AM 1000.0 0100.0 7000.0 3200.0 8100.0 3651.0 8558.0 2293.0 0824.0 9481.0 3622.0 1000.0 β α( 0 AM 0 AM 5000.0 8000.0 2000.0 0410.0 7430.0 7652.0 3394.0 1473.0 7646.0 2000.0 1000.0 1000.0 .β ≠ α sgaL sgaL 5 4 3 2 1 0 5 4 3 2 1 0 RA RA RA RA RA RA RA RA RA RA RA RA Chapter 3: The General ARIMA Model 137 It is reassuring that the methods almost never underestimate p or q when n is 500. For the ARMA(1,1) with parameters in this range, it appears that SCAN does slightly better than ESACF, with both being superior to MINIC. The SCAN and ESACF columns do not always add to 600 because, for some cases, no rectangle or triangle can be found with all elements insignificant. Because SCAN compares the smallest normalized squared canonical correlation to a distribution ( ) that is appropriate for a randomly selected one, it is also very conservative. By analogy, even if 5% of men exceed 6 feet in height, finding a random sample of 10 men whose shortest member exceeds 6 feet in height would be extremely rare. Thus the appearance of a significant bottom-rightcorner element in the SCAN table, which would imply no rectangle of insignificant values, happens rarely—not the 30 times you would expect from .03 006 0 0 0 0 0 5 0 0 0 0 * 0 *** 89 0 0 2 7 72 12 0 0 8 121 52 91 0 0 3 03 87 83 2 53 2 11 33 53 |>- 05=n )1,4(AMRA -- 05=n )1,1(AMRA -- 005=n )1,1(AMRA - rP 63 03 42 81 21 6 FD 88.673 81.692 13.512 50.991 45.571 30.911 erauqS -ihC 63 03 42 81 21 6 gaL oT --------------------snoitalerrocotuA-------------------- esioN etihW rof kcehC noitalerrocotuA | | | | | | | | | | | | | | | . *| . . |** . . |********* . |* . . **| . . ***| . . *| . .****| . . **| . . **| . . | . . |* . . |* . . |***** *****************| . | | | | | | | | | | | | | | | 41130.0 32111.060364.034750.062311.0 83931.0 75460.0 89671.0 26480.0 19811.0 01600.025230.024450.022942.057268.0 noitalerroC 51 41 31 21 11 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA laitraP )deunitnoc( AMIRA CORP :tnemetatS YFITNEDI eht htiw FCA eht gnitupmoC 2.4 tuptuO 148 SAS for Forecasting Time Series | | | | | | | | | | | | | | | . | . . |* . . |****. ***********| . *****| . . |* . . |** . . |* . . |*** . . |*** . . |****. . |** . . |* . . |** . ********| . | | | | | | | | | | | | | | | 67700.049750.057602.072065.0 87932.0 34960.070711.070530.053351.025341.069612.071001.044260.040670.072293.0 noitalerroC 51 41 31 21 11 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA laitraP | | | | | | | | | | | | | | | . . . . . . . . . . . . . . . |* . **| . *| . |********* **| . *| . *| . |** . *| . | . *| . **| . |* . |** . |****. | | | | | | | | | | | | | | | 34640.035680.0 53860.0 72744.062870.0 17660.0 19860.0 24970.057070.0 38020.053720.0 51380.0 98720.098480.085422.0noitalerroC 51 41 31 21 11 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA esrevnI 823812.0 948712.0 015112.0 478571.0 692961.0 472961.0 955861.0 508661.0 478951.0 213151.0 176341.0 638141.0 426141.0 467041.0 190321.0 0 rorrE dtS | . | . | . **| . | . ******| . | *************| . | . *****| . | . | . | . |** . | . |*** . | . |*****. | . |****** | . |*****. | . |*** . | . |* . | . **| . | ********| . |********************| | | | | | | | | | | | | | | | | 62420.20380.0 86992.0 59476.0 47372.0 04510.82980.33931.63372.15692.37272.74131.65440.45980.0 72293.0 00000.1 noitalerroC 615182.9594767.13 176.411 862.852 647.401 103298.5811361.43355513.35995.401954.311953.401087703.05887940.71456162.43 101.051 546.283 ecnairavoC 51 41 31 21 11 01 9 8 7 6 5 4 3 2 1 0 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA )deunitnoc( AMIRA CORP :tnemetatS YFITNEDI eht htiw FCA eht gnitupmoC 2.4 tuptuO Chapter 4: The ARIMA Model: Introductory Applications 149 122541.0 384441.0 746341.0 688141.0 048141.0 923141.0 892141.0 436041.0 375041.0 201931.0 077831.0 234631.0 824631.0 773631.0 380631.0 0 rorrE dtS | . **| | . **| | . |** | . |** | . | | . *| | . | | . |* | . | | . |** | . |* | . |*** | . | | . | | . |* |********************| . . . . . . . . . . . . . . . | | | | | | | | | | | | | | | | 28570.0 99570.0 46080.35611.77810.0 54260.0 75510.0 70170.84120.0 24501.58940.28131.54500.04910.0 94640.00000.1 noitalerroC 327220.7 939830.7 715964.7198397.01804837.1 240587.5 289144.1 150385.6604989.1 278467.9469716.4598012.21671505.0041797.1 534603.4927926.29 ecnairavoC 51 41 31 21 11 01 9 8 7 6 5 4 3 2 1 0 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA 31 45 434426.9 62907.121,1 gnicnereffid yb detanimile )s(noitavresbO snoitavresbO fo rebmuN noitaiveD dradnatS seireS gnikroW fo naeM gnicnereffiD fo )s(doireP TCRTSNOC = elbairaV fo emaN 143.0 271.0244.0 622.0576.0 792.0- 041.0 841.0181.0 202.0472.0 372.0- 521.0610.0850.0170.0510.0131.0- 711.0220.0180.0420.0980.0540.0- 161.0190.0 421.0380.0 931.0090.0 422.0412.0 112.0003.0 372.0293.0 1000.< 1000.< 1000.< 1000.< 1000.< 4000.0 qSihC > rP 63 03 42 81 21 6 FD 32.171 60.731 07.421 66.39 44.67 36.42 erauqS -ihC 63 03 42 81 21 6 gaL oT --------------------snoitalerrocotuA-------------------- esioN etihW rof kcehC noitalerrocotuA )deunitnoc( AMIRA CORP :tnemetatS YFITNEDI eht htiw FCA eht gnitupmoC 2.4 tuptuO 150 SAS for Forecasting Time Series 530.0660.0 042.0040.0 711.0501.0- 480.0 550.0 942.0090.0910.0 050.0- 830.0350.0 540.0 730.0260.0 231.0- 120.0 150.0 110.0670.0 610.0 500.0- 720.0281.0480.0 670.0 170.0910.0 710.0531.0 401.0 180.0120.0 640.0- 4318.0 4246.0 6917.0 3699.0 3889.0 9419.0 qSihC > rP 63 03 42 81 21 6 FD 83.82 36.62 95.91 79.5 07.3 50.2 erauqS -ihC 63 03 42 81 21 6 gaL oT --------------------snoitalerrocotuA-------------------- esioN etihW rof kcehC noitalerrocotuA | | | | | | | | | | | | | | | . . . . . . . . . . . . . . . **| *| |** |*** | *| | |** | |** |* |*** | | |* . . . . . . . . . . . . . . . | | | | | | | | | | | | | | | 67401.0 63760.0 39701.056251.016410.0 28230.0 08110.082880.068900.0 16801.080360.019231.077300.082710.0 94640.0noitalerroC 51 41 31 21 11 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA laitraP | | | | | | | | | | | | | | | . |** . |** . **| .****| . **| . |* . |* . **| . *| . ***| . **| . **| . |* . |** . ***| . . . . . . . . . . . . . . . | | | | | | | | | | | | | | | 61670.016890.076380.0 93022.0 19690.0 58440.019330.057670.0 41550.0 78521.0 93211.0 98111.0 64130.043290.071131.0 noitalerroC 51 41 31 21 11 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA esrevnI )deunitnoc( AMIRA CORP :tnemetatS YFITNEDI eht htiw FCA eht gnitupmoC 2.4 tuptuO Chapter 4: The ARIMA Model: Introductory Applications 151 7769.906 3243.695 2580.095 3232.774 7752.964 8419.864 7168.33 1024.23 4119.03 0006.345 0008.235 0005.925 97 87 77 ]senil tuptuo erom[ 9387.706 5490.216 7850.806 1614.145 5509.755 3147.965 9039.61 0428.31 0577.9 rorrE dtS 0006.475 0000.585 0009.885 tsaceroF 07 96 86 sbO stimiL ecnedifnoC %59 TCRTSNOC elbairaV rof stsaceroF .ledom siht ni mret naem oN 21,1 gnicnereffiD fo )s(doireP TCRTSNOC elbairaV rof ledoM 591.0870.0 970.0160.0902.0450.0050.0 110.0270.0 300.0390.0 980.0020.0 201.0 250.0 130.0 211.0 801.0 330.0840.0 531.0 640.0750.0 910.05968.0 5899.0 7799.0 0889.0 qSihC > rP 42 81 21 6 FD 94.61 22.5 26.2 39.0 erauqS -ihC 42 81 21 6 gaL oT --------------------snoitalerrocotuA-------------------- slaudiseR fo kcehC noitalerrocotuA 45 2764.993 2764.993 430577.9 3155.59 slaudiseR fo rebmuN CBS CIA etamitsE rorrE dtS etamitsE ecnairaV erudecorP AMIRA ehT SDNASUOHT NI SREKROW NOITCURTSNOC WEIVER NOITCURTSNOC 152 SAS for Forecasting Time Series Output 4.3 Forecasting Seasonal Data with the IDENTIFY, ESTIMATE, and FORECAST Statements: PROC ARIMA is known as the airline model. Its popularity started when Box and Jenkins (1976) used it to model sales of international airline tickets on a logarithmic scale. Output 4.4 shows plots of the original and log scale data from Box and Jenkins’s text. t e ) B1θ − 1( )B1θ − 1( = tY ) B − 1( )B − 1( 21 21 The model The results are shown in Output 4.3. To forecast the seasonal data, use the following statements: PROC ARIMA DATA=CONST; IDENTIFY VAR=CONSTRCT(1,12) NOPRINT; ESTIMATE NOCONSTANT METHOD=ML; FORECAST LEAD=12 INTERVAL=MONTH ID=DATE OUT=OUTF; RUN; Chapter 4: The ARIMA Model: Introductory Applications 153 Output 4.4 Plotting the Original and Log Transformed Box and Jenkins Airline Data 154 SAS for Forecasting Time Series Now analyze the logarithms, which have the more stable seasonal pattern, using these SAS statements: PROC ARIMA DATA=AIRLINE; IDENTIFY VAR=LAIR; IDENTIFY VAR=LAIR(1); TITLE 'SERIES G'; TITLE2 'INTERNATIONAL AIRLINES PASSENGERS'; RUN; The results are shown in Output 4.5. It is hard to detect seasonality in the ACF of the original series because all the values are so near 1. The slow decay is much more evident here than in the construction example. Once you take the first difference, you obtain the ACF . Looking at the seasonal lags (12,24), you see little decay, indicating you should consider a span 12 difference. To create the variable ) 31− t and its ACF, inverse autocorrelation function (IACF), and partial autocorrelation function (PACF), issue the following SAS statements: PROC ARIMA DATA=AIRLINE; IDENTIFY VAR=LAIR(1,12); The model is identified from the autocorrelations. Identification depends on pattern recognition in the plot of the ACF values against the lags. The nonzero ACF values are called spikes to draw to mind and the plots PROC ARIMA produces in the IDENTIFY stage. For the airline model, if the theoretical autocorrelations of the series 0 > 1θ t should have 1 gal ta e kips ) e v itag en( a e kips 21 g al e ht fo se bol e d is d e llac 31 dna 11 sgal ta se k ips ) e v iti sop d na( lauq e 21 gal ta e kips ) e v itag en( a .0 sn o italerro c gal re h to lla Y − 21− Y( − ) 1− Y − Y( = V e ) B 2 θ − 1( )B1θ − 1( = 21 t t t t t V ,0 > 2θ | | | | | | | | | | | | | | | . |* . .**| . ***| . . |******* ***| . . | . . | . . | . . | . . | . . | . . | . . *| . . | . . |********** | | | | | | | | | | | | | | | 62660.099670.0 61041.0 38143.044351.0 02800.081010.0 22910.0 07900.007010.0 73220.093210.067920.0 92300.078305.0noitalerroC 51 41 31 21 11 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA esrevnI . . . . . . . . . . . . . . . 574453.0 657543.0 682533.0 830323.0 044013.0 197792.0 369482.0 377172.0 694752.0 275142.0 254322.0 321202.0 994571.0 819931.0 333380.0 0 rorrE dtS | . ************| | .*************| | **************| | ***************| | ***************| | ***************| | ***************| | ***************| | ***************| | ***************| | ****************| | ****************| | *****************| | ******************| | *******************| |********************| | | | | | | | | | | | | | | | | 63816.0 40366.0 05617.0 49167.0 30857.0 62447.0 56337.0 31727.0 06737.0 44657.0 09877.0 34808.0 08058.0 29898.0 07359.0 00000.1 noitalerroC 276911.0 913821.0 566831.0 954741.0 107641.0 630441.0 389141.0 227041.0 847241.0 593641.0 147051.0 554651.0 656461.0 869371.0 175481.0 035391.0 ecnairavoC 51 41 31 21 11 01 9 8 7 6 5 4 3 2 1 0 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA 441 129934.0 671245.5 snoitavresbO fo rebmuN noitaiveD dradnatS seireS gnikroW fo naeM RIAL = elbairaV fo emaN erudecorP AMIRA ehT SREGNESSAP SENILRIA LANOITANRETNI G SEIRES Output 4.5 Identifying the Logarithms with the IDENTIFY Statement: PROC ARIMA Chapter 4: The ARIMA Model: Introductory Applications 155 961251.0 272151.0 811941.0 850111.0 753801.0 485701.0 217601.0 300990.0 031890.0 280890.0 875790.0 738980.0 050880.0 798680.0 426380.0 0 rorrE dtS | . |** . | . |*** . | . ****| . | *****************| . | ****| . | . |** . | . |** . | . |******* | . |** . | . *| . | . |** . | . |****** | . |***. | . |**. | ****| . |********************| | | | | | | | | | | | | | | | | 00611.55931.90512.0 34148.0 58502.0 72901.95511.27633.69011.87520.0 79380.70223.77051.01021.57991.0 00000.1 noitalerroC 8703100.04375100.01524200.0 0784900.0 9023200.0 0232100.02303100.05697300.01152100.056092000.0 8649000.03136300.09996100.02453100.02252200.0 572110.0 ecnairavoC 51 41 31 21 11 01 9 8 7 6 5 4 3 2 1 0 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA 1 341 381601.0 44900.0 1 gnicnereffid yb detanimile )s(noitavresbO snoitavresbO fo rebmuN noitaiveD dradnatS seireS gnikroW fo naeM gnicnereffiD fo )s(doireP RIAL = elbairaV fo emaN 025.0 915.0 267.0 657.0 715.0 445.0 857.0 977.0 605.0 675.0 447.0 808.0 894.0 816.0 437.0 158.0 094.0 366.0 727.0 998.0 105.0 717.0 837.0 459.0 1000.< 1000.< 1000.< 1000.< qSihC > rP 42 81 21 6 FD 23.5871 49.1251 26.7511 73.836 erauqS -ihC 42 81 21 6 gaL oT --------------------snoitalerrocotuA-------------------- esioN etihW rof kcehC noitalerrocotuA | . *| . | . |* . | . |********** | . |* . | .**| . | . *| . | ****| . | .**| . | . *| . | . *| . | .**| . | . | . | . *| . | . |**. | *******************| . | | | | | | | | | | | | | | | 22240.0 53430.034584.074240.040601.0 19360.0 01402.0 26990.0 30830.0 73440.0 28511.0 67320.0 32450.0 75711.007359.0 51 41 31 21 11 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1noitalerroC snoitalerrocotuA laitraP Output 4.5 Identifying the Logarithms with the IDENTIFY Statement: PROC ARIMA (continued) 156 SAS for Forecasting Time Series | | | | | | | | | | | | | | | .**| . . |**** . *| . ************| . . |****** . |*********** . |**** . |********** . |**** . |* . . | . . |****** . |**. . |*** ****| . | | | | | | | | | | | | | | | 40021.0 91181.089520.0 40685.0 92203.088135.092291.067494.082012.055470.087700.0 98013.088590.056661.057991.0 noitalerroC 51 41 31 21 11 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA laitraP | | | | | | | | | | | | | | | . |* . ***| . . | . . |***** .**| . ****| . . |* . .**| . *****| . ***| . . *| . ******| . *******| . .**| . ****| . | | | | | | | | | | | | | | | 72830.045051.0 26020.0 26372.026301.0 24091.0 96740.003911.0 40142.0 22331.0 57950.0 46382.0 26043.0 62590.0 05671.0 noitalerroC 51 41 31 21 11 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA esrevnI Output 4.5 Identifying the Logarithms with the IDENTIFY Statement: PROC ARIMA (continued) Chapter 4: The ARIMA Model: Introductory Applications 157 447611.0 625611.0 110511.0 126401.0 813401.0 198301.0 975101.0 975101.0 743101.0 572101.0 240101.0 700101.0 078790.0 600790.0 073780.0 0 rorrE dtS | . ***| . | . |* . | . ***| . | . |******** | . *| . | . |** . | ****| . | . | . | . |* . | . *| . | . *| . | . | . | . |**** | . **| . | . |******* |********************| | | | | | | | | | | | | | | | | 75941.0 16750.06151.0 16683.83460.0 63670.73671.0 67000.85550.08030.0 56550.0 63120.0 41202.50501.0 21143.00000.1 noitalerroC 00213000.0 2021000.042613000.0 5608000.013431000.0 3951000.019763000.0 6-E7685.19511000.062460000.0 01611000.0 65440000.0 7124000.031912000.0 6117000.00680200.0 ecnairavoC 51 41 31 21 11 01 9 8 7 6 5 4 3 2 1 0 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA 31 131 376540.0 192000.0 21,1 gnicnereffid yb detanimile )s(noitavresbO snoitavresbO fo rebmuN noitaiveD dradnatS seireS gnikroW fo naeM gnicnereffiD fo )s(doireP RIAL = elbairaV fo emaN erudecorP AMIRA ehT SREGNESSAP SENILRIA LANOITANRETNI G SEIRES 737.0 210.0 148.0 620.0 991.0 250.0602.0 480.0- 570.0972.0901.0223.0- 701.0611.0611.0151.0- 733.0041.0733.0021.0- 411.0512.0 111.0002.0 1000.< 1000.< 1000.< 1000.< qSihC > rP 42 81 21 6 FD 35.123 57.591 98.961 59.72 erauqS -ihC 42 81 21 6 gaL oT --------------------snoitalerrocotuA-------------------- esioN etihW rof kcehC noitalerrocotuA Output 4.5 Identifying the Logarithms with the IDENTIFY Statement: PROC ARIMA (continued) 158 SAS for Forecasting Time Series t e = tV ) B 2 θ − 1( )B1θ − 1( 21 t e ) B 2 θ − 1( )B1θ − 1( = 21 810.0610.0 783.0130.0 322.0 070.0 460.0 650.0 190.0931.0670.0120.0 930.0 051.0 671.0 202.0- 711.0850.0100.0501.0 110.0251.0 650.0143.0- 1000.< 1000.< 1000.< 7000.0 qSihC > rP 42 81 21 6 FD --------------------snoitalerrocotuA-------------------- esioN etihW rof kcehC noitalerrocotuA | | | | | | | | | | | | | | | . | . . |**. . |**. . |******* . *| . . *| . *****| . . | . . |* . . *| . . *| . . |*** . |**** . | . . |******* | | | | | | | | | | | | | | | 57120.048670.081901.096833.095640.0 70340.0 85522.0 22020.091060.086430.0 90330.0 30521.066291.018210.021143.0noitalerroC 51 41 31 21 11 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA laitraP | | | | | | | | | | | | | | | . *| . . *| . .**| . ********| . .**| . . | . . |**. . *| . . |* . . |*** . |**. .**| . .**| . .**| . *******| . | | | | | | | | | | | | | | | 44150.0 55650.0 64421.0 58973.0 80501.0 31410.0 08480.094920.0 97950.010661.072121.098801.0 29990.0 49590.0 23623.0 noitalerroC 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA esrevnI Output 4.5 Identifying the Logarithms with the IDENTIFY Statement: PROC ARIMA (continued) or the IACF of t V The pattern that follows represents the ACF of 72.47 44.26 74.15 72.32 erauqS -ihC 42 81 21 6 gaL oT 51 41 31 21 11 01 9 8 7 6 5 4 3 2 1 gaL Chapter 4: The ARIMA Model: Introductory Applications 159 160 SAS for Forecasting Time Series When you compare this pattern to the ACF of the LAIR(1,12) variable, you find reasonable agreement. If the signs of the parameters are changed, the spikes and side lobes have different signs but remain at the same lags. The spike and side lobes at the seasonal lag are characteristic of seasonal multiplicative models. Note that if the multiplicative factor is on the AR side, this pattern appears in the IACF instead of in the ACF. In that case, the IACF and PACF behave differently and the IACF is easier to interpret. If the model is changed to t the spike and side lobes are visible at the seasonal lag (for example, 12) and its multiples (24, 36, and If the so on), but the magnitudes of the spikes at the multiples decrease exponentially at rate decay is extremely slow, an additional seasonal difference is needed If the pattern appears in the IACF, the following model is indicated: . The SAS code for the airline data is PROC ARIMA DATA=AIRLINE; IDENTIFY VAR=LAIR(1,12) NOPRINT; ESTIMATE Q=(1)(12) NOCONSTANT; FORECAST LEAD=12 OUT=FORE ID=DATE INTERVAL=MONTH; RUN; PROC GPLOT DATA=FORE(FIRSTOBS=120); PLOT (LAIR FORECAST L95 U95)*DATE / OVERLAY HMINOR=0; SYMBOL1 V=A L=1 I=JOIN C=BLACK; SYMBOL2 V=F L=2 I=JOIN C=BLACK; SYMBOL3 V=L C=BLACK I=NONE; SYMBOL4 V=U C=BLACK I=NONE; RUN; DATA FORE; SET FORE; IF RESIDUAL NE .; RUN; PROC SPECTRA P WHITETEST DATA=FORE OUT=RESID; VAR RESIDUAL; RUN; PROC GPLOT DATA=RESID; PLOT P_01*FREQ/HMINOR=0; SYMBOL1 F=TRIPLEX V=* I=JOIN C=BLACK; RUN; α .) 1 = α( gaL >--- 6-5-4-3-2-1- 0-9-8-7-6-5- 4-3-2-1-0- 9-8-7-6-5-4-3-2-1-+ t e ) B α − 1( = tV ) B 2 θ − 1( )B1θ − 1( e ) B 2θ − 1( )B1θ − 1( = 21 21 * * * * * * * 21 2 1− t Vα − V t * * * * * * * * * | 1 )21(**B 63275.0 - 1 )1(**B 72773.0 - 1 :2 rotcaF :1 rotcaF srotcaF egarevA gnivoM .ledom siht ni mret naem oN 21,1 gnicnereffiD fo )s(doireP RIAL elbairaV rof ledoM 310.0 110.0 320.0770.0 412.0 550.0 520.0 180.0 120.0631.0540.0001.0130.0460.0 411.0 911.0690.0630.0 320.0820.0 890.0210.0 940.0010.0 2724.0 2547.0 0046.0 3272.0 qSihC > rP 22 61 01 4 FD 65.22 89.11 98.7 51.5 erauqS -ihC 42 81 21 6 gaL oT --------------------snoitalerrocotuA-------------------- slaudiseR fo kcehC noitalerrocotuA 000.1 190.01,2AM 190.0000.1 1,1AM 1,2AM 1,1AM retemaraP setamitsE retemaraP fo snoitalerroC .tnanimreted gol edulcni ton od CBS dna CIA * 131 slaudiseR fo rebmuN 383.084CBS 331.684CIA 455730.0 etamitsE rorrE dtS 14100.0 etamitsE ecnairaV 21 1 gaL 1000.< 1000.< |t| > rP xorppA 43.7 06.4 eulaV t 20870.0 69180.0 rorrE dradnatS 63275.0 72773.0 etamitsE 1,2AM 1,1AM retemaraP noitamitsE serauqS tsaeL lanoitidnoC erudecorP AMIRA ehT SREGNESSAP SENILRIA LANOITANRETNI G SEIRES Output 4.6 Fitting the Airline Model: PROC ARIMA The results are shown in Output 4.6 and Output 4.7. Chapter 4: The ARIMA Model: Introductory Applications 161 6186.0 910980.0 eulaV-P etamixorppA citsitatS tseT .elbairav modnar )1,0(mrofinu a fo FDC eht dna margodoirep eht fo smus laitrap dezidradnats eht fo ecnereffid etulosba mumixaM :citsitatS vonrimS-vorogomloK s'tteltraB 930556.3 appaK ))*(P(MUS/))*(P(XAM*M :appaK s'rehsiF 204181.0 2010.0 56 = ))*(P(muS ))*(P(xaM M LAUDISER elbairaV rof esioN etihW rof tseT erudecorP ARTCEPS ehT SREGNESSAP SENILRIA LANOITANRETNI G SEIRES 7633.6 6522.6 1463.6 9899.5 5009.5 1250.6 2680.0 9280.0 6970.0 8761.6 1360.6 1802.6 651 551 451 ]senil tuptuo erom[ 9072.6 4041.6 1381.6 7470.6 9669.5 9530.6 0050.0 2440.0 6730.0 rorrE dtS 8271.6 6350.6 5901.6 tsaceroF 741 641 541 sbO stimiL ecnedifnoC %59 RIAL elbairaV rof stsaceroF erudecorP AMIRA ehT SREGNESSAP SENILRIA LANOITANRETNI G SEIRES Output 4.6 Fitting the Airline Model: PROC ARIMA (continued) 162 SAS for Forecasting Time Series Chapter 4: The ARIMA Model: Introductory Applications 163 Output 4.7 Plotting the Forecasts and the Periodogram: PROC ARIMA and PROC SPECTRA 164 SAS for Forecasting Time Series PROC SPECTRA is also used to search for hidden periodicities in the airline residuals. No periodicities are indicated in the periodogram plot or in the white noise tests produced by PROC SPECTRA. Refer to Chapter 7, “Spectral Analysis,” for more information on PROC SPECTRA. 4.2 Models with Explanatory Variables Sometimes you can improve forecasts by relating the series of interest to other explanatory variables. Obviously, forecasting in such situations requires knowledge (or at least forecasts) of future values of such variables. The nature of the explanatory variables and of the model relating them to the target series determines the optimal forecasting method. Explanatory variables are addressed in Chapter 2, “Simple Models: Autoregression.” There, they are deterministic, meaning that their future values are determined without error. Seasonal indicator variables and time t are deterministic. Explanatory variables like interest rates and unemployment are not deterministic because their future values are unknown. Chapter 2 assumes that the relationship between the target series Yt and the explanatory series X1t, X2t, . . . , Xkt satisfies the usual regression model assumptions t t where et is white noise. The Durbin-Watson statistic is used in Chapter 2 to detect departures from the assumptions on et. The following methods are appropriate when the Durbin-Watson statistic from PROC REG or PROC GLM shows significant autocorrelation. Recall that if the regression analysis from PROC REG or PROC GLM shows no autocorrelation and if known future values (as opposed to forecasts) are available for all Xs, you can forecast with appropriate prediction intervals by This chapter combines regression with time series errors to provide a richer class of forecasting models. Three cases are delineated below, presented in order of increasing complexity. Examples are included, and special cases are highlighted. 4.2.1 Case 1: Regression with Time Series Errors The model is Z + tk X kβ + ... + 2 X 2 β + l X1β + 0β = Y t t t where Zt is an ARIMA time series. This is a typical regression except that you allow for autocorrelation in the error term Z. The Y series does not depend on lagged values of the Xs. If the error series is purely autoregressive of order p, the SAS code PROC AUTOREG DATA=EXAMP; MODEL Y=X1 X2 X3 / NLAG=P; RUN; .tnemetats TUPTUO eht ni =59L ,=59U sdrowyek eht ro tnemetats LEDOM eht ni noitpo ILC eht htiw sX eht no Y gnisserger sY erutuf rof ).( seulav gnissim dna sX erutuf gniylppus e+ tk X k β +. . . + t1 X 1 β + 0β = t Y Chapter 4: The ARIMA Model: Introductory Applications 165 properly fits a model to k=3 explanatory variables. Because PROC ARIMA can do this and can also accommodate mixed models and differencing, it is used instead of PROC AUTOREG in the analyses below. In case 1, forecasts of Y and forecast intervals are produced whenever future values of the Xs are supplied. If these future Xs are user-supplied forecasts, the procedure cannot incorporate the uncertainty of these future Xs into the intervals around the forecasts of Y. Thus, the Y forecast intervals are too narrow. Valid intervals are produced when you supply future values of deterministic Xs or when PROC ARIMA forecasts the Xs in a transfer function setting as in cases 2 and 3. 4.2.2 Case 1A: Intervention If one of the X variables is an indicator variable (each value 1 or 0), the modeling above is called intervention analysis. The reason for this term is that X usually changes from 0 to 1 during periods of expected change in the level of Y, such as strikes, power outages, and war. For example, suppose Y is the daily death rate from automobile accidents in the United States. Suppose that on day 50 the speed limit is reduced from 65 mph to 55 mph. Suppose you have another 100 days of data after this intervention. In that case, designate Xt as 0 before day 50 and as 1 on and following day 50. The model t explains Y in terms of two means (plus the error term). Before day 50 the mean is 0 and on and following day 50 the mean is Thus, is the effect of a lower speed limit, and its statistical significance can be judged based on the t test for If the model is fit by ordinary regression but the Zs are autocorrelated, this t test is not valid. Using PROC ARIMA to fit the model allows a valid test; supplying future values of the deterministic X produces forecasts with valid forecast intervals. The 1s and 0s can occur in any meaningful place in X. For example, if the speed limit reverts to 65 mph on day 70, you set X back to 0 starting on day 70. If a data point is considered an outlier, you can use an indicator variable that is 1 only for that data point in order to eliminate its influence on the ARMA parameter estimates. Deleting the point results in a missing value (.) in the series; closing the gap with a DELETE statement makes the lags across the gap incorrect. You can avoid these problems with the indicator variable approach. PROC ARIMA also provides an outlier detection routine. 4.2.3 Case 2: Simple Transfer Function In this case, the model is t where Xt and Zt are independent ARIMA processes. Because X is an ARIMA process, you can estimate a model for X in PROC ARIMA and use it to forecast future Xs. The algorithm allows you to compute forecast error variances for these future Xs, which are automatically incorporated later into the Y forecast intervals. First, however, you must identify a model and fit it to the Z series. You accomplish this by studying the ACF, IACF, and PACF of residuals from a regression of Y on X. In fact, you can accomplish this entire procedure within PROC ARIMA. Once you have identified and fit models for X and Z, you can produce forecasts and associated intervals easily. .0 = 1β : 0 H 1 β .1β + 0 β Z+ X β t t Z + X1β + 0 β = Y β = )0 ( ) 1β ( + 0β 1 + 0β = t t Y 166 SAS for Forecasting Time Series You can use several explanatory variables, but for proper forecasting they should be independent of one another. If the explanatory variables contain arbitrary correlations, use the STATESPACE procedure, which takes advantage of these correlations to produce forecast intervals. 4.2.4 Case 3: General Transfer Function In case 3, you allow the target series Yt to depend on current and past values of the explanatory variable X. The model is t where X and Z are independent ARIMA time series. Because it is impossible to fit an infinite number of unrestricted to a finite data set, you restrict the to have certain functional forms depending on only a few parameters. The appropriate form for a given data set is determined by an identification process for the that is very similar to the usual identification process with the ACFs. Instead of inspecting autocorrelations, you inspect cross-correlations; but you are looking for the same patterns as in univariate ARIMA modeling. The are called transfer function weights or impulse-response weights. You can use several explanatory Xs, but they should be independent of one another for proper forecasting and identification of the Even if you can identify the model properly, correlation among explanatory variables causes incorrect forecast intervals because the procedure assumes independence when it computes forecast error variances. Because you need forecasts of explanatory variables to forecast the target series, it is crucial that X does not depend on past values of Y. Such a dependency is called feedback. Feedback puts you in a circular situation where you need forecasts of X to forecast Y and forecasts of Y to forecast X. You can use PROC STATESPACE to model a series with arbitrary forms of feedback and crosscorrelated inputs. Strictly AR models, including feedback, can be fit by multiple regression as proved by Fuller (1996). A general approach to AR modeling by nonlinear regression is also given by Fuller (1986). .s 4.2.5 Case 3A: Leading Indicators Suppose in the model above you find that 0 Then Y responds two periods later to movements in X. X is called a leading indicator for Y because its movements allow you to predict movements in Y two periods ahead. The lead of two periods is also called a shift or a pure delay in the response of Y to X. Such models are highly desirable for forecasting. sβ sβ β Z + − X β 0=∞Σ + α = Y j t j ≠ 2β j sβ sβ ,0 = 1β = 0β t Chapter 4: The ARIMA Model: Introductory Applications 167 4.2.6 Case 3B: Intervention You can use an indicator variable as input in case 3B, as was suggested in case 1A. However, you identify the pattern of the differently than in case 3. In case 3 cross-correlations are the key to identifying the pattern, but in case 3B cross-correlations are virtually useless. sβ 4.3 Methodology and Example 4.3.1 Case 1: Regression with Time Series Errors In this example, a manufacturer of building supplies monitors sales (S) for one of his product lines in terms of disposable income (D), U.S. housing starts (H), and mortgage rates (M). The data are obtained quarterly. Plots of the four series are given in Output 4.8. The first task is to determine the differencing desired. Each series has a fairly slowly decaying ACF, and you decide to use a differenced series. Each first differenced series has an ACF consistent with the assumption of stationarity. The D series has differences that display a slight, upward trend. This trend is not of concern unless you plan to model D. Currently, you are using it just as an explanatory variable. The fact that you differenced all the series (including sales) implies an assumption about the error term. Your model in the original levels of the variable is t When you lag by 1, you get 1− t When you subtract, you get t η∇ + M∇3 β + H∇ 2β + D∇1β + 0 = S∇ η + 1− M 3 β + 1− H 2 β + 1− D1β + 0 β = 1− S t η + M 3β + H 2β + D1β + 0β = S t t t t t t t t β t t t 168 SAS for Forecasting Time Series Output 4.8 Plotting Building- and ManufacturingRelated Quarterly Data Chapter 4: The ARIMA Model: Introductory Applications 169 Output 4.8 Plotting Building- and ManufacturingRelated Quarterly Data (continued) 170 SAS for Forecasting Time Series Thus, differencing implies that had a unit root nonstationarity, so the differenced error series is stationary. This assumption, unlike assumptions about the explanatory series, is crucial. If you do not want to make this assumption, you can model the series in the original levels. Also, in the development above, you assume a simple intercept that canceled out of the differenced model. If, is present, the differenced series has intercept If you had decided to fit the in fact, a trend model in the original levels and to allow only AR error structures, PROC AUTOREG or Fuller's PROC NLIN method (1986) would have been an appropriate tool for the fitting. Assuming differencing is appropriate, your next task is to output the residuals from regression and to To accomplish this in PROC ARIMA, you choose a time series model for the error structure must modify your IDENTIFY and ESTIMATE statements. The IDENTIFY statement is used to call in all explanatory variables of interest and to declare the degree of differencing for each. The CROSSCOR= option accomplishes this goal. You specify the following SAS statements: PROC ARIMA DATA=HOUSING; TITLE 'MODEL IN FIRST DIFFERENCES'; IDENTIFY VAR=SALES(1) CROSSCOR=(MORT(1) DPIC(1) STARTS(1)) NOPRINT; RUN; . The NOPRINT option eliminates the printing of the cross-correlation function. Because you assume a contemporaneous relationship between sales and the explanatory variables, you do not check the cross-correlation function for dependence of sales on lagged values of the explanatory variables. If you want to check for lagged dependencies, you need to model the explanatory series to perform prewhitening. This is the only way you can get clear information from the cross-correlations. To run a regression of SALES(1) on MORT(1), DPIC(1), and STARTS(1), add the following statement to your PROC ARIMA code: ESTIMATE INPUT=(MORT DPIC STARTS) PLOT METHOD=ML; RUN; The INPUT= option denotes which variables in the CROSSCOR= list are to be used in the regression. Specifying differencing in the INPUT= option is not allowed. The order of differencing in the CROSSCOR= list is the order used. The PLOT option creates and plots the ACF, IACF, and PACF of the residuals. The results are shown in Output 4.9. ψ . η∇ 0 t β t η t ψ + 0β 688881.0 147881.0 975881.0 782781.0 527481.0 586381.0 854181.0 834181.0 823181.0 411851.0 0 rorrE dtS | . |** . | . |* . | . *| . | . |** . | . ***| . | . |** . | . ***| . | . | . | . |* . | . |******** |********************| | | | | | | | | | | | 51990.90330.29430.0 65890.40831.0 55780.45721.0 38110.23820.89693.00000.1 noitalerroC 815.79891466.0466933.7007 225.08791666.30772 334.07571265.69552 345.3732374.2865521.96697786002 ecnairavoC 01 9 8 7 6 5 4 3 2 1 0 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1- slaudiseR fo tolP noitalerrocotuA erudecorP AMIRA ehT 390.0 651.0 620.0 831.0 930.0410.0 350.0 880.0340.0981.0990.0821.0 471.0 341.0 330.0210.0961.0221.0530.0 820.0330.0110.0990.0793.01894.0 5425.0 3295.0 3971.0 qSihC > rP 42 81 21 6 FD 73.32 89.61 72.01 09.8 erauqS -ihC 42 81 21 6 gaL oT --------------------snoitalerrocotuA-------------------- slaudiseR fo kcehC noitalerrocotuA 000.1 930.0943.0 610.0 3MUN STRATS 930.0000.1 170.0619.02MUN CIPD 943.0 170.0000.1 820.01MUN TROM 610.0 619.0820.0000.1 UM SELAS 3MUN 2MUN 1MUN UM STRATS CIPD TROM SELAS retemaraP elbairaV setamitsE retemaraP fo snoitalerroC 04 2634.216 6086.506 5089.744 5.686002 6830.071 0 0 0 0 tfihS STRATS CIPD TROM SELAS elbairaV 0 0 0 0 gaL 1000.< 5218.0 9771.0 2943.0 |t| > rP xorppA slaudiseR fo rebmuN CBS CIA etamitsE rorrE dtS etamitsE ecnairaV etamitsE tnatsnoC 73.21 42.053.149.0 eulaV t 25893.0 98422.4 60521.211 44736.181 rorrE dradnatS 90039.4 21200.100470.15175830.071 etamitsE 3MUN 2MUN 1MUN UM retemaraP noitamitsE doohilekiL mumixaM erudecorP AMIRA ehT AMIRA CORP :noissergeR a nuR ot tnemetatS ETAMITSE eht fo noitpO =TUPNI eht gnisU 9.4 tuptuO Chapter 4: The ARIMA Model: Introductory Applications 171 490039.4 1 STRATS rotcaF noissergeR llarevO gnicnereffiD fo )s(doireP elbairaV tupnI 3 rebmuN tupnI 21200.11 CIPD rotcaF noissergeR llarevO gnicnereffiD fo )s(doireP elbairaV tupnI 2 rebmuN tupnI 470.1511 TROM rotcaF noissergeR llarevO gnicnereffiD fo )s(doireP elbairaV tupnI 1 rebmuN tupnI 1 6830.071 gnicnereffiD fo )s(doireP tpecretnI detamitsE SELAS elbairaV rof ledoM erudecorP AMIRA ehT | | | | | | | | | | . . . . . . . . . . |**** . | . *| . *| . ***| . | . *| . |*** . |**** . |******** | | | | | | | | | | 57402.000420.002720.0 33530.0 17461.0 53000.038270.0 12141.096022.089693.0noitalerroC 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA laitraP | | | | | | | | | | . ***| . . ***| . . **| . . *| . . |*** . . |**** . . |*****. . |* . . ****| . **********| . | | | | | | | | | | 46631.0 84651.0 73211.0 03430.0 93651.049381.030132.099650.047202.0 04205.0 noitalerroC 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA esrevnI )deunitnoc( AMIRA CORP :noissergeR a nuR ot tnemetatS ETAMITSE eht fo noitpO =TUPNI eht gnisU 9.4 tuptuO 172 SAS for Forecasting Time Series Chapter 4: The ARIMA Model: Introductory Applications 173 Output from the ESTIMATE statement for the sales data indicates that sales are positively related to housing starts but negatively related to mortgage rates and disposable personal income . In terms of significance, only the t statistic for housing starts exceeds 2. However, unless you fit the correct model, the t statistics are meaningless. The correct model includes specifying the error structure, which you have not yet done. For the moment, ignore these t statistics. You may argue based on the chi-square checks that the residuals are not autocorrelated. However, because the first chi-square statistic uses six correlations, the influence of a reasonably large correlation at lag 1 may be lessened to such an extent by the other five small correlations that significance is lost. Look separately at the first few autocorrelations, and remember that differencing is often accompanied by an MA term. Thus, you fit a model to the error series and wait to judge the significance of your t statistics until all important variables (including lagged error values) have been incorporated into the model. You use the same procedure here as in regression settings, where you do not use the t statistic for a variable in a model with an important explanatory variable omitted. Based on the ACF of the differenced series, you fit an MA(1) model to the errors. You interpret the ACF of the differenced series as having a nonzero value ( 0.39698) at lag 1 and a near-zero value at the other lags. Also, check the IACF to see if you have overdifferenced the series. If you have, the IACF dies off very slowly. Suppose you decide the IACF dies off rapidly enough and that you were correct to difference. Note that if where X and are unit root processes, regression of Y on X produces an inconsistent estimate of This makes it impossible for you to use the PLOT option in a model in the original levels of the series to determine if you should difference. Residuals from the model may not resemble the true errors in the series because the estimate of is inconsistent. Because the explanatory series seems to require differencing, you decide to model the SALES series in differences also and then to check for overdifferencing with the PLOT option. Overdifferencing also results in an MA coefficient that is an estimate of 1. The next step, then, is to fit the regression model with an MA error term. You can accomplish this in PROC ARIMA by replacing the ESTIMATE statement above with ESTIMATE INPUT=(MORT DPIC STARTS) Q=1 METHOD=ML; RUN; The results are shown in Output 4.10. .β − β η + Xβ + α = Y η 000.1 754.0 156.0 413.0216.03MUN STRATS 754.0 000.1 250.0204.0859.02MUN CIPD 156.0 250.0000.1 951.0 912.01MUN TROM 413.0204.0951.0 000.1 063.0 1,1AM SELAS 216.0859.0912.0063.0 000.1 UM SELAS 3MUN 2MUN 1MUN 1,1AM UM STRATS CIPD TROM SELAS SELAS retemaraP elbairaV setamitsE retemaraP fo snoitalerroC 04 7975.695 3531.885 2857.933 7.534511 94183.19 slaudiseR fo rebmuN CBS CIA etamitsE rorrE dtS etamitsE ecnairaV etamitsE tnatsnoC 0 0 0 0 0 tfihS STRATS CIPD TROM SELAS SELAS elbairaV 0 0 0 1 0 gaL 1000.< 7534.0 9000.0 8379.0 4350.0 |t| > rP xorppA 72.81 87.0 43.330.0 39.1 eulaV t 38082.0 01941.1 66936.06 94184.03 82603.74 rorrE dradnatS 45031.5 66598.0 04262.20237999.0 94183.19 etamitsE 3MUN 2MUN 1MUN 1,1AM UM retemaraP noitamitsE doohilekiL mumixaM .degrevnoc evah ton yam setamitsE 31 100.0 8-E1 860.982doohilekiL naissuaG goL 872351.0 1.452432 39715.72 100.0 setamitsE ni egnahC evitaleR mumixaM 5 doohilekiL mumixaM egasseM gninraW snoitaretI atleD noitabrutreP evitavireD laciremuN tneiciffeoC adbmaL s'tdrauqraM eulaV noitcnuF evitcejbO noitcnuF evitcejbO noitaretI tsaL morf egnahC erauqS-R tneidarG fo eulaV etulosbA mumixaM eulaV airetirC eulaV gnippotS noitaretI airetirC noitanimreT detamitsE sretemaraP dohteM noitamitsE SECNEREFFID TSRIF NI LEDOM erudecorP AMIRA ehT AMIRA CORP :mreT rorrE AM na htiw ledoM noissergeR eht gnittiF 01.4 tuptuO 174 SAS for Forecasting Time Series Chapter 4: The ARIMA Model: Introductory Applications 175 You have used the generally more accurate maximum-likelihood (ML) method of estimation on the differenced series. Remember that the IDENTIFY statement determines the degree of differencing used. You should note that the MA parameter 0.99973 is not significant (p-value >.05). The calculated t statistics on the explanatory variables have changed from the values they had in the regression with no model for the error series. Also note that PROC AUTOREG, another SAS procedure for regression with time series errors, cannot be used here because it does not allow for differencing (a problem that can be alleviated in the DATA step but could be very cumbersome for handling the forecasts and standard errors) and because it works only with AR error terms. Something has happened here that can happen in practice and is worth noting. The moving average parameter estimate is almost 1. A moving average parameter of 1 is exactly what would be expected if the original regression model in series levels had a white noise error term. This in turn indicates that just an ordinary regression would suffice to fit the model without any differencing being required. Further inspection of the printout, however, reveals that this number may in fact not be a --------------------snoitalerrocotuA-------------------- 880.0 470.0 440.0080.0 )deunitnoc( AMIRA CORP :mreT rorrE AM na htiw ledoM noissergeR eht gnittiF 01.4 tuptuO 370.0040.0 730.0660.0350.0731.0681.0741.0 1 94183.19 262.2021 TROM 956598.0 1 CIPD 345031.5 1 STRATS slaudiseR fo kcehC noitalerrocotuA 801.0 420.0 441.0670.0 )1(**B 37999.0 - 1 SELAS elbairaV rof ledoM srotcaF egarevA gnivoM 3 rebmuN tupnI gnicnereffiD fo )s(doireP tpecretnI detamitsE 1 rebmuN tupnI rotcaF noissergeR llarevO gnicnereffiD fo )s(doireP elbairaV tupnI 2 rebmuN tupnI rotcaF noissergeR llarevO gnicnereffiD fo )s(doireP elbairaV tupnI rotcaF noissergeR llarevO gnicnereffiD fo )s(doireP elbairaV tupnI 540.0381.0970.0970.0 370.0021.0821.0710.0 :1 rotcaF qSihC > rP 7609.0 6338.0 3838.0 2838.0 FD 32 71 11 5 erauqS -ihC 56.41 34.11 05.6 80.2 gaL oT 42 81 21 6 176 SAS for Forecasting Time Series good estimate of the true moving average parameter, this coming from the message about estimates not converging. Decisions made on the basis of this number can thus not be supported. It is worth noting that since the first edition of this book, in which the example first appeared, some relevant developments have taken place. If a regression model with stationary errors is appropriate for data in which the variables themselves appear to be nonstationary, then these errors are a stationary linear combination of nonstationary variables. The variables, independent and dependent, are then said to be cointegrated. Tests for cointegration are available in PROC VARMAX, discussed in Chapter 5, Section 5.2. It will be seen that elimination of some seemingly unimportant input variables in the example results in a model that does not show this problem, and this is the route that will be taken here. However, a test for cointegration could also be used to make a more informed decision as to whether the differencing was appropriate. Any model that can be fit in PROC AUTOREG can also be fit in PROC ARIMA, which makes PROC ARIMA more generally applicable than PROC AUTOREG. The only advantage of PROC AUTOREG in this setting is its automatic selection of an AR model and, starting with Version 8 of SAS, its ability to handle strings of missing data. A final modeling step is to delete insignificant explanatory variables. Do not calculate SALES forecasts based on forecasts of unrelated series. If you do, the forecast error variance is unnecessarily large because the forecast then responds to fluctuations in irrelevant variables. Is it acceptable to eliminate simultaneously all variables with insignificant t statistics? No, it is not acceptable. Eliminating a single insignificant regressor, like DPIC, can change the t statistics on all remaining parameters. In the example above DPIC drifts upward, along with SALES. A nonzero MU in the differenced model also corresponds to drift in the original levels. The t statistic on MU is currently insignificant because DPIC takes over as the explainer of drift if MU is removed. Similarly, MU takes over if DPIC is removed. However, if you remove both terms from the model, the fit deteriorates significantly. DPIC and MU have the lowest t statistics. Remove DPIC and leave MU in the model because it is much easier to forecast than DPIC. When DPIC is removed from the INPUT= list in your ESTIMATE statement, what happens then to the t test for MU? Omitting the insignificant DPIC results in a t statistic 3.86 (not shown) on MU. Also note that the other t statistics change but that the mortgage rates are still not statistically significant. Removing the mortgage rates from the INPUT= list results in a fairly simple model. Review the progression of your modeling thus far: where St niatbo dna smret tnacifingisni evomer yllaitneuqes ot scitsitats t desu uoY .degrevnoc evah ton yam erudecorp noitamitse eht dna ,rorre dradnats eguh a dah ti ,1 raen saw etamitse sti taht ni citamelborp saw mret sihT .)1(AM saw taht ledom mret rorre na yfitnedi ot noitpo TOLP eht desu uoY .)tsaf ylbanosaer nwod deid FCAI eht ,sdrow rehto ni( elbitrevni ylbanosaer dna yranoitats demees slaudiser ehT .TROM dna ,STRATS ,CIPD decnereffid no SELAS decnereffid fo noisserger a morf slaudiser eht dekcehc uoY .yranoitatsnon erew SELAS elbairav tnedneped eht dna stupni eht taht deciton uoY indicates a first difference is sales at time t 1− te β − e + ψ + H∇ = S∇ t t t ∇ F T AM E Y L Ht Chapter 4: The ARIMA Model: Introductory Applications 177 is U.S. housing starts at time t is a constant (drift) that corresponds to the slope in a plot of the undifferenced series against time. The final MA estimate 0.60397 is not particularly close to 1, giving you some confidence that you have not overdifferenced. No convergence problems remain at this point. (See Output 4.11). Consider two scenarios for forecasting this series. First, suppose you are supplied with future values of housing starts from some source. You incorporate these into your data set along with missing values for the unknown future values of SALES, and you call for a forecast. You do not supply information about the forecast accuracy of future housing start values, nor can the procedure use such information. It simply treats these futures as known values. In the second scenario, you model housing starts and then forecast them from within PROC ARIMA. This, then, provides an example of a case 2 problem. For the first scenario, imagine you have been given future values of U.S. housing starts (the values are actually those that would be forecast from PROC ARIMA, giving you an opportunity to see the effect of treating forecasts as perfectly known values). The first step is to create a data set with future values for DATE and STARTS and missing values for SALES. This data set is then concatenated to the original data set. The combined data set COMB has eight values of future STARTS. Use the following SAS statements: PROC ARIMA DATA=COMB; IDENTIFY VAR=SALES(1) CROSSCOR=(STARTS(1)) NOPRINT; ESTIMATE Q=1 INPUT=(STARTS) METHOD=ML; FORECAST LEAD=8 ID=DATE INTERVAL=QTR OUT=FOR1; TITLE 'DATA WITH FORECASTS OF STARTS APPENDED AND SALES=.'; RUN; 1+ t The results are shown in Output 4.11. tfihS 0 0 0 elbairaV STRATS SELAS SELAS seulaV selaS erutuF gnissiM dna seulaV tupnI erutuF htiw gnitsaceroF .=SELAS DNA DEDNEPPA STRATS FO STSACEROF HTIW ATAD gaL 0 0 1 |t| > rP xorppA 3000.0 1000.< 1000.< 04 6391.995 9621.495 9529.983 2.240251 96699.19 noitamitsE doohilekiL mumixaM erudecorP AMIRA ehT eulaV t 95.3 09.02 79.3 slaudiseR fo rebmuN CBS CIA etamitsE rorrE dtS etamitsE ecnairaV etamitsE tnatsnoC rorrE dradnatS 42306.52 58062.0 30251.0 etamitsE 96699.19 00154.5 79306.0 H ψ retemaraP 11.4 tuptuO 1MUN 1,1AM UM 178 SAS for Forecasting Time Series The estimation is exactly the same as in the original data set because SALES has missing values for all future quarters, and thus these points cannot be used in the estimation. Because future values are available for all inputs, forecasts are generated. A request of LEAD=10 also gives only eight forecasts because only eight future STARTS are supplied. Note that future values were supplied to and not generated by the procedure. Forecast intervals are valid if you can guarantee the future values supplied for housing starts. Otherwise, they are too small. Section 4.3.2 displays a plot of the forecasts from this procedure and also displays a similar plot in which PROC ARIMA is used to forecast the input variable. See Output 4.13. Predicted SALES are the same (recall that future values of STARTS in this example are the same as those produced in PROC ARIMA), but forecast intervals differ considerably. Note that the general increase in predicted SALES is caused by including the drift term 6795.65012 9814.22902 0302.51802 5398.45602 9574.30502 1366.06302 9234.51302 7440.78002 2427.24881 6909.29781 9045.37781 4230.50781 0759.94681 4677.80681 6944.17681 4365.85581 0477.465 1252.345 7148.025 7224.794 2548.274 1819.644 2193.914 9529.983 rorrE dtS 9066.94991 2466.75891 0273.49791 9269.97691 5617.67591 8917.48491 3144.39491 1408.22391 tsaceroF 94 84 74 64 54 44 34 24 sbO stimiL ecnedifnoC %59 SELAS elbairaV rof stsaceroF )1(**B 79306.0 - 1 :1 rotcaF srotcaF egarevA gnivoM 1 96699.19 gnicnereffiD fo )s(doireP tpecretnI detamitsE SELAS elbairaV rof ledoM 551.0 511.0 920.0110.0 270.0950.0 160.0440.0010.0870.0922.0211.0 812.0 700.0 290.0241.0 150.0 031.0750.0030.0 520.0781.0431.0001.04486.0 0418.0 0418.0 8838.0 qSihC > rP 32 71 11 5 FD 92.91 77.11 18.6 70.2 erauqS -ihC 42 81 21 6 gaL oT --------------------snoitalerrocotuA-------------------000.1 860.0 010.0 1MUN STRATS slaudiseR fo 860.0 000.1 071.01,1AM SELAS kcehC noitalerrocotuA 010.0 1MUN STRATS 071.01,1AM SELAS 000.1 UM SELAS UM retemaraP SELAS elbairaV setamitsE retemaraP fo snoitalerroC .ψ )deunitnoc( seulaV selaS erutuF gnissiM dna seulaV tupnI erutuF htiw gnitsaceroF 11.4 tuptuO Chapter 4: The ARIMA Model: Introductory Applications 179 4.3.2 Case 2: Simple Transfer Functions In case 2, housing starts Ht are used as an explanatory variable for a company's sales. Using fitting and diagnostic checking, you obtain the model t t In case 1, you supplied future values of Ht to PROC ARIMA and obtained forecasts and forecast intervals. The forecasts were valid, but the intervals were not large enough because future values of housing starts were forecasts. In addition, you have the problem of obtaining these future values for housing starts. PROC ARIMA correctly incorporates the uncertainty of future housing start values into the sales forecast. Step 1 in this methodology identifies and estimates a model for the explanatory variable Ht, U.S. housing starts. The data are quarterly and, based on the usual criteria, the series should be differenced. The differenced series shows some correlation at lag 4 but not enough to warrant a span 4 difference. Use an AR factor to handle the seasonality of this series. Diagnostic checking was done on the STARTS series Ht. The model t fits well. In Section 4.3.1, the series was forecast eight periods ahead to obtain future values. You do not need to request forecasts of your inputs (explanatory series) if your goal is only to forecast target series (SALES, in this case). The procedure automatically generates forecasts of inputs that it needs, but you do not see them unless you request them. In step 2, an input series is used in an input option to identify and estimate a model for the target series St. This part of the SAS code is the same as that in the previous example. The two steps must be together in a single PROC ARIMA segment. The entire set of code is shown below and some of the output is shown in Output 4.12. Some of the output has been suppressed (it was displayed earlier). Also, forecast intervals are wider than in case 1, where forecasts of Ht were taken from this run and concatenated to the end of the data set instead of being forecast by the procedure. This made it impossible to incorporate forecast errors for Ht into the forecast of St. The SAS code follows: PROC ARIMA DATA=HOUSING; TITLE 'FORECASTING STARTS AND SALES'; IDENTIFY VAR=STARTS(1) NOPRINT; ESTIMATE P=(4) Q=(3) METHOD=ML NOCONSTANT; FORECAST LEAD=8; IDENTIFY VAR=SALES(1) CROSSCOR=(STARTS(1)) NOPRINT; ESTIMATE Q=1 INPUT=(STARTS) METHOD=ML NOPRINT; FORECAST LEAD=8 ID=DATE INTERVAL=QTR OUT=FOR2 NOPRINT; RUN; H∇ e) B 3 θ − 1 = H∇ ( t 1− te θ− e= η ) 4 B α − 1( t η t where η + H∇β + ψ = S∇ t t is the moving average 740.0060.0802.0360.0- 660.0 850.0761.0400.0 860.0 611.0232.0350.0 901.0 411.0330.0430.0 811.0 711.0910.0601.0 690.0351.0120.0791.0 6866.0 8555.0 6944.0 1536.0 qSihC > rP 22 61 01 4 FD 26.81 85.41 09.9 55.2 erauqS -ihC 42 81 21 6 gaL oT --------------------snoitalerrocotuA-------------------- slaudiseR fo kcehC noitalerrocotuA 000.1 391.0 1,1RA 391.0 000.1 1,1AM 1,1RA 1,1AM retemaraP setamitsE retemaraP fo snoitalerroC 04 8606.235 922.925 6242.471 5.06303 4 3 gaL 4760.0 6500.0 |t| > rP xorppA 38.1 77.2 eulaV t slaudiseR fo rebmuN CBS CIA etamitsE rorrE dtS etamitsE ecnairaV 28551.0 38251.0 rorrE dradnatS 00582.0 23324.0 etamitsE 1,1RA 1,1AM retemaraP noitamitsE doohilekiL mumixaM erudecorP AMIRA ehT SELAS DNA STRATS GNITSACEROF AMIRA CORP :doohilekiL mumixaM gnisU gnitamitsE 21.4 tuptuO 180 SAS for Forecasting Time Series Chapter 4: The ARIMA Model: Introductory Applications 181 You can now merge the data sets FOR1 and FOR2 from the previous two examples and plot the forecasts and intervals on the same graph. This is illustrated in Output 4.13 to indicate the difference in interval widths for these data. The first graph gives forecast intervals that arose from using PROC ARIMA to forecast housing starts. The second plot gives these forecast intervals as a solid line along with intervals from the previous analysis (broken line), where the same future values for housing starts are read into the data set rather than being forecast by PROC ARIMA. Note how the broken line drastically underestimates the uncertainty in the forecasts. The narrower interval is questionable in light of the downturn in SALES at the end of the series. 6102.4122 9173.5712 7349.4212 4356.0602 4826.2991 1407.0691 4636.0781 9157.4171 stimiL ecnedifnoC %59 )deunitnoc( AMIRA CORP :doohilekiL mumixaM gnisU gnitamitsE 5300.625 2338.465 0397.526 2068.186 9657.547 2186.777 2207.409 3337.1301 1 )3(**B 23324.0 - 1 STRATS elbairav rof stsaceroF .ledom siht ni mret naem oN )4(**B 582.0 - 1 STRATS elbairaV rof ledoM srotcaF egarevA gnivoM srotcaF evissergerotuA gnicnereffiD fo )s(doireP rorrE dtS 7076.034 3958.014 4344.283 4937.153 3580.813 1797.103 3614.642 6242.471 :1 rotcaF :1 rotcaF tsaceroF 6201.0731 6201.0731 3863.5731 8652.1731 7291.9631 7291.9631 3966.7831 6242.3731 sbO 94 84 74 64 54 44 34 24 21.4 tuptuO 182 SAS for Forecasting Time Series Output 4.13 Plotting Forecast Intervals Chapter 4: The ARIMA Model: Introductory Applications 183 4.3.3 Case 3: General Transfer Functions noitacifitnedI ledoM 1.3.3.4 You have specified the ARMA model with backshift operators. For example, you can write the ARMA(1,1) model 1− te as t or as t or, finally, as ... t t 2 The pattern of the weights (coefficients on the ets) determines that the process has one AR and one MA parameter in the same way the ACF does. For example, if ... the weights are 1, 1.2, .6, .3, .15, .075, . . . . The pattern is characterized by one arbitrary change (from 1 to 1.2) followed by exponential decay at the rate .5 (.6= (.5)(1.2), .3=(.5)(.6), . . . ). The exponential decay tells you to put a factor (1 .5B) in the denominator of the expression multiplying et (in other words, .7. What have you learned from this exercise? First, you see that you can write any ARMA model by setting Yt equal to a ratio of polynomial factors in the backshift operator B operating on et. Next, you see that if you can estimate the sequence of weights on the ets, you can determine how many AR and MA lags you need. Finally, in this representation, you see that the numerator polynomial corresponds to MA factors and the denominator corresponds to AR factors. If you can apply a ratio of backshift polynomials to an unobserved error series et, why not apply one to an observable input? This is exactly what you do in case 3. For example, suppose you write t You then obtain t e) B6. + 1( + 1− t X)B4. − 1(3 = 1− te . 6 t Y) B8. + e= η t − 1( t t η where is the moving average − =θ η + ) 2− t X 4. t e) B5. − 1( / )B7. + 1( = − 1− t X(3 ,θ − α = 2.1 Because t you see that + 3 − e )θ − α ( 2 α + The model, then, is + 5− te570. + − 4 − te51. − + e ) θ − α ( α + 1− e )θ − α ( + e = 3− te3. + 2− te6. e ) Bα − 1( / ) Bθ − 1( = e ) Bθ − 1( .)5. t θ − e = 1− Yα − Y + =α 1− te2.1 t = = Y ) Bα − 1( 1− tY8. t + e= t t t − t t t t t Y Y Y Y Y 184 SAS for Forecasting Time Series or t This is called a transfer function. Yt is modeled as a function of lagged values of the input series Xt and current and lagged values of the shocks et. Usually, the intercept is not 0, although for simplicity 0 is used in the preceding example. You now have a potentially useful model, but how is it used? With real data, how will you know the form of the backshift expression that multiplies Xt–1? The answer is in the cross-correlations. Define the cross-covariance as ) j + t X , tY ( v oc and )j Define the cross-correlation as 5. Estimate this by 5. To illustrate the theoretical cross-covariances for a transfer function, assume that Xt is a white noise process independent of the error series The cross-covariances are computed below and are direct multiples of , the variance of X (this holds only when X is white noise): and esion Multiplying both sides by Xt–j, j=0, 1, 2, 3, and computing expected values gives 0 = ) Y X(E = )0( YX γ t t and ) 0( +. . . ) 0( ) 0( XX γ2.1 = ) n + 4− X867. + 3− X69. + 2− X2.1 + 1− X3 = XX γ3 = ) Y1− X(E = ) 1+ Y X(E = )1( YX γ / ))0( YY C) 0( XX C( / ) j ( YX C t .η t )Y t ))0( YY γ)0( XX γ( / ) j ( YX γ η+ − t Y t 2− t X2.1 2− t X(E = ) 2 + tY t X(E = ) 2( YX γ j XX γ69. = ) 3+ Y +t t Y ( )X t − − 1− t X3 t t t X( t t t ) ( X( = 0 XX γ Σ = ) j ( YX C 1− tY8. E = )3( YX γ ) j ( YX γ Estimate by e )B8. − 1( − ( XYγ = ) tY , j − t X ( voc = / ) B 6. + 1( + 1− t X) B8. ) j + tY − 1( / ) B4. − 1(3 + 0 = , t X ( voc = = = )j ( = ) j ( YX ρ t )j ( )j ( − XY γ YX γ YX t t t Y Y Y r Chapter 4: The ARIMA Model: Introductory Applications 185 Note that the model involves ... if you ignore the error term in the model and let Xt be a pulse; that is, Xt=0 except at t=10 where X10=1. Ignoring the white noise term, you have ... so 0 and 69. The weights are the expected responses to a pulse input. The pulse is delayed by one period. Its effect continues to be felt starting with t=11 but diminishes quickly because of the stationary denominator (in other words, AR-type operator (1–.8B) –1 on Xt–1). The crucial point is that if you can obtain the cross-correlations, you have the impulse-response weight pattern, which you can then analyze by the same rules used for the ACFs. In the example above, the 0 on Xt indicates a pure delay. The arbitrary jump from 3 to 1.2, followed by exponential decay at rate .8, indicates that the multiplier on Xt–1 has one numerator (MA) lag and one denominator (AR) lag. The only problem is the requirement that Xt be white noise, which is addressed below. and t t ε where the Xts are independent of the ets and where et and sequences. are two (independent) white noise .α Suppose you have the same transfer function, but Xt is AR(1) with parameter t e)B8. − 1( / ) B 6. + 1( + 3 2. 1 = ... + + 1− t X) B8. = . . . + )0(69. + )1(2.1 + )0(3 = 3− t X 69. 8 X 69. = 01Y − 1( / ) B4. − 1(3 + 0 = + + 2 − t X 2.1 = ... = 9 X2.1 ε + 1− t α = 2Y + + j 01 X3 β These weights are also known as the impulse-response function. The reason for this name is clear You have .j β so if X is white noise, the cross-covariances are proportional to the transfer function weights : 3 j )8.()2.1( 5 β 2 )8.()2.1( 4 69. 3 + , )0 ( XX γ When you divide each term in the cross-covariance sequence by you obtain the weights 2.1 2 3− t X69. + 2− t X2.1 3 1 + 1− t X3 0 0 + t X0 0 1– = 1− t X) B8. j β THGIEW J GAL − 1( / ) B4. − 1(3 1− t X3 = X 1Y = = = = 11Y 21Y 31Y 0Y t t t Y X Y 186 SAS for Forecasting Time Series Note that t so ) 1− t esion ( and 4− t t ′N + . . . + ) X and note that tε = is a white noise sequence, so the expression above becomes ′N + . . . + t The impulse-response function is exactly what you want, and t is a white noise sequence. You want to model X and use that model to estimate Yt and This process is known as prewhitening, although it really only whitens X. Next, compute the cross-correlations of the prewhitened X and Y (in other words, the estimated Yt and ). Note that the prewhitened variables are used only to compute the cross-correlations. The parameter estimation in PROC ARIMA is always performed on the original variables. . tε t Use the IDENTIFY and ESTIMATE statements in PROC ARIMA to model X. A subsequent IDENTIFY statement for Y with the CROSSCOR=(X) option automatically prewhitens X and Y, using the previously estimated model for X. For this example, you specify the following SAS statements: PROC ARIMA DATA=TRANSFER; TITLE 'FITTING A TRANSFER FUNCTION'; IDENTIFY VAR=X; ESTIMATE P=1; IDENTIFY VAR=Y CROSSCOR=(X) NLAG=10; RUN; The results are shown in Output 4.14. egatS YFITNEDI eht ni gniledoM noitcnuF refsnarT rof stnemetatS 2.3.3.4 ε ′ ′ 3− t ε69. + 2− t ε2.1 + 1− t ε3 ′ t Y ε = = 1− t 1− t 1− t X X Y α − α − α − = ′Y t ′ t t N where α − 3− X ( 69. + ) 3− X α − 2− X ( 2.1 + ) 2− X α − 1− X (3 = 1− Y α − is a noise term. Set t α +. t esion . . + 4 − Xα69. + 3− Xα2.1 + 2 − Xα3 = 1− Yα + t t . . . +3 -t X69. t t + 2 − X2.1 + 1− X3 = t t t t t t t t t t Y Y X X Y | | | | | | | | | | | | | | | .*| . . | . . |*. **| . . | . . |*. . | . .*| . . |*. . | . .*| . . |*. .*| . . | . . |******** | | | | | | | | | | | | | | | 37930.0 21500.0 50430.034580.0 57120.041820.058110.0 86430.0 02060.002110.098520.0 87650.083530.0 60300.047193.0noitalerroC 51 41 31 21 11 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA esrevnI 567060.0 285060.0 973060.0 169950.0 258950.0 738950.0 018950.0 808950.0 846950.0 004950.0 951950.0 915850.0 385750.0 580550.0 127440.0 0 rorrE dtS | . |*. | . |*. | . |** | . |** | . |*. | . | . | . |*. | . | . | .*| . | **| . | **| . | ***| . | ***| . | *****| . | **********| . |********************| | | | | | | | | | | | | | | | | 23660.94470.33870.31211.52750.23120.91820.63700.32960.0 89580.0 44480.0 62731.0 08461.0 62562.0 15805.0 00000.1 noitalerroC 236090.0887101.0830701.0332351.0032870.0921920.0715830.0260010.0995490.0 494711.0 383511.0 965781.0 102522.0 484263.0 698496.0 915663.1 ecnairavoC 51 41 31 21 11 01 9 8 7 6 5 4 3 2 1 0 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA 005 289861.1 951000.5 snoitavresbO fo rebmuN noitaiveD dradnatS seireS gnikroW fo naeM X = elbairaV fo emaN erudecorP AMIRA ehT NOITCNUF REFSNART A GNITTIF AMIRA CORP :stnemetatS ETAMITSE dna YFITNEDI eht htiw noitcnuF refsnarT a gnittiF 41.4 tuptuO Chapter 4: The ARIMA Model: Introductory Applications 187 .tnanimreted gol edulcni ton od CBS dna CIA * 005 slaudiseR fo rebmuN 798.7341 CBS 864.9241 CIA 75800.1 etamitsE rorrE dtS 312710.1 etamitsE ecnairaV 490064.2 etamitsE tnatsnoC 1 0 gaL 1000.< 1000.< |t| > rP xorppA 81.31 17.45 eulaV t 85830.0 94190.0 rorrE dradnatS 45805.0 96500.5 etamitsE 1,1RA UM retemaraP noitamitsE serauqS tsaeL lanoitidnoC 660.0 330.0 211.0680.0 001.0 150.0 750.0480.0 511.0 110.0120.0731.0 401.0 660.0820.0561.0 170.0 470.0700.0562.0 140.0 870.0960.0 905.0 1000.< 1000.< 1000.< 1000.< qSihC > rP 42 81 21 6 FD 24.142 07.712 14.702 61.691 erauqS -ihC 42 81 21 6 gaL oT --------------------snoitalerrocotuA-------------------- esioN etihW rof kcehC noitalerrocotuA | | | | | | | | | | | | | | | . | . . |*. . | . . |*. . |*. . | . . | . . |** . | . .*| . . | . .*| . .*| . . | . **********| . | | | | | | | | | | | | | | | 77000.060920.080420.0 66470.003950.087000.058400.029770.096200.0 20840.0 42610.049150.0 18530.0 00900.0 15805.0 noitalerroC 51 41 31 21 11 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA laitraP )deunitnoc( AMIRA CORP :stnemetatS ETAMITSE dna YFITNEDI eht htiw noitcnuF refsnarT a gnittiF 41.4 tuptuO 188 SAS for Forecasting Time Series 005 165141.6 51950.01 snoitavresbO fo rebmuN noitaiveD dradnatS seireS gnikroW fo naeM Y = elbairaV fo emaN )1(**B 45805.0 - 1 :1 rotcaF srotcaF evissergerotuA 196500.5 naeM detamitsE X elbairaV rof ledoM 170.0 940.0 110.0 550.0 800.0 700.0790.0140.0 830.0 900.0540.0 500.0640.0 270.0 600.0010.0- 420.0 710.0 320.0110.0550.0 800.0220.0 260.0 810.0 210.0 410.0240.0 050.0 450.0820.0400.0 120.0340.0 030.0050.0 020.0 720.0140.0210.0- 520.0 330.0610.0 310.0100.0500.0360.0 500.0- 8419.0 5509.0 6058.0 4417.0 8726.0 6425.0 7214.0 0217.0 qSihC > rP 74 14 53 92 32 71 11 5 FD 73.43 76.92 44.62 92.42 32.02 99.51 73.11 29.2 erauqS -ihC 84 24 63 03 42 81 21 6 gaL oT --------------------snoitalerrocotuA-------------------- slaudiseR fo kcehC noitalerrocotuA 000.1 500.0 1,1RA 500.0 000.1 UM 1,1RA UM retemaraP setamitsE retemaraP fo snoitalerroC )deunitnoc( AMIRA CORP :stnemetatS ETAMITSE dna YFITNEDI eht htiw noitcnuF refsnarT a gnittiF 41.4 tuptuO Chapter 4: The ARIMA Model: Introductory Applications 189 | | | | | | | | | | . | . .*| . . |** . |*. . | . . |*. . | . . | . . |*. *****************| . | | | | | | | | | | 51910.0 73130.0 12490.017640.098420.0 55150.091600.0 87800.0 46840.092658.0 noitalerroC 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA laitraP | | | | | | | | | | . | . . | . .*| . . |** . | . .*| . . |** .*| . . | . . |********** | | | | | | | | | | 04010.076100.0 22270.0 34180.050900.099960.0 79570.074240.0 44910.0 94125.0noitalerroC 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA esrevnI 442601.0 888501.0 651501.0 556301.0 552101.0 086790.0 901290.0 417380.0 532070.0 127440.0 0 rorrE dtS | . **| . | .***| . | ****| . | ******| . | *******| . | ********| . | **********| . | ************| . | **************| . | *****************| . |********************| | | | | | | | | | | | 49890.0 53731.0 85691.0 29972.0 95053.0 66124.0 61415.0 44706.0 62027.0 92658.0 00000.1 noitalerroC 949137.3 605081.5 747414.7 451855.01 778322.31 334409.51 805393.91 619119.22 193761.72 813892.23 377817.73 ecnairavoC 01 9 8 7 6 5 4 3 2 1 0 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA )deunitnoc( AMIRA CORP :stnemetatS ETAMITSE dna YFITNEDI eht htiw noitcnuF refsnarT a gnittiF 41.4 tuptuO 190 SAS for Forecasting Time Series | | | | | | | | | | | | | | | | | | | | | .*| . ***| . ***| . ***| . ****| . ****| . *****| . *******| . ****************| . . | . . | . . | . .*| . . | . .*| . .*| . . |*. . |*. . | . . |*. . |** | | | | | | | | | | | | | | | | | | | | | 41660.0 20641.0 50051.0 24131.0 16902.0 89102.0 60932.0 64133.0 72397.0 08800.0 13320.0 85320.0 19460.0 31600.0 02830.0 71530.0 53060.07730.02710.22830.04770.noitalerroC 285452.0 140265.0 045775.0 438505.0 808608.0 024777.0 671029.0 997572.1 143350.3 788330.0 437980.0 347090.0 658942.0 585320.0 920741.0 483531.0 803232.0411541.0991660.0321741.0539792.0ecnairavoC 01 9 8 7 6 5 4 3 2 1 0 12345678901gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1- snoitalerrocssorC .denetihwerp neeb evah seires htoB 251310.1 60326.41 005 X seires demrofsnart fo ecnairaV Y seires demrofsnart fo ecnairaV snoitavresbO fo rebmuN X dna Y fo noitalerroC 153.0 224.0 415.0 706.0 027.0 658.0 1000.< qSihC > rP 6 FD 30.3011 erauqS -ihC 6 gaL oT --------------------snoitalerrocotuA-------------------- esioN etihW rof kcehC noitalerrocotuA )deunitnoc( AMIRA CORP :stnemetatS ETAMITSE dna YFITNEDI eht htiw noitcnuF refsnarT a gnittiF 41.4 tuptuO Chapter 4: The ARIMA Model: Introductory Applications 191 192 SAS for Forecasting Time Series Data for this example are generated from the model t where t The cross-correlations are near 0 until you reach lag 2. You now see a spike (0.79327) followed by an arbitrary drop to 0.33146 followed by a roughly exponential decay. The one arbitrary drop and the exponential decay to one denominator (AR) corresponds to one numerator (MA) lag Note the pure delay of two periods. The default in PROC ARIMA is to estimate the model with the C multiplied through the numerator as shown on the right. The ALTPARM option gives the factored C form as on the left. Now review the PROC ARIMA instructions needed to run this example. In INPUT=(form1 variable1 form2 variable2. . . ), the specification for the transfer function form is ) where S lag polynomials variable j For example, INPUT=(2$(1,3)(1)/(1)X) ALTPARM; is the shift or pure delay (2 in the example) are written in multiplicative form is not followed by differencing numbers (this is done in CROSSCOR). 2− t X) B . . .( . . . ) . . . α − 1( / )B)θC( − C( = 1,1 + k L( / ) ... 1, 2− t X) B α k L( . . . ) . . . , L , L($S − 1( / )Bθ − 1(C 2 ,1 1,1 . )Bα − 1( lag The form of the transfer function is then -------------------snoitalerrocssorC-------------------- 202.0 932.0 )deunitnoc( AMIRA CORP :stnemetatS ETAMITSE dna YFITNEDI eht htiw noitcnuF refsnarT a gnittiF :retlif gniwollof eht yb denetihwerp neeb evah selbairav htoB 133.0 seireS neewteB kcehC noitalerrocssorC 397.0 )1(**B 45805.0 - 1 N srotcaF evissergerotuA + ))5 − retliF gninetihwerP 900.0 3− t X(4. ) Bθ − 1( − )5 − 320.0 :1 rotcaF 2− t X((3 qSihC > rP 1000.< e + )5 − 1− t X(5. = )5 − = )01 − 1− tY(8. FD 6 − )01 − erauqS -ihC 58.814 t t Y( X( 41.4 tuptuO gaL oT 5 Chapter 4: The ARIMA Model: Introductory Applications 193 indicates esion t Several numerator and denominator factors can be multiplied together. Note the absence of a transfer function form in the sales and housing starts example, which assumes that only contemporaneous relationships exist among sales, St, and the input variables. For the current (generated data) example, the transfer function form should indicate a pure delay of two (2$), one numerator (MA) lag (2$(1)), and one denominator lag (2$(1)/(1)). Use the PLOT option to analyze the residuals and then estimate the transfer function with the noise model. To continue with the generated data, add these SAS statements to those used earlier to identify and estimate the X model and to identify the Y model: ESTIMATE INPUT=(2$(1)/(1)X) MAXIT=30 ALTPARM PLOT METHOD=ML; RUN; The code above produces Output 4.15. Note the AR(1) nature of the autocorrelation plot of residuals. Continue with the following code to produce Output 4.16: ESTIMATE P=1 INPUT=(2$(1)/(1)X) PRINTALL ALTPARM METHOD=ML; FORECAST LEAD=10 OUT=OUTDATA ID=T; RUN; DATA NEXT; SET OUTDATA; IF T>480; RUN; PROC PRINT DATA=NEXT; TITLE 'FORECAST OUTPUT DATA SET'; RUN; PROC GPLOT DATA=NEXT; PLOT L95*T U95*T FORECAST*T Y*T / OVERLAY HMINOR=0; SYMBOL1 V=L I=NONE C=BLACK; SYMBOL2 V=U I=NONE C=BLACK; SYMBOL3 V=F L=2 I=JOIN C=BLACK; SYMBOL4 V=A L=1 I=JOIN C=BLACK; TITLE 'FORECASTS FOR GENERATED DATA'; RUN; noitaulavE ledoM 3.3.3.4 The estimated model is as shown in Output 4.16 Standard errors are (1.73), (.05), (.01), (.02), and (.03). In the autocorrelation and cross-correlation checks of residuals and input, note the following facts: .slaudiser ni noitalerrocotua yb deinapmocca netfo si sihT .ledom noitcnuf refsnart eht fo noitacifitnedi reporpmi seilpmi tupni htiw slaudiser fo noitalerroc-ssorC .sledom AMIRA CORP dradnats rof reilrae dessucsid scitsitats Q eht ekil era AMIRA CORP yb detnirp yllacitamotua scitsitats erauqs-ihC η 1− )B97. + − 1( + 2 − X )B73. − 1( 2 − t X))Bδ t − 1( / )Bα − 1() 3 B 2 θ − B 1θ − 1 C( + 0 θ = ( 1− )B87. − 1( 99.2 + 64.23− = t t Y Y 794 582.5291 154.8091 328346.1 351207.2 6799.33- slaudiseR fo rebmuN CBS CIA etamitsE rorrE dtS etamitsE ecnairaV etamitsE tnatsnoC 2 2 2 0 tfihS X X X Y elbairaV 1 1 0 0 gaL 1000.< 1000.< 1000.< 1000.< |t| > rP xorppA 76.99 71.61 35.34 48.34eulaV t 0339700.0 56420.0 53070.0 35577.0 rorrE dradnatS 96097.0 56893.0 17260.3 16799.33etamitsE 1,1NED 1,1MUN 1ELACS UM retemaraP noitamitsE doohilekiL mumixaM erudecorP AMIRA ehT NOITCNUF REFSNART A GNITTIF AMIRA CORP :noitcnuF refsnarT a gnittiF 51.4 tuptuO 194 SAS for Forecasting Time Series Neither cross-correlation check indicates any problem with the transfer specification. First, the inputs are forecast and then used to forecast Y. In an example without prewhitening, future values of X must be in the original data set. .deifitnedi ylreporp ton si ledom esion eht taht tub thgir si noitcnuf refsnart eht taht setacidni X htiw slaudiser fo noitalerroc-ssorc yb deinapmocca ton slaudiser fo noitalerrocotuA versus Output 4.16 See Output 4.15 ESTIMATE ESTIMATE INPUT=(2$(1)/(1)X) . . . ; P=1 . . . ; , from , from 326980.0 285980.0 424980.0 608880.0 688780.0 762680.0 901380.0 001770.0 587660.0 658440.0 0 rorrE dtS | . *| . | . *| . | . **| . | .***| . | ****| . | *****| . | *******| . | **********| . | ************| . | ****************| . |********************| | | | | | | | | | | | 18740.0 17240.0 76380.0 84561.0 40102.0 07462.0 65463.0 11984.0 92706.0 99977.0 00000.1 noitalerroC 302921.0 214511.0 080622.0 941744.0 452345.0 152517.0 901589.0 366123.1 289046.1 946701.2 351207.2 ecnairavoC 01 9 8 7 6 5 4 3 2 1 0 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1- slaudiseR fo tolP noitalerrocotuA 541.0 880.0521.0701.0520.0680.0200.0 102.0 611.0 090.0241.0111.0950.0590.0130.0 562.0 350.0 880.0341.0380.0570.0690.0840.0 563.0 810.0 601.0741.0840.0260.0690.0340.0 984.0 110.0390.0141.0910.0170.0740.0480.0 706.0 140.0801.0611.0120.0780.0720.0561.0 087.0 1000.< 1000.< 1000.< 1000.< 1000.< 1000.< 1000.< 1000.< qSihC > rP 84 24 63 03 42 81 21 6 FD 29.319 91.298 72.268 76.208 08.487 41.177 66.157 66.137 erauqS -ihC 84 24 63 03 42 81 21 6 gaL oT --------------------snoitalerrocotuA-------------------- slaudiseR fo kcehC noitalerrocotuA 000.1 128.0 192.0 436.01,1NED X 128.0 000.1 986.0 743.01,1MUN X 192.0 986.0 000.1 823.01ELACS X 436.0743.0823.0000.1 UM Y 1,1NED 1,1MUN 1ELACS UM X X X Y retemaraP elbairaV setamitsE retemaraP fo snoitalerroC )deunitnoc( AMIRA CORP :noitcnuF refsnarT a gnittiF 51.4 tuptuO Chapter 4: The ARIMA Model: Introductory Applications 195 740.0 740.0 300.0 310.0 530.0 720.0 840.0 100.0 230.0 070.0 310.0910.0750.0 440.0 810.0600.0- 910.0620.0 920.0 210.0 850.0 340.0 200.0700.0- 100.0910.0200.0610.0 550.0 730.0 400.0 400.0- 320.0 200.0 220.0300.0270.0 240.0 900.0800.0 120.0 200.0 570.0910.0 230.0 650.0 700.0 130.0- 9199.0 6979.0 3479.0 3069.0 0038.0 2869.0 9699.0 8669.0 qSihC > rP 64 04 43 82 22 61 01 4 FD 31.62 98.32 78.91 53.61 17.51 52.7 39.1 75.0 erauqS -ihC 74 14 53 92 32 71 11 5 gaL oT -------------------snoitalerrocssorC-------------------- X tupnI htiw slaudiseR fo kcehC noitalerrocssorC | | | | | | | | | | .*| . .*| . . |*** .*| . . | . . | . . |*. .*| . . | . ****************| . | | | | | | | | | | 23960.0 14330.0 27131.080830.0 30020.0 65410.023470.016140.0 08200.099977.0 noitalerroC 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA laitraP | | | | | | | | | | . |*. .*| . **| . . |*** .*| . . | . .*| . . |** .*| . . |********** | | | | | | | | | | 80140.082540.0 37080.0 83941.026960.0 53800.058050.0 17480.069050.0 14994.0noitalerroC 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA esrevnI )deunitnoc( AMIRA CORP :noitcnuF refsnarT a gnittiF 51.4 tuptuO 196 SAS for Forecasting Time Series 59160.01 175544.1 tsE ecnairaV esioN etihW etamitsE mreT tnatsnoC 92658.0 etamitsE 1 setamitsE evissergerotuA laitinI noitamitsE yranimilerP erudecorP AMIRA ehT NOITCNUF REFSNART A GNITTIF ataD detareneG rof stsaceroF gnittolP dna gniledoM 61.4 tuptuO )1(**B 96097.0 - 1 :1 rotcaF srotcaF rotanimoneD )1(**B 56893.0 - 1 :1 rotcaF srotcaF rotaremuN 807260.3 2 X rotcaF noissergeR llarevO tfihS elbairaV tupnI 1 rebmuN tupnI 6799.33- tpecretnI detamitsE Y elbairaV rof ledoM )deunitnoc( AMIRA CORP :noitcnuF refsnarT a gnittiF 51.4 tuptuO Chapter 4: The ARIMA Model: Introductory Applications 197 3 100.0 snoitaretI atleD noitabrutreP evitavireD laciremuN yrammuS noitazimitpO noitamitsE AMIRA 8-E1 835.017doohilekiL naissuaG goL 523000.0 759534.3 6-E251.3 noitcnuF evitcejbO ni egnahC evitaleR 842000.0 100.0 setamitsE ni egnahC evitaleR mumixaM 5 doohilekiL mumixaM tneiciffeoC adbmaL s'tdrauqraM eulaV noitcnuF evitcejbO noitcnuF evitcejbO noitaretI tsaL morf egnahC erauqS-R tneidarG fo eulaV etulosbA mumixaM eulaV airetirC etanretlA airetirC etanretlA eulaV airetirC eulaV gnippotS noitaretI airetirC noitanimreT detamitsE sretemaraP dohteM noitamitsE yrammuS noitazimitpO noitamitsE AMIRA 523000.0 502900.0 398850.0 1 tirC R 8-E1 7-E1 6-E1 10000.0 adbmaL 40237.636337.628476.636830.7tnatsnoC 27877.0 56877.0 06877.0 93967.0 1,1NED 85273.0 15273.0 33273.0 61763.0 1,1MUN 04399.2 23399.2 71499.2 05299.2 1ELACS 36297.0 35297.0 51497.0 72477.0 1,1RA 2364.231554.230624.232181.13UM 02835.01722835.01720955.01700163.117ekilgoL 3 2 1 0 retI noitamitsE doohilekiL mumixaM 749681.0 689752.0 548188.0 824909.0 1 tirC R 1000.0 100.0 6-E1 10000.0 10000.0 adbmaL 36830.784576.52377.453873.0175544.1 93967.0 25987.0 12118.0 47957.0 00001.0 61763.0 68193.0 62593.0 65054.0 00001.0 1,1MUN 05299.2 50100.3 05689.2 42387.2 17310.3 1ELACS 72477.0 46628.0 67878.0 70709.0 92658.0 1,1RA 2181.130937.235863.9353170.451950.01 UM 13.615 32.235 21.255 69.1132 83.8093 ESS 4 3 2 1 0 noitaretI tnatsnoC 1,1NED noitamitsE serauqS tsaeL lanoitidnoC )deunitnoc( ataD detareneG rof stsaceroF gnittolP dna gniledoM 61.4 tuptuO 198 SAS for Forecasting Time Series 620.0 401.0810.0200.0 760.0 710.0 300.0430.0- 580.0 010.0940.0660.0040.0920.0730.0 730.0- 540.0030.0 300.0520.0370.0500.0440.0 210.0- 210.0760.0830.0010.0 640.0 890.0480.0360.0 300.0630.0 040.0270.0 600.0 250.0 170.0040.0- 820.0 700.0910.0 600.0340.0710.0811.0 110.0- 8421.0 7011.0 0171.0 1190.0 0360.0 8170.0 9950.0 4815.0 qSihC > rP 74 14 53 92 32 71 11 5 FD 03.85 23.25 18.24 75.93 51.43 51.62 60.91 22.4 erauqS -ihC 84 24 63 03 42 81 21 6 gaL oT --------------------snoitalerrocotuA-------------------- slaudiseR fo kcehC noitalerrocotuA 000.1 818.0 510.0210.0597.01,1NED X 818.0 000.1 070.0 200.0 483.01,1MUN X 510.0070.0 000.1 120.0882.01ELACS X 210.0200.0 120.0000.1 820.0 1,1RA Y 597.0483.0882.0820.0 000.1 UM Y 1,1NED 1,1MUN 1ELACS 1,1RA UM X X X Y Y retemaraP elbairaV setamitsE retemaraP fo snoitalerroC 794 911.2541 670.1341 688410.1 399920.1 40237.6- slaudiseR fo rebmuN CBS CIA etamitsE rorrE dtS etamitsE ecnairaV etamitsE tnatsnoC 2 2 2 0 0 tfihS X X X Y Y elbairaV 1 1 0 1 0 gaL 1000.< 1000.< 1000.< 1000.< 1000.< |t| > rP xorppA 68.65 03.61 47.56 49.82 18.81eulaV t 96310.0 68220.0 45540.0 93720.0 66527.1 rorrE dradnatS 27877.0 85273.0 04399.2 36297.0 61364.23etamitsE 1,1NED 1,1MUN 1ELACS 1,1RA UM retemaraP noitamitsE doohilekiL mumixaM )deunitnoc( ataD detareneG rof stsaceroF gnittolP dna gniledoM 61.4 tuptuO Chapter 4: The ARIMA Model: Introductory Applications 199 )1(**B 27877.0 - 1 :1 rotcaF srotcaF rotanimoneD )1(**B 85273.0 - 1 :1 rotcaF srotcaF rotaremuN 593399.2 2 X rotcaF noissergeR llarevO tfihS elbairaV tupnI 1 rebmuN tupnI )1(**B 36297.0 - 1 :1 rotcaF srotcaF evissergerotuA 2364.23- tpecretnI detamitsE Y elbairaV rof ledoM 830.0 510.0720.0 740.0 510.0800.0501.0 020.0 770.0 970.0 650.0640.0910.0 810.0 510.0810.0 620.0760.0 150.0 100.0 420.0 020.0 200.0500.0 330.0820.0420.0 530.0 300.0700.0 320.0 020.0- 110.0 200.0060.0 920.0470.0 300.0 010.0910.0 120.0100.0131.0510.0310.0 620.0 620.0 200.0- 4667.0 4837.0 8947.0 6489.0 6379.0 3759.0 4927.0 7449.0 qSihC > rP 64 04 43 82 22 61 01 4 FD 77.83 59.33 41.82 43.41 80.11 07.7 69.6 57.0 erauqS -ihC 74 14 53 92 32 71 11 5 gaL oT -------------------snoitalerrocssorC-------------------- X tupnI htiw slaudiseR fo kcehC noitalerrocssorC )deunitnoc( ataD detareneG rof stsaceroF gnittolP dna gniledoM 61.4 tuptuO 200 SAS for Forecasting Time Series . . . . . 8736.12 5765.12 7044.12 0912.12 0938.02 9653.16022.18800.12776.05451.0- 21668.5 93318.5 30727.5 78585.5 95553.5 4041.01 5371.01 0612.01 9072.01 3243.01 . . . . . 015 905 805 705 605 03 92 82 72 62 )senil tuptuo erom( 87971.0 21640.0 63313.1 90673.164820.0 LAUDISER 3106.11 8180.01 1453.6 8603.5 7837.2 59U 0326.7 6301.6 8573.2 5823.1 6932.159L 98410.1 98410.1 98410.1 98410.1 98410.1 DTS 2216.9 7290.8 9463.4 6713.3 6947.0 TSACEROF 0297.9 8831.8 3876.5 5149.1 0877.0 Y 584 484 384 284 184 T 5 4 3 2 1 sbO TES ATAD TUPTUO TSACEROF 8736.12 5765.12 7044.12 0912.12 0938.02 6791.02 9121.91 9882.71 2405.31 3817.41 9653.16022.18800.12776.05451.04476.0 7999.1 3171.4 8724.8 0047.01 1668.5 4318.5 0727.5 9585.5 6553.5 5089.4 0863.4 4643.3 0592.1 9410.1 rorrE dtS 4041.01 5371.01 0612.01 9072.01 3243.01 0634.01 8065.01 1037.01 0669.01 2927.21 tsaceroF 015 905 805 705 605 505 405 305 205 105 sbO stimiL ecnedifnoC %59 Y elbairaV rof stsaceroF )deunitnoc( ataD detareneG rof stsaceroF gnittolP dna gniledoM 61.4 tuptuO Chapter 4: The ARIMA Model: Introductory Applications 201 202 SAS for Forecasting Time Series In addition to generated data, logarithms of flow rates for the Neuse River at Goldsboro, North Carolina, and 30 miles downstream at Kinston, North Carolina, are analyzed. These data include 400 daily observations. Obviously, the flow rates develop a seasonal pattern over the 365 days in a year, causing the ACF to die off slowly. Taking differences of the logarithmic observations produces ACFs that seem well behaved. The goal is to relate flow rates at Kinston to those at Goldsboro. The differenced data should suffice here even though nonstationarity is probably caused by the 365-day seasonal periodicity in flows. You can obtain a model for the logarithms of the Goldsboro flow rates by using the following SAS statements: PROC ARIMA DATA=RIVER; IDENTIFY VAR=LGOLD(1) NOPRINT; ESTIMATE Q=1 P=3 METHOD=ML NOCONSTANT MAXIT=100; IDENTIFY VAR=LKINS(1) CROSSCOR=(LGOLD(1)); TITLE 'FLOW RATES OF NEUSE RIVER AT GOLDSBORO AND KINSTON'; RUN; The results are shown in Output 4.17. AMIRA CORP :stnemetatS ETAMITSE dna YFITNEDI eht htiw ataD wolF fo smhtiragoL gnizylanA gaL 3 2 1 1 |t| > rP xorppA 7920.0 6000.0 1000.< 1000.< 3,1RA 000.1 748.0455.0 473.0 NOTSNIK DNA OROBSDLOG TA REVIR ESUEN FO SETAR WOLF setamitsE retemaraP fo snoitalerroC 993 834.231493.841197991.0 619930.0 2,1RA 748.0000.1 387.0763.0noitamitsE doohilekiL mumixaM eulaV t 71.244.326.61 78.41 erudecorP AMIRA ehT 1,1RA 455.0 387.0000.1 547.0 slaudiseR fo rebmuN CBS CIA etamitsE rorrE dtS etamitsE ecnairaV rorrE dradnatS 49350.0 24480.0 76470.0 87850.0 1,1AM 473.0 763.0547.0 000.1 etamitsE 42711.047092.038042.1 49378.0 retemaraP 3,1RA 2,1RA 1,1RA 1,1AM retemaraP 3,1RA 2,1RA 1,1RA 1,1AM 71.4 tuptuO 908170.0 477170.0 476170.0 902170.0 824070.0 996960.0 371960.0 750860.0 994660.0 125560.0 470560.0 070560.0 517460.0 343360.0 360050.0 0 rorrE dtS | . *| . | . *| . | . |* . | . |**. | . |*** | . |*** | . |**. | . |*** | . |**** | . |*** | . |**. | . | . | .**| . | ****| . | ***********| . |********************| | | | | | | | | | | | | | | | | 43360.0 65130.0 84350.71511.35841.47241.27021.08471.45402.35061.48701.68010.08590.0 42781.0 41845.0 00000.1 noitalerroC 7174100.0 31337000.0 5242100.06576200.09054300.03613300.06408200.00160400.00257400.06927300.04505200.04252000.07522200.0 2053400.0 537210.0 332320.0 ecnairavoC 51 41 31 21 11 01 9 8 7 6 5 4 3 2 1 0 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA 1 993 324251.0 508600.0 1 gnicnereffid yb detanimile )s(noitavresbO snoitavresbO fo rebmuN noitaiveD dradnatS seireS gnikroW fo naeM gnicnereffiD fo )s(doireP SNIKL = elbairaV fo emaN )1(**B 49378.0 - 1 :1 rotcaF srotcaF egarevA gnivoM )3(**B 42711.0 + )2(**B 47092.0 + )1(**B 38042.1 - 1 srotcaF evissergerotuA .ledom siht ni mret naem oN 1 gnicnereffiD fo )s(doireP DLOGL elbairaV rof ledoM 140.0 320.0500.0100.0420.0020.0 040.0910.0240.0 600.0 800.0700.0 170.0 540.0 590.0020.0 010.0610.0980.0120.0410.0660.0140.0 230.0700.0080.0 220.0 250.0270.0 170.0 470.0 210.0 500.0 740.0 520.0 850.0 400.0 600.0 230.0300.0 920.0560.0500.0 810.0420.0 040.0320.0 100.05558.0 7437.0 5557.0 7086.0 7905.0 7824.0 1173.0 9186.0 qSihC > rP 44 83 23 62 02 41 8 2 FD 12.43 81.23 81.62 51.22 91.91 82.41 76.8 77.0 erauqS -ihC 84 24 63 03 42 81 21 6 gaL oT :1 rotcaF --------------------snoitalerrocotuA-------------------- slaudiseR fo kcehC noitalerrocotuA )deunitnoc( AMIRA CORP :stnemetatS ETAMITSE dna YFITNEDI eht htiw ataD wolF fo smhtiragoL gnizylanA 71.4 tuptuO Chapter 4: The ARIMA Model: Introductory Applications 203 050.0101.0 511.0161.0- 030.0 670.0 941.0801.0- 750.0 200.0 341.0110.0- 960.0 360.0 121.0690.0 470.0 230.0 571.0781.0 470.0 350.0502.0845.0 1000.< 1000.< 1000.< 1000.< qSihC > rP 42 81 21 6 FD 18.132 14.222 04.212 98.351 erauqS -ihC 42 81 21 6 gaL oT --------------------snoitalerrocotuA-------------------- esioN etihW rof kcehC noitalerrocotuA | | | | | | | | | | | | | | | . | . .*| . . | . . |*. . |*. . |** . |*. . | . . |** . |** . |*. . |** **| . . |*** ***********| . | | | | | | | | | | | | | | | 34610.080140.0 48700.0 20550.038550.094701.074820.046600.071001.076780.083260.066421.008590.0 48161.041845.0 noitalerroC 51 41 31 21 11 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA laitraP | | | | | | | | | | | | | | | . |*** **| . . |*. .*| . . | . .*| . . |*. .*| . . | . . | . . | . .*| . . |*** *****| . . |*********** | | | | | | | | | | | | | | | 57271.006980.0 37040.089050.0 47400.089830.0 97950.073360.0 20100.062420.0 62510.0 19050.0 40631.087062.0 93075.0noitalerroC 51 41 31 21 11 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA esrevnI )deunitnoc( AMIRA CORP :stnemetatS ETAMITSE dna YFITNEDI eht htiw ataD wolF fo smhtiragoL gnizylanA 71.4 tuptuO 204 SAS for Forecasting Time Series | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | . | . . |*. . |*. . |** . | . .*| . . | . . | . . | . . | . .*| . .*| . . | . *******| . **************| . . |*. . | . ***| . .*| . . | . . | . . |*. . | . .*| . . | . . |** . | . . | . . | . .*| . . | . | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 23310.0 05050.77050.08580.72710.0 05830.0 00710.0 86400.0 50420.0 87010.0 98620.0 56860.0 32210.0 00163.0 77596.0 91050.93300.29561.0 97530.0 24000.0 40010.73140.32710.0 92930.0 33200.67380.52610.54220.83100.0 15920.0 08910.noitalerroC 34243000.0 2892100.00503100.04502200.059344000.0 27989000.0 30734000.0 32021000.0 01816000.0 10772000.0 91196000.0 6467100.0 92413000.0 5972900.0 588710.0 1092100.03780000.09462400.0 30029000.0 58010000.0 0852000.05360100.018244000.0 1010100.0 9950000.00351200.08714000.00775000.034530000.0 95857000.0 9805000.0ecnairavoC 51 41 31 21 11 01 9 8 7 6 5 4 3 2 1 0 123456789011121314151gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1- snoitalerrocssorC .denetihwerp neeb evah seires htoB 705930.0 527610.0 1 993 1 DLOGL seires demrofsnart fo ecnairaV SNIKL seires demrofsnart fo ecnairaV gnicnereffid yb detanimile )s(noitavresbO snoitavresbO fo rebmuN gnicnereffiD fo )s(doireP DLOGL dna SNIKL fo noitalerroC .decnereffid neeb sah DLOGL elbairaV )deunitnoc( AMIRA CORP :stnemetatS ETAMITSE dna YFITNEDI eht htiw ataD wolF fo smhtiragoL gnizylanA 71.4 tuptuO Chapter 4: The ARIMA Model: Introductory Applications 205 206 SAS for Forecasting Time Series The output from the ESTIMATE statement shows a reasonable fit. Cross-correlations from the second IDENTIFY statement show that a change in flow rates at Goldsboro affects the flow at as a transfer Kinston one and two days later, with little other effect. This suggests function. Add the following SAS statements to the code above: ESTIMATE INPUT=(1$(1)LGOLD) PLOT METHOD=ML; RUN; 1− t X) Bθ − 1(C Results are shown in Output 4.18. Diagnostics from the PLOT option are used to identify an error model. The cross-correlation check looks reasonable, but the autocorrelation check indicates that the error is not white noise. This also implies that t statistics for the model parameters are computed from improper standard errors. -------------------snoitalerrocssorC-------------------- 310.0 070.0710.0 720.0 )deunitnoc( AMIRA CORP :stnemetatS ETAMITSE dna YFITNEDI eht htiw ataD wolF fo smhtiragoL gnizylanA 460.0 631.0 930.0 960.0 )3(**B 42711.0 + )2(**B 47092.0 + )1(**B 38042.1 - 1 :retlif gniwollof eht yb denetihwerp neeb evah selbairav htoB 100.0 310.0 710.0 210.0 seireS neewteB kcehC noitalerrocssorC 910.0150.0500.0 163.0 )1(**B 49378.0 - 1 srotcaF evissergerotuA srotcaF egarevA gnivoM retliF gninetihwerP 340.0 150.0420.0 696.0 130.0 680.0110.0 050.0:1 rotcaF qSihC > rP 1000.< 1000.< 1000.< 1000.< :1 rotcaF FD 42 81 21 6 erauqS -ihC 58.662 88.362 05.942 93.842 71.4 tuptuO gaL oT 32 71 11 5 450.0860.0930.0910.0960.0950.0 310.0370.0- 310.0 840.0360.0 320.0330.0010.0450.0 601.0- 650.0 400.0990.0 260.0630.0 041.0240.0 961.0- 301.0 140.0780.0 650.0950.0 330.0260.0 591.0- 821.0 240.0240.0 350.0051.0 500.0180.0880.0- 810.0 940.0130.0 440.0290.0 320.0501.0513.0 1000.< 1000.< 1000.< 1000.< 1000.< 1000.< 1000.< 1000.< qSihC > rP 84 24 63 03 42 81 21 6 FD 42.251 81.731 94.131 53.021 79.411 15.79 51.78 04.67 erauqS -ihC 84 24 63 03 42 81 21 6 gaL oT --------------------snoitalerrocotuA-------------------- slaudiseR fo kcehC noitalerrocotuA 000.1 014.0 020.0 1,1MUN DLOGL 014.0 000.1 810.01MUN DLOGL 020.0 810.0000.1 UM SNIKL 1,1MUN 1MUN UM DLOGL DLOGL SNIKL retemaraP elbairaV setamitsE retemaraP fo snoitalerroC 793 798.808948.028337580.0 53700.0 898100.0 slaudiseR fo rebmuN CBS CIA etamitsE rorrE dtS etamitsE ecnairaV etamitsE tnatsnoC 1 1 0 tfihS DLOGL DLOGL SNIKL elbairaV 1 0 0 gaL 1000.< 1000.< 4956.0 |t| > rP xorppA 48.0164.02 44.0 eulaV t 60120.0 70120.0 4503400.0 rorrE dradnatS 73822.090134.0 6798100.0 etamitsE 1,1MUN 1MUN UM retemaraP noitamitsE doohilekiL mumixaM erudecorP AMIRA ehT NOTSNIK DNA OROBSDLOG TA REVIR ESUEN FO SETAR WOLF noitpO TOLP eht hguorht ledoM rorrE na gniyfitnedI :setaR wolF gniledoM 81.4 tuptuO Chapter 4: The ARIMA Model: Introductory Applications 207 | | | | | | | | | | | | | | | | | . |** ***| . . |*. . | . .*| . .*| . . | . **| . . |*. ***| . .*| . .*| . **| . **| . **| . ****| . . |****** | | | | | | | | | | | | | | | | | 25390.077651.0 92070.031510.0 16730.0 10430.0 55000.059290.0 12160.053471.0 90530.0 94650.0 58601.0 10870.0 55121.0 98802.0 56182.0noitalerroC 71 61 51 41 31 21 11 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA esrevnI srorre dradnats owt skram "." 126260.0 775260.0 325260.0 583260.0 374160.0 621160.0 489060.0 089060.0 061060.0 511060.0 311060.0 190060.0 480060.0 269950.0 688950.0 527950.0 644950.0 479850.0 747850.0 262850.0 610750.0 403550.0 259450.0 981050.0 0 rorrE dtS | . |* . | . |* . | . *| . | .*| . | ***| . | **| . | .*| . | . | . | . |*** | . |*. | . | . | . | . | . | . | .*| . | .*| . | .*| . | . |** | . |** | . |*. | . |** | . |*** | . |**** | . |** | ******| . |********************| | | | | | | | | | | | | | | | | | | | | | | | | | 41960.10330.04630.0 75850.0 27941.0 78190.0 86850.0 12010.24041.89230.54500.21320.16210.80450.0 33240.0 28160.0 03180.03501.97270.32601.68861.43591.67780.13513.0 00000.1 noitalerroC 2805000.07242000.015762000.0 25034000.0 4001100.0 42576000.0 82134000.0 1570000.01230100.04242000.00040000.00071000.07290000.035793000.0 61113000.0 53454000.0 6795000.09377000.00535000.08087000.02142100.08534100.01546000.05713200.0 1053700.0 ecnairavoC 42 32 22 12 02 91 81 71 61 51 41 31 21 11 01 9 8 7 6 5 4 3 2 1 0 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1- slaudiseR fo tolP noitalerrocotuA )deunitnoc( noitpO TOLP eht hguorht ledoM rorrE na gniyfitnedI :setaR wolF gniledoM 81.4 tuptuO 208 SAS for Forecasting Time Series 330.0550.0 390.0 020.0 240.0 650.0 101.0720.0 500.0740.0610.0 600.0200.0700.0670.0560.0 270.0 760.0360.0410.0510.0041.0 800.0170.0 260.0 830.0110.0 360.0140.0720.0 910.0 010.0- 750.0540.0 700.0 560.0840.0520.0700.0210.0- 440.0621.0 320.0 420.0630.0 330.0630.0 110.0 2523.0 3913.0 3955.0 3035.0 4104.0 2022.0 5434.0 3735.0 qSihC > rP 74 14 53 92 32 71 11 5 FD 38.05 07.44 21.33 77.72 40.42 41.12 11.11 80.4 erauqS -ihC 74 14 53 92 32 71 11 5 gaL oT -------------------snoitalerrocssorC-------------------- DLOGL tupnI htiw slaudiseR fo kcehC noitalerrocssorC | | | | | | | | | | | | | | | | | | | | | | | | . | . . |*. **| . . | . **| . .*| . . |*. .*| . . |*** . |*. . | . . | . . |** . | . . |** .*| . . |** . |*** . |** . |*. . |** . |** . |**** ******| . | | | | | | | | | | | | | | | | | | | | | | | | 70310.057930.039670.0 99900.097680.0 10330.0 80530.086070.0 84751.064070.013910.068310.077770.059710.0 89680.060740.0 12870.014431.004280.020270.059190.068111.048702.013513.0 noitalerroC 42 32 22 12 02 91 81 71 61 51 41 31 21 11 01 9 8 7 6 5 4 3 2 1 gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1snoitalerrocotuA laitraP | | | | | | | . | . . | . . |*. .*| . . |** . | . . | . | | | | | | | 89900.0 48320.0 49270.074360.0 55890.050320.0 20400.0 42 32 22 12 02 91 81 )deunitnoc( noitpO TOLP eht hguorht ledoM rorrE na gniyfitnedI :setaR wolF gniledoM 81.4 tuptuO Chapter 4: The ARIMA Model: Introductory Applications 209 210 SAS for Forecasting Time Series An ARMA(2,1) model fits the error term. Make the final estimation of the transfer function with noise by replacing the ESTIMATE statement (the one with the PLOT option) with ESTIMATE P=2 Q=1 INPUT=(1$(1)LGOLD) METHOD=ML NOCONSTANT ALTPARM; RUN; Output 4.19 shows the results, and the model becomes ) B7788. Because you encountered a pure delay, this is an example of a leading indicator, although this term is generally reserved for economic data. More insight into the effect of this pure delay is obtained through the cross-spectral analysis in Chapter 7. )deunitnoc( noitpO TOLP eht hguorht ledoM rorrE na gniyfitnedI :setaR wolF gniledoM 1 898100.0 )1(**B 73822.0 + 90134.0 1 1 DLOGL SNIKL elbairaV rof ledoM srotcaF rotaremuN gnicnereffiD fo )s(doireP tfihS elbairaV tupnI gnicnereffiD fo )s(doireP tpecretnI detamitsE 1 rebmuN tupnI − + 1− 1( t t e) DLOGL 2 B36974. :1 rotcaF ∇ ) B55. + B52361.1 + 1( 93594. − 1( / = t SNIKL ∇ 81.4 tuptuO 010.0210.0250.0530.0520.0540.0 840.0020.0 530.0 500.0250.0 200.0 500.0840.0 440.0 810.0 530.0 850.0 440.0 650.0460.0 231.0020.0100.0- 450.0 500.0 540.0 810.0100.0 630.0 290.0 010.0- 111.0 810.0 700.0010.0541.0 400.0 490.0800.0- 820.0 300.0 420.0 910.0 320.0 030.0340.0500.0 5063.0 2254.0 5462.0 4781.0 0270.0 3071.0 6443.0 8649.0 qSihC > rP 54 93 33 72 12 51 9 3 FD 87.74 04.93 66.73 03.33 01.13 40.02 70.01 73.0 erauqS -ihC 84 24 63 03 42 81 21 6 gaL oT --------------------snoitalerrocotuA-------------------- slaudiseR fo kcehC noitalerrocotuA 000.1 095.0 240.0 790.0190.01,1MUN DLOGL 095.0 000.1 121.0650.0 801.0 1ELACS DLOGL 240.0 121.0000.1 816.0621.0 2,1RA SNIKL 790.0650.0 816.0000.1 874.0 1,1RA SNIKL 190.0801.0 621.0 874.0 000.1 1,1AM SNIKL 1,1MUN 1ELACS 2,1RA 1,1RA 1,1AM DLOGL DLOGL SNIKL SNIKL SNIKL retemaraP elbairaV setamitsE retemaraP fo snoitalerroC 793 84.988993.909704670.0 838500.0 slaudiseR fo rebmuN CBS CIA etamitsE rorrE dtS etamitsE ecnairaV 1 1 0 0 0 tfihS DLOGL DLOGL SNIKL SNIKL SNIKL elbairaV 1 0 2 1 1 gaL 1000.< 1000.< 1000.< 1000.< 1000.< |t| > rP xorppA 21.2138.62 15.0150.32 23.52 eulaV t 04540.0 74810.0 46540.0 64050.0 60530.0 rorrE dradnatS 62055.093594.0 36974.052361.1 67788.0 etamitsE 1,1MUN 1ELACS 2,1RA 1,1RA 1,1AM retemaraP noitamitsE doohilekiL mumixaM erudecorP AMIRA ehT NOTSNIK DNA OROBSDLOG TA REVIR ESUEN FO SETAR WOLF AMIRA CORP :noitcnuF refsnarT laniF eht gnitamitsE 91.4 tuptuO Chapter 4: The ARIMA Model: Introductory Applications 211 )1(**B 62055.0 + 1 :1 rotcaF srotcaF rotaremuN 493594.0 1 1 DLOGL rotcaF noissergeR llarevO gnicnereffiD fo )s(doireP tfihS elbairaV tupnI 1 rebmuN tupnI )1(**B 67788.0 - 1 :1 rotcaF srotcaF egarevA gnivoM )2(**B 36974.0 + )1(**B 52361.1 - 1 srotcaF evissergerotuA :1 rotcaF .ledom siht ni mret naem oN 1 gnicnereffiD fo )s(doireP SNIKL elbairaV rof ledoM 710.0580.0 980.0 200.0830.0 201.0 420.0430.0 910.0100.0540.0 620.0300.0 150.0130.0450.0 840.0 520.0780.0600.0300.0 031.0 240.0 970.0 880.0 800.0300.0650.0600.0 040.0 001.0 010.0 230.0040.0 810.0350.0630.0200.0830.0 260.0- 960.0401.0 800.0740.0440.0 140.0 380.0 440.0- 4761.0 0561.0 8202.0 5812.0 2911.0 6030.0 0431.0 5071.0 qSihC > rP 64 04 43 82 22 61 01 4 FD 41.55 06.84 85.04 84.33 79.92 21.82 59.41 14.6 erauqS -ihC 74 14 53 92 32 71 11 5 gaL oT -------------------snoitalerrocssorC-------------------- DLOGL tupnI htiw slaudiseR fo kcehC noitalerrocssorC )deunitnoc( AMIRA CORP :noitcnuF refsnarT laniF eht gnitamitsE 91.4 tuptuO 212 SAS for Forecasting Time Series Chapter 4: The ARIMA Model: Introductory Applications 213 Follow these steps in case 3 to complete your modeling: Suppose you use as an input Xt a sequence that is 0 through time 20 and 1 from time 21 onward. If the model is you have through time 20 and or esion At time 21, X21=1 and the previous Xs are 0, so At time 22 you get Yt eventually approaches ) if you ignore the noise term. Thus, you see that ratios of polynomials in the backshift operator B can provide interesting approaches to new levels. esion +) ... + ρ − 1( / β + ′ α = ) . . . + 2ρ + ρ + 1( β + ′ α 2− t X 2 ρ+ esion + )ρ + 1( β + ′ α = 22Y 1− t X ρ esion + β + ′ α = 12Y + t X( ) ρ − 1(/α = ′ α where (the expected value of Y when X is 0). You can also write ) β + α ot α morf ( after time 20. Thus, Y experiences an immediate level shift the model to ygetartS gniledoM fo yrammuS 4.3.3.4 1. Identify and estimate model for input X (IDENTIFY, ESTIMATE). 2. Prewhiten Y and X using model from item 1 (IDENTIFY). 3. Compute cross-correlations, (j), to identify transfer function form (IDENTIFY). 4. Fit transfer function and compute and analyze residuals (ESTIMATE, PLOT). 5. Fit transfer function with noise model (ESTIMATE). 6. Forecast X and Y (FORECAST). 4.3.4 Case 3B: Intervention ( XY esion + Xβ + α = 1− Yρ − Y + t X) esion + Xβ + α = Y t Bρ − 1( / β + ′ α = esion + β + α = Y esion + α = Y β + ′α = t t t t t t t at time 21. Now change t Y Y 214 SAS for Forecasting Time Series When you use an indicator input, you cannot prewhiten. Therefore, impulse-response weights are not proportional to cross-covariances. You make the identification by comparing the behavior of Yt near the intervention point with a catalog of typical behaviors for various transfer function forms. Several such response functions for Xt=1 when t>20 and 0 otherwise are shown in Output 4.20. Output 4.20 Plotting Intervention Models Chapter 4: The ARIMA Model: Introductory Applications 215 Output 4.20 Plotting Intervention Models (continued) 216 SAS for Forecasting Time Series Output 4.20 Plotting Intervention Models (continued) Chapter 4: The ARIMA Model: Introductory Applications 217 Output 4.20 Plotting Intervention Models (continued) 218 SAS for Forecasting Time Series Output 4.20 Plotting Intervention Models (continued) Output 4.21 shows calls for directory assistance in Cincinnati, Ohio (McSweeny, 1978). Chapter 4: The ARIMA Model: Introductory Applications 219 Output 4.21 Plotting the Original Data Prior to March 1974 directory assistance was free, but from that day on a charge was imposed. The data seem to show an initial falling off of demand starting in February, which may be an anticipation effect. The data clearly show an upward trend. You check the pre-intervention data for stationarity with the code PROC ARIMA DATA=CALLS; IDENTIFY VAR=CALLS STATIONARITY = (ADF=(2,3,12,13) ); IDENTIFY VAR=CALLS(1); ESTIMATE P=(12) METHOD=ML; WHERE DATE < '01FEB74'D; RUN; Some of the results are shown in Output 4.22. Only the trend tests are of interest since there is clearly a trend; however, none of the other tests could reject a unit root either. Tests with 12 and 13 lagged differences are requested in anticipation of seasonality. Below this are the chi-square checks for a seasonal AR model for the first differences. The fit is excellent and the seasonal AR parameter 0.5693 is not too close to 1. With this information you see that only the unit root tests with 12 or more lags are valid. 220 SAS for Forecasting Time Series Output 4.22 Unit Root Tests, Pre-intervention Calls Data stseT tooR tinU relluF-yekciD detnemguA PROC ARIMA; IDENTIFY VAR=CALLS(1) CROSSCOR= (IMPACT(1)) NOPRINT; ESTIMATE INPUT = ((1)IMPACT) P=(12) METHOD=ML; RUN; 0β β ,0 A first difference will reduce a linear trend to a constant, so calls tend to increase by 1.077 per month. The intervention variable IMPACT is created, having value 1 from February 1974 onward. Since the majority of the drop is seen in March, you fit an intervention model of the form where is the IMPACT variable at time t. The first time is 1, the effect is and after that both and X 1 will be 1 so that the effect is You anticipate a negative A test that is a test for an anticipation effect. and a larger-in-magnitude and positive Motivated by the pre-intervention analysis, you try the same seasonal AR(1) error structure and check the diagnostics to see if it suffices. The code is as follows: t ------------------------------------------------------------------------------- ---------------snoitalerrocotuA--------------- 940.0 021.0760.0030.0- F > rP 4690.0 7832.0 1902.0 7305.0 600.0530.0 760.0 111.0- F 64.5 07.4 48.4 73.3 X 941.0590.0580.0910.0 uaT < rP 4290.0 9841.0 2521.0 8733.0 1 553770.1 .1 slaudiseR fo kcehC noitalerrocotuA β − 0β )21(**B 43965.0 - 1 280.0 920.0 930.0560.0 SLLAC elbairaV rof ledoM srotcaF evissergerotuA 0 = 0β uaT 81.369.240.384.2- erudecorP AMIRA ehT gnicnereffiD fo )s(doireP naeM detamitsE 820.0760.0060.0 100.0- ohR < rP 9999.0 9999.0 5400.0 8880.0 200.0810.0 400.0510.0.1 :1 rotcaF β ohR 6393.25 2981.421 6311.138071.81qSihC > rP 3628.0 1158.0 3868.0 1547.0 −t FD 32 71 11 5 sgaL 31 21 3 2 erauqS -ihC 46.61 01.11 70.6 17.2 t X t X dnerT , t X )B1 epyT gaL oT 42 81 21 6 β − 0β( Chapter 4: The ARIMA Model: Introductory Applications 221 It is seen in Output 4.23 that all terms except mu are significant. The trend part of the fitted model is overlaid on the data in Output 4.24. Because the model has a unit root, the data can wander fairly far from this trend, and this indeed happens. It also explains why the standard error for mu is so large; that is, with random walk errors it is difficult to accurately estimate the drift term. Despite this, the model seems to capture the intervention well and seems poised to offer an accurate forecast of the next few values. The drop of –123 in calls the month prior to the charge is significant, so there was an anticipation effect. An additional drop of 400 leaves the calls at 523 below the previous levels. Output 4.23 PROC ARIMA for Calls Data tfihS elbairaV gaL 0 0 0 0 ------------snoitalerrocotuA------------- qSihC > rP 501.0 890.0 860.0880.0030.0- TCAPMI TCAPMI SLLAC SLLAC 620.0920.0520.0 510.0 901.0- YAD REP SLLAC EGAREVA YLHTNOM ,ECNATSISSA YROTCERID 1 0 21 0 520.0261.0550.0011.0610.0 slaudiseR fo kcehC noitalerrocotuA 871 90.2361 363.9161 33544.22 9297.305 496972.1 noitamitsE doohilekiL mumixaM |t| > rP eulaV t rorrE xorppA dradnatS 1000.< 1000.< 1000.< 9914.0 610.0 260.0 120.0 220.0850.0 07.91 40.686.6 18.0 910.0 100.0 390.0840.0 640.0slaudiseR fo rebmuN CBS CIA etamitsE rorrE dtS etamitsE ecnairaV etamitsE tnatsnoC 07243.02 20593.02 04760.0 97688.2 120.0600.0 420.0620.0900.0 6887.0 8156.0 2138.0 9027.0 2436.0 etamitsE 22196.004 16881.32154054.0 36823.2 FD 92 32 71 11 5 erauqS -ihC 37.22 38.91 74.11 19.7 34.3 retemaraP 1,1MUN 1MUN 1,1RA UM gaL oT 03 42 81 21 6 erudecorP AMIRA ehT 222 SAS for Forecasting Time Series Output 4.24 Effect of Charge for Directory Assistance To forecast the next few months, you extend the data set with missing values for calls and set the intervention variable to 1, assuming the charge will remain in effect. The code below produces the plot in Output 4.25. Note how the forecasts and intervals for the historical data have been deleted from the plot. The intervals are quite wide due to the unit root structure of the errors. Recall that even the historical data have produced some notable departures from trend. Adding other predictor variables, like population or new phone installations, might help reduce the size of these intervals, but the predictors would need to be extrapolated into the future. DATA EXTRA; DO T=1 TO 24; DATE = INTNX('MONTH','01DEC76'D,T); IMPACT=1; OUTPUT; END; RUN; DATA ALL; SET CALLS EXTRA; RUN; PROC ARIMA; IDENTIFY VAR=CALLS(1) CROSSCOR=( IMPACT(1) ) NOPRINT; ESTIMATE INPUT = ( (1) IMPACT ) P=(12) METHOD=ML NOPRINT; FORECAST LEAD=24 OUT=GRAPH ID=DATE INTERVAL=MONTH; RUN; DATA GRAPH; SET GRAPH; IF CALLS NE . THEN DO; FORECAST=.; U95=.; L95=.; END; RUN; PROC GPLOT DATA=GRAPH; PLOT (CALLS FORECAST U95 L95)*DATE/OVERLAY; SYMBOL1 V=NONE I=JOIN R=4; TITLE ``FORECASTED CALLS''; RUN; Chapter 4: The ARIMA Model: Introductory Applications 223 Output 4.25 Forecasts from Intervention Model 4.4 Further Examples 4.4.1 North Carolina Retail Sales Consider again the North Carolina retail sales data investigated in Chapter 1. Recall that there the quarterly sales increases were modeled using seasonal dummy variables; that is, seasonal dummy variables were fit to the first differences of quarterly sales. The models discussed in this section potentially provide an alternative approach. Here the full monthly data (from which the quarterly numbers were computed as averages) will be used. This is an example in which the airline model seems a good choice at first, but later runs into some problems. Recall that when a first difference is found, often a moving average at lag 1 is appropriate. Likewise, a multiplicative moving average structure, specified by ESTIMATE often works well when the first and span 12 difference, (Y – Y ) – (Y –Y ), has been taken. You can think of these moving average terms as somewhat mitigating the impact of the rather heavy-handed differencing operator. As in the IBM example in Section 3.4.7, the fitting of these moving average terms causes forecasts to be weighted averages of seasonal patterns over all past years where the weights decrease exponentially as you move further into the past. Thus the forecast is influenced somewhat by all past patterns but most substantially by those of the most recent years. The airline model just discussed will be written here as , introducing double subscripts to indicate which factor and which lag within that factor is being modeled. This double-subscript notation corresponds to PROC ARIMA output. The airline model is then there is often a good first try when seasonal data are encountered. Now if, for example, t e) 21 B 1, 2 θ − 1()B θ − 1( = ,1 = θ 1, 2 1,1 t Y) B − 1()B 21 − 1( ,)21()1( =Q 31− t 21− t 1− t t 224 SAS for Forecasting Time Series cancellation on both sides of the model and it reduces to Surprisingly, this can happen even with strongly seasonal data. If it does, as it will for the retail sales, it suggests where , considering a model outside the ARIMA class. Consider a model and has some ARIMA structure, perhaps even having unit roots. Note that forms an exactly repeating seasonal pattern, as would be modeled using dummy variables. Because of the autocorrelation function will have spikes at lag 12, as will that of the ordinary first differences since is also periodic. However, the span 12 difference will involve and has a unit root at lag 12, estimates of the coefficient of will be forced toward the unless moving average boundary. This overdifferencing often results in failure to converge. You issue the following SAS statements to plot the data and compute the ACF of the original series, first differenced series, and first and seasonally differenced series. PROC GPLOT DATA=NCRETAIL; PLOT SALES*DATE/HMINOR=0 HREF='01DEC83'D '01DEC84'D '01DEC85'D '01DEC86'D '01DEC87'D '01DEC88'D '01DEC89'D '01DEC90'D '01DEC91'D '01DEC92'D '01DEC93'D '01DEC94'D; TITLE “NORTH CAROLINA RETAIL SALES”; TITLE2 “IN MILLIONS”; RUN; PROC ARIMA DATA=NCRETAIL ; IDENTIFY VAR=SALES OUTCOV=LEVELS NLAG=36; IDENTIFY VAR=SALES(1) OUTCOV=DIFF NLAG=36; IDENTIFY VAR=SALES(1,12) OUTCOV=SEAS NLAG=36; RUN; 2 1− Z t The data plot is shown in Output 4.26. The ACF, IACF, and PACF have been saved with the OUTCOV=option. Output 4.27 uses this with SAS/GRAPH and a template to produce a matrix of differenced data and columns representing, plots with rows representing original, (1), and from left to right, the ACF, IACF, and PACF. )21,1( 21− tS , tZ) B = t S 21 t S − 1( t t Z + S+µ = Y S . te)B t 1,1 θ − 1( = t 21− t t Y− Y Y)B t − 1( t Z 1− t t Z S− S t F T AM E Y L Chapter 4: The ARIMA Model: Introductory Applications 225 Output 4.26 Plotting the Original Data Output 4.27 Computing the ACF with the IDENTIFY Statement: PROC ARIMA 131 508.0681 81.2581 3462.052 42.23626 50956.021 1 0 gaL 5099.0 1000.< 6666.0 |t| > rP xorppA 10.0 63.21 34.0eulaV t slaudiseR fo rebmuN CBS CIA etamitsE rorrE dtS etamitsE ecnairaV etamitsE tnatsnoC 99668.38 79950.0 78925.1 rorrE dradnatS 97999.0 63147.0 50956.0etamitsE 1,2AM 1,1AM UM retemaraP noitamitsE doohilekiL mumixaM erudecorP AMIRA ehT .degrevnoc evah ton yam setamitsE 8 100.0 10000.0 90.329doohilekiL naissuaG goL 14931.0 8.645563 42872.96 100.0 setamitsE ni egnahC evitaleR mumixaM 3 doohilekiL mumixaM egasseM gninraW snoitaretI atleD noitabrutreP evitavireD laciremuN tneiciffeoC adbmaL s'tdrauqraM eulaV noitcnuF evitcejbO noitcnuF evitcejbO noitaretI tsaL morf egnahC erauqS-R tneidarG fo eulaV etulosbA mumixaM eulaV airetirC eulaV gnippotS noitaretI airetirC noitanimreT detamitsE sretemaraP dohteM noitamitsE yrammuS noitazimitpO noitamitsE AMIRA .degrevnoc evah ton yam setamitsE :GNINRAW .detanimret neeb sah ssecorp noitareti ehT .elbatsnu si setamitse wen eht yb denifed ledom ehT :GNINRAW erutcurtS AM evitacilpitluM eht gnittiF 82.4 tuptuO 226 SAS for Forecasting Time Series to the above code requests that maximum likelihood estimates of the multiplicative MA be fitted to the first and span 12 differenced data. The results are in Output 4.28. The plot of the data displays nonstationary behavior (nonconstant mean). The original ACF shows slow decay, indicating a first differencing. The ACF of the differenced series shows somewhat slow decay at the seasonal lags, indicating a possible span 12 difference. The Q statistics and ACF on the SALES(1,12) differenced variable indicate that some MA terms are needed, with the ACF spikes at 1 and 12 indicating MA terms at lags 1 and 12. Heeding the remarks at the beginning of this section, you try a multiplicative structure even though the expected side lobes at 11 and 13 (that such a structure implies) are not evident in the ACF. Such a structure also serves as a check on the differencing, as you will see. Adding ESTIMATE Q=(1)(12) ML; Chapter 4: The ARIMA Model: Introductory Applications 227 In Output 4.28, you see that there seems to be a problem. The procedure had trouble converging, the standard error on the lag 12 coefficient is extremely large, and the estimate itself is almost 1, indicating a possibly noninvertible model. You can think of a near 1.00 moving average coefficient at lag 12 as trying to undo the span 12 differencing. Of course, trying to make inferences when convergence has not been verified is, at best, questionable. Returning to the discussion at the opening is regular enough to be accounted for of this section, a possible explanation is that the seasonality by seasonal dummy variables. That scenario is consistent with all that has been observed about these data. The first difference plus dummy variable model of section 1 did seem to fit the data pretty well. The dummy variables can be incorporated in PROC ARIMA using techniques in Section 4.2. through denote monthly indicator variables (dummy variables), your model is Letting t t 1− t t Z∇ + S∇ δ + + S∇ δ + S∇ δ + ) ∇( β + α∇ = Y∇ ∇ Using to denote a first difference, write the model at time t and at time t,11 11 , then subtract to get ,1 δ ,s δ sales are always relatively high, you anticipate that all these t, 2 2 t,1 1 t t j δ similar values allow shifts for the other 10 months up through November. Because Christmas and especially will be negative. 1 δ ;tβ + ) 1δ + α( Y is that is, is a shift in the trend line that is included for all January data and tj S as a “December line” in that, for December, each t Z where, from your previous modeling, t ---------------snoitalerrocotuA--------------- seems to have a (nonseasonal) unit root. You interpret is 0. For January, the expected value of 731.0410.0411.0 230.0 780.0 100.0740.0511.0 361.0 410.0 021.0 640.0 1,2AM 000.1 980.0711.0- 21,1 50956.0- setamitsE retemaraP fo snoitalerroC slaudiseR fo kcehC noitalerrocotuA S )21(**B 97999.0 - 1 )1(**B 63147.0 - 1 )deunitnoc( erutcurtS AM evitacilpitluM eht gnittiF 470.0112.0 260.0 571.0 selas elbairav rof ledoM srotcaF egarevA gnivoM 1,1AM 980.000.1 502.0 gnicnereffiD fo )s(doireP naeM detamitsE Z+ 620.0300.0 511.0 650.0- t,11 S δ+ UM 711.0502.0 000.1 11 570.0 900.0 740.0 131.0- :2 rotcaF :1 rotcaF + S δ + S δ + tβ + α = Y t, 2 retemaraP qSihC > rP 3850.0 9411.0 0590.0 2850.0 2 1,2AM 1,1AM UM t ,21 t,1 FD 22 61 01 4 S 1 erauqS -ihC 62.33 69.22 61.61 21.9 t ,1 S t gaL oT 42 81 21 6 82.4 tuptuO β+ α 228 SAS for Forecasting Time Series Now is stationary if has a unit root, and Since errors should be stationary for proper modeling in PROC ARIMA, the model will be specified in first differences as .1 The parameters have the same interpretations as before. This code fits the model with specified as ARMA(2,1) and plots forecasts. The data set had 24 missing values for sales at the end with seasonal indicator variables nonmissing. Note that the seasonal indicator variables can be generated without error and so are valid deterministic inputs. The following code produces Output 4.29 and Output 4.30. PROC ARIMA DATA=NCRETAIL; IDENTIFY VAR=SALES(1) CROSSCOR = (S1(1) S2(1) S3(1) S4(1) S5(1) S6(1) S7(1) S8(1) S9(1) S10(1) S11(1) ) NOPRINT; ESTIMATE INPUT=(S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11) p=2 q=1 ml; FORECAST LEAD=24 OUT=OUT1 ID=DATE INTERVAL= MONTH; RUN; PROC GPLOT DATA=OUT1; PLOT (SALES L95 U95 FORECAST)*DATE/ OVERLAY HREF ='01DEC94'D; WHERE DATE> '01JAN90'D; SYMBOL1 V=NONE I=JOIN C=BLACK L=1 R=1 W=1; SYMBOL2 V=NONE I=JOIN C=BLACK L=2 R=2 W=1; SYMBOL3 V=NONE I=JOIN C=BLACK L=1 R=1 W=2; RUN; t Z∇ tfihS 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 = )1 − ( − = ∇ elbairaV t 11S 01S 9S 8S 7S 6S 5S 4S 3S 2S 1S selas selas selas selas t t gaL 0 0 0 0 0 0 0 0 0 0 0 2 1 1 0 selaS liateR aniloraC htroN rof ledoM lanosaeS noitamitsE doohilekiL mumixaM |t| > rP xorppA 1000.< 1000.< 1000.< 1000.< 1000.< 6200.0 1000.< 1000.< 1000.< 1000.< 1000.< 3710.0 1500.0 3000.0 1000.< ,0 = α − α = α∇ t t,11 STSACEROF FO SRAEY 2 SELAS LIATER CN erudecorP AMIRA ehT Z∇ + S∇ δ + eulaV t 49.429.644.587.401.410.383.433.524.707.1102.1183.208.206.3 55.4 11 rorrE dradnatS 50216.49 41068.29 92450.18 77329.88 06135.09 36049.78 20495.09 46154.98 51363.28 91463.39 13763.59 42521.0 55141.0 48041.0 09098.5 + t, 2 S∇ δ + S∇ δ + β = Y∇ tj S 2 t Z etamitsE 81845.76421805.24669197.04411756.42477203.17346136.46263549.69326606.67454284.1161.29014.860111892.066693.039605.0 41328.62 t,1 1 92.4 tuptuO retemaraP t 11MUN 01MUN 9MUN 8MUN 7MUN 6MUN 5MUN 4MUN 3MUN 2MUN 1MUN 2,1RA 1,1RA 1,1AM UM t Z∇ 549.6931 5S 706.6741 4S 284.1161 3S 31.29011 2S 14.86011 1S rotcaF noissergeR llarevO gnicnereffiD fo )s(doireP elbairaV tupnI 5 rebmuN tupnI rotcaF noissergeR llarevO gnicnereffiD fo )s(doireP elbairaV tupnI 4 rebmuN tupnI rotcaF noissergeR llarevO gnicnereffiD fo )s(doireP elbairaV tupnI 3 rebmuN tupnI rotcaF noissergeR llarevO gnicnereffiD fo )s(doireP elbairaV tupnI 2 rebmuN tupnI rotcaF noissergeR llarevO gnicnereffiD fo )s(doireP elbairaV tupnI 1 rebmuN tupnI :1 rotcaF )1(**B 39605.0 - 1 srotcaF egarevA gnivoM )2(**B 11892.0 + )1(**B 66693.0 + 1 srotcaF evissergerotuA 1 41328.62 gnicnereffiD fo )s(doireP tpecretnI detamitsE :1 rotcaF SELAS elbairaV rof ledoM 961.0730.0420.0 610.0 690.0 410.0 670.0801.0 861.0 030.0 801.0 100.0 020.0222.0 750.0 800.0 040.0030.0 911.0 120.0310.0 310.0 430.0 900.06441.0 0773.0 1085.0 3795.0 qSihC > rP 12 51 9 3 FD 48.72 80.61 55.7 88.1 erauqS -ihC 42 81 21 6 gaL oT ---------------snoitalerrocotuA--------------- slaudiseR fo kcehC noitalerrocotuA 341 344.2302 100.8891 875.932 26.79375 29854.54 slaudiseR fo rebmuN CBS CIA etamitsE rorrE dtS etamitsE ecnairaV etamitsE tnatsnoC )deunitnoc( selaS liateR aniloraC htroN rof ledoM lanosaeS 92.4 tuptuO Chapter 4: The ARIMA Model: Introductory Applications 229 4733.2239 9261.6188 2593.2068 3311.5678 8750.2478 1110.6578 3640.3288 2698.0568 2861.1358 5848.5538 1206.4387 1376.7187 1214.4488 1689.4338 1800.0218 4069.7728 5399.6428 7839.7628 2712.1338 2012.8118 4772.1408 9975.5197 2812.9417 0233.9327 9534.9277 2968.6427 1070.7507 8331.4427 7028.5427 3339.4827 6265.7737 2964.1327 2492.8317 2720.0996 5173.6946 5386.7056 5222.3657 8101.3807 7965.8986 0407.7807 2082.7807 1777.1417 8800.2427 2595.6507 3504.4107 5686.0696 0537.5026 4302.0036 9953.604 4733.004 8222.493 1210.883 1007.183 8182.573 6257.863 4501.263 5133.553 2034.843 6193.143 2781.433 1048.623 1463.913 1795.113 4246.303 7058.592 4192.782 4468.772 1528.072 0269.162 7995.342 9886.042 0875.932 rorrE dtS 6688.5258 1615.1308 6237.9287 6326.4008 3939.3997 2274.0208 4403.0018 7281.1497 2137.4387 9739.2767 8684.5617 3876.2617 3718.3028 9340.9077 9882.9057 2238.2867 9631.7667 9758.4077 0316.6877 7204.7857 3148.7257 2331.8347 6674.7766 7767.9676 tsaceroF 861 761 661 561 461 361 261 161 061 951 851 751 651 551 451 351 251 151 051 941 841 741 641 541 sbO stimiL ecnedifnoC %59 SELAS elbairaV rof stsaceroF 845.7641 11S 805.2461 01S 297.0441 9S 756.4241 8S 303.1731 7S 236.4621 6S rotcaF noissergeR llarevO gnicnereffiD fo )s(doireP elbairaV tupnI 11 rebmuN tupnI rotcaF noissergeR llarevO gnicnereffiD fo )s(doireP elbairaV tupnI 01 rebmuN tupnI rotcaF noissergeR llarevO gnicnereffiD fo )s(doireP elbairaV tupnI 9 rebmuN tupnI rotcaF noissergeR llarevO gnicnereffiD fo )s(doireP elbairaV tupnI 8 rebmuN tupnI rotcaF noissergeR llarevO gnicnereffiD fo )s(doireP elbairaV tupnI 7 rebmuN tupnI rotcaF noissergeR llarevO gnicnereffiD fo )s(doireP elbairaV tupnI 6 rebmuN tupnI )deunitnoc( selaS liateR aniloraC htroN rof ledoM lanosaeS 92.4 tuptuO 230 SAS for Forecasting Time Series Chapter 4: The ARIMA Model: Introductory Applications 231 Output 4.30 shows the resulting graph. Output 4.30 Forecasts from Seasonal Model 4.4.2 Construction Series Revisited Returning to the construction worker series at the beginning of Section 4.1.2, you can fit two models both having a first difference. Let one incorporate a seasonal difference and the other incorporate seasonal dummy variables S1 through S12 to model the seasonal pattern. This code produces two forecast data sets, OUTDUM and OUTDIF, that have 24 forecasts from the two models. The data set ALL has the original construction data along with seasonal dummy variables S1 through S12 that extend 24 periods into the future. In Section 4.4.1 the December indicator S12 was dropped to avoid a collinearity problem involving the intercept. An equally valid approach is to drop the intercept (NOCONSTANT) and retain all 12 seasonal indicators. That approach is used here. PROC ARIMA DATA=ALL; IDENTIFY VAR=CONSTRCT NLAG=36 NOPRINT; IDENTIFY VAR=CONSTRCT(1) STATIONARITY=(ADF=(1,2,3) DLAG=12); IDENTIFY VAR=CONSTRCT(1) NOPRINT CROSSCOR = (S1(1) S2(1) S3(1) S4(1) S5(1) S6(1) S7(1) S8(1) S9(1) S10(1) S11(1) S12(1) ); ESTIMATE INPUT = (S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11 S12 ) NOCONSTANT METHOD=ML NOPRINT; FORECAST LEAD=24 ID=DATE INTERVAL=MONTH OUT=OUTDUM; IDENTIFY VAR=CONSTRCT(1,12) NOPRINT; ESTIMATE NOCONSTANT METHOD=ML NOPRINT; FORECAST LEAD=24 INTERVAL=MONTH ID=DATE OUT=OUTDIF NOPRINT; RUN; 232 SAS for Forecasting Time Series In Output 4.31 the forecast data sets have been merged and forecasts 24 periods ahead have been plotted. The forecasts and intervals for the span 12 differenced series are shown as darker lines labeled “D,” and those for the dummy variable model are shown as lighter lines with a dot label on the far right. The forecasts are quite different. The seasonally differenced series gives much wider intervals and a general pattern of decline. The seasonal dummy variables produce forecast intervals that are less pessimistic and, 24 periods into the future, are about half the width of the others. Of course, wide intervals are expected with differencing. Is there a way to see which model is more appropriate? The chi-square statistics for both models show no problems with the models. Note the code STATIONARITY=(ADF=(1,2,3) DLAG=12) for the first differenced series. This DLAG=12 option requests a seasonal unit root test. Dickey, Hasza, and Fuller (1984) develop this and other seasonal unit root tests. Output 4.32 shows the results, and the tau statistics give some evidence against the null hypothesis of a seasonal unit root. stsaceroF ecnereffiD lanosaeS dna ymmuD lanosaeS 13.4 tuptuO Chapter 4: The ARIMA Model: Introductory Applications 233 The seasonal dummy variable model does not lose as much data to differencing, is a little easier to understand, has narrower intervals, and does more averaging of past seasonal behavior. In fact, the so the forecast for this first and span 12 difference model has forecast August is just this July’s value with last year’s July-to-August change added in. The forecast effectively makes a copy of last year’s seasonal pattern and attaches it to the end of the series as a forecast. Without moving average terms, last year’s pattern alone gives the forecast. For these data, these comments along with the fact that the data themselves reject the seasonal difference suggest the use of the dummy variable model. 4.4.3 Milk Scare (Intervention) Liu et al. (1998) discuss milk sales in Oahu, Hawaii, during a time period in which the discovery of high pesticide levels in milk was publicized. Liu (personal communication) provided the data here. The data indicate April 1982 as the month of first impact, although some tainted milk was found in March. Output 4.33 shows a graph with March, April, and May 1982 indicated by dots. Ultimately eight recalls were issued and publicized, with over 36 million pounds of contaminated milk found. It might be reasonable to expect a resulting drop in milk sales that may or may not have a long-term effect. It appears that, with the multiple recalls and escalating publicity, the full impact was not realized until May 1982, after which recovery began. Initially a model was fit to the data before the intervention. A seasonal pattern was detected, but no ordinary or seasonal differencing seemed necessary. A P=(1)(12) specification left a somewhat large correlation at lag 2, so Q=(2) was added and the resulting model fit the pre-intervention data nicely. The intervention response seemed to show an arbitrary value after the first drop, in fact another drop, followed by exponential increase upward. The second drop suggests a numerator lag and the exponential increase suggests a denominator lag in the transfer function operator. X is a variable that is 1 for April 1982 and 0 otherwise. The following code produces an intervention model with this pattern. PROC ARIMA DATA=LIU; IDENTIFY VAR=SALES NOPRINT CROSSCOR=(X); ESTIMATE INPUT=( (1) /(1) X ) P=(1)(12) Q=(2) METHOD=ML; RUN; Output 4.33 and Output 4.34 show the results. uaT < rP 8240.0 1450.0 1990.0 8910.0 1520.0 9940.0 stseT tooR tinU relluF-yekciD detnemguA lanosaeS uaT 70.259.116.102.290.277.1,) 31− tY − 21− tY ( + 1− tY ohR < rP 0391.0 8602.0 6062.0 1921.0 9831.0 0181.0 ohR 4063.74999.64557.50657.77624.71012.6- = t Y ˆ sgaL 3 2 1 3 2 1 naeM elgniS naeM oreZ epyT ataD noitcurtsnoC rof stseT tooR tinU lanosaeS 23.4 tuptuO 0 0 0 0 0 0 0 x x x selas selas selas selas 1 1 0 21 1 2 0 1000.< 1000.< 1000.< 1000.< 1000.< 5300.0 1000.< 65.81 37.61 90.4133.11 60.5 29.268.02 98230.0 85269.2 90238.2 46960.0 66501.0 08911.0 01089.3 15016.0 43955.94 14698.9392987.0 71435.0 92943.005510.38 etamitsE 1,1NED 1,1MUN 1MUN 1,2RA 1,1RA 1,1AM UM retemaraP tfihS elbairaV gaL |t| > rP eulaV t rorrE xorppA dradnatS noitamitsE doohilekiL mumixaM erudecorP AMIRA ehT selaS kliM no )noitanimatnoC( yticilbuP evitageN fo tceffE Output 4.34 Model for Milk Sales Intervention Output 4.33 Effect of Tainted Milk 234 SAS for Forecasting Time Series Chapter 4: The ARIMA Model: Introductory Applications 235 By specifying INPUT=( (1)/(1) X), where X is 1 for April 1982 and 0 otherwise, you are fitting an intervention model whose form is t Filling in the estimates, you have so when is 1, the estimated effect is The next month, is 1 and the effect is as recovery begins. This model Two months after the intervention the estimated effect is forces a return to the original level. In Output 4.33 a horizontal line at the intercept 83 has been drawn and the intervention effects and so on, have been added in. Notice how the intercept line underestimates the pre-intervention level, and how the estimated recovery seems faster than the data suggest. Had you plotted the forecasts, including the autoregressive components, this failure of the mean structure in the model might not have been noticed. The importance of plotting cannot be overemphasized. It is a critical component of data analysis. Note also that the statistics in Output 4.34 give no warning signs of any problems. Again one might think of the autoregressive structure as compensating for some lack of fit. Might there be some permanent effect of this incident? The model now under consideration does not allow it. To investigate this, you add a level shift variable. Define the variable LEVEL1 to be 1 prior to April 1982 and 0 otherwise. This will add a constant, the coefficient of the column, for the pre-intervention period. It represents the difference between the pre-intervention mean and the level to which the post-intervention trend is moving—that is, the level attained long after the intervention. If this shift is not significantly different from 0, then the model shows no permanent effect. If the shift (coefficient) is significantly larger than 0, then a permanent decrease in sales is suggested by the model. If the coefficient happens to be negative, then the preintervention level is less than the level toward which the data are now moving. You issue the following code to fit a model with both temporary effects (X) and a permanent level shift (LEVEL1): .4.47 ---------------snoitalerrocotuA--------------- 131.0640.0360.0081.0 − + 3− X )4.47− ( 216.0 + 2 − X)4.47− (16.0 + 1− X4.47 − X04− = + 2B 216.0 + B16.0 + 1()B05 − 04− ( = 010.0 451.0 532.0 910.0 t 710.0 500.0 000.0 011.0 slaudiseR fo kcehC noitalerrocotuA t 77 5799.664 9095.054 804589.3 84388.51 624841.8 X) )4.47 (16.0 530.0 600.0941.0 510.0- 1− t t X − 460.0 331.0 141.0 030.0 slaudiseR fo rebmuN CBS CIA etamitsE rorrE dtS etamitsE ecnairaV etamitsE tnatsnoC 811.0 250.0 611.0 720.0t Output 4.34 Model for Milk Sales Intervention (continued) ,4.47 .04 qSihC > rP 4343.0 7512.0 4011.0 6362.0 − − ,04 t − FD 12 51 9 3 t erauqS -ihC 10.32 69.81 53.41 89.3 X)B16.0 X)B1α − 1(/ )B1β − 0β( − 1(/ )B05 − 04− ( gaL oT 42 81 21 6 t X 77 8779.754 3722.934 920607.3 56437.31 36689.11 0 0 0 0 0 0 0 0 1level x x x selas selas selas selas 0 1 1 0 21 1 2 0 1000.< 1000.< 1000.< 1000.< 1000.< 4110.0 4900.0 1000.< slaudiseR fo rebmuN CBS CIA etamitsE rorrE dtS etamitsE ecnairaV etamitsE tnatsnoC 58.4 78.61 22.61 77.920.11 35.2 06.259.72 77622.2 35330.0 07299.2 33342.3 24070.0 43611.0 63021.0 15717.2 69097.01 56565.0 45455.84 77676.1343677.0 24492.0 16213.079459.57 etamitsE 2MUN 1,1NED 1,1MUN 1MUN 1,2RA 1,1RA 1,1AM UM retemaraP tfihS elbairaV gaL |t| > rP eulaV t rorrE xorppA dradnatS Output 4.36 Intervention Model with Permanent Shift Output 4.35 Model Allowing Permanent Effect noitamitsE doohilekiL mumixaM 236 SAS for Forecasting Time Series Output 4.35 and Output 4.36 show the results. PROC ARIMA; IDENTIFY VAR=SALES NOPRINT CROSSCOR=(X LEVEL1); ESTIMATE INPUT=((1)/(1)X LEVEL1) P=(1)(12) Q=(2) METHOD=ML; RUN; Chapter 4: The ARIMA Model: Introductory Applications 237 It appears that the pre-intervention level is about and the ultimate level to which sales will return is 75.95, according to this model. All estimates, including the estimated 10.79 permanent loss in sales, are significant. The geometric rate of approach to the new level is 0.56565, indicating a faster approach to the new level than that from the first model. Of course, at this point it is clear that the old model was misspecified, as it did not include LEVEL1. The AR1,1 coefficient 0.29 is quite a bit smaller than 0.53 from the first model. That is consistent with the idea that the autoregressive structure there was in part compensating for the poor fit of the mean function. You can add and subtract 1.96(2.2268) from 10.79 to get an approximate 95% confidence interval for the permanent component of the sales loss due to the contamination scare. Other models can be tried. Seasonal dummy variables might be tried in place of the seasonal AR factor. Liu et al. suggest that some sort of trend might be added to account for a decline in consumer preference for milk. A simple linear trend gives a mild negative slope, but it is not statistically significant. The estimated permanent level shift is about the same and still significant in its presence. 97.01 4.4.4 Terrorist Attack On September 11, 2001, terrorists used commercial airliners as weapons to attack targets in the United States, resulting in the collapse of the World Trade Center in New York City. American Airlines flights were among those involved. The stock market was closed following this incident and reopened September 17. In a second incident, an American Airlines jet crashed on November 12, 2001, in Queens, New York. An intervention analysis of American Airlines stock trading volume (in millions) is now done incorporating a pulse and level shift intervention for each of these events, defined similarly to those of the milk example in Section 4.4.3. Data through November 19 are used here, so there is not a lot of information about the nature of the response to the second incident. A model that seems to fit the data reasonably well, with parameters estimated from PROC ARIMA, is log(Volume) = 0.05 + (2.58–2.48B)/(1–.76B)Xt + 1.49/(1–.80B) Pt + (1–.52B)/(1–.84B) et where Xt is a level shift variable that is 1 after September 11 and 0 before, while Pt is a pulse variable that is 1 only on the day of the second incident. The p-values for all estimates except the intercept were less than 0.0005 and those for the chi-square check of residuals were all larger than 0.35, indicating an excellent fit for the 275 log transformed volume values in the data set. This model allows for a permanent effect of the terrorist attack of September 11 but forces the effect of the second incident to decline exponentially to 0 over time. The second incident sparked a j log(volume) increase 1.49 on the day it happened, but j days later, log(volume) is (0.80) (1.49) above what it would have otherwise been, according to the model. The permanent effect of the events of September 11 on log volume would be (2.59–2.48)/(1–.76) = 0.46 according to the model. The numerator lag for X allows a single arbitrary change from the initial shock (followed by an exponential approach at rate 0.76 to the eventual new level). In that sense, the inclusion of this lag ---------------snoitalerrocotuA--------------- 691.0860.0380.0011.0 270.0411.0 881.0 090.0- slaudiseR fo kcehC noitalerrocotuA 840.0030.0330.0110.0210.0960.0990.0 190.0- + 59.57 920.0 760.0 521.0 010.0- 080.0 010.0490.0 020.0 qSihC > rP 0136.0 2856.0 9493.0 1384.0 FD 12 51 9 3 erauqS -ihC 82.81 72.21 74.9 64.2 gaL oT 42 81 21 6 Output 4.36 Intervention Model with Permanent Shift (continued) 238 SAS for Forecasting Time Series acts like a pulse variable and likely explains why the pulse variable for September 11 was not needed in the model. The level shift variable for the second incident did not seem to be needed either, but with so little data after November 12, the existence of a permanent effect remains in question. Output 4.37 shows a graph of the data and a forecast from this model. Output 4.37 American Airlines Stock Volume Calculations from the log model were exponentiated to produce the graph. The model was fit to the full data set, but the option BACK=42 was used in the FORECAST statement so that the data following September 11 were not used to adjust the forecasts; that is, only the X and P parts of the model are used in the post-September 11 forecasts. With that in mind—that is, with no adjustments based on recent residuals—it is striking how closely these forecasts mimic the behavior of the data after this incident. It is also interesting how similar the decay rates (denominator terms) are for the two incidents. Two horizontal lines, one at the pre-intervention level exp(0.05) = 1.05 and one at the ultimate level exp(0.05 + (2.59–2.48)/(1–.76)) = exp(0.51) = 1.66, are drawn. The permanent effect of the event of September 11 is an increase of (2.59–2.48)/(1–.76) in log transformed volume, according to the model. That becomes a multiplicative increase of exp((2.59–2.48)/(1–.76)) = 1.58, a 58% increase in volume. 2 2 5.1 Regression with Time Series Errors and Unequal Variances SAS PROC AUTOREG provides a tool to fit a regression model with autoregressive time series errors. Such a model can be written in two steps. With a response related to a single input and with an AR(1) error, you can write t X Y t 972 stsaceroF dna scitsongaiD 11.2.5 772 setamitsE eht gniterpretnI 01.2.5 572 XAMRAV CORP 9.2.5 372 sgaL eroM dna stpecretnI 8.2.5 072 rotceV gnitargetnioC eht gnitamitsE 7.2.5 562 elpmaxE evitartsullI nA 6.2.5 362 stooR tinU dna noitargetnioC 5.2.5 062 sledoM redrO-rehgiH ni stooR 4.2.5 062 noitcnuF esnopseR eslupmI 3.2.5 852 seulavnegiE dna noitargetnioC 2.2.5 652 noitcudortnI 1.2.5 942 secnairaV lauqenU rof HCRAGI dna ,HCRAG ,HCRA 542 secnairaV lauqenU 142 ytisrevinU a ta dnameD ygrenE :elpmaxE 932 srorrE evissergerotuA 932 secnairaV lauqenU dna srorrE seireS emiT htiw noissergeR 1.5 where | | rP xorppA 88.5 19.61 02.01 84.5 27.4 17.71 eulaV t 3297.5 2928.851 0472.051 8541.0 8397.5 2728.202 rorrE dradnatS 3380.43 5862 3351 8897.0 2153.72 3953 etamitsE 1 1 1 1 1 1 FD 1PMET KROW HCAET QSPMET PMET tpecretnI elbairaV .noitalerrocotua evitagen gnitset rof eulav-p eht si WD>rP dna ,noitalerrocotua evitisop gnitset rof eulav-p eht si WD rP nostaW-nibruD erauqS-R ssergeR CBS ESM ESS setamitsE serauqS tsaeL yranidrO DNAMED elbairaV tnednepeD erudecorP GEROTUA ehT t r DW= t r ∑ / )1− t r − r( t ∑ Output 5.2 OLS Regression = 0.5331, is significantly less than 2 (p rP xorppA 1510.0 3300.0 3900.0 2300.0 1000.< 1000.< 1000.< 1000.< 1000.< 1000.< 1000.< ± eulaV t 44.259.2 26.279.281.6199.6 72.52 48.8 73.6 36.6 83.11 e+ rorrE dradnatS 9640.0 1840.0 4050.0 8740.0 1040.0 1385.3 8115.511 4782.411 2611.0 9835.3 0034.704 41− tZ 21.0 + 21− Z 31.0 − 7 − Z 61.0 + 5− Z 51.0 + 1− Z 85.0 = etamitsE 5411.00241.0 8131.08141.00946.00550.52 9192 1101 5047.0 1174.32 8364 t t Z In PROC AUTOREG, the model for the error is written with plus rather than minus signs—that eulaV t 06.208.2 62.383.312.41- sretemaraP evissergerotuA fo setamitsE rorrE dradnatS 718440.0 786540.0 044840.0 997540.0 128040.0 tneiciffeoC 096611.0277721.0 387751.0749451.0911085.0t FD 1 1 1 1 1 1 1 1 1 1 1 . te = tZ) p B p α + ... + 2B 2 α + B1α + 1( ± 41 21 7 5 1 t 41RA 21RA 7RA 5RA 1RA 1PMET KROW HCAET QSPMET PMET tpecretnI elbairaV t t Z Chapter 5: The ARIMA Model: Special Applications 245 Output 5.5 PROC ARIMA for Energy Data The effect on energy demand of teaching classes is estimated from PROC ARIMA as 838 with standard error 112, somewhat different from PROC AUTOREG and quite different from the OLS estimates. The purely autoregressive model from PROC AUTOREG and the mixed ARMA error model can both be estimated in PROC ARIMA. Doing so (not shown here) will show the AIC and SBC criteria to be smaller (better) for the model with the mixed ARMA error. The chi-square white noise tests, while acceptable in both, have higher (better) p-values for the mixed ARMA error structure. 5.1.3 Unequal Variances The models discussed thus far involve white noise innovations, or shocks, that are assumed to have constant variance. For long data sets, it can be quite apparent just from a graph that this constant variance assumption is unreasonable. PROC AUTOREG provides methods for handling such situations. In Output 5.6 you see graphs of 8892 daily values (from January 1, 1920 with = =log( ), and =log( ) –log( ). Clearly the log transformation improves the statistical properties and gives a clearer idea of the long-term increase than does the untransformed series. Many macroeconomic time series are better understood on the logarithmic scale over long periods of 1− t t Y 2988 Y Y 108.76 to December 31, 1949 with t =200.13) of = the Dow Jones Industrial Average, 1 Y tfihS 0 0 0 0 0 0 0 0 0 0 ---------------snoitalerrocotuA--------------- 610.0711.0 520.0210.0 500.0240.0101.0170.0 elbairaV 220.0240.0 601.0 700.0 310.0 410.0630.0220.0 1PMET KROW HCAET QSPMET PMET DNAMED DNAMED DNAMED DNAMED DNAMED 120.0810.0 200.0 340.0 200.0480.0 040.0 250.0- gaL 0 0 0 0 0 1 41 7 1 0 slaudiseR fo kcehC noitalerrocotuA noitamitsE doohilekiL mumixaM |t| > rP xorppA 1000.< 1000.< 1000.< 1000.< 1000.< 1000.< 5000.0 3000.0 1000.< 1000.< 330.0050.0 570.0 370.0940.0 910.0770.0 240.0640.0 510.0 710.0470.0 110.0 010.0 800.0 700.0eulaV t 43.7 58.62 05.7 96.6 83.7 93.32 94.356.311.4 03.21 410.0250.0 120.0 200.0 300.0700.0000.0810.0 rorrE dradnatS 37944.3 40209.411 83496.111 37111.0 24654.3 22630.0 98250.0 19350.0 75260.0 23803.363 qSihC > rP 5986.0 3615.0 5756.0 7847.0 7437.0 2204.0 5991.0 1641.0 erudecorP AMIRA ehT Y etamitsE 31923.52 4.5803 95880.838 42747.0 17794.52 92748.0 04481.075691.032752.0 1.8644 FD 44 83 23 62 02 41 8 2 t D erauqS -ihC 09.83 89.63 42.82 78.02 17.51 56.41 40.11 58.3 retemaraP t Y 5MUN 4MUN 3MUN 2MUN 1MUN 1,1RA 3,1AM 2,1AM 1,1AM UM gaL oT 84 24 63 03 42 81 21 6 t L 246 SAS for Forecasting Time Series the This also shows that ,0 The graph of shows some periods of high volatility. The five vertical graph lines represent, from left to right, Black Thursday (October 24, 1929, when the stock market crashed), the inauguration of President Franklin D. Roosevelt (FDR), the start of World War II, the bombing of Pearl Harbor, and the end of World War II. Note especially the era from Black Thursday until a bit after FDR assumed office, known as the Great Depression. )) 1Y(gol the increase in the series over the entire time period. For the data at hand, the ratio of the last to first data point is 200.13/108.76 = 1.84, so the series did not quite double over this 30-year period. You might argue that subperiods like the depression in which extreme volatility is present are not typical and should be ignored or at least downweighted in computing a rate of return that has some relevance for future periods. You decide to take a look at the variability of the series. numbers to a standard deviation Because there is so much data, the reduction of each month's still leaves a relatively long time series of 360 monthly numbers. These standard deviations have a histogram with a long tail to the right. Again a logarithmic transform is used to produce a monthly series =log(standard deviation) that has a more symmetric distribution. Thus measures the volatility in the series, and a plot of versus time is the fourth graph in Output 5.6. t t D S − ) nY(gol( 1− n t D )) 1− tY(gol − ) tY(gol( ∑ 2=t t 1− S n D t D The mean of the , 1 /n Y values is = = so that ... + ∆) 2/ 2 2B = )1(gol . 1− tY/) 1− tY 1B expansion of log(X) at X=1, you can represent , and so on. Since can be approximated by is essentially the overnight return on a $1 investment. . 1− tY/ tY 1( − ∆ 1 + )1(gol = ) ∆ + 1(gol = ∆ + 1 os ) 1− Y/) 1− Y − − t Y( = ∆ +0 t ,))X *X(/1 − ( t t Y(( n =∆ To demonstrate how this works, let X/1 = )X(gol)X∂/∂( Using a Taylor series since t D 1B t Y Y t 1B t Y Y t t D D t time. By the properties of logarithms, note that )/ ; that is, 100 is approximately ( – Jones average. 1B t = log( / ), and if / is near 1 then represents the daily percentage change in the Dow = )X/1()X∂/∂( Y 1B t Y Y t ) 1− tY/ tY( g ol t D = ) ∆ + 1(gol ) 1B tY/ tY( g ol t Y = Dne S Chapter 5: The ARIMA Model: Special Applications 247 Output 5.6 Dow Jones Industrial Average on Several Scales 248 SAS for Forecasting Time Series Now apply a time series model to the series. The tau test for stationarity suggests a unit root process when six augmenting lags are used. The reason for choosing six lags is that the partial is near 0 after lag 6 and, furthermore, a regression of on autocorrelation function for and 20 lagged differences ( for j=1 to 20) in PROC REG gave an insignificant F test for lags 7 through 20. A similar regression using six lagged differences showed all six to be significant according to their t tests. Dickey and Fuller show that such t tests on lagged differences are valid in large samples—only the test for the coefficient on the lagged level has a nonstandard distribution. That test cannot reject the unit root hypothesis, and so a model in first differences is suggested for the log transformed standard deviation series The above results are not displayed. At this point you are ready to model You have seen that a lag 6 autoregressive model for seems to provide an adequate fit. Perhaps this long autoregression is an approximation of a mixed model. The following code, using LSTD as the variable name for seems to provide a reasonable ARMA(1,1) model: PROC ARIMA DATA=OUT1; I VAR=LSTD(1) STATIONARITY=(ADF=(6)); E P=1 Q=1 ML NOCONSTANT; RUN; , tS The constant was suppressed (NOCONSTANT) after an initial check showed it to be insignificant. The tau test for unit roots suggests stationarity of the differenced series (p=0.0001) when six lagged differences are used. That is, no further differencing seems to be needed. Said and Dickey (1984) show that even for mixed models, these stationarity tests are valid as long as sufficient lagged differences are included in the model. In summary, the S series appears to be well modeled as an ARIMA(1,1,1) series with parameters as shown in Output 5.7. Output 5.7 ARIMA Model for S dnerT naeM elgniS naeM oreZ epyT gaL 1 1 F > rP F 0100.0 50.16 0100.0 41.16 1− t S− S t |t| > rP xorppA 1000.< 1000.< 1− t S stseT tooR tinU relluF-yekciD detnemguA uaT < rP uaT 1000.< 1000.< 1000.< .t S noitamitsE doohilekiL mumixaM eulaV t 14.4 66.81 50.1160.1170.11- rorrE dradnatS 83370.0 31440.0 ohR < rP ohR 9999.0 9999.0 9999.0 . tS 7910.223 6 1252.223 6 2279.223 6 etamitsE 82323.0 56328.0 1− j − t t S S− sgaL j −t S t retemaraP S 1,1RA 1,1AM 1− t S− S t 1− t S Chapter 5: The ARIMA Model: Special Applications 249 The model suggests the predicting equation 1− tˆ 56328. e deviations will all be positive regardless of the sign of This allows the variance to change over time in a way that can be predicted from the most recent few variances. The theory underlying ARIMA models is based on large sample arguments and does not require normality, so the use of log transformed standard deviations as data does not necessarily invalidate this approach. However, there are at least two major problems with approaching heterogeneous variation in the manner just used with the Dow Jones series. First, you will not often have so much data to start with, and second, the use of a month as a period for computing a standard deviation is quite arbitrary. A more statistically rigorous approach is now presented. The discussion thus far has been presented as a review of unit root test methodology as well as a motivation for fitting a nonconstant variance model that might involve a unit root. An analyst likely would use the more sophisticated approach shown in the next section. 5.1.4 ARCH, GARCH, and IGARCH for Unequal Variances The series whose variability is measured by has nonconstant conditional variance. Engle (1982) introduced a model in which the variance at time t is modeled as a linear combination of past squared residuals and called it an ARCH (autoregressive conditionally heteroscedastic) process. Bolerslev (1986) introduced a more general structure in which the variance model looks more like an ARMA than an AR and called this a GARCH (generalized ARCH) process. Thus the usual approach to modeling ARCH or GARCH processes improves on the method just shown in substantial ways. The purpose of the monthly standard deviation approach was to illustrate the idea of an ARMA type of structure for standard deviations or variances. t . tS ˆ t S ˆ standard deviation for month t. Notice that because is a logarithm, the resulting standard t S ˆ . 1− tS ˆ − 1B S S t t D 1B tˆ e where would be replaced by the residual ------------snoitalerrocotuA------------- qSihC FD erauqS > rP -ihC Exponentiation of gives a conditional 030.0220.0010.0 420.0710.0 550.0 070.0 610.0 400.0700.0750.0200.0510.0 870.0 500.0510.0 070.0610.0010.0180.0810.0560.0 440.0 350.0- slaudiseR fo kcehC noitalerrocotuA 820.0050.0650.0 131.0080.0431.0810.0540.0010.0500.0 720.0 610.0 220.0410.0060.0 610.0510.0810.0410.0 540.0350.0 310.0270.0 010.0 3356.0 7225.0 4613.0 0981.0 4943.0 6322.0 1436.0 0327.0 64 04 43 82 22 61 01 4 96.14 38.83 83.73 73.43 69.32 29.91 59.7 70.2 t t − ) 2 − S − 1− S(3323.0 + 1− S = gaL oT 84 24 63 03 42 81 21 6 t Output 5.7 ARIMA Model for S (continued) t S ˆ 250 SAS for Forecasting Time Series In this way, the error term has a conditional variance that is a function of the magnitudes of past errors. Engle's original ARCH structure has =0. Because is the variance rather than its example, if these are all restricted to be positive, then positive initial values of will ensure all are positive. For this reason, Nelson (1991) suggested replacing with log and an additional modification; he called the resulting process EGARCH. These approaches allow the standard deviation to change with each observation. Nelson and Cao (1992) give constraints on the and values that ensure nonnegative estimates of These are the default in PROC AUTOREG. More details are given in the PROC AUTOREG documentation and in Hamilton (1994), which is a quite detailed reference for time series. Recall that PROC AUTOREG will fit a regression model with autoregressive errors using the maximum likelihood method based on a normal distribution. In place of the white noise shocks in the autoregressive error model you can specify a GARCH(p,q) process. If it appears, as suggested by your analysis of the Dow Jones standard deviations, that the process describing the error variances is a unit root process, then the resulting model is referred to as integrated GARCH or IGARCH. If the usual stationarity conditions are satisfied, then for a GARCH process, forecasts of will revert to a long-run mean. In an IGARCH model, mean reversion is no longer a property of so forecasts of will tend to reflect the most recent variation rather than the average historical variation. You would expect the variation during the Great Depression to have little effect on future values in an IGARCH model of the Dow Jones data. ,t t PROC AUTOREG DATA=MORE; MODEL DDOW = / NLAG=2 GARCH=(P=2,Q=1,TYPE=INTEG,NOINT); OUTPUT OUT=OUT2 HT=HT P=F LCLI=L UCLI=U; RUN; PROC AUTOREG allows the use of regression inputs; however, here there is no apparent time trend or seasonality and no other regressors are readily available. The model statement DDOW = (with no inputs) specifies that the regression part of your model is only a mean. Note the way in which the sequence, predicted values, and default upper and lower forecast limits have been requested in the data set called OUT2. t h Y To investigate models of the daily percentage change in the Dow Jones Industrial Average Calling this variable DDOW, you issue this code: will use .) 1− tY ( g ol you h α t h t h ,t h t h ) th ( ω t h , j γ ,α i logarithm, certain restrictions must be placed on the and t h .t h j γ t h j −h t j γ ∑ 1= j t eh p t +2−εα i − ) Y(gol = t t ε i t ∑+ω = h 1= i t q D t t e noise ~ N(0,1) as = where satisfies the type of recursion used in an ARMA model: to ensure positive variances. For t ε The usual approach to GARCH(p,q) models is to model an error term in terms of a standard white t h γ Chapter 5: The ARIMA Model: Special Applications 251 In Output 5.8, the estimate of the mean is seen to be 0.000363. Since DDOW is a difference, a mean is interpreted as a drift in the data, and since the data are log differences, the number = 1.0003631 is an estimate of the long-run daily growth over this time period. With 8892 days in the study period, the number = 25 represents a 25-fold increase, roughly an 11.3% yearly growth rate! This is not remotely like the rate of growth seen, except in certain portions of the graph. PROC AUTOREG starts with OLS estimates so that the average DDOW over the period is the OLS intercept 0.0000702 from Output 5.8. This gives = 1.87, indicating 87% growth for the full 30-year period. This has to be more in line with the graph because, as you saw earlier, except for rounding error it is Note also the strong rejection of normality. The normality test used here is the that of Jarque and Bera (1980). This is a general test of normality based on a measurement of skewness and one of using residuals where kurtosis 1 36 3 00 0. 0 t and fourth moments scaled by the sample variance. The numerators are sums of approximately independent terms and thus satisfy a central limit theorem. Both have, approximately, mean 0 when the true errors are normally distributed. Approximate variances of the skewness and kurtosis are 6/n and 24/n. Odd and even powers of normal errors are uncorrelated, so squaring each of these approximately normal variates and dividing by its variance produces a pair of squares of approximately independent N(0,1) variates. The sum of these squared variates, therefore, follows a chi-square distribution with two degrees of freedom under the normality null hypothesis. The JarqueBera test JB = n( 2 ) has (approximately) a chi-square distribution with two degrees of freedom under the null hypothesis. Why is the IGARCH model giving a 25-fold increase? It seems unreasonable. The model indicates, and the data display, large variability during periods when there were steep drops in the Dow Jones average. A method that accounts for different variances tends to downweight observations with high variability. In fact there are some periods in which the 11.3% annual rate required for a 25-fold increase ( =25) was actually exceeded, such as in the periods leading up to the Great Depression, after FDR assumed office, and toward the end of WWII. The extremely large variances associated with periods of decrease or slow growth give them low weight, and that would tend to increase the estimated growth rate, but it is still not quite enough to explain the results. 02 311.1 htj 42 / )3 − 2b( + 6 / 21b n/ r ∑ The expression j is sometimes called the (raw) 3 − 2 )n / t 2 1= t n ( 3 − 2b and moment of r. The fractions involve third b e ) 19 8 8() 2 07 00 00 .0( n / 4tr e r∑ = ∑ 1= t n ) 19 8 8()3 63 0 00 .0( ,t r .1 / n Y e Y 2 /3 n / tr )n / t 1= t 3 n 2 r∑ = b ∑ 1 =t n 1 3 − 2b =t n ( 1 1000.0 1000.< 1000.< 8000.0 1000.< 1000.< |t| > rP xorppA 88.3 36.11 06.71 73.3 29.858.4 eulaV t 3750.0 9060.0 369300.0 675900.0 137900.0 8470000.0 rorrE dradnatS 4222.0 8707.0 8960.0 3230.0 8680.0363000.0 etamitsE 1 1 1 1 1 1 FD 2HCRAG 1HCRAG 1HCRA 2RA 1RA tpecretnI elbairaV 1000.< 764.22965. . 2988 qSihC > rP CIA erauqS-R latoT raV dnocnU snoitavresbO 9920.6883 300.788655332.66482 8371000.0 95873545.1 tseT ytilamroN CBS doohilekiL goL ESM ESS setamitsE HCRAG detargetnI .degrevnoc mhtiroglA 05.3 97.2eulaV t 995010.0 995010.0 rorrE dradnatS 421730.0 126920.0tneiciffeoC 2 1 gaL sretemaraP evissergerotuA fo setamitsE 5516.0 |t| > rP xorppA 05.0 eulaV t 041000.0 rorrE dradnatS 2070000.0 etamitsE 1 FD tpecretnI elbairaV 0000.0 421.1671571310.0 1988 erauqS-R latoT CIA ESM tooR EFD 7249.1 0000.0 130.457155371000.0 189245.1 nostaW-nibruD erauqS-R ssergeR CBS ESM ESS setamitsE serauqS tsaeL yranidrO wodd elbairaV tnednepeD Output 5.8 IGARCH Model for Dow Jones 252 SAS for Forecasting Time Series erudecorP GEROTUA ehT Chapter 5: The ARIMA Model: Special Applications 253 Perhaps more importantly, the rejection of normality by the Jarque-Bera test introduces the possibility of bias in the estimated mean. In an ordinary least squares (OLS) regression of a column Y of responses in a matrix X of explanatory variables, the model is Y=X +e and the estimated is unbiased whenever the random vector e parameter vector has mean 0. In regard to bias, it does not matter if the variances are unequal or even if there is correlation among the errors. These features only affect the variance of the estimates, causing biases in the standard errors for but not in the estimates of themselves. In contrast to OLS, GARCH and IGARCH models are fit by maximum likelihood assuming a normal distribution. Failure to meet this assumption could produce bias in parameter estimates such as the estimated mean. t As a check to see if bias can be induced by nonnormal errors, data from a model having the same sequence as that estimated for the Dow Jones log differences data were generated for innovations ~ N(0,1) and again for innovations ( )/ so this second set of innovations used the same normal variables in a way that gave a skewed distribution still having mean 0 and variance 1. The mean was set at 0.00007 for the simulation and 50 such data sets were created. For each data set, IGARCH models were fit for each of the two generated series and the estimated means were output to a data set. The overall mean and standard deviation of each set of 50 means were as follows: Thus it seems that finding a factor of 5 bias in the estimate of the mean (of the differenced logs) could be simply a result of error skewness; in fact the factor of 5 is almost exactly what the simulation shows. The simulation results also show that if the errors had been normal, good estimates of the true value, known in the simulation to be 0.00007, would have resulted. which, like any unit root model, will have a Using type=integ specifies an IGARCH model for linearly changing forecast if an intercept is present. You thus use NOINT to suppress the intercept. Using p=2 and q=1, your ht model has the form 1− t ε8960.0 2 ,t You can look at as a smoothed local estimate of the variance, computed by adding to the previous ) a weighted average of the two most recent changes in these smoothed values smoothed value ( and the square of the most recent shock. By default, PROC AUTOREG uses a constant variance to compute prediction limits; however, you is a local variance, add and can output the values in a data set as shown and then, recalling that subtract from your forecast to produce forecast intervals that incorporate the changing variance. Both kinds of prediction intervals are shown in Output 5.9, where the more or less horizontal bands are the AUTOREG defaults and the bands based on form what looks like a border to the data. The data set MORE used for Output 5.8 and Output 5.9 has the historical data and 500 PROC AUTOREG will produce values and additional days with dates but no values of default prediction limits for these. In general, future values of all inputs need to be included for this to work, but here the only input is the intercept. t t h h 6941000.0 7090000.0 noitaiveD dradnatS β h t h + ) 3− hB 2 − h(4222.0 + ) 2 − h − 1− h(8707.0 + 1− h = h β )e′X ( 1B )X′X( + β = )Y′X ( 1B )X′X( = β ˆ 853000.0 170000.0 naeM h t ,2 .t t D 1 − 2e srorrE dewekS srorrE lamroN t t β ˆ t 1B h h t t t t h h 57 9. 0 t t t t e 254 SAS for Forecasting Time Series Output 5.9 Default and ht-Based Intervals The default prediction intervals completely miss the local features of the data and come off the end of the data with a fairly wide spread. Since the last few data periods were relatively stable, the -based intervals are appropriately narrower. It appears that ( ), ( ), and were fairly small at the end of the series, contributing very little to so that is approximately as are all n+ j for j>0. The forecast intervals coming off the end of the series thus have about the same width as the last forecast interval in the historical data. They are almost, but not exactly, two horizontal lines. The autoregressive error model is seen to be 2 ,n t where Although the lag Z coefficients are statistically significant, they are small, so their contribution to forecasts and to the width of prediction intervals into the future is imperceptible in the graph. Clearly the IGARCH estimated mean 0.000363 is unacceptable in light of the nonnormality, the resulting danger of bias, and its failure to represent the observed growth over the period. The ordinary mean 0.00007 is an unbiased estimate and exactly reproduces the observed growth. The .t h h 2 t ε 2−t h − 1− h t 1+ n h 1− t h− h t 1+ t t ε + 2 − tZ3230.0 − 1− tZ8680.0 = h t Z eh t = t ε h Chapter 5: The ARIMA Model: Special Applications 255 usual conditions leading to the (OLS) formula for the standard error of a mean do not hold here, but more will be said about this shortly. The problem is not with IGARCH versus GARCH; in fact a GARCH(2,1) model also fits the series quite nicely but still gives an unacceptable estimate of the t mean 0 and variance 1. The AR(2) error series maximum likelihood estimates of are reasonable under rather mild assumptions for ARCH models even when the errors are not normal. Also the graphical evidence indicates that the estimated series has captured the variability in the data nicely. Proceeding on that basis, you sum the estimated series and use estimated autoregressive coefficients to estimate the standard deviation of the mean In this way you get 70000.0 which is not significant at any reasonable level. Interestingly, and despite the comments above, a simple t test on the data, ignoring all of the variance structure, gives about the same t. A little thought shows that this could be anticipated for the special case of this model. The summing of and division by n yields what might be thought of as an average variance over the period. Because the s are small here, the average of divided by n is To the extent that the squared a reasonable approximation of the variance of and thus of provide approximate estimates of the corresponding conditional variances residuals , )1 t mean. Additional care would be required, such as consideration of the assumed unit root structure for and the error introduced by ignoring the s, to make this into a rigorous argument. However, this line of reasoning does suggest that the naive t test, produced by PROC MEANS, for example, might be reasonable for these particular data. There is no reason to expect the naive approach to work well in general. − n(/ 2 )D − α t D( ∑n 1= t 1B the usual OLS formula, gives an estimate of the standard error of the ,t h h .5841000.0 t D = 64855.1 2− )3230.0 + 8680. − 1( . D α Z t h 1B 2988 as t h .h t ∑ 1= t 2− 2 B n ) 2 α − 1α − 1( t h n 5.0 t h = Z and that of is thus n t eh t = ∑ ) α − α − 1( 1= t n 5841000.0 t 2 ε )D 2 2 − t D( 1 .ε equal to t From it follows that the (approximate) variance of Hamilton (1994, p. 663) indicates that Z) 2 α − 1 α − 1( Z) 2 α − 1 α − 1( can be summed from 1 to n on both sides and divided by n to get n approximately is e t h ∑ 1= t 2B n t ε t ε+ 2− t Z 2 α + 1− Z1α = Z t .t D t mean of Note that the average of n independent values of has variance n if has =t h ∑ 1= t 2B 1B t t n n h h 256 SAS for Forecasting Time Series This example serves to illustrate several important points. One is that careful checking of model implications against what happens in the data is a crucial component of proper analysis. This would typically involve some graphics. Another is that failure to meet assumptions is sometimes not so important but at other times can render estimates meaningless. Careful thinking and a knowledge of statistical principles are crucial here. The naive use of statistical methods without understanding the underlying assumptions and limitations can lead to ridiculous claims. Computational software is not a replacement for knowledge. 5.2 Cointegration 5.2.1 Introduction In this section you study a dimension k vector of time series. The model (2 lags). It is assumed is called a vector autoregression, a “VAR,” of dimension k and order has a multivariate normal distribution with k dimensional mean vector 0, a vector of 0s, and that variance matrix The element of is the time series so the deviation of each series from its mean is expressed by the model as a linear function of previous deviations of all series from their means. For example, the upper-left panel of Output 5.10 shows the logarithms of some high and low prices for stock of the electronic retailer Amazon.com, extracted by the Internet search engine Yahoo! t e + 2− V2 A + 1− V1A = V t t t ,i 2=p µ− ti Y t t V V hti . Σ t e k× k Chapter 5: The ARIMA Model: Special Applications 257 enalP gnitargetnioC htiw ataD moc.nozamA 01.5 tuptuO 258 SAS for Forecasting Time Series One way of fitting a vector model is to simply regress each on lags of itself and the other Ys thereby getting estimates of row i of the A coefficient matrices. Using just one lag you specify PROC REG DATA=AMAZON; MODEL HIGH LOW = HIGH1 LOW1; RUN; ti where high1 and low1 are lagged values of the log transformed high and low prices. The partial output Output 5.11 shows the estimates. 11.5 tuptuO The estimated model becomes  t e   µ − − t Y   1245.  +   t e   µ − − t 1   8511. 1 1 1 , would have weights on past shocks that decay exponentially as you move further into the past. What The answer lies in the is the analogous requirement for the vector process “eigenvalues” of the coefficient matrix A. 5.2.2 Cointegration and Eigenvalues , k× k coefficient matrix, I is a where A is the matrix above, you find fitted k× k identity, and denotes a determinant. For the A − Im Any matrix has k complex numbers called eigenvalues or roots that determine certain properties of the matrix. The eigenvalues of A are defined to be the roots of the polynomial k× k t e ?t e + 1− VA = V t t t Y j −e t j α 0= j ∑= Y t that the expression for in terms of past shocks would “converge”—that is, it 1< α ,t e + 1− t Yα = Y t Recall that in a univariate AR(1) process, ∞ the requirement was imposed so |t| > rP 3350.0 1000.< 7130.0 |t| > rP 1000.< 1000.< 0951.0 eulaV t 49.1 39.41 51.2 eulaV t 69.8 55.7 14.1- rorrE dradnatS 97950.0 22950.0 03700.0 rorrE dradnatS 74060.0 09950.0 93700.0 Y 2 etamitsE retemaraP 38511.0 11488.0 37510.0 etamitsE retemaraP 90245.0 13254.0 24010.02 1 ,2 hgih :elbairaV tnednepeD wol :elbairaV tnednepeD Y FD 1 1 1 FD 1 1 1 1wol 1hgih tpecretnI 1wol 1hgih tpecretnI elbairaV 1488.  elbairaV  µ− 3254.  =  µ− 2 1 2×2 t1 t2 ataD moc.nozamA no GER CORP Y Y  Chapter 5: The ARIMA Model: Special Applications 259 which becomes so the roots of this matrix are the real numbers 0.9988 and 0.42740. A matrix with unique eigenvalues can be expressed as where D is a matrix with the eigenvalues of A on the main diagonal and 0 everywhere else and Z is the matrix of eigenvectors of A. Note that By the same reasoning as in the univariate case, the n A so that the elements of 21.5 tuptuO all converge to 0. V where predicted deviations from the means L steps into the future are matrix currently under study you have observed vector of deviations. For the ,n V LA = L +nV ˆ )8511.()3254.( − )1245. − ,)4724. 1−  Z L − m ()8899. − m ( = 9624. + m)1245. + 1488.( − 2m  4724.0 0 L m()1488. − m( =  8899.0  0 . 1− 2×2 Z LDZ = ) 1− ZDZ( Z = Z 1−  1− Z DZ = L  8511.  1245. L L  4724.0   0  1245.   8511. L 1488.  ) 1− 3254.   1 − 0 8899.0  0 1488.  3254.  ZDZ()1− ZDZ( = L A   Z= = , ledoM 1 gaL ,esnopseR eslupmI 0 1− 1 m ZDZ = A L is the last A 260 SAS for Forecasting Time Series 5.2.3 Impulse Response Function , t, From then on, in other words traces out the path that would be followed with increasing lead L in absence of further shocks. The sequence so computed is called an impulse response function. It is seen that at time responded to the jump in decreased, following the initial jump, to about 0.88. and increased to about 0.45 while Continuing through time, the two series come close together then descend very slowly toward 0. This demonstrates the effect of a unit shock to the log of high price. The equilibrium, 0 deviations of both series from their mean, is approached slowly due to the eigenvalue 0.9988 being so close to 1. would not decrease at all and the forecasts would not Clearly, if it were exactly 1.000, then converge to the mean (0). Similarly, any attempt to represent the vector of deviations from the mean in terms of an infinite weighted sum of past error vectors will fail (i.e., not converge) if the eigenvalues or roots of the coefficient matrix A are one—that is, if A has any unit roots. When all the eigenvalues of A are less than 1, we say that the vector autoregressive process of order 1, or VAR(1), is stationary, following the terminology from univariate processes. When the true A has unit roots, nonstandard distributions of estimates will arise just as in the univariate case. Note is uncomfortably close to 1, and that the largest eigenvalue of the estimated matrix here, it would not be at all surprising to find that the true A matrix has a unit root. The roots here are analogous to the reciprocals of the roots you found for univariate series, hence the requirement that these roots be less than 1, not greater in magnitude. L 000. 5.2.4 Roots in Higher-Order Models The requirement that the roots are all less than 1 in magnitude is called the stationarity condition. Series satisfying this requirement are said to be stationary, although technically, certain conditions on the initial observations are also required to ensure constant mean and covariances that depend only on the time separation of observations (this being the mathematical definition of stationarity). In higher-order vector processes, it is still the roots of a determinantal equation that determine the characteristic polynomial is stationarity. In an order 2 VAR, process satisfies the stationarity condition. ,t Regressing as above on lag 1 and 2 terms in the Amazon.com high and low price series, this estimated model  V 72931.0− 37180.0−  is found, where the matrix entries are estimates coming from two PROC REG outputs: 1< m and if all values of m that make this determinant 0 satisfy then the vector ,11 1 11 = t Y =t To illustrate, Output 5.12 shows a bivariate series with both and being 0 up to time mimicking constant high and low stock price (log transformed and mean corrected). At time is shifted to 1 with remaining at 0, thus representing a shock to the high price; that is, L +11 t 2 Y V ˆ t, 2 Y ,8899.0 2− t ,2 t 1 Y 1= t  41460.0− =ρ ˆ e + 2− V2 A + 1− V1A = V ;′ )0 ,1( t  85256.0 41522.0−   + 1− t   54532.0 L A = 11V LA = L+11V ˆ t 1 t,1 Y t V t 2 Y 70136.0  68489.0  , 2 A − m1A − I2m = . )0 ,1( = 11 V Y ′ t V ˆ t 1 Chapter 5: The ARIMA Model: Special Applications 261 Inclusion of lag 3 terms seems to improve the model even further, but for simplicity of exposition, the lag 2 model will be discussed here. Keeping all the coefficient estimates, the characteristic equation, whose roots determine stationarity, is matrices has a unit root. Fountis and Dickey (1989) show that if a vector AR process has a single unit has the same limit distribution as in the root, then the largest estimated root, normalized as univariate AR(1) case. Comparing to the 5% critical value the unit root hypothesis is not rejected. This provides a test for one versus no unit roots and hence is not as general as tests to be discussed later. Also, no diagnostics have been performed to check the model adequacy, a prerequisite for validity of any statistical test. is generated, then Using this vector AR(2) model, a bivariate vector of 0 deviations up to time a unit shock is imposed on the first component, the one corresponding to the high price, and the AR(2) used to extrapolate into the future. The code is as follows: 11 = t 80.1 0= − = )1 − 78799.0(905 = )1 − ρ(n ˆ 2 A − 1Am − I2m surprising to find that the characteristic equation ,)1 − ρ(n ˆ ,78799.0 =ρ ˆ Note that again, the largest eigenvalue, is very close to 1, and it would not be at all using the true coefficient |t| > rP 7530.0 8252.0 6000.0 1000.< 0340.0 |t| > rP 2133.0 7100.0 1000.< 1000.< 6832.0 eulaV t 11.241.164.3 82.41 30.2 eulaV t 79.061.306.9 81.9 81.1- rorrE dradnatS 41660.0 83170.0 41860.0 69860.0 33700.0 rorrE dradnatS 59560.0 81170.0 59760.0 77860.0 13700.0 hgih :elbairaV tnednepeD  41460.0−   72931.0− wol :elbairaV tnednepeD )48771. + m()76762.0 − m()47945.0 − m()78799.0 − m ( = 37180.0 41522.0−   85256.0 −  54532.0 etamitsE retemaraP 72931.037180.054532.0 68489.0 78410.0 etamitsE retemaraP 41460.041522.085256.0 70136.0 26800.0- − FD 1 1 1 1 1 FD 1 1 1 1 1 2wol 2hgih 1wol 1hgih tpecretnI 2wol 2hgih 1wol 1hgih tpecretnI elbairaV elbairaV 1 70136.0  m −  0 68489.0  0 1 2 31.5 tuptuO  fo ssecorP m 2 redrO ,3. 11− 262 SAS for Forecasting Time Series DATA SHOCK; Y12=0; Y22=0; Y11=0; Y21=0; DO T=1 TO 100; Y1 = .98486*Y11 + .23545*Y21 - .08173*Y12 - .13927*Y22; Y2 = .63107*Y11 + .65258*Y21 - .22514*Y12 - .06414*Y22; IF T=11 THEN Y1=1; OUTPUT; Y22=Y21; Y21=Y2; Y12=Y11; Y11=Y1; END; RUN; PROC GPLOT DATA=SHOCK; PLOT (Y1 Y2)*T/OVERLAY HREF=11; SYMBOL1 V=DOT I=JOIN C=RED; SYMBOL2 V=DOT I=JOIN C=GREEN; RUN; QUIT; The graph of this impulse response function is shown in Output 5.14. 41.5 tuptuO ledoM 2 gaL ,esnopseR eslupmI Chapter 5: The ARIMA Model: Special Applications 263 The addition of the second lag produces a more interesting pattern immediately following the shock to the high price logarithm series, but in the long run the series again approach each other and equilibrium deviation from the mean. descend in tandem to the The forecasts might not have returned to the equilibrium point if the true coefficient matrices rather than estimates had been used. The behavior in the estimated model could simply be the result of the highest estimated root 0.99787 being a slight underestimate of a root that is really 1. Notice that a number even slightly smaller than 1 will reduce to nearly 0 when raised to a large exponent, as happens when the impulse response is extrapolated into the future. Models that allow exact unit roots in vector processes will be discussed next. )0 ,0 ( )0 ,0 ( 5.2.5 Cointegration and Unit Roots An interesting class of models with exact unit roots is the class of cointegrated vector processes that can be represented in a type of model called the error correction model. Cointegration refers to a case in which a vector process, like the one with logarithms of high and low prices currently under discussion, has individually nonstationary components but there is some linear combination of them that is stationary. To make things a little clearer, suppose it is hypothesized that the ratio of high to low prices is stable; specifically, the daily price ratio series log(high/low) = log(high) – log(low) is stationary even though the log(high) and log(low) series each have unit roots. In this case, a shock to the high price series will result in an impulse response in which both series move as before, but they will not move back toward any historical mean values. Rather they will move toward some equilibrium pair of values for which log(high) – log(low) equals its long-term mean. You can check spread = log(high) – log(low) for stationarity with no new tools—simply create the daily spread series and perform a unit root test on it. Here is some code to do the test and to check to see if 3 autoregressive lags (and hence 2 lagged differences) are sufficient to reduce the errors to white noise. PROC ARIMA DATA=AMAZON; I VAR=SPREAD STATIONARITY = (ADF=(2)); E P=3; RUN; As shown in Output 5.15, the tests strongly reject the unit root null hypothesis and thus indicate stationarity. The zero mean test would be useful only if one is willing to assume a zero mean for log(high) – log(low), and since high > low always, such an assumption is untenable for these data. Also shown are the chi-square tests for a lag 3 autoregression. They indicate that lagged differences beyond the second, are unnecessary and the fit appears to be excellent. This also suggests that an increase in the bivariate system to 3 lags might be helpful, as has previously been mentioned. , 3− t Y − 2− Y t .90. 1= ) 783400.0 ()69.1( 25670.0 + e e 70. 1= ) 783400.0 () 69.1( 25670.0 − e 80. 1= 25670. 0 205 = n Output 5.15 Stationary Test for High-Low Spread 264 SAS for Forecasting Time Series is stationary according to the unit roots tests. That It appears that means standard distribution theory should provide accurate tests since the sample size is not too small. In that light, notice that the mean estimate 0.07652 for spread is significantly different from 0. An estimate of the number toward which the ratio of high to low prices tends to return is with a 95% confidence interval extending from to You conclude that the high tends to be 7% to 9% higher than the low in the long run. ) w ol( g ol − ) hgih (gol = daerps 400.0300.0 540.0360.0 830.0 250.0 850.0 330.0- 600.0 930.0 610.0 600.0950.0 610.0 920.0 270.0 830.0010.0 320.0210.0470.0410.0520.0 740.0- 050.0320.0 940.0 410.0 760.0520.0 530.0 900.0- 530.0550.0 830.0 940.0 320.0 640.0140.0 810.0- 030.0 940.0 310.0 620.0 710.0 810.0730.0 100.0- 5218.0 2737.0 4376.0 9665.0 6054.0 8426.0 8893.0 3102.0 qSihC > rP 54 93 33 72 12 51 9 3 FD 15.63 50.33 68.82 31.52 41.12 17.21 34.9 36.4 84 24 63 03 42 81 21 6 ---------------snoitalerrocotuA--------------- erauqS gaL -ihC oT slaudiseR fo kcehC noitalerrocotuA 3 2 1 0 gaL 1000.< 3923.0 1000.< 1000.< |t| > rP xorppA 13.4 89.0 19.8 44.71 eulaV t 87340.0 20740.0 07340.0 0783400.0 rorrE dradnatS 88881.0 29540.0 71983.0 25670.0 etamitsE 3,1RA 2,1RA 1,1RA UM retemaraP noitamitsE serauqS tsaeL lanoitidnoC 0100.0 0100.0 F > rP 14.23 42.92 F 1000.< 1000.< 8200.0 uaT < rP 50.856.700.3uaT 1000.0 1000.0 6200.0 ohR < rP 885.941092.3314453.81ohR 2 2 2 dnerT naeM elgniS naeM oreZ epyT sgaL stseT tooR tinU relluF-yekciD detnemguA Chapter 5: The ARIMA Model: Special Applications 265 You see that testing for cointegration is easy if you can prespecify the linear combination—e.g., Often one only suspects that some linear combination is a bivariate time series, so the problem involves estimating as well is stationary, where as testing the resulting linear combination for stationarity. Engle and Granger (1987) argue that if you use regression to estimate your method is somewhat like sorting through all linear combinations of log(high) and log(low) to find the most stationary-looking linear combination. Therefore if you use the standard critical values for this test as though you knew from some external source, your nominal level 0.05 would understate the true probability of falsely rejecting the from a least squares unit root null hypothesis. Their solution was to compute residuals regression of on and run a unit root test on these residuals, but then to compare the test statistic to special critical values that they supplied. This is a relatively easy and intuitively pleasing approach; however, it is not clear which of two or more series to use as the dependent variable in such a regression. More symmetric approaches were suggested by Stock and Watson (1988) and Johansen (1988, 1991). Stock and Watson base their approach on a principal components decomposition of the vector time series, and Johansen’s method involves calculating standard quantities, canonical correlations, from a multivariate multiple regression and then figuring out what distributions these would have in the vector time series case with multiple unit roots. Both strategies allow testing for multiple unit roots. For further comparisons among these approaches and an application to a macroeconomic vector series, see Dickey, Janssen, and Thornton (1991). t 5.2.6 An Illustrative Example To get a little better feeling for cointegration, consider this system with known coefficients:    t,1 Suppose and up to time 11, where a shock takes place. What happens after time 11 if no further shocks come along? That is, what does the impulse response function look like? 2 Yβ − 1Y t β t,2 e   51 − e   01 − − t + β t 2 Yβ − 1Y = r ˆ 2 2 − t,1   Y ,2   76.0− Y  t 82.0 t 88.0 70.0 −   01 − − t Y   66.1 +   51 − − t 1   42.0− 1 1 ,2 , Y .) w ol( go l ,β 01 =  01 − 60.0−  =   51 − 48.1 ) t2Y , t1Y ( t 2 − ) hgih(gol = daerps = S Y t 2 Y 51 = t 1 Y t2 t1 Y Y   t 1 Y t 266 SAS for Forecasting Time Series sesnopseR eslupmI 61.5 tuptuO Chapter 5: The ARIMA Model: Special Applications 267 The left panels of Output 5.16 show the results of setting the pair to and It is seen that a change in either coordinate at time 11 results in the ultimate shifting of both coordinates. Also it is seen that there can be a lot of wiggling as the new levels are approached or there can be a relatively monotone approach of each coordinate to its new level. An insight into this behavior is given by the plots of these three impulse response functions and several others in the three-dimensional plots in the right column of the graph. The axes represent and time t. All series set up to time thus forming a “means axis.” The top-right panel shows eight possible shocks at time fanning out in an asterisk-shaped pattern. The middle plot on the right adds in the eight resulting impulse response curves, and the bottom-right plot is just a rotated view of the middle plot, with time measured by depth into the plot. In the first and second plots, time increases with movement to the right, the height of a point is and its depth back into the plot is The plots include a 0 shock case that forms a continuation of the means axis. For a while after the shock at time 11, there can be substantial wiggling or relatively smooth movement. What is striking is that as time passes, the points all seem to align in a plane. This plane is interpreted as a long-term relationship that will be approached over time after a shock bumps the point off of it (the plane). This gives rise to the term “error correction,” meaning that movement off the plane is an “error,” and in the long run in the absence of shocks, the points will move back to the equilibrium represented by the plane—an error “correction.” A single shock can send the system into fairly wild fluctuations that, depending on what the series represent, might frighten investors, but these are temporary and the vector ultimately will settle near the plane of equilibrium. This equilibrium plane is interpreted as a relationship that cannot be dramatically violated for long periods of time by the system. Envision the plane as an “attractor,” exerting a force like gravity on the points to settle them down after a shock. Further insights are given by a bit of mathematics. Note that a vector VAR(2) model of dimension k, can be algebraically written in terms of differenced vectors and a lagged vector as where , becomes .′ βα = )1− 1(  so that    t,2 t,1 e   51 − e   01 − − t + Π    −  − +) − − − −  e  +  ∇   − e   − ∇      −    −   − − )  e   ∇  + − −  −  e  +  − ∇  −    −     −   − −   e   ∇   +  − − 1Y   −  e  +  − ∇  − t ,2 t,1 t, 2 t,1 t ,2 t,1 1 1 2 2 − t,1   Y k× k t, 2 t,1 1 1 ,2 Y Y t, 2   76.0− Y  t ,1 1 1 r× k less than full rank) then the time series has a unit root Any can be written as where and are full-rank currently under discussion, the model 82.0 Y Y t, 2 t,1 Y Y 82.0 76.0 82.0 76.0 .1 82.0 76.0 =m 88.0 70.0 70.0 88.0 70.0 88.0 . is evaluated at So if (that is, if this matrix is matrix that has rank matrices. Using the A matrices ,)31,51( ,) 7 ,51( 11 = t ,11 ,t 1= m 0 = 2 A − 1A − I e + 1− V∇2A − 1− V) 2A − 1A − I(− = V∇ =t t ) 2Y , 1Y ( .2 )01,51( Y t = ) 2Y , 1Y( − 70.0 88.0   01 − − t Y   66.1 +   51 − − t 1   42.0− β t 01 51 01 51 α 1 1 5 1− t 1 1  10.0   10.0 =  40.0−   40.0− ,2 , Y t,2 t, 1 V 1 1 Y t ,2 t ,1 t, 2 Y Y Y 40.0 2 A − m1A − mI ) 2 A − 1A − I ( 2 1− V − V = V∇ e + 2− V2 A + 1− V1A = V ,′ ,1 10.0 1 βα = Π Y 1 t,1 Y 1( , 2Y , 1  01 − 60.0−  =   51 − ,t  (     48.1 10.0 40.0 40.0 40.0 10.0 10.0 Y  = −  = −    −  ∇ − =  ∇  10.0−  − = Π  40. t 0 t t2 t1 Y Y t2 t1 t Y Y   t .)21,71( kborP eulaV F 4-1RA 2815.0 18.0 6662.0 13.1 F>borP 1000.< 1000.< F>borP 1000.< 1000.< 905 = n eulaV F 1HCRA 74.441 60.91 F>borP eulaV F 3-1RA 7485.0 56.0 7504.0 79.0 52.81 19.4 skcehC citsongaiD ledoM etairavinU eulaV F skcehC citsongaiD ledoM etairavinU skcehC citsongaiD ledoM etairavinU 04.9 qSihC >borP 1000.< 1000.< = )85810.0(605 = ])3021.0 − 1(gol − )8301.0 − 1(gol[)3 − n( 3450.0 6450.0 veDdtS F>borP eulaV F 2-1RA 5617.0 33.0 4929.0 70.0 H∇ qSihC ytilamroN 54.964 39.28 . tL10.1 erauqs-R 0081.0 8550.0 − t H F>borP eulaV F 1RA 1789.0 00.0 9478.0 20.0 )1(WD 89.1 89.1 elbairaV t wol hgih L∇ t L− H elbairaV elbairaV t )1− ,1(φ wol hgih wol hgih = ′β )1− ,1(φ = ′β 280 SAS for Forecasting Time Series The fitted model implies one common trend that is a unit root with drift process and one cointegrating vector. The last bit of code requests forecasts using the VAR(3) in rank 1 error correction form. These are put into an output data set, a few observations from which are shown in Output 5.17f. An additional complication with these data is that the market is closed on the weekends, so the use of the actual dates as ID variables causes a missing data message to be number as an ID variable, thus making the produced. An easy fix here is to use implicit assumption that the correlation between a Monday and the previous Friday is the same as between adjacent days. A portion of these data, including standard errors and upper and lower 95% confidence limits, is shown. Output 5.17f VARMAX on Amazon.com Data, Last Part noitavresbo = t 40710.5 04669.4 92898.4 50169.4 27989.4 1ICU 04829.4 73578.4 48018.4 47568.4 56819.4 2ICU qSihC >borP 2200.0 40795.4 02436.4 22486.4 89647.4 56577.4 1ICL ATEB tneiciffeoC nuR-gnoL detcirtseR rof tseT 88205.4 60835.4 99795.4 88256.4 08507.4 2ICL FD 1 51701.0 57480.0 16450.0 16450.0 16450.0 erauqS -ihC 04.9 1DTS 55801.0 50680.0 03450.0 03450.0 03450.0 1 ymmuD 00000.100000.1 1 ymmuD 00000.100000.1 1 ymmuD 68782.0 64770.02DTS 1SER ATEB no sisehtopyH ot tcepser htiw AHPLA tneiciffeoC tnemtsujdA ATEB ot tcepser htiw H xirtaM noitcirtseR ATEB no sisehtopyH ot tcepser htiw ATEB tneiciffeoC nuR-gnoL . . . 391750.0079910.0eulavnegiE 3021.0 . . . 604640.0436850.02SER elbairaV elbairaV elbairaV 40708.4 03008.4 52197.4 20458.4 86288.4 wol hgih wol hgih wol hgih 1ROF tcirtseRnO eulavnegiE 8301.0 46517.4 17607.4 24407.4 13957.4 22218.4 2ROF . . . 28697.4 27268.4 hgih . . . 09217.4 95357.4 wol xednI 1 215 115 015 905 805 t sbO 215 115 015 905 805 sbO 215 115 015 905 805 Output 5.17e VARMAX on Amazon.com Data, Part 6 Chapter 5: The ARIMA Model: Special Applications 281 You can observe the quick spreading of confidence intervals, typical of data whose logarithms contain a unit root. The fact that the unit root is in some sense shared between the two series does not do much to narrow the intervals. The drift in the underlying unit root process, or common trend, is apparent in the forecasts. The short-term dynamics do not seem to contribute much to the forecasts, suggesting that the last few observations were quite near the cointegrating plane. (See Output 5.18.) Output 5.18 Forecasts Using Cointegration 282 6 State Space Modeling 492 sledoM AMRA rotceV dna ecapS etatS fo ecnelaviuqE 3.1.6 582 elpmaxE etairavitluM elpmiS A 2.1.6 382 selpmaxE etairavinU elpmiS emoS 1.1.6 6.1 Introduction In ARIMA modeling, one of the difficult tasks is to select a model. Also, if you have several related time series, they must satisfy some restrictive conditions in order to justify the kind of transfer function modeling that is available in PROC ARIMA. There must be no feedback, and, for proper identification and forecast intervals, multiple inputs must be independent of each other and enough differencing must be specified to render the series stationary. PROC STATESPACE allows estimation under less restrictive conditions and provides some automatic model specification ability, although the user is still responsible for making the series stationary. In Chapter 5, another procedure, PROC VARMAX, was discussed. This procedure also handles multiple series and, unlike STATESPACE, can perform cointegration analysis, which is appropriate when your series display unit root nonstationarity but some linear combination of the series is stationary. In other words, the transformation to stationarity is not just differencing. The basic idea in state space modeling is to discover the “state vector.” The state vector consists of the current values of all series under investigation plus enough forecasts into the future so that all forecasts, no matter how far away, are linear combinations of these. 6.1.1 Some Simple Univariate Examples To get started, here are some models, all with mean 0, and their forecasting equations. As is customary in discussing state space models, the symbol denotes a forecast of using information available at time t. In model discussions in this section, the default assumption is that the mean has already been subtracted. L +t Y t L + tY ˆ 703 elpmaxE detalumiS 3.3.6 503 snoitalerroC lacinonaC 2.3.6 503 secnairavoC morf denimreteD srotceV etatS 1.3.6 103 2 noisnemiD fo )1,1(AMRA 2.2.6 892 selpmaxE etairavinU emoS 1.2.6 203 ECAPSETATS CORP 3.6 892 selpmaxE eroM 2.6 382 noitcudortnI 1.6 retpahC 284 SAS for Forecasting Time Series Table 6.1 One-, Two-, and Three-Step-Ahead Prediction for Different Models t Numerical examples and further discussion of models like these appear in Section 6.2. A “linear combination” of a set of variables is a sum of constant coefficients times variables. For example, and are linear combinations of X and Y. Notice that is automatically also a linear combination of X and Y; that is, linear combinations of linear combinations are themselves also linear combinations. Note that and for to be the is also a valid linear combination of X and Y. Considering variables and considering functions of model parameters, like ,) t k + tY , ˆ , t 2 + tY , t 1+ tY, tY ( ˆ ˆ to be constants, the state vector is defined to be with t where k is the smallest value such that all remaining forecasts are linear combinations of the state vector elements. For the AR(1) model all t complicated, but by the “linear combination of linear combinations” argument it is clear that all forecasts are linear combinations of and You can see that for an AR(p) the state vector will have p elements. For moving averages it is assumed that current and past s can be well approximated from the observed data—that is, MA models need to be invertible. Acting as though s that have already occurred are known, it is clear from the MA(2) example that for an MA(q) model, forecasts more than q steps ahead are trivial linear combinations (0) of state vector elements. Finally, for mixed models the forecasts are eventually determined through autoregressive type recursions and, by the linear combination of linear combinations argument, must be linear combinations of state vector elements from that point on. The state vector contains all the information needed to forecast into the infinite future. During an early space shuttle mission in which the landing was broadcast, the mission control engineers were heard to say, “Your state vector is looking good.” What did that mean? Numerical measurements of height, velocity, deceleration, and so forth were being taken, and from them, forecasts of the flight path into the future were being computed. Of course these state vector entries were being updated quickly and state space forecasting is based on this updating idea. At time the state vector will STATESPACE and it is the thing that you are trying to estimate from the data. In the space shuttle example, if the elements of the state vector included height, deceleration, and location information, then a state vector that “looked good” would be one whose projections forecast a landing on the runway. .) 1+ t 1+ k + tY , ˆ , 1+ t 3 + tY , 1+ t 2 + tY , 1+ tY ( ˆ ˆ t be updated to The updating equation is the model in PROC ,1 +t e , t 1+ t Y2 α + ) Y2α + ˆ t . t 1+ tY ˆ t 1+ t Y1α( 1α ˆ t Y … t 3 + tY ˆ and is a linear combination, of them too. The expressions get more , Y2α + t t 1+ t Y1α t 2 + tY ˆ .t Y determined from However, is a linear combination, 1− t Y t |1+ tY ˆ t Y .) t 1+ tY , tY ( ˆ state vector is It can’t be just because involves whose value cannot be of state vector entries, Y ,t Y forecasts are linear combinations (multiples) of so the state vector is just ( ). For the AR(2) the 0 = Y0 + X0 t 1+ t Y2α + ˆ t ) eβ + Yα( α t t 2+ t … t Y1α ˆ Y 3α 2 ,2 ,1 3 + tY ˆ 0 =L t Y2α + ) eβ + Yα(α t t L + tY ˆ t 1+ t t t t Y1α ˆ Y2α t 2 + tY e2 β , Lα ˆ t Y 1− Y 2 α + Y1α = 1+ Y ˆ Yα t 1− te2 t t eβ + Yα β + e1β t t 1+ tY t t t t t ˆ e+ 2 − te2 Y2 − X5 1− e β + e + 1− Yα = β + 1− e1β + e = … 2− t t Y 2 α + 1− Y1α = ) Y2 t t t t t t e + 1− tYα = − X5(4 − ) Y3 + X2(6 alumroF Y3 + X2 k >L t t t t Y Y Y Y )1,1(AMRA )2(AM )2(RA L + tY ˆ t e )1(RA emaN Chapter 6: State Space Modeling 285 6.1.2 A Simple Multivariate Example Now suppose a vector process is of interest. An easy case to consider is  t e   −t  +  t e   −t 1 2 2 ,1 This is a vector autoregressive model, VAR, of dimension 2 (2 elements) and of order 2 (maximum lag is 2). The state vector is  t +t  Y  ˆ    t =  t1    To see why this is the case, first note from the bottom row in the model equation that the one-stepahead predictor is clearly a linear combination of state vector elements and thus does not need to be included in the state vector. Next, note that if the best predictor is used and the coefficients are known as is assumed here, then the forecast will differ from only by ,t which is the error term that, at time t, has yet to be realized. The same is true for you thus have t2  9.0−  +t e    1  +t e   0 1 1 ,2 ,1 .1+ t Z  t +t  Y   3.1 36.0− 3.1  ˆ      0 + t Y  0 7.0    t   0  Y  1 1  t Z These three equations show how to update from  + t +t Y  94.0−   ˆ      1.0 =  +t Y    +t 1     0 to You have , 2 + t,1 e+ t1 1+ t Y4.0 Noting from the top row of the model equation that seen that forecasting one step ahead using information available up through time − 1 + t,2Y9.0 t1 − Y4.0 − ) 1+ t,2e + 1 + t,1Y3.1 1 + t,2e9.0 = − 2 + t,1 1 + t,1e3.1 Y t2 Y 7.0 + t1 1.0(9.0 − ) 1+ t,1e + Y + 1 t1 ,2 ,1 + t,1Y3.1 ,1 t1 Y4.0 − 1+ t,2Y9.0 − 1+ t,1 3.1 = 1+ t 2 + t,1Y Y ˆ ˆ + 1 + t,2e t2 Y36.0 − t1Y94.0− + 1+ t t2 Y7.0 + t1Y1.0 Y 7.0 + t1Y1.0 EG + ZF = 1+ Z 1 + t,1e t1 + t,1Y(3.1 + ˆ t t1 mrof eht sah hcihw + t,1Y = = ˆ 1 = 1 1 = = t 2 t1 ,2 , 1 1 Y + t,1 Y + t,2Y + t,2Y ˆ using it is would produce 1 + t,1 2Y Y t1 + t,1Y ˆ 2 ,2 Y  0 Y  0 4.0 0 − t2   − t Y   7.0 +   − t 1   9.0− Y7.0 + t1Y1.0 1 1 ,2 , Y = t1 + t,2Y ˆ  1.0  =  1 2 3.1  Y Y ,1 t2 t1 ,1 Y Y t Z  , 1+ t,1 and e 286 SAS for Forecasting Time Series This looks quite a bit like a vector autoregressive model, and you might think of the state space approach as an attempt to put all vector ARMA processes in a canonical form that looks like an AR(1), because it happens that every possible vector ARMA process of any dimension can be cast While this eliminates the problem of identifying the into the state space form autoregressive and moving average orders, it introduces a new problem—namely, deciding from the observed data what elements are needed to construct the state vector. Prior to discussing this new problem, a simulation of 2000 values from this bivariate VAR model is used to produce some state space output. The data set TEST contains the variables Y and X, corresponding to Y2t and Y1t , respectively. The code PROC STATESPACE DATA=TEST; VAR Y X; RUN; .1+ t is all that is needed to produce the results in Output 6.1. Output 6.1 PROC STATESPACE on Generated Data eulaV t 50.8356.06 68.43 16.0121.4136.067.61 40.4 ledoM dettiF dna mroF ecapsetatS detceleS erudecorP ECAPSETATS ehT )T;1+T(Y 86443.1 2810.01 xirtaM noitisnarT fo etamitsE noitavonnI rof xirtaM ecnairaV noitavonnI rof xirtaM tupnI 40858.01 0 7179.0 564810.0 rorrE dradnatS 055220.0 908120.0 975830.0 053450.0 332730.0 469820.0 347040.0 869720.0 setamitsE retemaraP rotceV etatS 4675.0630386.0 0 )T;T(X etamitsE 40858.0147223.1 086443.1 04675.057525.002810.0630386.0 888211.0 147223.1 0 1 564810.0 814830.1 57525.0888211.0 0 retemaraP )T;T(Y EG + Z F = 1+ Z )2,3(G )1,3(G )3,3(F )2,3(F )1,3(F )3,2(F )2,2(F )1,2(F t t Chapter 6: State Space Modeling 287 to force a one-step-ahead Y predictor and no predictions of future X to enter the state vector. Of course this assumes the unlikely scenario that the user has some prior knowledge of the state vector’s true form. The matrix F is referred to as the transition matrix and G as the input matrix in the output. Comparing the true and estimated F matrices you see that and for the input matrix G Entries of 0 or 1 are known once the state vector has been determined. They are structural parameters that do not require estimation. No elements of F or G are more than 2 standard errors away from the true values, and all estimates are quite close to the true values both numerically and statistically. Knowing that the estimate –0.02 is in fact an estimate of 0, you would expect its t statistic to be smaller than 2 in magnitude, which it is You might want to drop that term from your model by forcing its coefficient to 0, using the statement RESTRICT F(2,3)=0; . )36.0 in the PROC STATESPACE step to restrict that row 2, column 3 element to 0. Doing so produces the results in Output 6.2.   43.1  20.0−   1 85.0− 86.0 0  35.0−   =F 11.0  ˆ    68.0B  1   0 0 − = t( 23.1   = 0 dna  ˆ )T;1+T(Y )T;T(X )T;T(Y 1  dna notation user could specify using the X Y variable names. Had this not been the case, the , t 1+ t,1Y ˆ , t2Y , t1 Y The state vector has been correctly identified as containing and as is seen in the 1 X 2 Y MROF   3.1  0   1 G 36.0− 7.0 0   9.0−  1   0  94.0−   = 1.0   3.1   =  1  0 0 G F Output 6.2 RESTRICT Statement in PROC STATESPACE 288 SAS for Forecasting Time Series The estimated elements of F and G are again close to their true values. Plots of both series and their forecasts are seen in Output 6.3. 30.8326.06 74.44 18.3106.7170.75 15.72 eulaV t 255220.0 618120.0 678920.0 223340.0 799820.0 593210.0 964300.0 rorrE dradnatS 25758.0406223.1 525823.1 62895.082015.0324707.0 944590.0 etamitsE )2,3(G )1,3(G )3,3(F )2,3(F )1,3(F )2,2(F )1,2(F retemaraP setamitsE retemaraP 288179.0 244810.0 244810.0 714830.1 noitavonnI rof xirtaM ecnairaV 25758.01 0 406223.1 0 1 noitavonnI rof xirtaM tupnI 525823.1 0 1 62895.0324707.0 0 82015.0944590.0 0 xirtaM noitisnarT fo etamitsE )T;1+T(Y )T;T(X rotceV etatS ledoM dettiF dna mroF ecapsetatS detceleS erudecorP ECAPSETATS ehT )T;T(Y Chapter 6: State Space Modeling 289 Output 6.3 Forecasts for Generated Data The forecasts seem to have a little more interesting structure than some you have previously seen. This has to do with the nature of the roots of the characteristic equation. As in a univariate series, the behavior of a VAR of the form t t is determined by the roots of a “characteristic equation,” and the same is true for a vector ARMA. is a vector of random normal variables that can be contemporaneously correlated in any Here arbitrary way, but must be uncorrelated across time. is a dimension k vector of deviations from means. For these might be the time t deviations of GDP, unemployment, and interest rates from their long-term means, and each is a matrix of parameters to be estimated. The characteristic equation involves the determinant p Y k× k E + p − tYp A + j A A− − 2 A 2 − p m − 1A1− p m − I p m + 2 − Y 2A + 1− Y1A = t 3= k t t Y t E 290 SAS for Forecasting Time Series and the values of m that make this 0 are the roots. In the VAR example currently under study,  m 7.0 − m 2   m9.0 4.0 damped sinusoidal component with damping rate and period time periods as you forecast L periods ahead. This seems consistent with the graph. At a lead of around the forecasts hit local low points as they did at the end of the data. Each low is about or about 1/3 of what it was then. All of these roots are less than 1 in magnitude, this being the stationarity condition. Some sources write the characteristic equation in ascending powers of m, namely, 1.41 whose roots they then require to all be greater than 1 in magnitude for stationarity. In a VAR of order 2 you have t which is sometimes written in a matrix form like this:  I   1− Y   1A 2 A   Y    =  E  0  +  2−     Y I 0   1− Y  t t This simply says that and so it consists of a trivial identity and the original AR(2). If you substitute the A matrices of the current example, you have ,t which looks somewhat similar to the state space representation; however, this system has dimension been of full rank, the system would have had full 4, not 3 as you had previously. Had the matrix rank, the state vector would have had dimension 4, and the above representation would be another type of state space representation of the vector process. It would differ from the process used in PROC STATESPACE in that current and lagged Ys, rather than current Ys and predictions, constitute the state vector elements. As it stands, the system is not full rank and can be reduced by simply eliminating the second row and second column of the coefficient matrix. That second row produces the trivial identity which, of course, is true whether you put it in the system 2 = 5.52 / 063 5.52 = )71538. / 7893.( natA = θ .i7893.0 E + 1− Y1A + 2− Y2A = Y L 53948.0 ± 71538.0 whose roots are 0, 0.32966 and the complex pair representation so ,) ) 53948.0 + + m3.1 − m1.0− t m6633076.1 − 2m ( )28787324366923.0 − m ( m = 2 t    e      te   + 0     0   m 2 1  0  =  0 A t 2 2 1 1 2 m90.0 + ) m7.0 − 2m ( )4. + m3.1 − 2m ( = − t Y   7.0   − t Y  9.0−  −t Y   1  − t,1 ,2 ,2 ,1 4.0 0 t Y  , 1− t,2 0= t − θ nis i ± θ soc ( 53948.0 Y=   7.0 −  9.0− 0 p )82.0 − Ap m − 1 − t,2 1− t 1.0 E + 2− Y2 A + 1− Y1A = Y 3.1 Y = 1− Y Y 0 1 t m4.1 + 2m2 − 3m ( m = 1 1.0  m −  0 0 0 0 0 t t 3.1  − 2 A 2 m − 1Am − I  t  0    Y     4.0−   tY   =  −t  0 Y      −t  0 ,23.0 t = 1 2 0 1 1  41 2 1 ,2 t ,1 53948.0 Y m t The complex pair of roots has degrees, implying a ,41 =L Chapter 6: State Space Modeling 291 or not. The second column is all 0s, so leaving out the second row and column makes no real change in the system. The resulting reduction gives  t   −t  e   t   −t e +     0   −t 2 ,1 again having a familiar form The first row gives a trivial identity, but it is needed to make the system square. This system is observationally equivalent to the system in that, for any e sequence and any given initial values, the two systems will produce exactly the same sequence of Ys. In the theoretical research on STATESPACE methods there are several ways to formulate the state vector, as has been demonstrated. The size of the state vector Z and the general form of the updating recursion is the same for all ways of writing a state vector. The entries of Z and of the matrices F and G depend on the particular formulation. that arises from a vector ARMA satisfies a recursive relationship of the form In PROC STATESPACE the state vector always consists of the current and if you have 3 observed series at each time t—along with observations—say, predictions into the future. For example might contain and predictors up to two steps ahead. t2 t, t How do you decide what to put in the state vector? Returning to the bivariate VAR of order 2 that is being used as an example, it is possible from the model to compute the autocorrelation between any and for the same or different vector elements at the same or different times. Σ + )2(Γ2A + )1(Γ1A = )0(Γ 0=j . 0>j for For you find )2 + j −( Γ2A + )1 + j −( Γ1A = } − ′ j t Y 2− t Y { E2A + } − ′ j t Y j −t ′ Y Multiplying the AR(2) on both sides by 1− t and taking expected values, you see that t Y } j +t ′ Y Y{E = )j (Γ t defined as an expected value, namely, you define (assuming has mean 0). ,) j ( Γ , j +t ′ Y t Y The covariance matrix between column vector and row vector Σ The data were generated using the innovations variance matrix 1  0 defined as symbolized as is 1 Y There will be no “gaps”; that is, if the two-step-ahead predictor is included, so must be the . t2 + t,1 Y , t1 + t,2 Y , )s t1 + t,1 = t( Y 3Y ,t + 1Y Y 1Y , t2 t Z ,t t Z Y { E1A t 3 Y )j = = i( , t2 } −t ′ j Y 1Y Y Y{E 0   2e  =  raV = Σ 1   1e  ,t Z Every state vector 4×4 2 1 .t EG + 1− Z F = Z 1 1 ,2 ,1  Y  7.0    9.0− Y  t Y  3×3 0 t 1.0 3.1 1  t 0    Y   t 4.0− = Y     −t 1 0 t .1+ t t t EG + Z F = 1+ Z = )j 1 sj 1 2 − (Γ t Y , Y     ti t Y 292 SAS for Forecasting Time Series These, then, constitute the multivariate Yule-Walker equations that can be solved to give all the covariances from the known A coefficient matrices and the innovations variance matrix Thus it covariance matrix M between the vector would be possible to compute, say, the these identifying the columns of M, and the vector these identifying the rows. That matrix would have what is known as a block Hankel form:   )4(  )3(   )2( . State space researchers describe M as the covariance matrix between a set of current and lagged Ys and a set of current and future Ys. For such a matrix M, the following numbers are all the same: 1. The size of the state vector 2. The rank of the covariance matrix M 3. The number of nonzero canonical correlations between the set of current and lagged Ys and the set of current and future Ys. Items 2 and 3 are always the same for any covariance matrix. (See the PROC CANCOR documentation for more information on canonical correlations.) Thus the size of the state vector and the nature of the corresponding state space equations can be deduced by studying the covariance matrix M. With only data, rather than a model with known coefficients, the covariances must be estimated. The strategy used is to fit a long vector autoregression whose length is determined by some information criterion, and then to use the fitted model as though it were the true structure to construct an estimate of M. The initial autoregressive approximation provides an upper bound for the size of M. Returning to the order 2 VAR with known coefficient matrices, by substitution, you can see that these matrices satisfy the multivariate Yule-Walker equations. )j ( Γ These in turn lead to a matrix M formed by stacking together the form previously suggested, namely, matrices in the block Hankel Γ Σ 83.1  70.7 24.6−   = )2( 68.52  Γ  26.0−  85.6 73.2  40.5 37.8−   = )4 ( 41.0−  18.2−   = )1( 74.53  6×6 ,) 2 − t,2Y ,) 2 + t,2Y 2 ′ Α)2 Γ Γ 0>j − j 2 2 − t,1Y + t,1Y ( Γ Γ Γ Γ 1A)1  23.0 45.7 71.3   85.1 + ′ 1 1 )3(Γ )2 (Γ Γ )1( − t,2Y + t,2Y 64.8−  ,) j (′ − j 92.93  60.31  85.1 )2(Γ    )1(Γ = M  Γ )0( 1 1 ( Γ = ) j − (Γ − t,1Y + t,1Y Now so for you have Γ = ) j (Γ   = )0(  = )3( t2 t2 Y Y Γ Γ t t1 1Y( Y( Chapter 6: State Space Modeling 293 You can diagnose the column dependencies and rank of matrix M using a clever trick. If any column of a matrix is a linear combination of some others, then a regression of that first column on those others (no intercept) will fit perfectly. Regressing column 2 of M on column 1 and column 3 on columns 1 and 2, you find nonzero error sums of squares, indicating that columns 1, 2, and 3 form a linearly independent set. Regressing any other column on columns 1, 2, and 3 gives a perfect fit and so shows that the rank of matrix M is 3. For example, you find by regression that so that the covariance between the column 4 variable, and any future Y is the same as that between and that same future Y. Linear forecasts such as we are considering are functions only of the covariances. Therefore, adding to a set of predictors that already contains and does not add any more prediction accuracy. Note that the second row of the state space transition matrix F is 0.1, 0.7, 0, so the same regression that displayed the dependency gives the corresponding row of F. On the other hand, column 3 is not a perfect linear combination of columns 1 and 2. You get a positive error mean square when regressing column 3 on columns 1 and 2. Regression reveals that (column 5)= –0.49(column 1)–0.63(column 2)+1.3(column 3) with 0 error sum of squares. Again note that the coefficients give a row of F. Column 5 is associated with so even though you columns, 4 and 5, thus far considered are the first columns associated with series 1 . t2 t that show dependencies. These dependencies reveal the number of forecasts of each series that appear in the state vector (one less than the lag number associated with the dependent column) and the row of the F matrix associated with the last occurrence of that series in the state vector. Once the first dependency in each variable has been discovered, the state vector has been completely determined and no further investigation is needed. Column 6 is automatically a linear combination of columns 1,2, and 3 at this point. A perfectly fitting regression corresponds to a canonical correlation 0 in matrix M. In particular, you can build a sequence of matrices by sequentially appending columns of M. When you use the first four columns of M you will get a 0 canonical correlation, but not before. That tells you the fourth column, and hence is redundant information. Leave out that redundant fourth column and consider a matrix consisting of column 1, 2, 3, and 5 of M. If that matrix had no 0 canonical correlations, then (associated with column 5) would have been included in the state vector, but in this example, the addition of column 5 also produces a 0 canonical correlation. Since dependencies for both series have been discovered, you need not look any further. When estimated covariances are used to get an estimated M matrix, that matrix, will almost certainly be of full rank, possibly with some small but nonzero canonical correlations. What is is consistent with the needed is a statistic to decide if a small estimated canonical correlation in , ˆ t2 2 Y M + t,1 Y M ˆ , t1 + t,2 + t,1 Y Y t1 + t,1 Y needed in the state vector, there is nothing to be gained by including   26.0−   85.6  23.0   45.7  83.1   70.7 , 2 + t,1 2Y 7.0 Y 37.8− 24.6 41.0− 64.8− 60.31 68.52 + − t 1Y1. 23.0 45.7 83.1 40.5 73.2 70.7 0 t 2 Y 18.2 60.31 64.8− 24.6− 68.52 74.53 − t 1 , )3 nmuloc ( 0 + )2 nmuloc ( 7. + )1 nmuloc (1.0 = )4 nmuloc ( Y 83.1 40.5 85.1 70.7 73.2 71.3    )4 ( 18.2−    = )3( 74.53     )2 (  68.52 85.1 24.6−  92.93    Γ Γ Γ )3(Γ )2 (Γ Γ )1( )0( )2(Γ    )1(Γ = M  Γ , 1+ t,2 Y t1 + t,2 Y Y The dependent and with series 294 SAS for Forecasting Time Series hypothesis that M has corresponding true canonical correlation 0. A criterion DIC to do so has been proposed by Akaike. If you build matrices as described above by appending columns of then the DIC criterion is expected to be negative when the column just added introduces an approximate dependency. That column would then be omitted from the matrix being built, as would all columns to that correspond to lagged values of that series. Then the appending would continue, its right in using only columns of the other series, until dependencies have been discovered in each of the series. Like any statistical criterion, the DIC is not infallible and other tests, such as Bartlett's test for canonical correlations, could also be used to test the hypothesis that the newly added column has introduced a dependency in the system. Now, if there are moving average components in the series, things become a little more complicated and, of course, estimates of the elements of G are also needed. But if you have followed the example, you have the idea of how the STATESPACE procedure starts. The long autoregression is run, the estimated M matrix is computed from it, the rank is diagnosed, and initial elements of F and G are computed by treating estimated covariances as though they were the true ones. Thus the initial estimates of F and G fall into the “method of moments” category of estimates. Such estimates are approximate and are often used, as is the case here, as starting values for more accurate methods such as maximum likelihood. Another nice feature of the maximum-likelihood method is that large sample approximate standard errors, based on the derivatives of the likelihood function, can be computed. Examples of these standard errors and t tests were seen in Output 6.1 and Output 6.2. Additional numerical examples and discussion are given in Section 6.2. Some ideas are reiterated there and some details filled in. The reader who feels that Section 6.1 has provided enough background may wish to move directly to Section 6.3. The following section is for those interested in a more general theoretical discussion. , ˆ t 6.1.3 Equivalence of State Space and Vector ARMA Models A general discussion of the state space model, under the name “Markovian representation,” is given by Akaike (1974). The following summarizes a main idea from that paper. represent a dimension k vector ARMA process with mean vector 0, and let be an Let uncorrelated sequence of multivariate, mean 0 normal variables with variance matrix The ARMA process is so substituting in the original expression gives ) q −1− t , −E j t j j ∑= Y t substitution produces a convergent expression only in terms of the E vectors, say, ψ 0= ∞ . 1− t which involves current and lagged E vectors and Y vectors prior to time q −1− t q − tE q B − − 1− tE1B − tE + p − tY p A + EqB − − 2− tE1B − 1− tE + p −1− tY p A + E qB − − 2 − tE1B − 1− tE + p −1− tY p A + + 2 − Y1A ( 1A = + 2− tY 2 A + + 2 − Y1A = 1− Y t t 1− t At time , Repeated back E M . Σ q− t E qB − ) q ,p ( − 1− tE1B − t E + p − tY p A + + 1− Y1A = t M ˆ ) q ,p ( t t t Y Y t Y Chapter 6: State Space Modeling 295 What should be the size of the state vector—that is, what should you use as the subscript M? To the model becomes answer that, you look for dependencies. At time q−L +t For leads L exceeding the moving average length q, this becomes )q 1+ t E  1− M   2− M    1   I ψ ψ ψ    t 1− M + t     t 2− M + t   +    t 1+ t     1+ t Y Y Y A Y I   1        0 0 , p >j 2− M A 1− M A MA 0 0 I 0 0= j 0 I 0 A coefficient matrices with 1+ t E1−Mψ + 0 0 t M +t    1+ t M + t    1+ t 1−M + t     1+ t 2 + t    1+ t Y = 1+ M+ Y Y Y t Y Y          t with gives the full set of equations. Expanding the set of autoregressive when you have the complete set of equations , t p −M+ t Yp A + + t 1−M + tY1A = t M+t Y Finally, to replace the “?” in F and G, note that M>L t L+t Y forecast with will be a linear combination of these. This establishes M. which combined ,)1 + q ,p (xam , t M+ Y , t , t 1+ t Y Y t t Z and so if the state vector contains , where M is then every . 0 L rof( … t p − L+ t Yp A + 0>j + t 1− L + tY1A = t| j t L+t +t E where is 0 for and is the vector of one-step-ahead forecast errors at time t q−L+ t EqB − L +t − Y If there were only information up to time t, the forecast of t 1− L + t would be . t EqB − t      t 1-M + tY   ?  ?     2− M ψ   t 2− M + tY   I     1+ t E +   1    t 1+ t ψ Y  0      I   t  0 − 1− L + tE1B − L + tE + p − L + tY p A + t (the dimension of Y) rows of the state space equation question mark (?) for items not yet discussed, you have In particular, using a , 1+ t 1+ EG + 1+ Z F = 1+ Z E1− Lψ + L+ Y = 1+ L+ Y t t t t E1B − t L+tE + t p −L+ tY p A + L+t Y I 0 0  0 I 0 0 0 ? ? + t 1− L + tY1A = + 1− L + Y1A =    ?   1+ t M + tY    0   1+ t 1−M + tY     =   1+ t 2 + t  Y    1+ t  , 1+ t L + t t Y will contain and the relationship between these, will provide k if t . t t , t L+t Y t Z So if the state vector contains then it will also contain the predecessor Thus .1+ t 1+ Z E1− Lψ + t L+t 1− L + Y Y = 1+ L+ Y t t 1+ t the forecast of the same thing at time just adds one more term so that , j −L+ t E j j ∑= t L+t Y time t would just be the part of this sum that is known at time t, namely, ψ L= ∞ j −L + t j j t provided the series is stationary. The forecast of any E ψ 0=∞ ∑ = L+ Y using information up to and Y t L+t L+t Y Y Y 1 6. 27. 8. 1 92. 850.− 85. 0 1 2 1 ,2 1 1 2 1 ,   1.   + t e   6.−     + t 1e  1   0   t +t 2.  X 0       t +t 8.  Y 0  + 0 0 t   X    t 0 1 Y  + t + tX  22.−        +t +t 440.  Y  =  1+ t  44.− X      1+ t  0 Y 1 ,2  6.−  +t e     1  +t e   0 1 ,1 8.    0 )8.  1 85.  t +t  Y  27.    44.−( +  tX  8.   t  Y  1 1 296 SAS for Forecasting Time Series Suppose you want to find the equivalent bivariate ARMA representation for you have so that .′ ) tY t X( 1 ,2 1 ,  6.−  +t e     1  + t 1e    0 Inserting this extra line into the state space equations you get 1 1 ,2  +t e    )1.  + t 1e  1 ,  t +t  Y   2.( +  tX  )06.0   t  Y  850.− 85. 0 1 92.0 22.− ( = 440.    44.− )8.   0 2 85. 44.− ( =  +t +t  Y    1+ t X  )8.   1+ t   Y 85. 44.− ( = 1+ t 2 + tX t 1+ tY8. + t X85. + t Y44. −= t 1+ tX where the system arising from the current discussion was not block identifiable. It had a linear dependency, and a reduction in the size of the system was called for, ultimately producing a state space representation with dimension 3. 0 4.0  t2e   2 − t X   0  +   t1e   2 − t   0 Y This would be the final state space form if the system were of full rank. Such a full-rank system is called “block identifiable.” In this case the link between the ARIMA representation and the state space representation is relatively easy to see, unlike the Section 6.2.2 example Consider the state space representation From row 2 1 850.− 85. 0 1 2 1  t +t  8.  Y  27.      0 +  t X  8.      tY   1 1  +t + t  440.  Y     + tX  44.− =       1+ t  0 Y 4×4 −   1− tX   7.0 +   1− t   9.0− Y  tX  1.0  =  t  3.1 Y 1 6. 27. 0 1 92. 850.− 0 0 1 ,2 1 1 ,   1.   + t e   6.−     + t 1e  1   0  t +t  2.  X 0       t +t 8.  Y 0  + 1 0 t   X    t 0 1 Y t Now anytime you see you can replace it with , t 1+ tX 1+ tY8. + t X85. + t Y44. − replaced by Anytime you see a multiple of row using Row 3 gives So you have recovered the original ARMA(1,1) model. A useful feature of such a structure is that it sometimes gives a nice interpretation; for example, it is clear from the ARMA model that lagged shocks to Y do not have any effect on Y or X, while that is not so clear from the state space representation. 2. 8. 4.  5.  1. 0 =  6.− 0 5.0 2.  −  1.− 8.  0  = 1 B 1  B−  5.0 1.− 2. 8.0   = 1  B− A =  1  1. 2.  8.  6.−  = 1ψ     1. 2.  1 = ψ dna 6.− 8.  1− t     5.0 2.  1 = A 1.− 8.0  t t t E1B − E + 1− Y1A = Y 1 2 1 1 ,2 1 1 ,   1.   + t e   6.−     + t 1e  1   0  t +t  2.  X   5.       t +t 8.  Y  1.−  +  1 0 t   X    t  0 1 Y t 1+ tY8.0 + t 1+ tX1.0 −= t t t 1+ tY8.0 +) t 1+ tY8. + t X85. + t Y44. (1.0 − − = 1+ 2 + Y . t 1+ t X t X85. + t Y 44. − 1 1  + t +t  22.−  X      + t + tY 440.    = 0 1+ t X       1+ t  2 2 0 .1 000 and row 4 is . t 1+ tY2.0 It is seen to be an ARMA(2,1) with with ; 0 = 2A 2. 8. 0 1 0 0 0 0 1 2 0  + t +t  0 X      0   + t + tY  =  1+ t  0 X     1+ t  Y + t 1+ t X5.0 = 1+ 2 + X t t Y and setting . It can be expressed as , + 3− E3ψ + 2− E2ψ + 1− E) 1B − 1A ( + E = Y t t t t t ) tX you find that t Y( = t Y The result is This system results: in other words it is the vector ARMA(1,1) as the leading term in a row, you can re-express that Chapter 6: State Space Modeling so row 2 of F can be and 297 298 SAS for Forecasting Time Series 6.2 More Examples 6.2.1 Some Univariate Examples Univariate models can also be expressed in state space form, and doing so provides some insights. Consider the AR(1) model where t Suppose you are given Y1, Y2, Y3, . . . , Y100. You can forecast Y101, Y102, Y103, . . . . In fact, the only . . . . If you value you need to know is Y100. If Y100=150, then observe Y101=140 at time 101, the forecasts of Y102, Y103, Y104, . . . change to , . . . . The point is that given the model, you need to know only the last Y, Yn to forecast as far into the future as you like. The forecasts are updated as new information is obtained. Consider the AR(2) model t Again, suppose you know Y1, Y2, . . . , Y100 with Y100=150. Knowing Y100 is not enough to forecast Y101. You need more information. If you know Y99=110, then 4.651 and ... ,86.941 In this example, you need to know two pieces of information: Y99=110 and Y100=150, or Y100=150 and Either pair of numbers allows you to forecast the series as far into the future as you like. The vector with the information you need is the state vector Zt. For the AR(2) model, the state vector is ′ where the prime symbol ( ) indicates the transpose of the row vector. Recall that Yt+k|t denotes the forecast of Yt+k given the data Y1, Y2, . . . , Yt. ′ 4.411 = 301Y ,421 ˆ ,8.011 = 201Y ˆ = 301Y ,811 = 201Y ˆ ˆ ,031 = 101Y ˆ e + )001 − = ) 05 ( 63. 2− tY(63. = )01(63. − )05(2.1 + 001 = e + )001 − 1− tY(6. = 001 − − )001 − − )001 − ) 4.65 ( 2.1 t |1+ tY 1− tY(2.1 ,001 − + 001 = 001 − t Y( = .4.651 = 101Y ˆ 201 Y t t t Y Y Z ˆ = 101Y ˆ 1+ t EG + ZF = 1+ Z t t t ′ )001 − t |1+ tY ,001 − Y( = 1+ t Z 1+ te + t Z = 1+ Z 6. t t t 001 − Y = Z Y 001 − 1+ t Y= 1+ t e + 001 − t |1 + t t t 1+ te2.1 )001 − + )001 − t |1+ tY(2.1 1+ Y(2.1 + )001 − + )001 − t Y(63. Y(63. − = − = 001 − 1+ t |2 + tY 1+ te + )001 − 1− tY(63. − )001 − t Y(2.1 = 001 − 1+ tY 1+ te + 001 − t |1+ tY = 001 − 1+ tY 1+ te  t 2.1 1  2.1 +  1  Z 63. −   = 1+ t  0 Z 1+ te + )001 − t |1+ tY( = 001 − 1+ tY ) )001 − ′ t Y ( 63. − )001 − 1+ tY ( 2.1 ,001 − 1+ tY ( = )001 − 1+ t|2 + tY ,001 − 1+ t ′ Y( = 1+ t Z )001 − 1− tY(63. − )001 − t Y(2.1 = 001 − t |1+ tY and In the second example, an AR(2) model, the state vector was and Two examples have been discussed thus far. In the first, an AR(1) model, the state vector was because The last line of the matrix equation becomes or The first line of the matrix equation is simply you can write Because for the AR(2) model. When the data at time t+1 become available, the state vector changes to Now Chapter 6: State Space Modeling 299 300 SAS for Forecasting Time Series where 2.1   =  1  Every ARMA model has an associated state vector Zt and an updating equation of the form 1+t EG + tZF = To determine the state vector for a univariate ARMA time series Yt, consider the sequence Yt, Yt+1|t, Yt+2|t, . . . . At some point, such as k+1, you find that Yt+k+1|t is a linear combination of the previous sequence of elements Yt, Yt+1|t, . . . Yt+k|t; that is, t| k + tY k In the AR(2) example, ) 001 so k=1. This determines the state vector as ′ Furthermore, any prediction of Yt+R|t with R>k is also a linear combination of state vector elements. Think of constructing the state vector by sequentially including forecasts Yt+j|t into Zt until you reach the first forecast that is linearly dependent on forecasts already in Zt. At that point, stop expanding Zt. Section 6.1.2 shows how this can be accomplished using canonical correlations. One more univariate example follows. Suppose 1–te8. + t Then, because t and, for j>1, 0= t| j and  1+ te8.   = 1+ t  1+ t  0= 1 α = 0α  te8.  t 1+ tY  = =  tY   tY  t| 1+ t Y 1α + Y 0 α Y | t (which is with ), the state vector has the following form: − t Y ( 63. α + − . . . )001 ) t k + tY , . . . , t 1+ tY , t + − | t|1+ t G t |1+ tY ( 2.1 Yα dna 1 + 2.1   1 = | t Yα= 001 0 63.−   =  e8. = t|1+tY 0 Y( = Z − t|1+ k + t e = tY t |2 + tY 1 +t + tY t t Y Z Z F Z Chapter 6: State Space Modeling 301 Note that 1+ te  which is equivalent to the equation t along with the identity 1+ te8. Thus, 8.  = 1 for the moving average (MA) model. The truly useful fact is that all multivariate ARMA models have state space equations. To construct the state vector for a multivariate process, you forecast each element of the process. For the bivariate process (Xt, Yt), for example, consider the sequence Xt, Yt, Xt+1|t, Yt+1|t, Xt+2|t, Yt+2|t, . . . . When you first reach a forecast of X (or Y) that is a linear combination of elements currently in the state vector, do not include that forecast or any future forecasts of that variable in the state vector. Continue including forecasts of the other variable in Zt until you reach a point of linear dependence in that variable. You have seen an AR(2) of dimension 2 in Section 6.1.2. A bivariate ARMA(1,1) is shown next. 6.2.2 ARMA(1,1) of Dimension 2 Consider the model    1 or 1 and t,2 from which t ,2 1 − t,2 − t,1ε  ε  0   1. − t,2 ε1. − 2. t   0  −  ε +    t 1ε  1 − t,1ε2. ε1. ,2 − , − t ,1 t, 8.  te8. 0 +   1   tY  1 ε2. 1ε ε + + 1 − 1 − tX − tY    5. 3. 1− tY3. t Y5. 1− tY5. t Y3. G dna  + + e8. + 1+ te = 1+ tY + + t t X3. X5. 1− tX5.  1+ te8. 0 =   1+ tY  1− t X3. 0 t  3.  =  Y  tX 5. 1 0 = = = 0 0 = = 1+ te8. = t| t| 1+ tX 1+ tY t t X Y F .noissergerotua rotcev yranimilerp eht fo tif eht tsniaga ledom hcae fo tif eht ssessa ot desu si noitalerroc lacinonac tsellams ehT .k redro fo noissergerotua rotcev a yb detamixorppa ylbanosaer eb nac taht sledom AMRA rotcev lla fo ssalc eht hcraes ,sdrow rehto nI .secirtam A eht fo stne mele eht fo emos 0 ot gnittes dna smret AM gniwolla yb devorpmi eb nac tif eht rehtehw semoceb noitseuq ehT .tif sti fo ,CIA ,erusaem a evah uoy dna ,k redro fo noissergerotua rotcev a dellac won si ledom ehT .2 .CIA eziminim ot k tceleS .tif taht niatbo ot desu sretemarap fo rebmun eht tsniaga tif fo noisicerp ffo sedart ti ,suhT .noitcnuf doohilekil eht ni esaercni na ro sretemarap ledom fo rebmun eht ni esaerced a yb rellams edam si CIA taht etoN )ledom eht ni sretemarap fo rebmun(2 + )doohilekil dezimixam(GOL2– = CIA ,3 2 ,2 ,1 3 2 1 ,3 2 ,3 ,2 2 si noiretirc sihT .)CIA( noiretirc noitamrofni s’ekiakA fo noisrev a esu uoy ,k no ediced oT .ECAPSETATS CORP ni enod si tahw yllaitnesse si sihT .secirtam A eht fo swor driht dna dnoces eht secudorp noisserger eht ni selbairav tnedneped sa t X dna t X gnisU .kA , . . . , A , A , A secirtam fo swor pot eht secudorp k–t X , . . ., –t X , –t X , –t X , –t X , –t X , – X no , 1X fo noisserger eht ,si tahT .wor yb wor siht od nac uoY 1 ,3 1 ,2 1 t, 1 t, t E+ k–t XkA + . . . + –tX A + – X A = X 2 2 1 t 1 t ledom RA etairavitlum a tif ′) X , X , 1X( = X ,elpmaxe rof ,seires rotcev ro etairavitlum a roF .1 t, 3 t, 2 t, t | 1 1 ,2   + t ε 2.   1  + t 1ε   0 1 ,     t + tX  3.   t  0  +  Y   tX  1 t Y51. + t X90. + t |1+ tX5. 6.3 PROC STATESPACE 51. 5. 0 1 |2 302 SAS for Forecasting Time Series The general outline of PROC STATESPACE is as follows:  5.  0 1 1     + t + tX 90.   +t  3. Y  =    1+ t X  0  = t |1+ tY3. Finally, the state space form is ′ ) t 1+ t X , tY , t X( | = t Z so + t |1+ tX5. = t |2 + tX and Chapter 6: State Space Modeling 303 For example, a vector ARMA(1,1) of dimension 3 can be  1− t,3e    1− t,2e    1− t,1e   4.  1.  1. 3. 2. 0 4. 0 Check to see if this model fits as well as the original vector autoregression. If k=4, for example, the original autoregression contains four A matrices, each with nine parameters. If the vector ARMA(1,1) fits about as well as the vector AR(4) in likelihood, the inherent penalty in the information criterion for a large number of parameters can make the information criterion for the vector ARMA(1,1) smaller than for the vector AR(4), and thus the difference will be negative. An information criterion for model selection based on this idea is called DIC in PROC STATESPACE. The comparison in step 2 is easier than it first appears. All vector ARMA models can be expressed in state space form. Thus, comparing state space models and determining the best model is equivalent to finding the dimension of the best model's state vector Zt, because all state space models have the same basic form, t The key to this decision is an organized sequential formulation of the state vector. Start by including X1,t, X2,t, and X3,t. Next, check X1,t+1|t to see if it is a linear combination of X1,t, X2,t, and X3,t. If it is, it provides no new information and is not added to the state vector. Otherwise, the state vector is augmented to (X1,t, X2,t, X3,t, X1,t+1|t). The next question is whether X2,t+1|t should be included in the state vector. Include it only if it cannot be written as a linear combination of elements already in the state vector. The state vector is formulated sequentially in this fashion. Suppose X1,t+1|t is included and both X2,t+1|t and X3,t+1|t have been tested. Next, consider testing X1,t+2|t for inclusion in the state vector. If X1,t+2|t is not included in the state vector, pioneering work by Akaike shows that X1,t+j|t is not included for any j>2. That is, if a forecast X1,t+k|t is a linear combination of elements already in the state vector, X1,t+j|t also is such a linear combination for any j>k. At this point, stop considering the forecast of X1, but continue to consider forecasts of X2 and X3 (unless X2,t+1|t or X3,t+1|t was found earlier to be a linear combination of elements already in the state vector) and continue in this fashion .   t,3e  1− t,3 X  0      −  t,2e +  1− t,2 X       t,1e   1− t,1X  1− t EB − E +  8.  4.  0 t EG 6. 4. 0 + 1 –t 1− t 27. 0 0 XA = X      t,3 X   =  t,2 X     t,1X  ZF Z = t t ro .3 .rotcev etats eht etupmoc ot )2.3.6 noitceS ees( hcaorppa noitalerroc lacinonac eht esu tsum uoy dna ,mrof ledom AMRA rotcev eht wonk ton od uoy ,ecitcarp nI .sledom nwonk rof ylno devired neeb evah srotcev etats ,tniop siht ot pu taht etoN .enod eb nac siht woh dewohs )6791( ekiakA .3 pets hsilpmocca ot snoitalerroc lacinonac fo esu eht si lacitcarp eb ot ssecorp siht swolla taht tluser laciteroeht ehT .slavretni tsacerof gniyalpsid rof lufesu noitamrofni rehto dna ,srorre dradnats tsacerof ,stsacerof sniatnoc taht detaerc si tes atad tuptuo na ,ECAPSETATS CORP nI .6 t t t t .Σ dna ,G ,F fo se tam itse m orf d e tam itse eb nac tI .xirtam ecnairavoc-ecnairav ,rorre-noitciderp eht sniatnoc sesehtnerap ni ytitnauq eht fo xirtam ecnairavoc-ecnairav ehT . Z3F fo stne mele eerht tsrif eht era 3+ ,3X dna ,3+ ,2X ,3+ ,1X stsacerof eht ,erofeb deredisnoc ,3X dna , ,2X , ,1X stne mele eerht eht sah ssecorp rotcev lanigiro eht fI t t t t t t t 3+tEG )3+ EG + 2+ EGF + 1+ EG2F( + Z3F = + 2+ EGF + 1+ Z2F = 3+ EG + 2+ ZF = 3+ Z t t t t t woN .daeha sdoirep eerht gnitsacerof ,elpmaxe rof ,redisnoC .secnairav rorre noitciderp eht sdleiy noitauqe ecaps etats eht ,noitidda nI .seires etairavitlum eht fo stsacerof eht era esehT .stnemele tsrif eht tcartxe neht dna k Z seulav erutuf tsacerof ot noitauqe ecaps etats eht esu ,tsacerof eb ot seires etairavitlum eht pu ekam rotcev etats eht fo stnemele wef tsrif eht esuaceB .5 +t .ecnairav rorre noitciderp fo erusaem lareneg a si ti esuaceb deziminim eb dluohs tnanimreted sihT .ECAPSETATS CORP ni tuo detnirp eb nac hcihw , Σ tnanimreted ˆ sti si Σ fo ezis eht fo erusaem lareneg A .) Σ fo stnemele lanogaid eht era eseht( secnairav ˆ rorre lareves sah gnittes etairavitlum a hcuS .seires hcae fo srorre tsacerof daeha-pets -eno eht fo stcudorpssorc dna serauqs fo smus eht morf Σ fo ,Σ ,etamitse na niatbo nac uoY ˆ .srorre dradnats dilav yllacitotpmysa rieht dna sretemarap eht fo setamitse )LM( doohilekil-mumixam eht dnif ot enituor hcraes raenilnon a esu dna ,setamitse laitini eht htiw tratS .noitcnuf doohilekil eht tuo etirw nac uoy ,Σ xirtam ecnairavoc-ecnairav dna 0 naem htiw srotcev modnar lamron tnednepedni fo ecneuqes a si E gnimussA .ledom AMRA rotcev etairavitlum gniylrednu yna rof t 1 +t EG + ZF = Z t 1 +t si ,mrof ecaps etats ni ,ledom eht taht llaceR .G dna F fo stnemele eht fo setamitse laitini etupmoc ot snoitauqe reklaW-eluY eht ot ralimis rennam a ni eseht sesu tI . Z dna Z fo stnemele eht neewteb secnairavoc etamitse ot noitamixorppa )k(RA rotcev laitini eht sesu ECAPSETATS CORP .4 t 1 +t ni tneserp ydaerla stnemele fo snoitanibmoc raenil era ′ t| 2 + t , 3 X dna , t| 1 + t , 1 t| 1 + t , 2 X, t| 3 + t , 1 .rotcev etats eht X ,esac siht nI t, t ) t| 2 + t , 1 X, t| 1 + t , 3 X, X , X , X , 1X( = Z t, 3 t, 2 eb yam rotcev etats eht ,elpmaxe siht roF 304 SAS for Forecasting Time Series Chapter 6: State Space Modeling 305 6.3.1 State Vectors Determined from Covariances PROC STATESPACE computes the sequence of information criterion values for k = 1, 2, . . . , 10 (or ARMAX) and selects the model that gives the minimum. This vector AR model for the time series is used to compute a variance-covariance matrix M between the set of current and past values and between the set of current and future values. These two facts are relevant: 1. All predictions are linear combinations of the observations of Yt, Yt–1, Yt–2, . . . where for practical purposes this list can be truncated at Yt–k, as determined by the initial vector autoregression. 2. The covariance between a prediction Yt+j|t and a current or past value Yt–i is the same as that between Yt+j and Yt–i. Akaike uses these facts to show that analyzing the covariances in the matrix M is equivalent to determining the form of the state vector. Canonical correlations are used in this case, and the elements of M are replaced by their sample values. 6.3.2 Canonical Correlations Suppose the covariance matrix between the set of current and past values Yt, Yt–1, Yt–2 and the set of current and future values Yt, Yt+1, Yt+2 for a univariate series is given by  5.  1  2 ′ ′ Note that there are no zero correlations. (You will find, however, that some canonical correlations are zero.) Canonical correlation analysis proceeds as follows: 1. Find the linear combination of elements in the first vector (Yt, Yt–1, Yt–2) and second vector (Yt, Yt+1, Yt+2), with maximum cross-correlation. This canonical correlation is the largest, and the linear combinations are called canonical variables. In the example, Yt and Yt are perfectly correlated, so (1, 0, 0)(Yt, Yt–1, Yt–2) and (1, 0, 0)( Yt, Yt+1, Yt+2) are the canonical variables. 2. Now consider all linear combinations of elements in the original vectors that are not correlated with the first canonical variables. Of these, the two most highly correlated give the secondhighest canonical correlation. In this case, you can show that the next-highest canonical correlation is, in fact, zero. 3. At each stage, consider linear combinations that are uncorrelated with the canonical variables found thus far. Pick the two (one for each vector being analyzed) with the highest crosscorrelation. 4 2 1  2  =  4 8 M 306 SAS for Forecasting Time Series Akaike establishes that the following numbers are all the same for general vector ARMA models: When you look at the example matrix M, you see that the covariance between Yt–k and Yt+j is always j+k 8(.5 ) for j,k 0. Thus, M is the covariance matrix of an AR(1) process. All rows of M are direct multiples of the first, so M has rank 1. Finally, the canonical correlations computed from M are 1,0,0. When the general sample covariance matrix M is used, the analysis proceeds as follows (illustration for bivariate series (Xt, Yt) ): 1. Determine the number of lags into the past (Xt, Yt, Xt–1, Yt–1, . . . , Yt–k ). 2. Do a canonical correlation of (Xt, Yt) with current and past values. This produces correlations 1,1. 3. Next, do a canonical correlation analysis of (Xt, Yt, Xt +1|t) with current and past values of Xt, Yt from step 1. 4. a) If the smallest canonical correlation is not close to zero, include Xt +1|t in the state vector and analyze (Xt, Yt, Xt +1|t, Yt +1|t). b) If the smallest canonical correlation is close to zero, exclude from consideration Xt+1|t and all Xt+j|t for j>1. In this case, the analysis of (Xt, Yt, Yt+1|t) is next. 5. Continue until you have determined the first predictions, Xt+j|t and Yt+s|t, that introduce zero canonical correlations. Then Xt+j–1|t and Yt+s–1|t are the last predictions of X and Y to be included in the state vector. PROC STATESPACE executes this procedure automatically. The sample canonical correlations are judged by the aforementioned DIC. A chi-square test statistic attributed to Bartlett (1947) is computed. The significance of Bartlett’s statistic indicates a nonzero canonical correlation. Robinson (1973) suggests potential problems with Bartlett’s test for MA models. Thus, DIC is used as the default criterion. PROC STATESPACE uses the estimated covariance matrix and the identified state vector to compute initial estimates of matrices F and G in the state space representation. The advantage of PROC STATESPACE is its automatic identification of a model and preliminary parameter estimation, but the user is responsible for any transformation necessary to produce stationarity and approximate normality. Also note that the STATESPACE theory does not include deterministic components like polynomials in time. Use the NOEST option to view the preliminary model before fine-tuning the parameter estimates through nonlinear iterative least squares (LS) or ML estimation. You may want to use the RESTRICT statement to set certain elements of F and G to zero. (You have seen several cases where F and G contain zeros.) .rotcev etats eht fo noisnemid eht snoitalerroc lacinonac oreznon fo rebmun eht M etairporppa na fo knar eht ′ ≥ Chapter 6: State Space Modeling 307 6.3.3 Simulated Example To see how PROC STATESPACE works with a known univariate model, consider 100 observations from an MA(1) model Yt = et + .8et–1 Note that the model can be re-expressed as Yt = (.8Yt–1 .64Yt–2 (.8) Yt–3 3 (.8) Yt–4. . . ) + et 4 Thus, the initial AR approximation should have coefficients near .8, .64, .512, .4096, . . . . Use the following SAS statements for the analysis: PROC STATESPACE CANCORR ITPRINT DATA=TEST2; VAR Y; RUN; As shown in Output 6.4, the CANCORR option displays the sequential construction of the state vector. The ITPRINT option shows the iterative steps of the likelihood maximization. , and standard deviation and the sequence of AICs In Output 6.4, observe the sample mean, for up to ten AR lags. The smallest AIC in the list is 9.994428 , which occurs at lag 4. Thus, the initial AR approximation involves four lags and is given by Yt = .79Yt–1 .58Yt–2 + .31Yt–3 .24Yt–4 + et This corresponds reasonably well with the theoretical results. Schematic representations of the autocorrelation function (ACF) and partial autocorrelation function (PACF) are also given. A plus sign (+) indicates a value more than two standard errors above 0, a period (.) indicates a value within two standard errors of 0, and a minus sign (-) indicates a value more than two standard errors below 0. Based on results from Chapter 3, “The General ARIMA Model,” you would expect the following sequences of + and - signs in the theoretical ACF and PACF plots. LAG ACF PACF 0 + + 1 + 2 . + 3 . 4 . + 5 . 6 . + 7 . 8 . + 9 . 10 . + You also would expect the estimated PACF to drop within two standard errors of 0 after a few lags. The estimated functions correspond fairly well with the theoretical functions. Note the canonical correlation analysis . Initially, consideration is given to adding Yt+1|t to the state vector containing Yt. The canonical correlation, 0.454239 , is an estimate of the second-largest canonical correlation between the set of variables (Yt, Yt+1) and the set of variables (Yt, Yt–1, Yt–2, Yt–3, Yt–4). The first canonical correlation is always 1 because both sets of variables contain Yt. The question is whether 0.4542 is an estimate of 0. PROC STATESPACE concludes that a correlation is 0 if DIC si + 2 + 1 Y gaL/emaN snoitalerrocotuA laitraP fo noitatneserpeR citamehcS neewteb si . . 01 . 9 + 8 . 7 ,rorre dts*2- < si . 6 . 5 . 4 . 3 ,rorre dts*2 > si + . 2 + 1 + 0 Y gaL/emaN snoitalerroC fo noitatneserpeR citamehcS 75984.61 01=gaL sledoM evissergerotuA rof noiretirC noitamrofnI 8155.41 9=gaL 30746.21 49017.01 2857.21 8=gaL 7=gaL 6=gaL 12059.11 824499.9 8788.31 5=gaL 4=gaL 3=gaL 42725.31 73950.72 78468.55 2=gaL 1=gaL 0=gaL sledoM evissergerotuA rof noiretirC noitamrofnI 632223.1 rorrE dradnatS 99404.0naeM Y elbairaV 001 snoitavresbO fo rebmuN erudecorP ECAPSETATS ehT snoitpO TNIRPTI dna RROCNAC eht htiw ECAPSETATS CORP ni ataD detalumiS gniledoM 4.6 tuptuO 308 SAS for Forecasting Time Series 70542020.1 72542020.1 12503020.1 58640520.1 33162040.1 )1,1(amgiS 41939048.0 4609048.0 49951248.0 65392338.0 69170987.0 )1,2(G 4998911.0 58978911.0 81835621.0 50594102.0 25290092.0 )2,2(F 1215150.0494150.03900750.02327511.02479261.0)1,2(F 100.0 1000.0 100.0 10.0 1.0 adbmaL noitamitsE doohilekiL mumixaM :gnittiF evitaretI 441020.1 noitavonnI rof xirtaM ecnairaV 270987.0 1 noitavonnI rof xirtaM tupnI 390092.0 1 79261.00 xirtaM noitisnarT fo etamitsE rotceV etatS setamitsE yranimilerP dna mroF ecapsetatS detceleS FD erauqS ihC noiretirC noitamrofnI )T;2+T(Y FD )deunitnoc( snoitpO TNIRPTI dna RROCNAC eht htiw ECAPSETATS CORP ni ataD detalumiS gniledoM Chapter 6: State Space Modeling 11 4 89646.22 erauqS ihC 61901.51 noiretirC noitamrofnI 932454.0 )T;1+T(Y sisylanA snoitalerroC lacinonaC 12 13 14 15 3 19083.4 83255.1- 175802.0 697984.0 )T;1+T(Y 16 )T;1+T(Y )T;T(Y 22 17 21 542020.1 542020.1 503020.1 740520.1 162040.1 tnanimreteD 2 0 0 0 0 flaH 4 3 2 1 0 retI 1 )T;T(Y 1 )T;T(Y 4.6 tuptuO 309 310 SAS for Forecasting Time Series Now consider the portion of Yt+2|t that you cannot predict from Yt and Yt+1|t. If this portion is correlated with the past of the series, you can produce a better predictor of the future than one that uses only Yt and Yt+1|t. Add Yt+2|t to the state vector unless the third-highest canonical correlation between the set (Yt, Yt+1, Yt+2) and the set (Yt, Yt–1, Yt–2, . . . , Yt–4) is 0. The estimate of the thirdhighest canonical correlation is 0.208571 12 . PROC STATESPACE assumes that 0.208571 is just an estimate of 0 because DIC is negative (–1.55238) 13 . This means that once you have predicted Yt+2 from Yt and Yt+1|t, you have the best predictor available. The past data do not improve the forecast. Thus, Yt+2|t is not added to the state vector. Bartlett’s test statistic, 4.38091 14 , is not significant compared to a chi-squared table with three degrees of freedom 15 (with a critical value of 7.81). )deunitnoc( snoitpO TNIRPTI dna RROCNAC eht htiw ECAPSETATS CORP ni ataD detalumiS gniledoM eulaV t 24.8 97.0 93.0ledoM dettiF dna mroF ecapsetatS detceleS .degrevnoc sah noitamitse doohilekil mumixaM noitavonnI rof xirtaM tupnI 19 xirtaM noitisnarT fo etamitsE 18 noitavonnI rof xirtaM ecnairaV 998911.0 1 rorrE dradnatS 658990.0 432151.0 797231.0 )T;1+T(Y setamitsE retemaraP rotceV etatS 939048.0 1 542020.1 etamitsE 939048.0 998911.0 15150.0- 15150.00 )T;T(Y retemaraP )1,2(G )2,2(F )1,2(F 4.6 tuptuO 20 Chapter 6: State Space Modeling 311 Again, the two tests agree that Yt+2|t is a linear combination of Yt and Yt+1|t. Thus, the only information you need to predict arbitrarily far into the future is in ′ Zt = (Yt, Yt+1|t) When you compare this to the theoretical analysis of an MA(1), you see that PROC STATESPACE has correctly identified the state vector as having two elements. The theoretical analysis gives the state space representation as 1+ te  PROC STATESPACE estimates these matrices to be 97.    1  initially 16 17 and 148.    1  finally 18 19 . Note that the t statistics 20 on F (2,1) and F (2,2) are, as expected, not significant. The true entries of F are zeros in those positions. Finally, observe the nonlinear search 21 beginning with the initial values in Output 6.4 16 17 and then moving to the final values in Output 6.4 18 19 . Note that decreases at each step 22 . To force the correct form on the matrix F, use the RESTRICT statement: PROC STATESPACE ITPRINT COVB DATA=TEST2; RESTRICT F(2,1)=0 F(2,2)=0; VAR Y; RUN; The RESTRICT statement may also include restrictions on the entries of G. (See Output 6.5.) As requested, the bottom row of F has been set to 0 0 . The initial G matrix and the final G matrix are close to the theoretical matrix, namely G=(1 .8) . The COVB option requests the variance-covariance matrix of the parameter estimates , which is a scalar in the case of a single parameter estimate. ′ Σ ˆ =G =G dna 8. + 1 dna 092.   1 21.   1 t  Z 0 1 150. 361. 0  0  0 0  = 1+ t  −  − =F =F Z 32932.0Y -4=gaL-- 989803.0 Y -3=gaL-- 52285.0Y -2=gaL-- 270987.0 Y -1=gaL-- Y CIA muminiM rof setamitsE reklaW-eluY neewteb si . . 01 . 9 . 8 ,rorre dts*2- < si . 7 . 6 . 5 4 . 3 ,rorre dts*2 > si + 2 + 1 Y gaL/emaN snoitalerrocotuA laitraP fo noitatneserpeR citamehcS neewteb si . . 01 . 9 + 8 . 7 ,rorre dts*2- < si . 6 . 5 . 4 . 3 ,rorre dts*2 > si + . 2 + 1 + 0 Y gaL/emaN snoitalerroC fo noitatneserpeR citamehcS 75984.61 01=gaL sledoM evissergerotuA rof noiretirC noitamrofnI 8155.41 9=gaL 30746.21 49017.01 2857.21 8=gaL 7=gaL 6=gaL 12059.11 824499.9 8788.31 5=gaL 4=gaL 3=gaL 42725.31 73950.72 78468.55 2=gaL 1=gaL 0=gaL sledoM evissergerotuA rof noiretirC noitamrofnI 632223.1 rorrE dradnatS 99404.0naeM Y elbairaV 001 snoitavresbO fo rebmuN erudecorP ECAPSETATS ehT tnemetatS TCIRTSER eht htiw ECAPSETATS CORP ni ataD detalumiS gniledoM 5.6 tuptuO 312 SAS for Forecasting Time Series 544620.1 noitavonnI rof xirtaM ecnairaV 172087.0 1 noitavonnI rof xirtaM tupnI 0 1 0 0 xirtaM noitisnarT fo etamitsE )T;1+T(Y rotceV etatS ledoM dettiF dna mroF ecapsetatS detceleS .demussa neeb sah ecnegrevnoC .sgnivlah pets 01 retfa tnemevorpmi oN :GNINRAW 22544620.1 85374620.1 3084620.1 )1,1(amgiS 47072087.0 39962977.0 69170987.0 )1,2(G 100.0 10.0 1.0 adbmaL 544620.1 474620.1 84620.1 tnanimreteD 0 0 0 flaH 2 1 0 retI )T;T(Y noitamitsE doohilekiL mumixaM :gnittiF evitaretI 441020.1 noitavonnI rof xirtaM ecnairaV 270987.0 1 noitavonnI rof xirtaM tupnI 0 1 0 0 xirtaM noitisnarT fo etamitsE )T;1+T(Y )T;T(Y rotceV etatS setamitsE yranimilerP dna mroF ecapsetatS detceleS )deunitnoc( tnemetatS TCIRTSER eht htiw ECAPSETATS CORP ni ataD detalumiS gniledoM 5.6 tuptuO Chapter 6: State Space Modeling 313 314 SAS for Forecasting Time Series You can find other options for PROC STATESPACE in the SAS/ETS User's Guide. It is dangerous to ignore the autocorrelations. The theory behind PROC STATESPACE assumes the input series are stationary. You have no guarantee of a reasonable result if you put nonstationary series into PROC STATESPACE. Often, you see almost all plus signs in the ACF diagram, which indicates a very slow decay and, consequently, possible nonstationarity. Differencing is specified exactly as in PROC ARIMA. For example, the following SAS statements specify a first and span 12 difference to be applied to Y: PROC STATESPACE; VAR Y(1,12); RUN; The FORM statement is used to specify a form for the state vector. This statement can be helpful if you want to specify a state vector different from what DIC automatically chooses (for example, Bartlett's test may give a different result than DIC, and you may prefer Bartlett's test). For example, the statements PROC STATESPACE; VAR X Y; FORM X 2 Y 1; RUN; specify the state vector as ′ Zt = (Xt, Yt, Xt+1|t) Now consider an interesting data set that cannot be modeled correctly as a transfer function because of feedback. The data are counts of mink and muskrat pelts shipped to Europe from Canada by the Hudson's Bay Company. The logarithms are analyzed, and both the logarithms and the original data are plotted in Output 6.6. )deunitnoc( tnemetatS TCIRTSER eht htiw ECAPSETATS CORP ni ataD detalumiS gniledoM eulaV t 64.21 setamitsE retemaraP fo ecnairavoC setamitsE retemaraP fo noitalerroC )1,2(G 4429300.0 rorrE dradnatS 546260.0 )1,2(G 00000.1 setamitsE retemaraP etamitsE 172087.0 )1,2(G )1,2(G retemaraP )1,2(G 5.6 tuptuO Chapter 6: State Space Modeling 315 Output 6.6 Plotting Original and Logged Data 316 SAS for Forecasting Time Series Output 6.6 Plotting Original and Logged Data (continued) Chapter 6: State Space Modeling 317 You have an increasing, seemingly linear trend in the data plots. PROC REG is appropriate for detrending the data if the trend is due to increased trapping and does not reveal anything about the relationship between these two species. In that case, the dynamic relationship between mink (predator) and muskrat (prey) is best displayed in residuals from the trend. Another approach is to difference the two data series and analyze the resulting changes in pelt numbers. The approach you choose depends on the true nature of the series. The question becomes whether it is a unit root process or a time trend plus stationary error process. The regression detrending approach is used here simply to display the technique and is not necessarily recommended over differencing. The following SAS code detrends the logged data and submits the detrended data (residuals) to PROC STATESPACE for analysis: DATA DETREND; SET MINKMUSK; T+1; RUN; PROC REG DATA=DETREND NOPRINT; MODEL LMINK LMUSKRAT=T; OUTPUT OUT=RESID R=RMINK RMUSKRAT; RUN; PROC STATESPACE NOCENTER DATA=RESID; VAR RMINK RMUSKRAT; TITLE 'HUDSON”S BAY FUR TRAPPING RECORDS 1842-1890'; TITLE2 'RESIDUALS FROM LINEAR TREND'; RUN; The results are shown in Output 6.7. Because the data are detrended, you do not need to subtract the mean. Thus, you can specify NOCENTER. Note that the ACF schematic plot shows several plus and minus signs but not enough to indicate nonstationarity. (However, it is notoriously difficult to detect nonstationarity visually in a series that has been detrended.) Note that the ACF at lag 1 is represented by a matrix of plus and minus signs because you have a bivariate series. If you consider a bivariate series in general as (Xt, Yt) and the lag 1 matrix  +   + 1− t then the + in the upper-left corner indicates a positive covariance between Xt and Xt–1. The + in the upper-right corner indicates a positive covariance between Xt and Yt–1. The in the lower-left corner indicates a negative covariance between Yt and Xt–1 and, finally, the + in the lower-right corner indicates a positive covariance between Yt and Yt–1 In terms of the current example, Xt represents RMINK, and Yt represents RMUSKRAT, so the signs make sense with respect to the predator–prey relationship. The PACF looks like that of a vector AR of dimension 2 and order 1 (one lag). Thus, you expect the initial AR approximation to have only one lag and to be very close to the final model chosen by PROC STATESPACE. This is, in fact, the case here. − Y 1− X t −   +  t t Y X 318 SAS for Forecasting Time Series The state vector is simply the vector of inputs, so the vector ARMA model is easily derived from the state space model . When Xt=RMINK (mink residuals at time t) and Yt=RMUSKRAT (muskrat residuals at time t) are used, the state vector is simply ′ Zt = (Xt, Yt) and the model is  + t e t     +    t    + t 1e  9=gaL 198.122- 302.522- 721.722- 52.9228=gaL 7=gaL − 1 , Here, for example, the number –.468 indicates that large mink values (predator) at time t are associated with small muskrat values (prey) at time t+1. Xt and Yt are not related by a transfer function because you can use the t statistic to reject the hypothesis that .468 is an estimate of 0. That is, each series is predicted by using lagged values of the other series. The transfer function methodology in PROC ARIMA is not appropriate. slaudiseR eht ezylanA ot ECAPSETATS CORP dna ataD eht dnerteD ot GER CORP gnisU sledoM evissergerotuA rof noiretirC noitamrofnI erudecorP ECAPSETATS ehT DNERT RAENIL MORF SLAUDISER 0981-2481 SDROCER GNIPPART RUF YAB S'NOSDUH erudecorP ECAPSETATS ehT DNERT RAENIL MORF SLAUDISER 0981-2481 SDROCER GNIPPART RUF YAB S'NOSDUH 6=gaL 94 1 rorrE dradnatS 900793.0 147763.0 ,2 5=gaL 94.032- snoitavresbO fo rebmuN sledoM evissergerotuA rof noiretirC noitamrofnI 01=gaL 211.122-    Y 726. 864.−   + Y  TARKSUMR = X 892. 965.   + X KNIMR 4=gaL 456.632- 845.242- 507.542- 877.152- 426.881elbairaV TARKSUMR KNIMR 3=gaL 2=gaL 1 1 t t 1=gaL 0=gaL 7.6 tuptuO 1 0 0 1 noitavonnI rof xirtaM tupnI 851726.0 262892.0 93864.0947865.0 xirtaM noitisnarT fo etamitsE )T;T(TARKSUMR rotceV etatS )T;T(KNIMR setamitsE yranimilerP dna mroF ecapsetatS detceleS erudecorP ECAPSETATS ehT DNERT RAENIL MORF SLAUDISER 0981-2481 SDROCER GNIPPART RUF YAB S'NOSDUH 851726.0 262892.0 93864.0947865.0 TARKSUMR KNIMR TARKSUMR KNIMR -------1=gaL-------CIA muminiM rof setamitsE reklaW-eluY neewteb si . .. .. 01 .. .. 9 .. .. 8 .. .. 7 ,rorre dts*2- < si .. .. 6 .. .. 5 .. .. 4 ,rorre dts*2 > si + .. .. 3 .. .. 2 +.+ 1 TARKSUMR KNIMR gaL/emaN snoitalerrocotuA laitraP fo noitatneserpeR citamehcS neewteb si . .. .+ 01 .. -+ 9 .+ -. 8 .+ -. 7 ,rorre dts*2- < si .+ .6 .. .5 .. .4 .+. 3 ,rorre dts*2 > si + .+. 2 +++ 1 +. .+ 0 TARKSUMR KNIMR gaL/emaN snoitalerroC fo noitatneserpeR citamehcS )deunitnoc( slaudiseR eht ezylanA ot ECAPSETATS CORP dna ataD eht dnerteD ot GER CORP gnisU 7.6 tuptuO Chapter 6: State Space Modeling 319 49.6 08.459.2 12.5 eulaV t 653090.0 445790.0 302101.0 352901.0 rorrE dradnatS 851726.0 93864.0262892.0 947865.0 etamitsE )2,2(F )1,2(F )2,1(F )1,1(F retemaraP setamitsE retemaraP 270360.0 307200.0 307200.0 131970.0 noitavonnI rof xirtaM ecnairaV 1 0 0 1 noitavonnI rof xirtaM tupnI 851726.0 262892.0 93864.0947865.0 xirtaM noitisnarT fo etamitsE )T;T(TARKSUMR rotceV etatS )T;T(KNIMR ledoM dettiF dna mroF ecapsetatS detceleS erudecorP ECAPSETATS ehT DNERT RAENIL MORF SLAUDISER 0981-2481 SDROCER GNIPPART RUF YAB S'NOSDUH .degrevnoc sah noitamitse doohilekil mumixaM 270360.0 307200.0 307200.0 131970.0 noitavonnI rof xirtaM ecnairaV )deunitnoc( slaudiseR eht ezylanA ot ECAPSETATS CORP dna ataD eht dnerteD ot GER CORP gnisU 7.6 tuptuO 320 SAS for Forecasting Time Series )2(**B 43192.0 + )1(**B 25487.0 - 1 srotcaF evissergerotuA retliF gninetihwerP :1 rotcaF :retlif gniwollof eht yb denetihwerp neeb evah selbairav htoB srorre dradnats owt skram "." | | | | | | | | | | | | | | | | | | | | | . |** . . ****| . . **| . . ***| . . ***| . . *| . . | . . |** . . |** . . |********** . **| . .*****| . . | . . ****| . . **| . . | . . | . . |** . . |******** . |**** . . *| . | | | | | | | | | | | | | | | | | | | | | 17080.72481.0 48221.0 92361.0 69541.0 58750.0 33200.65801.72411.55084.54121.0 99272.0 74900.94122.0 58601.0 97210.78410.86601.49404.88281.84820.0 noitalerroC 7747600.0504510.0 072010.0 156310.0 202210.0 2638400.0 6491000.05570900.00355900.0471040.0351010.0 228220.0 8197000.0715810.0 4239800.0 4960100.03342100.09719800.0358330.0982510.02183200.0 ecnairavoC 01 9 8 7 6 5 4 3 2 1 0 12345678901gaL 1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1- snoitalerroC-ssorC erudecorP AMIRA ehT 0981-2481 SDROCER GNIPPART RUF YAB S'NOSDUH snoitalerroC-ssorC 8.6 tuptuO T AM E F If you had (mistakenly) decided to fit a transfer function, you could have fit an AR(2) to the mink series and computed the prewhitened cross-correlations. You observe a somewhat subtle warning in the cross-correlations plot—namely, that there are nonzero correlations at both positive and negative lags as shown in Output 6.8. Y L Chapter 6: State Space Modeling 321 322 7 Spectral Analysis 7.1 Periodic Data: Introduction The modeling of time series data using sinusoidal components is called spectral analysis. The main tool here is the periodogram. A very simple model appropriate for spectral analysis is a mean plus a sinusoidal wave plus white noise: t where the formula, for the sine of the sum of two angles, has been applied. The function oscillates between and in a smooth and exactly periodic fashion. The number is called the amplitude. The number in radians, is called the phase shift or phase angle. The number is called the frequency and is also measured in radians. If an arc of length r is measured along the circumference of a circle whose radius is r, then the angle obtained by connecting the arc’s ends to the circle center is one radian. There are degrees. A radians in a full 360-degree circle, and one radian is thus plot of versus t is a sine wave that repeats every time units; that is, the period is A sinusoid of period 12 would “go through” radians per observation. t e + ) tω(nis B + ) tω(soc A + µ = ,)δ(soc α = B )δ(nis α = A Letting and we see that π2 25.0 = 21 / π 2 = ω ω / π2 3.75 = 2382.6 / 063 = ) π2(/ 063 α+µ ,δ e + )) tω(nis )δ(soc + ) tω(soc )δ(nis(α + µ = te + ))δ + tω(nis( α + µ = tY α−µ 453 yaleD eruP dna ,esahP ,niaG no sliateD 053 ataD reviR esueN eht fo sisylanA lartcepS-ssorC 643 stnemetatS ARTCEPS CORP 443 artcepS esahP dna edutilpmA-ssorC gniterpretnI 143 seititnauQ lartcepS-ssorC gniterpretnI 143 sisylanA lartcepS-ssorC 01.7 043 margodoireP dehtoomS ehT :murtcepS eht gnitamitsE 9.7 933 )gnidaeR lanoitpO( liateD lacitamehtaM emoS 8.7 533 ytisneD lartcepS ehT 7.7 433 gnisailA dna snoitautculF tsaF ylemertxE 6.7 033 seicneuqerF cinomraH 5.7 823 esioN etihW rof gnitseT 4.7 623 decudortnI ARTCEPS CORP 3.7 423 ytivitcA emyznE tnalP :elpmaxE 2.7 323 noitcudortnI :ataD cidoireP 1.7 )B (soc ) A(nis + )B(nis ) A(soc = )B + A(nis α ))δ + tω(nis(α + µ ω .ω / π2 ))δ + tω(nis(α + µ 5.01.7 4.01.7 3.01.7 2.01.7 1.01.7 t Y retpahC 324 SAS for Forecasting Time Series This is a very nice expression in that, if is known, variables and can be A, and B can be estimated by ordinary least constructed in a DATA step and the parameters squares as in PROC REG. From the expressions for A and B it is seen that and so phase angle and amplitude estimates can be constructed from estimates of A and B. 7.2 Example: Plant Enzyme Activity As an example, Chiu-Yueh Hung, in the Department of Genetics at North Carolina State University, collected observations on leaf enzyme activity Y every 4 hours over 5 days. There are 6 observations per day and 30 observations in all. Each observation is an average of several harvested leaves. The researcher anticipated a 12-hour enzyme cycle, which corresponds to 3 observations. To focus this discussion on periodic components, the original data have been detrended using linear regression. 3/ DATA PLANTS; TITLE "ENZYME ACTIVITY"; TITLE2 "(DETRENDED)"; DO T=1 TO 30; INPUT Y @@; PI=3.1415926; S1=SIN(2*PI*T/3); C1=COS(2*PI*T/3); OUTPUT; END; CARDS; 265.945 290.385 251.099 285.870 379.370 283.096 306.199 341.696 246.352 310.648 234.870 314.744 261.363 321.780 313.289 307.988 303.909 284.128 252.886 317.432 213.168 308.458 296.351 283.666 333.544 ; RUN; PROC REG DATA=PLANTS; MODEL Y = S1 C1/SS1; OUTPUT OUT=OUT1 PREDICTED=P RESIDUAL=R; RUN; 301.173 276.348 253.460 287.160 316.998 The analysis of variance table is shown in Output 7.1. π2 First read in the data, creating the sine and cosine variables for a period 3 (frequency per observation), and then regress Y on these two variables. )δ(nat = A / B ) tω(soc ) tω(nis ,µ ω ,α = )δ( 2 nis + )δ( 2 soc α = 2B + 2 A cycles Chapter 7: Spectral Analysis 325 The sum of squares for the intercept is and the sum of squares for the model, which is the sum of squares associated with frequency is 11933 and has 2 ( ). degrees of freedom. It is seen to be statistically significant based on the F test, It appears that the sine term is significant but not the cosine term; however, such a splitting of the two degree of freedom sum of squares is not meaningful in that, if had been used as the first time index rather than both would have been significant. The sum of squares 11933 would not is change with any such time shift. The sum of squares 11933 associated with frequency called the periodogram ordinate at that frequency. A given set of data may have important fluctuations at several frequencies. Output 7.2 shows the actual and fitted values for the plant enzyme data. 4300. 3 / π2 = ω =P F > rP 4300.0 48874.992 33611 0228452 SS I epyT 90.7 7592.0 3443.0 =F eulaV F 90.7 ,3 / π2 = ω 8555.0 9000.0 1000.< |t| > rP ,0228452 qS-R jdA erauqS-R 0=t erauqS naeM 82126.148 19544.6695 Y :elbairaV tnednepeD 1LEDOM :ledoM erudecorP GER ehT ecnairaV fo sisylanA setamitsE retemaraP 06.027.330.55 eulaV t Output 7.1 Plant Enzyme Sinusoidal Model = ) 238544.192(03 = 2 Yn 70459.9 38544.192 17010.92 35094.7 35094.7 16692.5 rorrE dradnatS serauqS fo muS 75643 42722 33911 raV ffeoC naeM tnednepeD ESM tooR 52864.498848.7238544.192 etamitsE retemaraP FD 92 72 2 1 1 1 FD ,1 latoT detcerroC rorrE ledoM 1c 1s tpecretnI elbairaV =t ecruoS 326 SAS for Forecasting Time Series Output 7.2 Data and Sinusoidal Predictions 7.3 PROC SPECTRA Introduced Periodogram ordinates are calculated for a collection of frequencies known as the Fourier observations Y, there are m of these, each with 2 degrees of freedom, so frequencies. With that a multiple regression of Y on the sine and cosine columns fits the data perfectly; that is, there are no degrees of freedom for error. The Fourier frequencies are where j runs from 1 to m. For each j, two columns and are created. The model periodogram ordinate. sum of squares, when the data are regressed on these two columns, is the At the Fourier frequency, the sine and cosine run through j cycles in the time period covered by the data. If an even number, there are still m periodogram ordinates and j still runs from 1 to m, but when the frequency becomes and Thus for even n, the last Fourier frequency has only one degree of freedom associated with it, arising from the cosine term, only. It does not matter whether a multiple regression using all the Fourier sine and cosine columns or m bivariate regressions, one for each j, are run. The columns are all orthogonal to each other and the sums of squares (periodogram ordinates) are the same either way. PROC SPECTRA calculates periodogram ordinates at all the Fourier frequencies. With the 30 plant enzyme measurements there are 15 periodogram ordinates, the last having 1 degree of freedom and the others 2 each. Since the Fourier frequency for should have periodogram ordinate equal to the previously computed model sum of squares, 11933. You might expect the other periodogram ordinates to add to 22724, the error sum of squares. However, PROC SPECTRA .0 = ) tπ (nis htj ,n ,)n / jπ2 ( 01 = j … ,2 ,1 π = )m2 (/ mπ2 = t ,)n / tjπ 2(soc )n / tjπ 2(nis n ,3 / π 2 = 03 / 01π2 m2 / jπ2 1 + m2 t , )1− ( = ) tπ (soc , m=j m2 = n htj Chapter 7: Spectral Analysis 327 associates twice the correction term, with frequency 0 and twice the sum of squares at frequency (when n is even) with that frequency, so one must divide the frequency ordinate by 2 to get its contribution to the error sum of squares from regression. This doubling is done here because, after doubling some, division of all ordinates by 2 becomes the same as dividing unadjusted numbers by their degrees of freedom. The frequency 0 ordinate is replaced with 0 when the option ADJMEAN is used in PROC SPECTRA. These ideas are illustrated here for the plant enzyme data: PROC SPECTRA DATA=PLANTS OUT=OUT2 COEFF; VAR Y; RUN; DATA OUT2; SET OUT2; SSE = P_01; TITLE J=CENTER "ENZYME DATA"; IF PERIOD=3 OR PERIOD=. THEN SSE=0; IF ROUND (FREQ, .0001) = 3.1416 THEN SSE = .5*P_01; RUN; PROC PRINT DATA=OUT2; SUM SSE; RUN; The option COEFF in PROC SPECTRA adds the regression coefficients and to the data. Looking at the period 3 line of Output 7.3, you see the regression sum of squares which matches the regression output. The coefficients and are those that would have been obtained if time t had been labeled as (as PROC SPECTRA Any periodogram ordinate with 2 degrees of freedom can be computed as does) instead of where A and B are its Fourier coefficients. You see that (See Output 7.3.) 97.71 ESS 87.32722 ======== 03.66 54.144 49.7592 31.3753 53.295 00.0 66.926 70.0654 85.292 19.3131 58.9632 51.4141 78.1482 47.586 87.489 00.0 10_P 06.231 54.144 49.7592 31.3753 53.295 98.23911 66.926 70.0654 85.292 19.3131 58.9632 51.4141 78.1482 47.586 87.489 34.0446905 10_NIS 10_SOC )dednerted( ytivitca emyznE 0000.0 9424.58720.019478.31 5367.21497.71 0871.08597.28850.17643.95865.71116.91279.110546.6 3703.6 0000.0 379.2 220.0038.9067.6 446.5488.12774.6012.71 882.4 384.0 530.01083.1297.6942.1 680.5 298.285 DOIREP 0000.2 9241.2 7703.2 0005.2 3727.2 0000.3 3333.3 0057.3 7582.4 0000.5 0000.6 0005.7 0000.01 0000.51 0000.03 . 95141.3 51239.2 17227.2 72315.2 38303.2 04490.2 69488.1 25576.1 80664.1 46652.1 02740.1 67738.0 23826.0 88814.0 44902.0 00000.0 QERF sbO Output 7.3 OUT Data Set from PROC SPECTRA π )10 _ n is =B 88.12− = A 10 _s oc( 92 , … , , 1 0 ,0446905 = 2 Yn2 .33911 = ) 2 )1497.71( + 2)488.12− (()2 / 03( .03 , π … ,2 , 1 61 51 41 31 21 11 01 9 8 7 6 5 4 3 2 1 ,) 2B + 2 A()2 / n( ,10 _P = 33911 328 SAS for Forecasting Time Series PROC SPECTRA automatically creates the column FREQ of Fourier frequencies equally spaced in the interval 0 to and the column PERIOD of corresponding periods. It is customary to plot the periodogram versus frequency or period, omitting frequency 0. Output 7.4 Periodogram with a Single Important Frequency Output 7.4 shows the unusually large ordinate 11933 at the anticipated frequency of one cycle per 12 hours—that is, one cycle per 3 observations. The researcher was specifically looking for such a cycle and took sufficient observations to make the frequency of interest a Fourier frequency. If the important frequency is not a Fourier frequency, the periodogram ordinates with frequencies near the important one will be large. Of course, by creating their own sine and cosine columns, researchers can always investigate any frequency using regression. The beauty of the Fourier frequencies is the orthogonality of the resulting collection of regression columns (sine and cosine functions). 7.4 Testing for White Noise For a normal white noise series with variance the periodogram ordinates are independent and, when divided by have chi-square distributions with 2 degrees of freedom (df). These properties lead to tests of the white noise null hypothesis. You are justified in using an F test for the single sinusoid plus white noise model when the is known in advance, as in Section 7.2. You would not be justified in testing the appropriate largest observed ordinate (just because it is the largest) with F. If you test for a period 3 component , 2σ , 2σ π ω Chapter 7: Spectral Analysis 329 in multiple sets of white noise data (your null hypothesis), the F test statistic will have an F distribution. However, if you always test the largest ordinate whether or not it occurs at period 3, then this new F statistic will never be less than the F for period 3 and will usually be larger. Clearly this new “F” statistic cannot have the same F distribution. Fisher computed the distribution for the largest periodogram ordinate divided by the mean of all the 2 df ordinates under the white noise null hypothesis. In the plant enzyme data, omission of the 1 df ordinate 132.6 gives Fisher’s kappa test statistic Fuller (1996) discusses this test along with the cumulative periodogram test. The latter uses which is the ratio of the sum of the first k periodogram ordinates to the sum of all the ordinates (again dropping any 1 df ordinate). The set of these should behave like an ordered sample from a uniform distribution if the data are white noise. Therefore a standard distributional test, like those in PROC UNIVARIATE, can be applied to these cumulative ratios, resulting in a test of the white noise null hypothesis. Traditionally the Kolmogorov-Smirnov test is applied. (See Fuller, page 363, for more details.) Interpolating in Fuller’s table of critical values for Fisher’s kappa with 14 ordinates gives 4.385 as the 10% and 4.877 as the 5% critical value. Our value 4.83 is significant at 10% but not quite at 5%. Therefore, if you were just searching for a large ordinate rather than focusing from the start on a 12hour cycle, your evidence for a 12-hour cycle would be nowhere near as impressive. This illustrates the increase in statistical power that can be obtained when you know something about your subject matter. You obtain both white noise tests using the WHITETEST option, as shown in Output 7.5. PROC SPECTRA DATA=PLANTS WHITETEST; VAR Y; RUN; .38.4 , kC For 14 periodogram ordinates, tables of the Kolmogorov-Smirnov (K-S) statistic indicate that a value larger than about 0.36 would be needed for significance at the 5% level so that 0.256 is not big enough. Fisher’s test is designed to detect a single sinusoid buried in white noise and so would be expected to be more powerful under the model proposed here than the K-S test, which is designed to have some power against any departure from white noise. .elbairav modnar )1,0(mrofinu a fo FDC eht dna margodoirep eht fo smus laitrap dezidradnats eht fo ecnereffid etulosba mumixaM :citsitatS vonrimS-vorogomloK s'tteltraB 489552.0 ))*(P(muS/))*(P(xaM*)1-M( :appaK s'rehsiF Y elbairaV rof esioN etihW rof tseT erudecorP ARTCEPS ehT Output 7.5 PeriodogramBased White Noise Tests = ]41 / )33911 + 2 / 6.231 − 8.32722 ([/ 33911 63.09543 98.23911 41 k C 286928.4 ))*(P(muS ))*(P(xaM 1-M k C appaK citsitatS tseT 330 SAS for Forecasting Time Series 7.5 Harmonic Frequencies Just because a function is periodic does not necessarily mean it is a pure sinusoid. For example, the is a periodic function of period k but is not sum of a sinusoid of period k and another of period expressible as a single sinusoid. On the other hand, any periodic function of period k defined on the integers can be represented as the sum of sinusoids of period k, etc. For a fundamental for are called “harmonics.” Harmonics affect the wave shape but period k, periods not the period. A period of 2 is the shortest period detectable in a periodogram, and its associated frequency, is sometimes called the Nyquist frequency. Thus the plant enzyme measurements were (period 3). not taken frequently enough to investigate harmonics of the fundamental frequency Even the first harmonic has period and frequency which exceeds the Nyquist frequency . To further illustrate the idea of harmonics, imagine monthly observations where there is a fundamental frequency and possibly contributions from the harmonic frequencies plus white noise. To fit the model you create three sine and three cosine columns. The and sine column for the fundamental frequency would have tth entry and would go through 3 cycles in 36 observations. Now look at Output 7.6. Output 7.6 Fundamental and Harmonic Sinusoids 2/ 21 / )π 2(2 3 / π2 )21 / tπ 2(nis ,3 / k ,2 / ,3 / π4 k 63 = n k 2 < 2 /3 …,3 ,2 = 21 / π2 j j /k 21 / )π 2(3 ,π π Chapter 7: Spectral Analysis 331 Output 7.6 is a schematic representation of the regression X matrix just described and is interpreted as follows. On the left, a vertical column of dots represents the intercept column, a column of 1s. Just to its right is a wave that represents and to its right is another wave representing Run your finger down one of these two waves. Your finger cycles between one unit left and one unit right of the wave center line. There are three cycles in each of these two columns. Writing the deviations of dots from centers as numbers supplies the entries of the corresponding column of the X matrix. The two waves, or columns of X, currently under discussion will have regression coefficients By proper choice of these, the regression will exactly fit any sinusoid of frequency regardless of its amplitude and phase. Similar comments apply to the other two pairs of waves, but note that, as you run your finger down for the any of these, the left-to-right oscillation is faster and thus there are more cycles: for the rightmost pair, where and are the periods middle pair and corresponding to the two harmonic frequencies. Three more pairs of columns, with periodicities and fill out a full set of harmonics for a period 12 function measured at integer time points. They would add 6 more columns for a total of 12 waves, seeming to contradict the fact that a period 12 function has 11, not 12, degrees of freedom. However, at period the sine column becomes for all t. Such a column of 0s would, of course, be omitted, leaving 11 columns (11 degrees of freedom) plus an intercept column associated with the period 12 function. If 36 consecutive observations from any period 12 function were regressed on this 12 column X matrix, the fit would be perfect at the observed points but would not necessarily interpolate well between them. A perfect fit at the observation times would result even if the sequence were repeated sets of six 1s followed by six s. The fitted values would exactly match the observed –1,1 pattern at integer values of t, but interpolated values, say, at time would not be restricted to –1 or 1. One might envision the harmonics as fine-tuning the wave shape as you move up through the higher harmonic frequencies (shorter period fluctuations). This motivates the statistical problem of separating the frequencies that contribute to the true process from those that are fitting just random noise so that a good picture of the wave shape results. Periodograms and associated tests are useful here. plus another at the first The following outputs are generated from a sinusoid of period harmonic, period Each sinusoid is the sum of a sine and cosine component, thus allowing an arbitrary phase angle. For interpolation purposes, sine and cosine terms are generated for t in increments of 0.1, but Y exists only at integer t. ,2 = 6 / 21 ,9.5 6 = 6 / 63 =t 4 = 3 / 21 21 = k 6 = 2 / 21 1− ,)21 / tπ 2(soc 0 = ) tπ (nis = )2 / tπ 2(nis ,2 .1B ,1A .6 = 6 / 21 9 = 2 / 21 = 4 / 63 21 / tπ2 ,5 / 21 ,3 .)21 / tπ2 (nis t Y = 4 / 21 332 SAS for Forecasting Time Series Output 7.7 Increased Resolution Using Harmonics Output 7.7 shows three sets of fitted values. The sine and cosine at the fundamental frequency are used to produce the fitted values on the left side. These fitted values do not capture the double peak in each interval of 12 time points, and they miss the low and high extremes. Including the first harmonic gives a better fit and gives an idea of what the datagenerating function looks like. The fitted values on the right side are those coming from the fundamental and all harmonic frequencies for omitting the sine at The . Adding all those extra parameters does minor wiggles there are due to the frequencies with not seem to have produced any useful new features in the fitted values. From PROC REG (see Output 7.9), the F test 1.53 for frequencies with is not significant, and the Type I and 2 are large enough that neither the nor frequencies can be sums of squares for omitted. Recall that you would not eliminate just a sine or cosine; they are treated as pairs. Rearrangement of terms or deletion of some terms would not affect the sums of squares here because the sine and cosine columns correspond to Fourier frequencies, so they are orthogonal to each other. The following PROC SPECTRA code is used to generate Output 7.8 and Output 7.10. PROC SPECTRA DATA=COMPRESS P S ADJMEAN OUT=OUTSPECTRA; VAR Y; WEIGHTS 1 2 3 4 3 2 1; RUN; .6 =j 2=j 1= j ,6 , … ,2 , 1 6, … =j ,4 ,3 2>j =j )21 / π2 ( j )21 / π2 (2 = 2ω 1= j 21 / π2 = 1ω 4502.0 F > rP 35.1 eulaV F 03347.0 48531.1 erauqS naeM 42 7 FD rotanimoneD rotaremuN ecruoS Y elbairaV tnednepeD rof stluseR scinomraH tseT 37405.0 47900.0 15305.0 41323.1 27655.0 28090.3 02269.1 76859.06 68584.472 54781.0 24090.215 30549.9963 SS I epyT 0814.0 8909.0 6814.0 7491.0 4593.0 6250.0 3711.0 1000.< 1000.< 1026.0 1000.< 1000.< |t| > rP 28.011.028.033.178.040.226.160.9 22.91 05.0 52.62 55.07 eulaV t 96341.0 12302.0 12302.0 12302.0 12302.0 12302.0 12302.0 12302.0 12302.0 12302.0 12302.0 96341.0 rorrE dradnatS 14811.062320.052761.021172.078571.083414.071033.072048.1 20509.3 50201.0 08333.5 68731.01 etamitsE retemaraP 1 1 1 1 1 1 1 1 1 1 1 1 FD 6c 5c 4c 3c 5s 4s 3s 2c 2s 1c 1s tpecretnI elbairaV Output 7.8 Periodogram with Two Independent Frequencies The periodogram, shown in Output 7.8, makes it quite clear that there are two dominant frequencies, and its first harmonic, The last few lines of the program deliver a smoothed version of the periodogram, shown in Output 7.10, that will be discussed in Section 7.9. Smoothing is not helpful in this particular example. .21 / π4 Output 7.9 Regression Estimates and F Test setamitsE retemaraP 21 / π2 Chapter 7: Spectral Analysis 333 334 SAS for Forecasting Time Series Output 7.10 Smoothed Periodogram 7.6 Extremely Fast Fluctuations and Aliasing Suppose a series actually has a frequency larger (faster fluctuations) than the Nyquist frequency Imagine a wheel with a dot on its edge, and an radians per observation—for example, observer who looks at the wheel each second. If the wheel rotates clockwise radians per second, at the first observation, the dot will now be radians counterclockwise—i.e., radians—from its previous position, and similarly for subsequent observations. Based on the dot’s position, the observer only knows that the frequency of rotation is for some integer j. These frequencies are all said to be aliased with where this frequency was selected because it Another alias will be seen to be as though the observer had moved to is in the interval the other side of the wheel. π 3/ π2− 3/ π4 j π 2 + 3 / π2− ,3 / π2 3/ 3/ π2 π2− . π > 3 / π4 .] π ,π − [ Chapter 7: Spectral Analysis 335 Because it is not possible to distinguish a cycle of from one of using the periodogram. Thus it is sufficient and customary to compute frequency with so that . periodogram ordinates at the Fourier frequencies Recall that the number of periodogram ordinates m is either if n is odd or if n is even. for all integers j, positive, negative, and zero. Imagine a number line with reference points at Folding that line back and forth in accordion fashion at these reference points maps the whole line into the interval The set of points that map into any are its aliases. For that reason, the Nyquist frequency is also referred to as the folding frequency. The reason that this frequency has names instead of always being called is that some people prefer radians or cycles per second, per hour, per day, etc., rather than radians per observations as a unit of measure. If observations are taken radians per every 15 minutes, the Nyquist frequency radians per observation would convert to hour, or 2 cycles per hour. In this book, radians per observation and the Nyquist frequency will be the standard. When the periodogram is plotted over and there appears to be a cycle at a bizarre frequency in ask yourself if this might be coming from a cycle beyond the Nyquist frequency. 7.7 The Spectral Density variance 1. and is for a general white noise series . Since for The . in general of a spectral density whether or not it is plotted as a rectangle. ) ec na irav = aera( , 2σ = area of the rectangle, is the variance of the series and this will be true π2 π2 π2 σ 2 )ω( f white noise, is , the plot is just a rectangle of height over an interval of width π ≤ ω ≤ π− π2 σ 2 π2 σ 2 . 2σ with variance Sometimes the spectral density is plotted over the interval t W π2 σ 2 π2 π2 ∞− =h 1 = )0(soc )0(γ 1 = )hω(soc )h(γ ∞ 0 = )h ( γ and if h is not 0. The spectral density for becomes just t .)ω− ( f = )ω( f is the autocovariance function. The function is symmetric: For the variance is )h( γ )hω(soc )h(γ ∞−=h W f The spectral density function of a process is defined as ∞ ∑ π12 = )ω( )63.0 ,0( N∼ e t where , te + )01 − 1− Y(8. + 01 = t t Y , te )3 / 5( + 01 = t W Consider three processes: , te and is white noise. Each process has mean 10 and π ≤ n / j π2 ≤ 0 , where π π4 2/ n m, … ,2 , 1 2 / )1 − n ( =j ω ,) tω − (nis B − ) tω − (soc A = ) tω(nis B + ) tω(soc A n / j π2 j π ]π ,0[ π π ω− + )01 − 1− Z (8. − 01 = π .] π ,0[ ∑ π12 = )ω( W t 1 = 2 σ = )0( γ ω ,] π ,0[ t Z f 336 SAS for Forecasting Time Series Because the plot of has equal height at each it is said that all frequencies contribute This is the same idea as white light, where all frequencies of the light equally to the variance of spectrum are equally represented, or white noise in acoustics. In other words the time series is conceptualized as the sum of sinusoids at various frequencies with white noise having equal contributions for all frequencies. In general, then, the interpretation of the spectral density is the decomposition of the variance of a process into components at different frequencies. An interesting mathematical fact is that if the periodogram is computed for data from any ARMA model, the periodogram ordinate at any Fourier frequency estimates that is, is (approximately) the periodogram ordinate’s expected value. Dividing the periodogram ordinate by thus gives an almost unbiased estimate of If the plot over is and plot the desired (so that the area under the curve is the variance), use the symmetry of and For white noise, of course, estimates the same thing at estimate at both each and so averaging several values gives an even better estimate. You will see that local averaging of estimates often, but not always, improves estimation. Often only the positive frequency half of the estimated spectral density is plotted, and it is left to the reader to remember that the variance is twice the area of such a plot. positively autocorrelated series to fluctuate at a slower rate around its mean than does Likewise you would expect the negatively autocorrelated series to fluctuate faster than since, for positive deviations tend to be followed by negative and negative by positive. The slower low frequencies. For you’d expect the opposite—large contributions to the variance from or frequencies near The three graphs at the top of Output 7.11 show the symmetrized periodograms for W, Y, and Z and each computed from 1000 simulated values and having each ordinate plotted at the associated its negative to show the full symmetric spectrum. The behavior is as expected—high values near indicating low-frequency waves in high values near the extreme s for indicating high-frequency fluctuations, and a flat spectrum for Two other periodograms are shown. The first, in the bottom-left corner, is for Because is a moving linear combination of values, is referred to as a filtered version of Note that if the filter had been applied, the filtered series would just be white noise and the spectral density just a horizontal line. It is seen that linear filtering of this sort is a way of altering the spectral density of a process. The differencing filter has overcompensated for the autocorrelation, depressing the middle (near 0 too much so that instead of being level, the periodogram frequency) periodogram ordinates of dips down to 0 at the middle. 1− tY8. t ω t Z − t Y ω t D . tW . tY .1− tY − , tY t Y = t t D D .π t Z π− t Y fluctuation in should show up as longer period waves—that is, higher periodogram ordinates at t . tW W t Z Z t Y What do the spectral densities of and look like? Using a little intuition, you would expect the )π 2 / σ( π ≤ ω ≤ π− 2 ;)ω( fπ4 )ω( f π j ) ω( I j j n Let denote the periodogram ordinate at Fourier frequency constructed from n ω )ω( f cannot exist either. Despite the fact that ,)ω( *A)ω(A π4 )ω( Y f .π4 π ≤ ω ≤ π− )ω( Y f .0 )h( γ =ω )hω(soc )h(γ ∞− =h ,D t .π ≤ ω ≤ π − ∞ ∑ π12 = )ω( t Z n , tY ω f does not exist for unit where the Chapter 7: Spectral Analysis 341 From the graphs in Output 7.8 and 7.10, it is seen that the sinusoids are indicated more strongly by the unsmoothed P_01 than by the smoothed spectrum S_01. That is because the smoothing spreads the effect of the sinusoid into neighboring frequencies where the periodogram concentrates it entirely on the true underlying Fourier frequency. On the other hand, when the true spectrum is fairly smooth, as with X, Y, Z, and D in Output 7.11, the estimator should be smoothed. This presents a dilemma for the researcher who is trying to discover the nature of the true spectrum: the best way to smooth the spectrum for inspection is not known without knowing the nature of the true spectrum, in which case inspecting its estimate is of no interest. To address this, several graphs are made using different degrees of smoothing. The less smooth ones reveal spikes and the more smooth ones reveal the shape of the smooth regions of the spectrum. Dividing each periodogram ordinate by the corresponding spectral density results in a set of almost independent variables, each with approximately (exactly if the data are normal white noise) a chi-square distribution with 2 degrees of freedom, a highly variable distribution. The weights applied to produce the spectral density lower the variance while usually introducing a bias. The set of weights is called a spectral window, and the effective number of periodogram ordinates involved in an average is called the bandwidth of the window. The estimated spectral density approximates a weighted average of the true spectral density in an interval surrounding the target frequency rather than just at the target frequency. The interval is larger for larger bandwidths and hence the resulting potential for bias increased, whereas the variance of the estimate is decreased by increasing the bandwidth. 7.10 Cross-Spectral Analysis Interpreting Cross-Spectral Quantities 7.10.1 Interpreting cross-spectral quantities is closely related to the transfer function model in which an output time series, Yt, is related to an input time series, Xt, through the equation t For example, let Yt and Xt be related by the transfer function t Then j which is a weighted sum of current and previous inputs. Cross-spectral quantities tell you what happens to sinusoidal inputs. In the example, suppose Xt is the sinusoid )t . 0= t η − t X X j )8.( 0 = ω ( nis = X 1− tY8. = Σ= ∞ j t η and where is a time series independent of the input, Xt. For the moment, assume )ω( f η+ −X j t j v ∞−=∞ Σ = j − t t t t Y Y Y 342 SAS for Forecasting Time Series where 21 / Using trigonometric identities shows that Yt satisfying )t must be of the form ) Solving you have and The solution is 2203.1 and The transfer function produces output with amplitude 1.9828 times that of the input; it has the same frequency and a phase shift of arctan radians. These results hold only for The output for any noiseless linear transfer function is a sinusoid of frequency when the input X is such a sinusoid. Only the amplitude and phase are changed. In cross-spectral analysis, using arbitrary input and its associated output, you simultaneously estimate the gain and phase at all Fourier frequencies. An intermediate step is the computation of quantities called cospectrum and quadrature spectrum. ω 8289.1 = 29. = °5.25 = )2203.1(  )  2 / 1(1397. − ) 2 / 3 (1906. 8. − 1906. / 1 = A { } = )3 ) 4.( − 1( / 4. = ) ) ω( soc )tω(nis = ) tω ( soc ) )ω + B ( nis A8. + )B ( nis A − ( + ) tω ( nis ) )ω + B ( soc A8. − )B ( soc A ( = ω − B − tω ( nis A8. − )B − tω ( nis A ) 0 = )ω + B ( nis A8. + B nis A − 1 = )ω + B ( soc A8. − B soc A 8. − 1( / ) ω( nis ω ( nis = 1− Y8.− Y B − tω ( nis A = Y 8. = )B ( nat t π2 = ω .21 / π2 = ω t t ) ω( xx f = ))ω( nis i8. + )ω( soc 8. − 1()ω( yx f ) ω( nis i − )ω( soc = ω − e i ) ω( xx f = ωi− e )ω ( yx f 8. − )ω ( yx f ) h ωi − e ) h ( xx γ ( ∞− =h Σ ) π 2 ( ∞ 1− h ωi− = ) ωi − e ) 1− h ( ωi − e )1 − h ( yx γ8. − e ) h ( yx γ ( ∞− =h Σ ) π2 ( ∞ 1− ) h ωi− e )1 − h ( yx γ8. − ) e ) h ( xx γ ( ∞− = h Σ ) π 2 ( = h ωi − ∞ 1− e ) h ( yx γ ( ∞− =h Σ ) π2 ( ∞ 1− h ωi − ) h ωi − e )h ( γ ( ∞− =h Σ ∞ 1− ) π2 ( = ) ω( f Now when ) h( γ ) ( xx γ h = )1 − h ( yx γ8. − ) h ( yx γ t X = 1− tY8. − t Y    } ) tY ( E − h + tY   ) t X ( E − t X  { E = The theoretical cross-spectrum, ,) ω ( yx f The real part of where is the cospectrum, ,) ω( c )ω(qi − )ω(c = )ω ( yx f quadrature spectrum, ) ω(q ) h ( yx γ , ) h ( yx γ where ) h( xx γ so However, or or From these last two equations multiply both sides by Xt–h and take the expected value. You obtain is the autocorrelation function (ACF) for x. . In the example is the Fourier transform of the cross-covariance function, and the imaginary part gives the Chapter 7: Spectral Analysis 343 is absolutely summable, 344 SAS for Forecasting Time Series obtain ) You then have the cospectrum of X by Y (that of Y by X is the same) )) In Output 7.12 (pp. 348–349) the cospectrum and quadrature spectrum of Y by X along with their estimates from PROC SPECTRA are graphed for the case t 7.10.2 Interpreting Cross-Amplitude and Phase Spectra 5.0 The cross-amplitude spectrum is defined as )) In this example, ) The gain is defined as the amplitude divided by the spectral density of X, or ) . ) )ω(c / )ω(q− ( natcra and that of Y by X is )) ω( c / )ω( q ( natcra = )ω ( yx Ψ ) ω( yxΨ The phase spectrum of X by Y is defined as )) ω ( soc 6.1 − 46.1( function, in our case . ω 5.0 − .0 ≠ )ω ( xx f ω provided frequency Thus, the gain is the multiplier applied to the sinusoidal component of X at to obtain the amplitude of the frequency component of Y in a noiseless transfer )) ω( q− and the quadrature spectrum of X by Y (that of Y by X is   }) ω ( soc 6.1 − 46.1 / ) ω ( nis 8.{ ) ω ( xx f = ) , )) ω ( nis i8. − ) ω ( soc 8. − 1( Multiplying and dividing the left side by the complex conjugate you ω( xx f ))ω( soc 6.1 − 46.1( / ))ω( nis i8. − )ω( soc 8. − 1( = )ω( yx f ω ( soc 6.1 − 46.1( / ) )ω( soc 8. − 1( )ω ( xx f = )ω ( c ω( ω( xx f 2 q+) 5.0 − )) ω( ω ( soc 6.1 − 46.1( = ) ω ( A 2 c( = ) e ω( ω ( / )ω ( A + yx f 1− t X 5. xx f = ) ω( A = ω( q yx yx yx t X Chapter 7: Spectral Analysis 345 These cross-amplitude and phase spectra are graphed along with their estimates from PROC SPECTRA in Output 7.12. The graphs explain the effect of the transfer function on a sinusoidal input. Its amplitude is changed , and it undergoes a phase shift . The graphs show ) Transfer function relationships are not perfect (noiseless), so an error series is introduced into the model as t where is uncorrelated with Xt. Now, in analogy to the correlation coefficient, the squared coherency is defined as )) This measures the strength of the relationship between X and Y as a function of frequency. The of satisfies spectrum )) x x ω ( y2 K − 1( )ω ( yy f = )ω ( yx f )ω ( 1−x f )ω ( yx f − )ω ( yy f = )ω ( ηf t The true squared coherency and its estimate from PROC SPECTRA for the example are also graphed in Output 7.12 (p. 347). 1− }) ω ( soc − 52.1 /  ) ω ( soc 6.1 − 46.1 + 1{ =     t η with var(et)=1 and ) x ω ( y2 K is white noise with variance 1. Then .η To compute the theoretical coherency for the example, you need assumptions on X and t ω( how these changes are a function of frequency )) . The cross-spectrum can be expressed as )) ω( yxΨ( . ω This is the phase difference between the output and input at frequency } In this example,   ) ω ( soc 8. − 1 / ) ω ( nis 8.−{ natcra = ) ω ( Ψ ω( )) yy f ω( yx A( ) ω( ω( yxΨi( pxe )ω( yx A = )ω( yx f xx f (/ η + − X v ∞−=∞ Σ = Y 2 ) ω( j t e η yx f + j 1− t X5. = )ω( ) ω( η f j = xy yx K 2 t t t η Assume X 346 SAS for Forecasting Time Series 7.10.3 PROC SPECTRA Statements PROC SPECTRA gives these names to estimates of the cross-spectral quantities for the first two variables in the VAR list: Cospectrum Quadrature Spectrum Cross-Amplitude Spectrum Phase Spectrum Squared Coherency CS_01_02 QS_01_02 A_01_02 PH_01_02 K_01_02 PROC SPECTRA options for cross-spectral analysis are as follows: PROC SPECTRA DATA=IN OUT=O1 COEF P S CROSS A K PH WHITETEST ADJMEAN; VAR Y1 Y2; WEIGHTS 1 1 1 1 1; RUN; CROSS indicates that cross-spectral analysis is to be done. It produces the cospectrum C_01_02 and the quadrature spectrum Q_01_02 when used in conjunction with S. CROSS produces the real part RP_01_02 and the imaginary part IP_01_02 of the cross-periodogram when used in conjunction with P. Thus, RP and IP are unweighted estimates, and C and Q are weighted and normalized estimates of the cospectrum and quadrature spectrum. A, K, and PH request, respectively, estimation of crossamplitude, squared coherency, and phase spectra (CROSS must be specified also). Weighting is necessary to obtain a valid estimate of the squared coherency. Consider the following 512 observations Yt generated from the model and where Xt is an autoregressive (AR) process t The following SAS code produces appropriate spectral estimates: PROC SPECTRA DATA=A OUT=OOO P S CROSS A K PH; WEIGHTS 1 1 1 1 1 1 1 1 1 1 1; VAR Y X; RUN; Plots of estimated and true spectra are overlaid in Output 7.12. t η with variance 1.3333 and where )noitcnuf refsnart sselesion eht( is white noise with variance 1. )mret esion a gnidda( e t X + + t η+ 1− t X5. 1− tV8. t V= Y = t t = t X V Chapter 7: Spectral Analysis 347 Output 7.12 Plots of Estimated and True Spectra 348 SAS for Forecasting Time Series X YB Y EDUTILPMA-SSORC Output 7.12 Plots of Estimated and True Spectra (continued) Chapter 7: Spectral Analysis 349 Output 7.12 Plots of Estimated and True Spectra (continued) Although the data are artificial, think of X and Y as representing furnace and room temperatures in a near zero) and short-term building. The phase spectrum shows that long-term fluctuations fluctuations near ) for furnace and room temperatures are nearly in phase. The phase spectrum starts at zero and then decreases, indicating that X (the furnace temperature) tends to peak slightly before room temperature at intermediate frequencies. This makes sense if the furnace is connected to the room by a reasonably long pipe. The squared coherency is near one at low frequencies, indicating a strong correlation between room temperature and furnace temperature at low frequencies. The squared coherency becomes smaller at the higher frequencies in this example. The estimated phase spectrum can vary at high frequencies as a result of this low correlation between furnace and room temperatures at high frequencies. Because of mixing as the air travels from the furnace to the room, high-frequency oscillations in furnace temperatures tend not to be strongly associated with temperature fluctuations in the room. The gain, A_01_02/S_02, behaves like the cross-amplitude spectrum A_01_02 for this example. This behavior shows that low-frequency fluctuations in the furnace produce high-amplitude fluctuations at room temperature, while high-frequency fluctuations produce low-amplitude (small variance) fluctuations at room temperature. The transfer function tends to smooth the high-frequency fluctuations. Because of mixing in the pipe leading from the furnace to the room, it is not surprising that high-frequency (fast oscillation) temperature changes in the furnace are not transferred to the room. ω( π ω( 350 SAS for Forecasting Time Series 7.10.4 Cross-Spectral Analysis of the Neuse River Data In Chapter 3, “The General ARIMA Model,” the differenced log flow rates of the Neuse River at Kinston (Y) and Goldsboro (X) are analyzed with the transfer function model t or 1− te478. with 9930. and t where 1− t v888. and vt is a white noise series with 8500. The spectral quantities discussed above are computed and plotted using the estimated model parameters. First, the model-based spectral quantities are developed. Then, the direct estimates (no model) of the spectral quantities from PROC SPECTRA are plotted. When the models above are used, the spectrum of Goldsboro is ) Note that the cross-covariance of Yt with Xt–j is the same as the cross-covariance of Xt–j with so you obtain YX γ 2−t Thus, the cross-spectrum is )     192. − ) 711. (142.1 )ω2 ( soc 2 −  ) 711. (192. − )192.(142.1 + 142.1 )ω ( soc 2 − ωi3 e711. + ωi2 e192. + ωie142.1 − 1( ( / ) ) ωi− e478. − 1( ) ωie478. − 1( (   2 711. + )) π 2 ( / 9930.( ) ) ω 3− e711. + 2    192. + 2142.1 + 1 { /  )ω ( soc )478.( 2 − 2478. + 1 e)B478. − 1( = i    ) π 2 ( / 9930. }]711.[ )ω3 ( soc 2 + − e= ω ( XX f ) ω 2− e372. + )2 t − j ( XX γ372. + )1 − j ( XX γ594. = 3− t X 711. ωi2− i e192. + ωi − e142.1 − 1( t X) − 3 ε + 2− t X372. + 1− t X594. = B711. + t v 2 − t X192. + 2− t ε84. − 1− t ε361.1 = t ε + ωi− 2 e594.( B192. + X372. + 1− X594. 1− t X142.1 + B142.1 − 1( = )ω( = )ω ( XX f = )j ( t = vσ 2 − e = 2σ YX f t t X Y Chapter 7: Spectral Analysis 351 The real part (cospectrum) is ) and the quadrature spectrum is ) The phase spectrum is )) The spectrum of Kinston (Y) is where     π2   ) ) ω 2−e84. + ω −e361.1 − 1( ) ω 2e84. + ω e361.1 − 1( (  εε   = )ω( f 8500.   )) e888. − 1( ) ω e888. − 1( (  ω− i i The squared coherency is simply )) Consider this pure delay transfer function model: c− t Using the Fourier transform, you can show the following relationship: ) so )) Thus, the phase spectrum is When you use the ordinates in the plot of the phase spectrum as dependent variable values and frequency as the independent variable, a simple linear regression using a few low frequencies gives 1.34 as an estimate of c. This indicates a lag of 1.34 days between Goldsboro and Kinston. Because ARIMA models contain only integer lags, this information appears as two spikes, at lags 1 and 2, in the prewhitened cross-correlations. However, with the cross-spectral approach, you are not restricted to integer lags. In Output 7.13 the irregular plots are the cross-spectral estimates from PROC SPECTRA. These are overlaid on the (smooth) plots computed above from the transfer function fitted by PROC ARIMA. ) ω ( εεf + ) ω ( XX f ) ω 2− e372. +  ωci e) c − h(ωi e) c − h( i i i ω ( XX f ))ω2 ( nis372. + )ω ( nis594.( = )ω ( q XX γ( ∞− =h Σ β = ) e)h( YX γ( ∞−=h Σ = ) ω( YX f ∞ hωi ∞ ωi− ω( XX f ) )ω2 ( soc 372. + )ω ( soc 594.( = )ω ( c e594.( ) ωc ( nis i + )ωc ( soc ( β )ω ( XXf = )ω ( YX f ω( YY f )ω( XX f ( / 2 ) ω( YX f = ) ω( YX K 2 i ωc = ))ωc ( nat ( natcra = )ω ( YXΨ ωi2 ω( c / )ω ( q ( natcra = )ω ( YXΨ e372. i + ωi e594.(  = ) ω ( YY f Xβ = tY 352 SAS for Forecasting Time Series Output 7.13 Overlaying the Smoothed ModelDerived Plots from PROC ARIMA and the Irregular PROC SPECTRA Plots Chapter 7: Spectral Analysis 353 Output 7.13 Overlaying the Smoothed ModelDerived Plots from PROC ARIMA and the Irregular PROC SPECTRA Plots (continued) 354 SAS for Forecasting Time Series From one viewpoint, the closeness of the PROC SPECTRA plots to the model-derived plots provides a check on the ARIMA transfer function model and estimates. From another viewpoint, the modelbased spectral plots provide a highly smoothed version of the PROC SPECTRA output. 7.10.5 t Details on Gain, Phase, and Pure Delay ) δ + ω − tω(nis α3 = 1− X3 = Y .1 t t Y is also a perfect sine wave. The phase of Y is radians less than the phase of X, and the radians more than amplitude of Y is 3 times that of X. You could also say that the phase of X is that of Y. You have seen that the idea of cross-spectral the phase of Y and the amplitude of X is analysis is to think of a general pair of series X and Y as each being composed of sinusoidal terms, then estimating how the sinusoidal components of Y are related, in terms of amplitude and phase, to those of the corresponding sinusoidal component of X. then Y is With two series, Y and X, there is a phase of Y by X and a phase of X by Y. If behind X by 1 time unit; that is, the value of X at time t is a perfect predictor of Y at time Similarly X is ahead of Y by 1 time unit. This program creates and so it is an the spectrum of X is example of a simple noiseless transfer function. With 1− t DATA A; PI = 4*ATAN(1); X=0; DO T = 1 TO 64; Y = 3*X; *Y IS 3 TIMES PREVIOUS X*; X=NORMAL(1827655); IF T=64 THEN X=0; OUTPUT; END; RUN; PROC SPECTRA DATA=A P S CROSS A K PH OUT=OUT1 COEFF; VAR X Y; RUN; PROC PRINT LABEL DATA=OUT1; WHERE PERIOD > 12; ID PERIOD FREQ; RUN; .4234.1 = )π 2(/ 9 t Y , ω at all frequencies and has spectrum +t )ω( XX f X3 = Y , 1− t X3 t = ω t Y t e= t ,)1,0 ( X N∼ e ω t .) δ + tω(nis α = X δ + ω− 3 /1 t 2951.0 = ) π2 (/ 1 = )ω( XX f X Suppose is a perfect sine wave Now suppose . Chapter 7: Spectral Analysis 355 Since no weights were specified, no smoothing has been done. Only a few frequencies are printed out. Output 7.14 X and Y Series It is seen that at period 64, X has a component 0.16213cos(2 t/64) – 0.09548sin(2 t/64) = 0.188156sin(2 t/64 + 2.10302) and Y has a component 0.51212cos(2 t/64) – 0.23739sin(2 t/64) = 0.564465sin(2 t/64 –2.00486), where 0.564465/0.188156=3 is the amplitude increase in going from X to Y. The phase shift is 2.10302 – 2.00486 = 0.09817 radians. Each periodogram ordinate is times the sum of squares of the two 78231.1 Each Y periodogram ordinate is times the corresponding X periodogram ordinate. This exact relationship would not hold if noise were added to Y. Within the class of ARMA models, the periodogram divided by (where the true spectral density of the process is ) has approximately a chi-square distribution with 2 degrees of freedom, a distribution with mean 2. This motivates as an estimator of for both Y and X. Each spectral density estimator is the . For example, for X at corresponding periodogram ordinate divided by period 64. In the VAR statement of PROC SPECTRA, the order of variables is X Y, and you see that this of Y is radians less than the phase produces the phase of X by Y, not Y by X. The phase of X as was shown above. Thus the entries in the phase column are exactly the same as the )ω( f 2090.0 = )π 4(/ 78231.1 ω δ + ω− π4 = ] 2)84590.0( + 2 )31261.0([)2 / 46( coefficients, Y yb X fo Y yb X fo Y fo X fo Y fo IP ot doireP margodoireP margodoireP ytisneD ytisneD margodoireP 0 morf gamI laeR lartcepS lartcepS ycneuqerF X fo Y fo Y fo X fo X fo IP ot doireP margodoireP mrofsnarT mrofsnarT mrofsnarT mrofsnarT 0 morf eniS enisoC eniS enisoC ycneuqerF for X at period 64, for example. 64817.6 91161.6 34561.2 14394.1 21333.0 Y yb X Y yb X fo Y yb X fo 2 Y yb X fo Y yb X fo fo esahP edutilpmA **ycnerehoC erutardauQ artcepsoC 78094.0 07293.0 25492.0 53691.0 71890.0 57057.4 66663.5 65684.2 66155.2 78231.1 61431.1 02182.1 26395.0 61906.0 54072.0 )2 / n ( 15200.138300.027538.0 82645.093732.04965.21 4478.41 5831.7 9705.7 3283.3 B B 34575.065822.1 13030.0 84746.021215.0 74204.3 95348.3 78087.1 94728.1 63118.0 1 1 1 1 1 )ω( f 50873.0 70724.0 78791.0 50302.0 51090.0 92402.009751.046362.0 94631.084590.046435.0 92094.0 23271.0 48811.0 15620.0 B B ωfπ2 2 3 96623.068773.0 35090.0 12742.031261.0 7657.24 9992.84 1973.22 0569.22 9591.01 42000.1 76381.1 60865.0 64795.0 51962.0 π 4 / )ω( n I IP ot doireP 0 morf ycneuqerF )ω( n I 78094.0 07293.0 25492.0 53691.0 71890.0 78094.0 07293.0 25492.0 53691.0 71890.0 78094.0 07293.0 25492.0 53691.0 71890.0 B B 0008.21 0000.61 3333.12 0000.23 0000.46 0008.21 0000.61 3333.12 0000.23 0000.46 0008.21 0000.61 3333.12 0000.23 0000.46 δ 356 SAS for Forecasting Time Series frequencies. The plot of phase by frequency is a straight line with slope 1, and this slope gives the so d=1. Had the variables been listed in the order Y X, would have pure delay d for appeared as the phase spectrum estimate. The slope of the phase plot near the origin gives some idea of the lag relationship between Y and X in a transfer function model with or without added noise, as long as the coherency there is reasonably strong. The delay need not be an integer, as was illustrated with the river data earlier. The phase plot of the generated data that simulated furnace and room temperatures had a negative slope near the origin. The room temperature Y is related to lagged furnace temperature X, and with the variables listed in the order Y X, the phase of Y by X is produced, giving the negative slope. Had the order been X Y, the plot would be reflected about the horizontal line and an initial positive slope would have been seen. For the river data, you see that the sites must have been listed in the order Goldsboro Kinston in PROC SPECTRA, since the phase slope is positive and Goldsboro (X) is upstream from Kinston (Y). ,)h ( XX γ If and if has an absolutely summable covariance function in the current example, then Y also has a covariance function } h + tY tY{E = )h ( YY γ t which is the case summability assumption ensures the existence of the theoretical spectral densities. The processes also have a cross-covariance function } h + tY t X{E whose Fourier transform is the cross-spectral density of Y by X: 4249.2 = )23 / π 2(soc 3 )ω( XX f )ω(soc 3 = )ω(c .)ω(q .)ω( XX f )ω(nis 3 = )ω(q 3585.0 i − the coefficient of is the quadrature spectrum In this example For example, at period 32 you find and . Multiplying these by the estimated X spectral density gives )ω(c ,)ω( qi − )ω(c Writing as the real part .)ω( XX f 9 = )ω( YY f )1 − h( γ XX )ω( f ])ω(nis i − ) ω(soc[3 = ∞− =h π2 )1 − h( γ −h ωi− e]) ω(nis i − ) ω(soc[ 3 = ∞ ∞− = h π2 ∞− =h π2 3 = )h( γ hωi−e ∞ 1 = )ω( )1 e e − h ( ωi− ωi − XX )1 ( ∑ ∞ XX YX } h +1− tX tX{E3 )h ( YX γ h ωi − e ∞− =h ∞ )1 − h( XX γ3 ∑ ∑ = )h( YX γ = = = )23 / π2 (nis 3 ∑ π12 XX f sequence: and similarly )ω( XX f By definition, the theoretical spectral density h )h( XX γ hωi−e ∞−=∞ of X is the Fourier transform of the covariance The absolute is the cospectrum of X by Y and and ω− 0 = esahp } h +1− t X1− tX{E9 = d− t ∑ π12 = )ω( XC = Y )h( XX γ 9 = X t 1− t X3 YX f = Y t Chapter 7: Spectral Analysis 357 (2.9424)(0.20305) = 0.5974, the estimated cospectrum of X by Y for period 32, and similarly the estimated quadrature spectrum of X by Y on the printout. The phase and amplitude spectra are transformations of and and are often easier to and that of Y by X is interpret. The phase of X by Y is as would be expected from the previous discussion of phase diagrams. The phase shows you the lag relationship between the variables, as has been mentioned several times. For the amplitude of the frequency component is This is called the amplitude of X by Y, and in the printout, each of these entries is the corresponding spectral density of X estimate multiplied by 3. The quantity is the spectral density for that part of Y that is exactly related to X, without any added noise. Since Y is related to X by a noiseless transfer function, the spectral density of Y should be For example, at period 32 you find Recall that the quantity has been referred to earlier component in going from X as the “gain.” It represents the amplitude multiplier for the frequency to Y in a model where Y is related to X without noise. In our case the gain is thus consists of plus an added noise A more realistic scenario is that an observed series component independent of X (and thus Y). Here the phase, amplitude, and gain using W and X as data have their same interpretation, but refer to relationships between X and Y—that is, between X and the part of W that is a direct transfer function of X. You can think of fluctuations in X over time as providing energy that is transferred into Y, such as vibrations in an airplane engine transferred to the wing or fuselage. The fluctuations in that object consist of the transferred energy plus independent fluctuations such as wind movements while flying. The spectral density of W 2 the squared coherency between Y and X would be 1 at all frequencies because in that case. This appears in the output; however, in 2 This small example without smoothing is presented to show and interpret the cross-spectral calculations. In practice, smoothing weights are usually applied so that more accurate estimates can be obtained. Another practical problem arises with the phase. The phase is usually computed as the whose tangent is If a phase angle a little less than is followed angle in by one just a bit bigger than , the interval restriction will cause this second angle to be reported The phase diagram can thus show phases jumping back as an angle just a little bigger than and forth between and when in fact they could be represented as not changing much at all. Some practitioners choose to add and subtract multiples of from the phase at selected frequencies in order to avoid excessive fluctuations in the plot. Fuller (1996) gives formulas for the cross-spectral estimates and confidence intervals for these equal smoothing weights. quantities in the case that there are R the absence of smoothing weights, the squared coherency is really meaningless, as would be an of 1 in a simple linear regression with only 2 points. 2/π π .)ω(c / )ω(q 1 = )ω( YYf / )ω( YYf = )ω( YYf / ])ω( XXf / )ω( A[ 1 + d2 .2 / π − 2/π 2/π 2 / π− ]2 / π ,2 / π − [ ,) ω( 2 κ symbol is and it is called the squared coherency. In a noiseless transfer function, like R then, the quantity ])ω( WW f )ω( XXf [/ )ω( 2 A )ω( XXf / )ω( 2 A will no longer be but will be this plus the noise spectrum. In a system with noise, provides an measure as a function of frequency. Its )ω( WW f ω = ))ω(soc / )ω(nis(natA = ))ω( c / )ω( q(natA )ω( XX f / )ω( A ω ω t )ω(c Y .)ω( XX f 3 = )ω( XX f )ω( nis + )ω( 2soc 3 = )ω( A = )ω( 2 q + )ω( 2 c )ω(q t W .)ω( XXf / )ω( 2 A )ω( XXf / )ω( 2 A 2 ,)ω( XX f ])ω(nis i − )ω(soc[3 = )ω( YX f .94728.1 ,8811.0 .3 = )ω( 2 nis + )ω( 2 soc 3 = )50302.0(/ 2 )61906.0( = )50302.()3585.0( t N , 1− t X3 = , 2 ω− t Y 358 8 Data Mining and Forecasting 8.1 Introduction This chapter deals with the process of forecasting many time series with little intervention by the user. The goal is to illustrate a modern automated interface for a collection of forecasting models, including many that have been discussed so far. Most models herein, such as damped trend exponential smoothing and Winters method, are equivalent to specific ARIMA models. Some of these were developed in the literature without using ARIMA ideas and were later recognized as being ARIMA models. The examples focus on Web traffic data that accumulate very quickly over time and require a demanding warehousing and analytics strategy to automate the process. Analysis of such large amounts of data is often referred to as “data mining.” In this chapter SAS Web Analytics are used to read the Web traffic data, summarize the information for detailed and historical analyses, and define the information into a data warehouse. The SAS Web Analytics reports provide important details about your Web traffic—who is visiting your site, how long they stay, and what material or pages they are viewing. This information can then be accumulated over time to construct a set of metrics that enables you to optimize your e-business investment. Results are displayed on the Web and accessed by an Internet browser. In addition, the SAS/ETS software Time Series Forecasting System (TSFS) is examined. This system provides a menu-driven interface to SAS/ETS and SAS/GRAPH procedures to facilitate quick and easy analysis of time series data. The HPF (High Performance Forecasting) procedure is used here to provide an automated way to generate forecasts for many time series in one step. All parameters associated with the forecast model are optimized based on the data. 383 yrammuS 183 tnempoleveD ledoM gnikeeS-laoG 673 syalpsiD lacihparG 673 scirteM ecnamrofreP laoG ssenisuB 573 tnempoleveD dracerocS 863 erudecorP FPH 263 metsyS gnitsaceroF seireS emiT ehT 063 ledoM ataD gnitsaceroF 953 noitcudortnI retpahC 9.8 8.8 7.8 6.8 5.8 4.8 3.8 2.8 1.8 360 SAS for Forecasting Time Series Finally, the chapter uses a scorecard to integrate, distribute, and analyze the information enterprisewide to help make the right decisions. This interface helps business users analyze data in new and different ways to anticipate business trends and develop hypotheses. They can receive automated alerts to early indicators of excellence or poor performance. The interface enables IT (information technology) professionals to fully automate and personalize the collection and distribution of knowledge across the organization. The application presented here is available through the SAS IntelliVisor for Retail service. The delivery mechanism is provided through an ASP (application service provider) infrastructure. 8.2 Forecasting Data Model Under the ASP framework, each night we receive customer Web logs after 12:00 AM local time. The Web logs are unzipped, placed in a file directory, and analyzed using SAS Web Analytics. The data examine key metrics used to describe activity during the 24 hours of e-retailing in a given day. One company using this approach is the online retailer the Vermont Country Store. (They provided a modified version of their data for illustration here. See www.vermontcountrystore.com.) Chapter 8: Data Mining and Forecasting 361 The variables and their descriptions are provided in Table 8.1, followed by a listing of some of the data. Table 8.1 Variables and Descriptions Variable date revenue buyer dollars_per_purch_session items_per_purch_session catalog_quick_purch_perc perc_abandon_carts num_session requestcatalog_con productsuggestion_pages new_cust_perc purch_perc new_buy_perc Description SAS Date variable formatted in DATE9. Revenue (TARGET) Number of Purchasing Sessions Average Order Value Average Items per Purchasing Session %CQS Buyers Abandon Carts % Number of Sessions Number of Catalog Requests Number of Product Suggestion Pages Viewed New/Total Sessions × 100 Purchase Response Rate New/Total Buyers × 100 362 SAS for Forecasting Time Series 8.3 The Time Series Forecasting System Open the TSFS and select the data set to be accessed. The TSFS automatically identifies the Time ID variable DATE and recognizes that the data are at daily intervals. Since revenue is the target or main response variable of interest, select the graph button to evaluate revenue behavior over time. Chapter 8: Data Mining and Forecasting 363 Select the Revenue variable and then select the Graph button. The Revenue variable shows a decrease in variability over time with some periodic tendencies. This is not unusual. Retail sales over the Web tend to show a daily cycle over time. (Again, this graph represents a display that does not reflect the true revenue at the Vermont Country Store.) 364 SAS for Forecasting Time Series The series looks nonstationary, and examining the autocorrelation plots suggests the need to difference. By selecting the p=.05 button you can access the Dickey-Fuller unit root test. This test is previously described and fails to reject the null hypothesis of nonstationarity only with four augmenting lags. The TSFS employs ordinary unit root tests (1 ) and unit root tests for the seasonal polynomial (1 ) using k lagged differences as augmenting terms. That is, these are factors in an autoregressive polynomial of order k + 1 and is tested. The user should always entertain the possibility of fitting a model outside the class of models considered here. For example, had the pre-Christmas surge in sales been modeled, say, with a separate mean, the residuals might look more stationary. The display below only goes up through 5 augmenting terms. B φ− 1 = φ : 0H s B φ− Chapter 8: Data Mining and Forecasting 365 Go back to the main window and request that the TSFS automatically fit models for every series. 366 SAS for Forecasting Time Series We’re notified that 12 models will be fit for each series. Notice the TSFS selects a seasonal exponential smoothing model for revenue. The TSFS provides an assortment of different seasonal and nonseasonal models and chooses the “best” model based on an information criterion that in this case is minimizing the root mean square error. The user has some control over the list of potential models and simple features of the data that are used initially to pare down the list, in this case, to 12 models that might fit well. Chapter 8: Data Mining and Forecasting 367 Select the Graph button to see the forecasts, and then select the forecast graph button to see the forecasts and confidence intervals. 368 SAS for Forecasting Time Series The graph and review of the data and forecasts using the data button suggest the seasonal exponential smoothing model does not fit the larger revenue spikes very well, although it does a reasonable job overall. Because exponential smoothing is analogous to fitting a unit root model, the typical fast spreading prediction intervals are seen as the forecast goes beyond one or two steps. You can also go back to the Automatic Fitting Results screen to evaluate the forecasts for each series individually. The TSFS can also be further automated by using the forecast command and the SAS/AF Forecast Application Command Builder. 8.4 HPF Procedure The HPF procedure can forecast millions of time series at a time, with the series organized into separate variables or across BY groups. You can use the following forecasting models: Smoothing Models: Additionally, transformed versions of these models are provided: For intermittent time series (series where a large number of values are zero values), you can use Croston’s method (Croston 1977). All parameters associated with the forecast model are optimized based on the data. The HPF procedure writes the time series with extrapolated forecasts, the series summary statistics, the forecast confidence limits, the parameter estimates, and the fit statistics to output data sets. )evitacilpitlum dna evitidda( dohteM sretniW lanosaeS dnerT depmaD raeniL elbuoD elpmiS xoC-xoB citsigoL tooR erauqS goL F T AM E Y L Chapter 8: Data Mining and Forecasting 369 The HPF procedure step below examines the application of the automatic forecasting technique to the evaluation of seven different forecasting methods described above. The program creates a Forecasts data set that contains forecasts for seven periods beyond the end of the input data set VC_DATA.DAILY_STATS_09AUG02. The data represent daily values for Revenue, a variable describing the total number of purchasing dollars for a given day. The daily variable indicator, date, is formatted date9. The GPLOT procedure is used to display the actual values, predicted values, and upper and lower confidence limits overlaid on the same graph. 370 SAS for Forecasting Time Series The HPF procedure describes the input data set WORK.DAILY_STATS_09AUG02 and the Time ID variable DATE. There are 268 observations in the data set and no missing observations. The descriptive statistics are also provided. Chapter 8: Data Mining and Forecasting 371 The Winters additive method seasonal exponential smoothing model fits best based on the RMSE statistic, and only the level weight is statistically different from 0. When performing these operations in an automatic fashion on many series, it is often found that the models tend to be overparameterized. The forecasts for the next seven days are displayed below, in addition to the standard errors and upper and lower 95% prediction intervals. The lower 95% confidence interval falls below 0 as you extend well beyond the end of the historical data. 372 SAS for Forecasting Time Series The statistics of fit for the selected model are given as a reference for model comparison. As noted, these are calculated based on the full range of data. A detailed description of these summary statistics can be found by consulting the SAS System 9 documentation. Chapter 8: Data Mining and Forecasting 373 A forecast summary shows values for the next seven days, and a sum forecast for the seven-day total is displayed at the bottom. 374 SAS for Forecasting Time Series The graph below suggests a drop in purchasing sessions in early January. The Winters additive method of seasonal exponential smoothing does a nice job of tracking the historical data shown by the heavy middle graph line. Chapter 8: Data Mining and Forecasting 375 8.5 Scorecard Development Each day the Vermont Country Store is provided with a report called a “scorecard” that examines its key metrics (variables of interest). The revenue is denoted Revenue (TARGET). The actual value for the day is removed and then forecasted using the HPF procedure. Since the current day’s value is removed (9Aug02 in this case), the standard error and forecast estimate are independent of today’s observed value. Standardized differences denoted “Difference” are also displayed for each metric. i y ˆ s / ) Y − Y( ˆ t t 376 SAS for Forecasting Time Series 8.6 Business Goal Performance Metrics From a retailing business perspective, often you would like the actual values of a metric like Buyer Percent to be larger than the Forecast value so that you are doing better than expected. For a metric like Error Page Percent, smaller values are preferred. For each metric a directional business performance measure is computed for the day. If the preferred direction is greater than the forecast, the calculation is x Thus the Business performance has a minimum value of 50% (when Y is small). When Y matches the prediction, the Business performance statistic has a value of 75%; it increases toward 100% as Y gets larger than the prediction. When the preferred direction of the business movement is less than the prediction, the Business performance measure is calculated as x Using this approach, each metric in the table has a Business performance measure. The AUTOREG procedure is then applied by regressing the target (Revenue in this example) on the other metrics and treating 1-pvalues as weight statistics. The sum of products of weight statistics and Business Performance measures gives an overall daily mean score as shown in the previous display. 8.7 Graphical Displays You can go to the scorecard table and select each metric to display the predictions and limits in a graphical format. In the following display, the scattered black dots represent the observed data, and the dots connected by lines represent the predictions. On the target day (9Aug02) the observed value is removed, so we designate the forecasts and upper and lower 95% prediction intervals with plus signs. Throughout the other historical data, the forecasts and forecast bounds are based on a model developed from the full data set that includes 9Aug02. The same is true of the forecasts and bounds beyond 9Aug02. ) ˆ s / )Y ˆ Y − Y(( = x / td 2/ t e ∫ π2 ∞− /1 (1– ( 2))*100 where ) ˆ s / )Y ˆ Y − Y(( = x td 2 − 2/ t e ∫ π2 ∞− /1 ( 2 − /2 +.5 )*100 where Chapter 8: Data Mining and Forecasting 377 378 SAS for Forecasting Time Series If the HPF procedure selects a seasonal model, you will see a display of the daily averages, as shown below. By clicking on a given day of the week, you can also see the associated history for that day over the past history of the data. Chapter 8: Data Mining and Forecasting 379 The drop in revenue is also displayed in the chart of past Sunday revenues. 380 SAS for Forecasting Time Series The scorecard also supports the output from a regression with autocorrelation and the ability to solve for inputs one at a time when seeking input values that deliver a specified level of a target. This is done using the SOLVE statement in PROC MODEL. By simply selecting the Goal Seeking Scorecard, you can find values of the inputs that satisfy increasing values of the target Revenue. An example of fitting a model and using it to later solve for values of the inputs is illustrated below. We restrict the explanatory variables to Purchasing Sessions, Average Order Value, and Product Suggestions to illustrate how the back solution is obtained. The 0 Percent column indicates the current daily settings for the metrics on 09Aug02. Increasing the target by 5% would set revenue at $199,905.09. To achieve this goal would require 2769 purchasing sessions, assuming all the other inputs remain at their 0 percent level (i.e., the 9Aug01 value). It is interesting to note that the number of product suggestions would need to drop to 54.5 to achieve this 5% increase. In other words, fewer visitors would be suggesting alternative products to the site and would be more apt to purchase the observed products. Based on the regression results, the number of product suggestions becomes negative (unreasonable) as revenue increases beyond 5%. The display uses metadata (data that characterize positive and negative business directions and acceptable ranges, etc.) that describe reasonable values and set the corresponding negative values to missing. The increasing values for purchasing sessions and average order size provide reasonable results. Chapter 8: Data Mining and Forecasting 381 8.8 Goal-Seeking Model Development The MODEL procedure analyzes models in which the relationships among the variables comprise a system of one or more nonlinear equations. The %AR macro can be used to specify models with autoregressive error processes similar to the AUTOREG procedure. In this case we are regressing revenue on buyer, dollar_per_purch_session, and productsuggestion_pages. 382 SAS for Forecasting Time Series The SOLVE data set is created to view values of the input variables that satisfy the 5% increase for the target variable Revenue. The output below examines the parameter estimates and test statistics. Lags 1, 5, and 7 for the autoregressive errors are statistically different from 0. The signs of the coefficient for purchasing sessions and average order value are positive and negative for product suggestions. The R square and significant parameters and AR terms suggest a reasonable model. Chapter 8: Data Mining and Forecasting 383 Observation 1 in the SOLVE data set shows the values for current values for the four variables for 09AUG2002. Using the fitted model with autoregressive errors, observations 2 through 4 demonstrate the changes in each individual input required to achieve a 5% increase in revenue, assuming the other inputs are at their current levels. These match the Goal Seeking Scorecard results. 8.9 Summary This example illustrates how you can apply automated forecasting techniques in a data mining environment. SAS IntelliVisor for Retail through the ASP delivery channel requires the ability to construct analytic results quickly in a batch environment without user intervention. The use of a daily scorecard allows the consumer to focus on what’s important and how things are changing over time. By focusing on a goal-seeking report, you can set goals and determine the changes required to produce increasing returns on investment. 384 References Akaike, H. 1974. “Markovian Representation of Stochastic Processes and Its Application to the Analysis of Autoregressive Moving Average Processes.” Annals of the Institute of Statistical Mathematics 26:363–386. Akaike, H. 1976. “Canonical Correlations of Time Series and the Use of an Information Criterion.” In Advances and Case Studies in System Identification, ed. R. Mehra and D. G. Lainiotis. New York: Academic Press. Akdi, Y., and D. A. Dickey. 1997. “Periodograms of Unit Root Time Series: Distributions and Tests.” Communications in Statistics 27:69–87. Anderson, T. W. 1971. The Statistical Analysis of Time Series. New York: Wiley. Bailey, C. T. 1984. “Forecasting Industrial Production 1981–1984.” Proceedings of the Ninth Annual SAS Users Group International Conference, Hollywood Beach, FL, 50–57. Bartlett, M. S. 1947. “Multivariate Analysis.” Supplement to the Journal of the Royal Statistical Society, Series B, IX:176–197. Bartlett, M. S. 1966. An Introduction to Stochastic Processes. 2d ed. Cambridge: Cambridge University Press. Bolerslev, Tim. 1986. “Generalized Autoregressive Conditional Heteroskedasticity.” Journal of Econometrics 31:307–327. Box, G. E. P., and D. R. Cox. 1964. “An Analysis of Transformations.” Journal of the Royal Statistical Society B26:211. Box, G. E. P., and G. M. Jenkins. 1976. Time Series Analysis: Forecasting and Control. Rev. ed. Oakland: Holden-Day. Box, G. E. P., G. M. Jenkins, and G. C. Reinsel. 1994. Time Series Analysis: Forecasting and Control. 3d ed. Englewood Cliffs, NJ: Prentice Hall. Brillinger, D. R. 1975. Time Series: Data Analysis and Theory. New York: Holt, Rinehart & Winston. Brocklebank, J., and D. A. Dickey. 1984. SAS Views: SAS Applied Time Series Analysis and Forecasting. Cary, NC: SAS Institute Inc. Chang, M. C., and D. A. Dickey. 1993. “Recognizing Overdifferenced Time Series.” Journal of Time Series Analysis 15:1–8. Chavern, J. 1984. “On the Limitations of Akaike’s Information Criterion and Its Use in PROC STATESPACE.” Proceedings of the Ninth Annual SAS Users Group International Conference, Hollywood Beach, FL, 106–111. Cohen, H., ed. 1981. Metal Statistics. New York: Fairchild Publications. 386 References Croston, J. D. 1977. “Forecasting and Stock Control for Intermittent Demands.” Operations Research Quarterly 23, no. 3. Davis, H. T. 1941. The Analysis of Economic Time Series. Chicago: Principia Press. Dickey, D. A., W. R. Bell, and R. B. Miller. 1986. “Unit Roots in Time Series Models: Tests and Implications.” American Statistician 40:12–26. Dickey, D. A., and W. A. Fuller. 1979. “Distribution of the Estimators for Autoregressive Time Series with a Unit Root.” Journal of the American Statistical Association, 427– 431. Dickey, D. A., and W. A. Fuller. 1981. “Likelihood Ratio Statistics for Autoregressive Time Series with a Unit Root.” Econometrica 49:1057–1072. Dickey, D. A., D. P. Hasza, and W. A. Fuller. 1984. “Testing for Unit Roots in Seasonal Time Series.” Journal of the American Statistical Association 79:355–367. Dickey, D. A., D. W. Janssen, and D. L. Thornton. 1991. “A Primer on Cointegration with an Application to Money and Income.” Review of the Federal Reserve Bank of St. Louis 73:58–78. Draper, N., and H. Smith. 1998. Applied Regression Analysis. 3d ed. New York: Wiley. Durbin, J. 1960. “The Fitting of Time Series Models.” International Statistical Review 28:233–244. Engle, Robert. 1982. “Autoregressive Conditional Heteroskedasticity with Estimates of the Variance of United Kingdom Inflation.” Econometrica 50:987–1007. Engle, R. F., and C. W. J. Granger. 1987. “Cointegration and Error Correction: Representation, Estimation, and Testing.” Econometrica 55:251–276. Evans, B. 1998. “Estimation and Hypothesis Testing in Nonstationary Time Series Using Frequency Domain Methods.” Ph.D. diss., North Carolina State University. Fountis, N. G., and D. A. Dickey. 1989. “Testing for a Unit Root Nonstationarity in Multivariate Autoregressive Time Series.” Annals of Statistics 17:419–428. Fuller, W. A. 1976. Introduction to Statistical Time Series. New York: Wiley. Fuller, W. A. 1986. “Using PROC NLIN for Time Series Prediction.” Proceedings of the Eleventh Annual SAS Users Group International Conference, Atlanta, GA, 63–68. Fuller, W.A. 1996. Introduction to Statistical Time Series. 2d ed. New York: Wiley. Hall, A. 1992. “Testing for a Unit Root in Time Series with Data-Based Model Selection.” Journal of Business and Economic Statistics 12:461–470. Hamilton, J. D. 1994. Time Series Analysis. Princeton, NJ: Princeton University Press. Hannan, E. J., and J. Rissanen. 1982. “Recursive Estimation of Mixed AutoregressiveMoving Average Order.” Biometrika 69, no. 1 (April): 81–94. Harvey, A. C. 1981. Time Series Models. Oxford: Philip Allan Publishers. References 387 Jarque, C. M., and A. K. Bera. 1980. “Efficient Tests for Normality, Homoskedasticity and Serial Independence of Regression Residuals.” Economics Letters 6:255–259. Jenkins, G. M., and D. G. Watts. 1968. Spectral Analysis and Its Applications. Oakland: Holden-Day. Johansen, S. 1988. “Statistical Analysis of Cointegrating Vectors.” Journal of Economic Dynamics and Control 12:312–254. Johansen, S. 1991. “Estimation and Hypothesis Testing of Cointegrating Vectors in Gaussian Vector Autoregressive Models.” Econometrica 59:1551–1580. Johansen, S. 1994. “The Role of the Constant and Linear Terms in Cointegration Analysis of Non-Stationary Variables.” Econometric Reviews 13:205–230. Johnston, J. 1972. Econometric Methods. 2d ed. New York: McGraw-Hill. Jones, R. H. 1974. “Identification and Autoregressive Spectrum Estimation.” IEEE Transactions on Automatic Control, AC-19:894–897. Liu, Shiping, Ju-Chin Huang, and Gregory L. Brown. 1988. “Information and Risk Perception: A Dynamic Adjustment Process.” Risk Analysis 18:689–699. Ljung, G. M., and G. E. P. Box. 1978. “On a Measure of Lack of Fit in Time Series Models.” Biometrika 65:297–303. McSweeny, A. J. 1978. “Effects of Response Cost on the Behavior of a Million Persons: Charging for Directory Assistance in Cincinnati.” Journal of Applied Behavior Analysis 11:47–51. Nelson, D. B. 1991. “Conditional Heteroskedasticity in Asset Returns: A New Approach.” Econometrica 59:347–370. Nelson, D. B., and C. Q. Cao. 1992. “Inequality Constraints in the Univariate GARCH Model.” Journal of Business and Economic Statistics 10:229–235. Pham, D. T. 1978. “On the Fitting of Multivariate Process of the Autoregressive-Moving Average Type.” Biometrika 65:99–107. Priestley, M. B. 1980. “System Identification, Kalman Filtering, and Stochastic Control.” In Directions in Time Series, ed. D. R. Brillinger and G. C. Tiao. Hayward CA: Institute of Mathematical Statistics. Priestley, M. B. 1981. Spectra Analysis and Time Series. Volume 1: Univariate Series. New York: Academic Press. Robinson, P. M. 1973. “Generalized Canonical Analysis for Time Series.” Journal of Multivariate Analysis 3:141–160. Said, S. E., and D. A. Dickey. 1984. “Testing for Unit Roots in Autoregressive Moving Average Models of Unknown Order.” Biometrika 71, no. 3: 599–607. SAS Institute Inc. 1984. SAS/ETS User’s Guide, Version 5 Edition. Cary, NC: SAS Institute Inc. 388 References SAS Institute Inc. 1985. SAS/GRAPH User’s Guide, Version 5 Edition. Cary, NC: SAS Institute Inc. SAS Institute Inc. 1985. SAS Introductory Guide. 3d ed. Cary, NC: SAS Institute Inc. SAS Institute Inc. 1985. SAS User’s Guide: Basics, Version 5 Edition. Cary, NC: SAS Institute Inc. SAS Institute Inc. 1985. SAS User’s Guide: Statistics, Version 5 Edition. Cary, NC: SAS Institute Inc. Singleton, R. C. 1969. “An Algorithm for Computing the Mixed Radix Fast Fourier Transform.” IEEE Transactions of Audio and Electroacoustics, AU-17:93–103. Stock, J. H., and W. W. Watson. 1988. “Testing for Common Trends.” Journal of the American Statistical Association 83:1097–1107. Tsay, Ruey S., and George C. Tiao. 1984. “Consistent Estimates of Autoregressive Parameters and Extended Sample Autocorrelation Function for Stationary and Nonstationary ARMA Models.” Journal of the American Statistical Association 79, no. 385 (March): 84–96. Tsay, Ruey S., and George C. Tiao. 1985. “Use of Canonical Analysis in Time Series Model Identification.” Biometrika 72, no. 2 (August): 299–315. U.S. Bureau of Census. 1982. “Construction Workers in Thousands.” Construction Review. U.S. Department of Labor. 1977. “Publishing and Printing Nonproduction Workers 1944–1977.” Handbook of Labor Statistics. Whittle, P. 1963. “On the Fitting of Multivariate Autoregressions and the Approximate Canonical Factorization of a Spectral Density Matrix.” Biometrika 50:129–134. A 3–2 erudecorp XAMRAV 281–971 ,561 snoitcnuf refsnart elpmis 461–341 seires emit lanosaes 6–4 serudecorp rehto htiw pihsnoitaler 652–932 secnairav lauqenu dna srorre seires emit ni noisserger 871–761 ,461 srorre seires emit ni noisserger 132–322 )elpmaxe( selas liater aniloraC htroN 321–311 rof sledom ,seires yranoitatsnon 301–201 seires yranoitatsnon 761–461 selbairav yrotanalpxe htiw sledom 732–332 )elpmaxe( eracs klim 322–681 ,661 snoitcnuf refsnart lareneg 55–25 snoitciderp erutuf 93–03 htiw gnitsacerof 79–59 fo sdohtem noitamitse 311–401 gnitsacerof no stceffe gnicnereffid 08 noitpo =ATAD 332–132 )elpmaxe( srekrow noitcurtsnoc 182–652 ,671 noitargetnioc 201–79 sdohtem noitamitse FCAP ,FCAI ,FCA erudecorp AMIRA ,tnemetats YFITNEDI osla eeS erudecorp AMIRA ,tnemetats ETAMITSE osla eeS erudecorp AMIRA 652–052 dezilareneg esrevni 552 ,252–942 dezilareneg 552 ,052–942 ssecorp )citsadecsoreteh yllanoitidnoc evissergerotua( HCRA 292 ,24 ,93 redro-dnoces 652–542 secnairav lauqenu dna srorre seires emit RA htiw noisserger 542–932 ,871–761 ,461 srorre seires emit RA htiw noisserger 86–36 FCAP htiw gniyfitnedi 84–54 erudecorp GER ni gnittif 9 tset nostaW-nibruD 182–652 ,671 noitargetnioc 2 sledom )evissergerotua( RA 183 orcam RA% 323 edutilpma 832–732 )elpmaxe( emulov gnidart kcots senilriA naciremA 182–572 no erudecorp XAMRAV 162 ,852–652 dna erudecorp GER 372–072 rotcev gnitargetnioc gnitamitse 182–972 stsacerof dna scitsongaid 072 ,362–652 ,321–711 )elpmaxe( ecirp kcots moc.nozamA 391–291 )AMIRA( tnemetats ETAMITSE ,noitpo MRAPTLA 533–433 gnisaila 203 )CIA( noiretirc noitamrofni s’ekiakA 461–251 )elpmaxe( ledom sregnessap enilria 203 )noiretirc noitamrofni s’ekiakA( CIA 723 erudecorp ARTCEPS ,noitpo NAEMJDA 411 stset )relluF-yekciD detnemguA( FDA 201–79 sdohtem noitamitse FCA 061 ,451 )elpmaxe( sregnessap enilria lanoitanretni 18 )selpmaxe seires 8Y( snoitcnuf noitalerrocotua gnitamitse 77–37 detamitse 08 ,86–36 laitrap ,FCA 061 ,451 )elpmaxe( sregnessap enilria lanoitanretni 18 )selpmaxe seires 8Y( snoitcnuf noitalerrocotua gnitamitse 97–87 detamitse 08–87 esrevni ,FCA 301 ytiranoitatsnon toor tinu 541–341 sledom lanosaes 061 ,451 )elpmaxe( sregnessap enilria lanoitanretni 18 )selpmaxe seires 8Y( snoitcnuf noitalerrocotua gnitamitse 37–96 detamitse 08 ,26–65 )noitcnuf ecnairavocotua( FCA 643 erudecorp ARTCEPS ,noitpo A Index 083–573 tnempoleved dracerocs 473–863 erudecorp FPH 083–673 syalpsid lacihparg 383–183 tnempoleved ledom gnikees-laog 383–953 seuqinhcet gnitsacerof citamotua FCA eeS noitcnuf ecnairavocotua sledom AMRA eeS ecnairavocotua 97–65 sledom AMRA noitacifitnedi seires emit 49–09 selpmaxe ,gnitamitse 98–18 gnitamitse snoitcnuf noitalerrocotua 292 ,44–14 secnairavoc rof snoitauqe reklaW-eluY 413 gniledom ecaps etats 9 ,1 evitisop 2 evitagen 93–03 erudecorp AMIRA htiw gnitsacerof 2–1 noitalerrocotua 411 sgal gnitnemgua 411 stset relluF-yekciD detnemguA 97–65 noitacifitnedi seires emit 55 ytilibitrevni dna ytiranoitats 251–641 elpmaxe srekrow noitcurtsnoc ,seires emit lanosaes 461–251 elpmaxe sregnessap enilria ,seires emit lanosaes 461–541 seires emit lanosaes 321–311 rof sledom ,seires yranoitatsnon 49–09 )elpmaxe( stropxe leets dna nori 49–09 )selpmaxe( snoitcnuf noitalerrocotua gnitamitse 041–821 sdohtem CINIM dna ,NACS ,FCASE 311–401 gnitsacerof no stceffe gnicnereffid 501–201 atad gnicnereffid 371 ,97 slaudiser fo kcehc erauqs-ihc 201–79 )selpmaxe( sdohtem noitamitse FCAP ,FCAI ,FCA 79–59 sdohtem noitamitse FCAP ,FCAI ,FCA 08–55 gniyfitnedi ,sledom )egareva gnivom evissergerotua( AMRA 832–732 )elpmaxe( kcatta tsirorret 05–94 ygolonimret 55 ytiranoitats 281–971 ,561 snoitcnuf refsnart elpmis 461–341 seires emit lanosaes 652–932 secnairav lauqenu dna srorre seires emit ni noisserger 871 –761 ,461 srorre seires emit ni noisserger 45–15 snoitciderp 25–15 snoitciderp daeha-pets-eno 132–322 )elpmaxe( selas liater aniloraC htroN 321–311 rof sledom ,seires yranoitatsnon 301–201 seires yranoitatsnon 761–461 selbairav yrotanalpxe htiw sledom 36 ,05 sledom dexim 732–332 )elpmaxe( eracs klim 49–09 )elpmaxe( stropxe leets dna nori 08–55 fo mrof gniyfitnedi 322–381 ,661 snoitcnuf refsnart lareneg 55–25 sdohtem gnitsacerof 311–011 ,49–29 sledom egareva gnivom gnittif 35 retemarap gnitamitse 98–18 )selpmaxe seires 8Y( snoitcnuf noitalerrocotua gnitamitse 041–821 sdohtem CINIM dna ,NACS ,FCASE 311–401 gnitsacerof no stceffe gnicnereffid 332–132 )elpmaxe( srekrow noitcurtsnoc 182–652 ,671 noitargetnioc 14–04 noitaton tfihskcab 05–94 dnuorgkcab 201–79 )selpmaxe( sdohtem noitamitse FCAP ,FCAI ,FCA 79–59 sdohtem noitamitse FCAP ,FCAI ,FCA sledom )RAV( noissergerotua rotcev osla eeS gniyfitnedi ,sledom AMRA osla eeS 141–94 sledom )egareva gnivom evissergerotua( AMRA 390 xednI 79–59 sdohtem noitamitse erudecorp AMIRA 93 ,73–53 dohtem )serauqs tsael lanoitidnoc( SLC 81 )GER( tnemetats LEDOM ,noitpo ILC 371 ,97 slaudiser fo kcehc erauqs-ihc 08 ,73 ,13 )AMIRA( tnemetats YFITNEDI ,noitpo =RETNEC 703–503 sisylana noitalerroc lacinonac 372 erudecorp RROCNAC 703 erudecorp ECAPSETATS ,noitpo RROCNAC 673 scirtem ecnamrofrep laog ssenisub 871–761 )elpmaxe( selas seilppus gnidliub 692 smetsys elbaifitnedi kcolb 292 mrof leknaH kcolb 431 noiretirc CIB 143 wodniw lartceps fo htdiwdnab 442 ,51 )GEROTUA( tnemetats LEDOM ,noitpo =PETSKCAB 312–381 snoitcnuf refsnart lareneg 15 sledom AMRA 14–04 noitaton tfihskcab sledom AMRA eeS sledom egareva gnivom evissergerotua sledom RA eeS sledom evissergerotua 652–052 dezilareneg esrevni 552 ,252–942 dezilareneg 552 ,052–942 ssecorp )HCRA( citsadecsoreteh yllanoitidnoc evissergerotua 92–72 noissergerotua 71 tnemetats TUPTUO 442 ,51 ,01 tnemetats LEDOM 71–41 ytilanosaes raluger ylhgih 542–242 )elpmaxe( srorre evissergerotua 242–932 srorre evissergerotua 652–942 sessecorp HCRAGI dna ,HCRAG ,HCRA 2 erudecorp GEROTUA 863–263 )metsys gnitsacerof seires emit( SFST B C 923 tset margodoirep evitalumuc 681 ,071 )AMIRA( tnemetats YFITNEDI ,noitpo =ROCSSORC 653 ,453–053 )elpmaxe( rof erudecorp ARTCEPS 053–643 rof erudecorp ARTCEPS 443–143 seititnauq gniterpretni 543–443 artceps esahp dna edutilpma-ssorc gniterpretni 753–143 sisylana lartceps-ssorc 643 erudecorp ARTCEPS ,noitpo SSORC 681–481 snoitcnuf refsnart lareneg 071 snoitalerroc-ssorc 543–443 murtceps edutilpma-ssorc 113 erudecorp ECAPSETATS ,noitpo BVOC 292 ,44–14 rof snoitauqe reklaW-eluY 503 morf denimreted srotcev etats secnairavoc 332–132 ,251–641 )elpmaxe( srekrow noitcurtsnoc 45–35 sdohtem gnitsacerof ledom AMRA 79–59 sdohtem noitamitse erudecorp AMIRA 93 ,73–53 dohtem )SLC( serauqs tsael lanoitidnoc 933 setagujnoc dna slaitnenopxe xelpmoc 572 erudecorp XAMRAV ,noitpo TSETNIOC 562–362 stoor tinu 671 rof stset 472–372 noitcirtser tpecretni 062 noitcnuf esnopser eslupmi 072–562 elpmaxe 372–072 rotcev gnitargetnioc gnitamitse 952–852 seulavnegie 182–652 ,671 noitargetnioc 372–072 ,862 rotcev gnitargetnioc 572 erudecorp XAMRAV ,tnemetats GETNIOC 723 erudecorp ARTCEPS ,noitpo FFEOC 242 dohtem ttucrO-enarhcoC 45–35 sdohtem gnitsacerof ledom AMRA xednI 391 seuqinhcet gnitsacerof citamotua osla eeS 44–92 gnitsacerof 6–4 serudecorp rehto htiw pihsnoitaler 4 ,2 noitpo =DOHTEM 2 erudecorp TSACEROF 533–433 ,033 ycneuqerf gnidlof 653 ,453–053 sisylana lartceps-ssorc 212–202 )elpmaxe( srevir aniloraC htroN fo setar wolf 923 citsitats tset appak s’rehsiF 923–823 esion etihw 511–411 secnereffid deggal rof tset F 49–09 )elpmaxe( nori dna leets ,stropxe 933 xelpmoc ,slaitnenopxe 173 ,2 )sretniW( sledom lanosaes-dnert dehtooms yllaitnenopxe 2 gnihtooms laitnenopxe 761–461 selbairav yrotanalpxe 3 seires emit yrotanalpxe 77–37 )noitcnuf ecnairavocotua laitrap( FCAP detamitse 97–87 )noitcnuf ecnairavocotua esrevni( FCAI detamitse 37–96 )noitcnuf ecnairavocotua( FCA detamitse 79 ,73 noitpo LLATNIRP 301 ,73 noitpo TNATSNOCON 071 noitpo =TUPNI 79 noitpo DIRG 681 snoitcnuf refsnart lareneg 311–011 ,49–29 sledom egareva gnivom gnittif 391–291 noitpo MRAPTLA 201–79 sdohtem noitamitse FCAP ,FCAI ,FCA 13 erudecorp AMIRA ,tnemetats ETAMITSE 131–821 dohtem FCASE 652–932 secnairav lauqenu dna srorre seires emit htiw noisserger 871–761 ,461 srorre seires emit htiw noisserger srorre 792–492 gniledom AMRA rotcev htiw ecnelaviuqe F 623–423 )elpmaxe( stnalp ni ytivitca emyzne 052 )HCRAGE( ssecorp HCRA dezilareneg s’elgnE 542–142 )elpmaxe( ytisrevinu ta dnamed ygrene 332–132 ,251–641 )elpmaxe( srekrow noitcurtsnoc yrnosam dna lacirtcele 952–852 noitargetnioc dna seulavnegie 052 ssecorp )HCRA dezilareneg s’elgnE( HCRAGE 01 )GEROTUA( tnemetats LEDOM ,noitpo BORPWD 7 )GER( tnemetats LEDOM ,noitpo WD 542–142 srorre seires emit RA htiw noisserger 21–9 ,7 tset nostaW-nibruD 652–052 sessecorp HCRAGI dna ,HCRAG ,HCRA 942–542 )elpmaxe( egarevA lairtsudnI senoJ woD 322–912 )elpmaxe( sllac ecnatsissa yrotcerid 311–401 gnitsacerof no stceffe gnicnereffid 821–321 sdnert raenil gnivomer 311–501 )elpmaxe( seires ecirp kcots MBI 411 sgal gnitnemgua 501–201 atad gnicnereffid 123–713 atad gnidnerted 08 erudecorp AMIRA ,noitpo =ATAD 863–263 )metsys gnitsacerof seires emit( SFST 083–573 tnempoleved dracerocs 473–863 erudecorp FPH 083–673 syalpsid lacihparg 383–183 tnempoleved ledom gnikees-laog 673 scirtem ecnamrofrep laog ssenisub 383–953 gninim atad atad gnicnereffid eeS gnicnereffid atad 912 noitnevretni-erp ,sisylana atad D E 392 xednI 732–332 )elpmaxe( eracs klim naiiawaH 433–033 seicneuqerf cinomrah 79 )AMIRA( tnemetats ETAMITSE ,noitpo DIRG 653 ,453–053 sisylana lartceps-ssorc 212–202 )elpmaxe( atad revir orobsdloG 383–183 tnempoleved ledom gnikees-laog erudecorp GER eeS erudecorp MLG 552 ,252–942 )HCRAG( ssecorp HCRA dezilareneg 212–202 )elpmaxe( setar wolf revir 391–681 stnemetats gniledom 202–391 noitaulave ledom 812–312 sisylana noitnevretni 681–381 sledom gniyfitnedi 681–481 snoitalerroc-ssorc 312–381 noitaton tfihskcab 312–381 ,661 snoitcnuf refsnart lareneg 552 ,252–942 ssecorp )HCRA dezilareneg( HCRAG 933 niag derauqs 753–453 niag G H 832–732 kcatta tsirorret 912 sisylana atad noitnevretni-erp 732–332 eracs klim 812–312 snoitcnuf refsnart lareneg 322–912 sllac ecnatsissa yrotcerid 561 sisylana noitnevretni 461–251 )elpmaxe( sregnessap enilria lanoitanretni 472–372 noitargetnioc dna ,stpecretni 041–831 )elpmaxe( snaol knabretni 071 )AMIRA( tnemetats ETAMITSE ,noitpo =TUPNI 782 )gniledom ecaps etats( xirtam tupni noitargetnioc osla eeS 062 noitcnuf esnopser eslupmi 652–052 ssecorp )HCRA dezilareneg esrevni( HCRAGI gniyfitnedi ,sledom AMRA eeS sledom AMRA gniyfitnedi 08 noitpo =RAV 422 noitpo =VOCTUO 29 ,08 noitpo =TNIRPON 08 noitpo =GALN 49–09 )elpmaxe( stropxe leets dna nori 681 snoitcnuf refsnart lareneg 98–18 )selpmaxe seires 8Y( snoitcnuf noitalerrocotua gnitamitse 97 –65 snoitcnuf noitalerrocotua gnitamitse 681 ,071 noitpo =ROCSSORC 08 ,73 ,13 noitpo =RETNEC erudecorp AMIRA ,tnemetats YFITNEDI 311–501 )elpmaxe( seires ecirp kcots MBI 061 ,451 )elpmaxe( sregnessap enilria lanoitanretni 18 )selpmaxe seires 8Y( snoitcnuf noitalerrocotua gnitamitse 97–87 detamitse 08–87 )noitcnuf ecnairavocotua esrevni( FCAI 083–673 syalpsid lacihparg 473–863 erudecorp FPH 281–971 )elpmaxe( strats gnisuoh 02–31 ytilanosaes raluger ylhgih 362–062 sledom RAV redro-rehgih xednI I 393 gnitsacerof osla eeS 55–15 sledom AMRA ,snoitciderp erutuf 123–413 )elpmaxe( gnippart ruf 323 ycneuqerf 433–033 cinomrah ,seicneuqerf sisylana lartceps osla eeS 623 seicneuqerf reiruoF 413 erudecorp ECAPSETATS ,tnemetats MROF 292 ,44–14 secnairavoc rof snoitauqe reklaW-eluY 832 –332 ,322–312 ,561 sisylana noitnevretni 761–461 selbairav yrotanalpxe htiw gnivorpmi 35 sretemarap ledom AMRA gnitamitse 311–401 stceffe gnicnereffid 383–953 gninim atad 55–25 sledom AMRA 93–03 rof erudecorp AMIRA 43–33 dohtem )serauqs tsael( SL 62–12 noitamrofsnart cimhtiragol 041–831 )elpmaxe( knabretni ,snaol 821–321 gnicnereffid htiw gnivomer ,sdnert raenil 652–932 secnairav lauqenu dna srorre seires emit htiw 871–761 ,461 )elpmaxe( srorre seires emit htiw 02–31 ytilanosaes raluger yrev 62–12 atad demrofsnart yllacimhtiragol noissergerotua osla eeS 21–6 noisserger raenil 242 noitcnuf doohilekil 43–33 dohtem )SL( serauqs tsael 623–423 )elpmaxe( ytivitca emyzne fael 753–453 ,012 ,661 yaled erup dna srotacidni gnidael 923–823 esion etihw 511–411 rof tset F ,secnereffid deggal 923 tset )vonrimS–vorogomloK( S-K 643 erudecorp ARTCEPS ,noitpo K 242 noitcnuf ytisned tnioj 672 ,372–272 dohtem s’nesnahoJ 703 erudecorp ECAPSETATS ,noitpo TNIRPTI 49–09 )elpmaxe( stropxe leets dna nori 55–45 ytilibitrevni 652–052 ssecorp )HCRAGI( HCRA dezilareneg esrevni 061 ,451 )elpmaxe( sregnessap enilria lanoitanretni 18 )selpmaxe seires 8Y( snoitcnuf noitalerrocotua gnitamitse 97–87 detamitse 08–87 )FCAI( noitcnuf ecnairavocotua esrevni 394 xednI gniledom lanosaes eeS atad ylhtnom 81 noitpo P 7 noitpo WD 81 noitpo ILC erudecorp GER ,tnemetats LEDOM 51 noitpo =GALN 51 noitpo =GAN 01 noitpo BORPWD 442 ,51 noitpo =PETSKCAB erudecorp GEROTUA ,tnemetats LEDOM 083 tnemetats EVLOS 383–083 erudecorp LEDOM 242 )setamitse doohilekil mumixam( sELM 45–35 sdohtem gnitsacerof ledom AMRA 79–59 sdohtem noitamitse erudecorp AMIRA 93 ,73 ,53–43 dohtem )doohilekil-mumixam( LM 36 FCAP htiw gniyfitnedi 05 sledom AMRA dexim 123–413 )elpmaxe( stlep tarksum dna knim gninim atad eeS atad gninim 731–431 dohtem CINIM 732–332 )elpmaxe( eracs klim 673 ecnamrofrep laog ssenisub ,scirtem 4 ,2 erudecorp TSACEROF ,noitpo =DOHTEM 31 htiw ytilanosaes gniledom ,naem 45–35 sdohtem gnitsacerof ledom AMRA 79–59 sdohtem noitamitse erudecorp AMIRA 93 ,73 ,53–43 dohtem )LM( doohilekil-mumixam 242 )sELM( setamitse doohilekil mumixam 332–132 ,251–641 )elpmaxe( srekrow noitcurtsnoc lacirtcele dna yrnosam 492 noitatneserper naivokraM M L K J 162 ,852–652 )elpmaxe( ecirp kcots moc.nozamA 21–7 erudecorp GER 18 citsitats Q Q R 71 erudecorp GEROTUA ,tnemetats TUPTUO 561 sreiltuo 422 )AMIRA( tnemetats YFITNEDI ,noitpo =VOCTUO 072 ,552 noisserger )SLO( serauqs tsael yranidro 25–15 sledom AMRA ,snoitciderp daeha-pets-eno 072 ,552 noisserger )serauqs tsael yranidro( SLO 533–433 ,033 ycneuqerf tsiuqyN 653 ,453–053 sisylana lartceps-ssorc 212–202 )elpmaxe( setar wolf revir aniloraC htroN 032–322 ,02–31 )elpmaxe( selas liater aniloraC htroN 29 ,08 )AMIRA( tnemetats YFITNEDI ,noitpo =TNIRPON 321–311 rof sledom 301–201 seires yranoitatsnon 821–421 )elpmaxe( gnitnirp dna gnihsilbup ni srekrow noitcudorpnon 55–45 ytilibitrevninon 603 erudecorp ECAPSETATS ,noitpo TSEON 301 ,73 )AMIRA( tnemetats ETAMITSE ,noitpo TNATSNOCON 51 )GEROTUA( tnemetats LEDOM 08 )AMIRA( tnemetats YFITNEDI noitpo =GALN 653 ,453–053 sisylana lartceps-ssorc 212–202 )elpmaxe( atad revir esueN 2 noitalerrocotua evitagen 542–142 )elpmaxe( dnamed ygrene USCN 51 )GEROTUA( tnemetats LEDOM ,noitpo =GAN N O 753–453 ,012 ,661 )srotacidni gnidael( esnopser ni yaled erup 821–421 )elpmaxe( srekrow noitcudorpnon ,gnitnirp dna gnihsilbup 79 ,73 )AMIRA( tnemetats ETAMITSE ,noitpo LLATNIRP 25–15 daeha-pets-eno gnitsacerof osla eeS 55–15 sledom AMRA ,snoitciderp 912 sisylana atad noitnevretni-erp 9 ,1 noitalerrocotua evitisop 71 )GEROTUA( tnemetats TUPTUO ,noitpo =MP 623–423 )elpmaxe( ytivitca emyzne tnalp 753–453 ,543–443 artceps esahp 323 )elgna esahp( tfihs esahp 643 erudecorp ARTCEPS ,noitpo HP 143–043 ,433–333 dehtooms 923 tset margodoirep evitalumuc 323 smargodoirep sisylana lartceps osla eeS 623 ,523 etanidro margodoirep 061 ,451 )elpmaxe( sregnessap enilria lanoitanretni 36 htiw sledom AMRA dexim gniyfitnedi 18 )selpmaxe seires 8Y( snoitcnuf noitalerrocotua gnitamitse 77–37 detamitse 08 ,86–36 )noitcnuf ecnairavocotua laitrap( FCAP 71 )GEROTUA( tnemetats TUPTUO ,noitpo =P 81 )GER( tnemetats LEDOM ,noitpo P 71 noitpo =MP 71 noitpo =P xednI P 395 123–413 )elpmaxe( gnippart ruf tarksum 203–103 ,492–582 selpmaxe gniledom ecaps etats etairavitlum 571 retemarap egareva gnivom sledom AMRA eeS sledom egareva gnivom 292 ,24 secnairavoc rof snoitauqe reklaW-eluY 93 sessecorp RA redro-dnoces 2 egakcap tnemtsujda lanosaes 11-X 173 ,2 sledom lanosaes -dnert dehtooms yllaitnenopxe sretniW 02–31 ytilanosaes raluger yrev 1 denifed ,ytilanosaes 02–31 ytilanosaes raluger 3–2 rof seitilibissop 31 naem htiw gniledom 461–251 )elpmaxe( sregnessap enilria lanoitanretni 02–31 ytilanosaes raluger ylhgih 251–641 )elpmaxe( srekrow noitcurtsnoc 541–341 FCA 461–341 gniledom lanosaes 083–573 tnempoleved dracerocs 041–731 ,331–131 dohtem NACS 431 noiretirc noitamrofni CBS 6–2 erawtfos STE/SAS 653 ,453–053 sisylana lartceps-ssorc 212–202 )elpmaxe( setar wolf revir 032–322 ,02–31 )elpmaxe( selas liater 113 )detalumis( etairavinu ,elpmaxe 603 ,882–782 erudecorp ECAPSETATS ,tnemetats TCIRTSER 371 ,97 fo kcehc erauqs-ihc ,slaudiser 02–31 ytilanosaes raluger 072 ,552 )SLO( serauqs tsael yranidro ,noisserger 02–31 ytilanosaes raluger yrev 652–932 secnairav lauqenu dna srorre seires emit htiw 871–761 ,461 )elpmaxe( srorre seires emit htiw 62–12 atad demrofsnart yllacimhtiragol noissergerotua osla eeS 21–6 raenil ,noisserger 611–411 )elpmaxe( skcots revlis 62–12 atad demrofsnart htiw noisserger 81 ,7 tnemetats LEDOM 02–81 ,51 ytilanosaes raluger ylhgih 123–713 atad gnidnerted 84–54 gnittif ,sledom )RA( evissergerotua 396 xednI 753–453 ,543–443 artceps esahp 433–033 seicneuqerf cinomrah 753–453 yaled erup dna ,esahp ,niag 143–043 ,433–333 )smargodoirep dehtooms( detamitse 653 ,453–053 )elpmaxe( sisylana lartceps-ssorc 753–143 sisylana lartceps-ssorc 533–433 snoitautculf tsaf ylemertxe dna gnisaila 753–323 sisylana lartceps 923 noitpo TSETETIHW 043 tnemetats THGIEW 553 tnemetats RAV 643 noitpo HP 643 noitpo K 143–043 murtceps gnitamitse 653 ,453–053 )elpmaxe( sisylana lartceps-ssorc 053–643 sisylana lartceps-ssorc 643 noitpo SSORC 543–443 artceps esahp dna edutilpma-ssorc 723 noitpo FFEOC 723 noitpo NAEMJDA 643 noitpo A 823–623 ,3 erudecorp ARTCEPS 083 erudecorp LEDOM ,tnemetats EVLOS 173 ,2 sledom lanosaes -dnert dehtooms yllaitnenopxe sretniW 2 laitnenopxe ,gnihtooms 143–043 ,433–333 smargodoirep dehtooms sisylana lartceps eeS gniledom tnenopmoc ladiosunis 281–971 )elpmaxe( strats gnisuoh 561 snoitcnuf refsnart elpmis 711–411 atad yranoitatsnon rof sledom 84–54 )elpmaxe( skcots revlis sisylana noitnevretni eeS noitcnuf esnopser eslupmi eeS noitargetnioc eeS skcohs 413–703 gniledom ecaps etats etairavinu detalumis 753–453 ,012 ,661 )srotacidni gnidael( esnopser ni tfihs S 63–43 )SSU( serauqs fo mus lanoitidnocnu 45–35 sdohtem gnitsacerof ledom AMRA 79–59 sdohtem noitamitse erudecorp AMIRA 93 ,53–43 dohtem )serauqs tsael lanoitidnocnu( SLU 863–263 )metsys gnitsacerof seires emit( SFST 821–321 gnicnereffid htiw gnivom ,raenil 832–332 ,322–312 ,561 sisylana noitnevretni 123–713 atad gnidnerted 1 sdnert 62–22 )elpmaxe( setar llib yrusaerT 123–413 )elpmaxe( gnippart 782 )gniledom ecaps etats( xirtam noitisnart sisylana lartceps eeS stnenopmoc ladiosunis htiw sledom seires emit 49–09 )elpmaxe( stropxe leets dna nori 98–18 )selpmaxe seires 8Y( snoitcnuf noitalerrocotua gnitamitse 97 –65 snoitcnuf noitalerrocotua gnitamitse sledom AMRA ,noitacifitnedi seires emit 863–263 )SFST( metsys gnitsacerof seires emit 542–142 )elpmaxe( dnamed ygrene ytisrevinu 652–542 secnairav lauqenu 542–932 ,871–761 ,461 htiw noisserger ,srorre seires emit 3 yrotanalpxe ,seires emit 832–732 )elpmaxe( kcatta tsirorret 832–732 )elpmaxe( senilriA naciremA ,emulov gnidart kcots 311–501 )elpmaxe( MBI ,seires ecirp kcots 49–09 )elpmaxe( stropxe nori dna leets xednI U T 397 321–311 rof sledom ,seires yranoitatsnon 301–201 seires yranoitatsnon 55 sledom AMRA 44 ytiranoitats 503 secnairavoc morf denimreted srotcev etats 113 ,603 ,882–782 tnemetats TCIRTSER 6–4 serudecorp rehto htiw pihsnoitaler 603 noitpo TSEON 703 noitpo TNIRPTI 123–413 )elpmaxe( gnippart ruf 413 tnemetats MROF 413–703 )detalumis( elpmaxe 113 noitpo BVOC 703–503 sisylana noitalerroc lacinonac 703 noitpo RROCNAC 123–203 ,192–092 ,682 ,3–2 erudecorp ECAPSETATS 792–492 sledom AMRA rotcev 503 secnairavoc morf denimreted gniledom ecaps etats osla eeS 482 srotcev etats 44–14 secnairavoc rof snoitauqe reklaW-eluY 413–703 )detalumis( selpmaxe etairavinu 103–892 ,482–382 selpmaxe etairavinu 782 xirtam noitisnart 123–413 ,203–103 ,492–582 selpmaxe etairavitlum 782 xirtam tupni 123–413 )elpmaxe( gnippart ruf 792–492 gniledom AMRA rotcev htiw ecnelaviuqe 703–503 sisylana noitalerroc lacinonac 692 smetsys elbaifitnedi kcolb 413 noitalerrocotua 123–382 gniledom ecaps etats 933 niag derauqs 943–643 ycnerehoc derauqs 143 wodniw lartceps 933–533 ytisned lartceps 923–823 rof gnitset ,esion etihw 933–533 ytisned lartceps 623–423 )elpmaxe( ytivitca emyzne tnalp 292 gniledom ecaps etats 44–14 secnairavoc rof snoitauqe reklaW-eluY 98–18 selpmaxe seires 8Y 3 erudecorp 21X 2 erudecorp 11X 2 egakcap tnemtsujda lanosaes 11-X 82 noitatneserper dloW 173 ,2 sledom lanosaes -dnert dehtooms yllaitnenopxe sretniW 923 erudecorp ARTCEPS ,noitpo TSETETIHW 923–823 rof gnitset 72 esion etihw 043 erudecorp ARTCEPS ,tnemetats THGIEW 02–31 ytilanosaes raluger yrev 863–263 )metsys gnitsacerof seires emit( SFST 573 tnempoleved dracerocs 673 scirtem ecnamrofrep laog ssenisub W X Y 163–063 )elpmaxe( atad erotS yrtnuoC tnomreV srotcev etats eeS etats ,srotcev 182 –572 snoitaluclac erudecorp XAMRAV 123–203 rof erudecorp ECAPSETATS 792–492 ot ecnelaviuqe ,sledom ecaps etats 062 noitcnuf esnopser eslupmi 362–062 sledom redro-rehgih 072–562 elpmaxe 952–852 seulavnegie 562–362 stoor tinu dna noitargetnioc 182–652 sledom )RAV( noissergerotua rotcev 182–572 snoitaluclac ,sledom RAV 671 noitargetnioc rof stset 6–4 serudecorp rehto htiw pihsnoitaler 572 noitpo TSETNIOC 572 tnemetats GETNIOC 3–2 erudecorp AMIRA 182–572 )elpmaxe( ecirp kcots moc.nozamA 172 erudecorp XAMRAV 553 erudecorp ARTCEPS ,tnemetats RAV 08 )AMIRA( tnemetats YFITNEDI ,noitpo =RAV sledom noissergerotua rotcev eeS sledom RAV 63–43 )serauqs fo mus lanoitidnocnu( SSU 542–142 )elpmaxe( dnamed ygrene ytisrevinu 413–703 detalumis 123–413 )elpmaxe( gnippart ruf 103–892 ,482–382 selpmaxe gniledom ecaps etats etairavinu 62–22 )elpmaxe( setar llib yrusaerT .S.U 49–09 )elpmaxe( stropxe leets dna nori .S.U 281–971 )elpmaxe( strats gnisuoh .S.U 321–311 atad yranoitatsnon rof sledom 562–362 noitargetnioc 301 ytiranoitatsnon toor tinu 652–542 htiw noisserger ,srorre seires emit dna secnairav lauqenu 398 xednI V Books from SAS Institute’s Books by Users Press Advanced Log-Linear Models Using SAS ® by Daniel Zelterman Health Care Data and the SAS® System by Marge Scerbo, Craig Dickstein, and Alan Wilson Annotate: Simply the Basics by Art Carpenter The How-To Book for SAS/GRAPH ® Software by Thomas Miron Applied Multivariate Statistics with SAS® Software, Second Edition by Ravindra Khattree and Dayanand N. Naik In the Know ... SAS ® Tips and Techniques From Around the Globe by Phil Mason Integrating Results through Meta-Analytic Review Using SAS® Software by Morgan C. Wang and Brad J. Bushman Applied Statistics and the SAS Programming Language, Fourth Edition ® by Ronald P. Cody and Jeffrey K. Smith An Array of Challenges — Test Your SAS ® Skills by Robert Virgile Learning SAS ® in the Computer Lab, Second Edition by Rebecca J. Elliott The Little SAS ® Book: A Primer by Lora D. Delwiche and Susan J. Slaughter The Little SAS ® Book: A Primer, Second Edition by Lora D. Delwiche and Susan J. Slaughter (updated to include Version 7 features) Logistic Regression Using the SAS® System: Theory and Application by Paul D. Allison Longitudinal Data and SAS®: A Programmer’s Guide by Ron Cody Maps Made Easy Using SAS® by Mike Zdeb Models for Discrete Data by Daniel Zelterman Beyond the Obvious with SAS ® Screen Control Language by Don Stanley Carpenter’s Complete Guide to the SAS® Macro Language by Art Carpenter The Cartoon Guide to Statistics by Larry Gonick and Woollcott Smith Categorical Data Analysis Using the SAS System, Second Edition ® by Maura E. Stokes, Charles S. Davis, and Gary G. Koch Cody’s Data Cleaning Techniques Using SAS® Software by Ron Cody Common Statistical Methods for Clinical Research with SAS ® Examples, Second Edition by Glenn A. Walker Concepts and Case Studies in Data Management by William S. Calvert and J. Meimei Ma Multiple Comparisons and Multiple Tests Using SAS® Text and Workbook Set (books in this set also sold separately) by Peter H. Westfall, Randall D. Tobias, Dror Rom, Russell D. Wolfinger, and Yosef Hochberg Multiple-Plot Displays: Simplified with Macros by Perry Watts Debugging SAS ® Programs: A Handbook of Tools and Techniques by Michele M. Burlew Efficiency: Improving the Performance of Your SAS ® Applications by Robert Virgile Multivariate Data Reduction and Discrimination with SAS ® Software by Ravindra Khattree and Dayanand N. Naik A Handbook of Statistical Analyses Using SAS®, Second Edition by B.S. Everitt and G. Der The Next Step: Integrating the Software Life Cycle with SAS ® Programming by Paul Gill Output Delivery System: The Basics by Lauren E. Haworth SAS ® Macro Programming Made Easy by Michele M. Burlew Painless Windows: A Handbook for SAS ® Users by Jodie Gilmore (for Windows NT and Windows 95) SAS ® Programming by Example by Ron Cody and Ray Pass Painless Windows: A Handbook for SAS ® Users, Second Edition by Jodie Gilmore (updated to include Version 7 features) SAS ® Programming for Researchers and Social Scientists, Second Edition by Paul E. Spector PROC TABULATE by Example by Lauren E. Haworth SAS ® Software Roadmaps: Your Guide to Discovering the SAS ® System by Laurie Burch and SherriJoyce King Professional SAS Programmer’s Pocket Reference, Fourth Edition ® SAS ® Software Solutions: Basic Data Processing by Thomas Miron by Rick Aster Professional SAS ® Programmer’s Pocket Reference, Second Edition by Rick Aster SAS ® Survival Analysis Techniques for Medical Research, Second Edition by Alan B. Cantor Professional SAS ® Programming Shortcuts by Rick Aster SAS ® System for Elementary Statistical Analysis, Second Edition by Sandra D. Schlotzhauer and Ramon C. Littell Programming Techniques for Object-Based Statistical Analysis with SAS® Software by Tanya Kolosova and Samuel Berestizhevsky SAS ® System for Forecasting Time Series, 1986 Edition by John C. Brocklebank and David A. Dickey Quick Results with SAS/GRAPH ® Software by Arthur L. Carpenter and Charles E. Shipp SAS ® System for Mixed Models by Ramon C. Littell, George A. Milliken, Walter W. Stroup, and Russell D. Wolfinger Quick Results with the Output Delivery System by Sunil K. Gupta SAS ® System for Regression, Third Edition by Rudolf J. Freund and Ramon C. Littell Quick Start to Data Analysis with SAS ® by Frank C. Dilorio and Kenneth A. Hardy SAS ® System for Statistical Graphics, First Edition by Michael Friendly Reading External Data Files Using SAS®: Examples Handbook by Michele M. Burlew The SAS ® Workbook and Solutions Set (books in this set also sold separately) by Ron Cody Regression and ANOVA: An Integrated Approach Using SAS ® Software by Keith E. Muller and Bethel A. Fetterman Selecting Statistical Techniques for Social Science Data: A Guide for SAS® Users by Frank M. Andrews, Laura Klem, Patrick M. O’Malley, Willard L. Rodgers, Kathleen B. Welch, and Terrence N. Davidson Reporting from the Field: SAS ® Software Experts Present Real-World Report-Writing Applications SAS Applications Programming: A Gentle Introduction ® Solutions for Your GUI Applications Development Using SAS/AF ® FRAME Technology by Don Stanley by Frank C. Dilorio SAS ® for Forecasting Time Series, Second Edition by John C. Brocklebank, and David A. Dickey Statistical Quality Control Using the SAS ® System by Dennis W. King SAS for Linear Models, Fourth Edition ® A Step-by-Step Approach to Using the SAS ® System for Factor Analysis and Structural Equation Modeling by Larry Hatcher by Ramon C. Littell, Walter W. Stroup, and Rudolf J. Freund SAS ® for Monte Carlo Studies: A Guide for Quantitative Researchers by Xitao Fan, Ákos Felsovályi, Stephen A. Sivo, ˝ and Sean C. Keenan A Step-by-Step Approach to Using the SAS ® System for Univariate and Multivariate Statistics by Larry Hatcher and Edward Stepanski Step-by-Step Basic Statistics Using SAS®: Student Guide and Exercises (books in this set also sold separately) by Larry Hatcher JMP® Books Basic Business Statistics: A Casebook by Dean P. Foster, Robert A. Stine, and Richard P. Waterman Strategic Data Warehousing Principles Using SAS ® Software by Peter R. Welbrock Business Analysis Using Regression: A Casebook by Dean P. Foster, Robert A. Stine, and Richard P. Waterman Survival Analysis Using the SAS ® System: A Practical Guide by Paul D. Allison JMP® Start Statistics, Second Edition by John Sall, Ann Lehman, and Lee Creighton Table-Driven Strategies for Rapid SAS ® Applications Development by Tanya Kolosova and Samuel Berestizhevsky Regression Using JMP® by Rudolf J. Freund, Ramon C. LIttell, and Lee Creighton Tuning SAS ® Applications in the MVS Environment by Michael A. Raithel Univariate and Multivariate General Linear Models: Theory and Applications Using SAS ® Software by Neil H. Timm and Tammy A. Mieczkowski Using SAS ® in Financial Research by Ekkehart Boehmer, John Paul Broussard, and Juha-Pekka Kallunki Using the SAS ® Windowing Environment: A Quick Tutorial by Larry Hatcher Visualizing Categorical Data by Michael Friendly Working with the SAS ® System by Erik W. Tilanus Your Guide to Survey Research Using the SAS® System by Archer Gravely WILEY SERIES IN PROBABILITY AND STATISTICS Established by WALTER A. SHEWHART and SAMUEL S. WILKS Editors: David J. Balding, Peter Bloomfield, Noel A. C. Cressie, Nicholas I. Fisher, Iain M. Johnstone, J. B. Kadane, Louise M. Ryan, David W. Scott, Adrian F. M. Smith, Jozef L. Teugels Editors Emeriti: Vic Barnett, J. Stuart Hunter, David G. Kendall A complete list of the titles in this series appears at the end of this volume. BRUNNER, DOMHOF, and LANGER · Nonparametric Analysis of Longitudinal Data in Factorial Experiments BUCKLEW · Large Deviation Techniques in Decision, Simulation, and Estimation CAIROLI and DALANG · Sequential Stochastic Optimization CHAN · Time Series: Applications to Finance CHATTERJEE and HADI · Sensitivity Analysis in Linear Regression CHATTERJEE and PRICE · Regression Analysis by Example, Third Edition CHERNICK · Bootstrap Methods: A Practitioner’s Guide CHERNICK and FRIIS · Introductory Biostatistics for the Health Sciences CHILÈS and DELFINER · Geostatistics: Modeling Spatial Uncertainty CHOW and LIU · Design and Analysis of Clinical Trials: Concepts and Methodologies CLARKE and DISNEY · Probability and Random Processes: A First Course with Applications, Second Edition *COCHRAN and COX · Experimental Designs, Second Edition CONGDON · Bayesian Statistical Modelling CONOVER · Practical Nonparametric Statistics, Second Edition COOK · Regression Graphics COOK and WEISBERG · Applied Regression Including Computing and Graphics COOK and WEISBERG · An Introduction to Regression Graphics CORNELL · Experiments with Mixtures, Designs, Models, and the Analysis of Mixture Data, Third Edition COVER and THOMAS · Elements of Information Theory COX · A Handbook of Introductory Statistical Methods *COX · Planning of Experiments CRESSIE · Statistics for Spatial Data, Revised Edition ´´ CSÖRGO and HORVÁTH · Limit Theorems in Change Point Analysis DANIEL · Applications of Statistics to Industrial Experimentation DANIEL · Biostatistics: A Foundation for Analysis in the Health Sciences, Sixth Edition *DANIEL · Fitting Equations to Data: Computer Analysis of Multifactor Data, Second Edition DASU and JOHNSON · Exploratory Data Mining and Data Cleaning DAVID · Order Statistics, Second Edition *DEGROOT, FIENBERG, and KADANE · Statistics and the Law DEL CASTILLO · Statistical Process Adjustment for Quality Control DETTE and STUDDEN · The Theory of Canonical Moments with Applications in Statistics, Probability, and Analysis DEY and MUKERJEE · Fractional Factorial Plans DILLON and GOLDSTEIN · Multivariate Analysis: Methods and Applications DODGE · Alternative Methods of Regression *DODGE and ROMIG · Sampling Inspection Tables, Second Edition *DOOB · Stochastic Processes DOWDY and WEARDEN · Statistics for Research, Second Edition DRAPER and SMITH · Applied Regression Analysis, Third Edition DRYDEN and MARDIA · Statistical Shape Analysis DUDEWICZ and MISHRA · Modern Mathematical Statistics DUNN and CLARK · Applied Statistics: Analysis of Variance and Regression, Second Edition DUNN and CLARK · Basic Statistics: A Primer for the Biomedical Sciences, Third Edition DUPUIS and ELLIS · A Weak Convergence Approach to the Theory of Large Deviations *ELANDT-JOHNSON and JOHNSON · Survival Models and Data Analysis ENDERS · Applied Econometric Time Series ETHIER and KURTZ · Markov Processes: Characterization and Convergence EVANS, HASTINGS, and PEACOCK · Statistical Distributions, Third Edition FELLER · An Introduction to Probability Theory and Its Applications, Volume I, Third Edition, Revised; Volume II, Second Edition FISHER and VAN BELLE · Biostatistics: A Methodology for the Health Sciences *FLEISS · The Design and Analysis of Clinical Experiments FLEISS · Statistical Methods for Rates and Proportions, Second Edition FLEMING and HARRINGTON · Counting Processes and Survival Analysis FULLER · Introduction to Statistical Time Series, Second Edition FULLER · Measurement Error Models GALLANT · Nonlinear Statistical Models GHOSH, MUKHOPADHYAY, and SEN · Sequential Estimation GIFI · Nonlinear Multivariate Analysis GLASSERMAN and YAO · Monotone Structure in Discrete-Event Systems GNANADESIKAN · Methods for Statistical Data Analysis of Multivariate Observations, Second Edition GOLDSTEIN and LEWIS · Assessment: Problems, Development, and Statistical Issues GREENWOOD and NIKULIN · A Guide to Chi-Squared Testing GROSS and HARRIS · Fundamentals of Queueing Theory, Third Edition *HAHN and SHAPIRO · Statistical Models in Engineering HAHN and MEEKER · Statistical Intervals: A Guide for Practitioners HALD · A History of Probability and Statistics and their Applications Before 1750 *Now available in a lower priced paperback edition in the Wiley Classics Library. HALD · A History of Mathematical Statistics from 1750 to 1930 HAMPEL · Robust Statistics: The Approach Based on Influence Functions HANNAN and DEISTLER · The Statistical Theory of Linear Systems HEIBERGER · Computation for the Analysis of Designed Experiments HEDAYAT and SINHA · Design and Inference in Finite Population Sampling HELLER · MACSYMA for Statisticians HINKELMAN and KEMPTHORNE: · Design and Analysis of Experiments, Volume 1: Introduction to Experimental Design HOAGLIN, MOSTELLER, and TUKEY · Exploratory Approach to Analysis of Variance HOAGLIN, MOSTELLER, and TUKEY · Exploring Data Tables, Trends and Shapes *HOAGLIN, MOSTELLER, and TUKEY · Understanding Robust and Exploratory Data Analysis HOCHBERG and TAMHANE · Multiple Comparison Procedures HOCKING · Methods and Applications of Linear Models: Regression and the Analysis of Variance, Second Edition HOEL · Introduction to Mathematical Statistics, Fifth Edition HOGG and KLUGMAN · Loss Distributions HOLLANDER and WOLFE · Nonparametric Statistical Methods, Second Edition HOSMER and LEMESHOW · Applied Logistic Regression, Second Edition HOSMER and LEMESHOW · Applied Survival Analysis: Regression Modeling of Time to Event Data HØYLAND and RAUSAND · System Reliability Theory: Models and Statistical Methods HUBER · Robust Statistics HUBERTY · Applied Discriminant Analysis HUNT and KENNEDY · Financial Derivatives in Theory and Practice HUSKOVA, BERAN, and DUPAC · Collected Works of Jaroslav Hajek—with Commentary IMAN and CONOVER · A Modern Approach to Statistics JACKSON · A User’s Guide to Principle Components JOHN · Statistical Methods in Engineering and Quality Assurance JOHNSON · Multivariate Statistical Simulation JOHNSON and BALAKRISHNAN · Advances in the Theory and Practice of Statistics: A Volume in Honor of Samuel Kotz JUDGE, GRIFFITHS, HILL, LÜTKEPOHL, and LEE · The Theory and Practice of Econometrics, Second Edition JOHNSON and KOTZ · Distributions in Statistics JOHNSON and KOTZ (editors) · Leading Personalities in Statistical Sciences: From the Seventeenth Century to the Present JOHNSON, KOTZ, and BALAKRISHNAN · Continuous Univariate Distributions, Volume 1, Second Edition JOHNSON, KOTZ, and BALAKRISHNAN · Continuous Univariate Distributions, Volume 2, Second Edition JOHNSON, KOTZ, and BALAKRISHNAN · Discrete Multivariate Distributions JOHNSON, KOTZ, and KEMP · Univariate Discrete Distributions, Second Edition  JUREC KOVÁ and SEN · Robust Statistical Procedures: Aymptotics and Interrelations JUREK and MASON · Operator-Limit Distributions in Probability Theory KADANE · Bayesian Methods and Ethics in a Clinical Trial Design KADANE AND SCHUM · A Probabilistic Analysis of the Sacco and Vanzetti Evidence KALBFLEISCH and PRENTICE · The Statistical Analysis of Failure Time Data, Second Edition KASS and VOS · Geometrical Foundations of Asymptotic Inference KAUFMAN and ROUSSEEUW · Finding Groups in Data: An Introduction to Cluster Analysis KEDEM and FOKIANOS · Regression Models for Time Series Analysis KENDALL, BARDEN, CARNE, and LE · Shape and Shape Theory KHURI · Advanced Calculus with Applications in Statistics, Second Edition KHURI, MATHEW, and SINHA · Statistical Tests for Mixed Linear Models KLUGMAN, PANJER, and WILLMOT · Loss Models: From Data to Decisions KLUGMAN, PANJER, and WILLMOT · Solutions Manual to Accompany Loss Models: From Data to Decisions KOTZ, BALAKRISHNAN, and JOHNSON · Continuous Multivariate Distributions, Volume 1, Second Edition KOTZ and JOHNSON (editors) · Encyclopedia of Statistical Sciences: Volumes 1 to 9 with Index KOTZ and JOHNSON (editors) · Encyclopedia of Statistical Sciences: Supplement Volume KOTZ, READ, and BANKS (editors) · Encyclopedia of Statistical Sciences: Update Volume 1 KOTZ, READ, and BANKS (editors) · Encyclopedia of Statistical Sciences: Update Volume 2 KOVALENKO, KUZNETZOV, and PEGG · Mathematical Theory of Reliability of Time-Dependent Systems with Practical Applications LACHIN · Biostatistical Methods: The Assessment of Relative Risks LAD · Operational Subjective Statistical Methods: A Mathematical, Philosophical, and Historical Introduction LAMPERTI · Probability: A Survey of the Mathematical Theory, Second Edition LANGE, RYAN, BILLARD, BRILLINGER, CONQUEST, and GREENHOUSE · Case Studies in Biometry LARSON · Introduction to Probability Theory and Statistical Inference, Third Edition LAWLESS · Statistical Models and Methods for Lifetime Data, Second Edition LAWSON · Statistical Methods in Spatial Epidemiology LE · Applied Categorical Data Analysis LE · Applied Survival Analysis LEE and WANG · Statistical Methods for Survival Data Analysis, Third Edition LEPAGE and BILLARD · Exploring the Limits of Bootstrap LEYLAND and GOLDSTEIN (editors) · Multilevel Modelling of Health Statistics LIAO · Statistical Group Comparison *Now available in a lower priced paperback edition in the Wiley Classics Library. LINDVALL · Lectures on the Coupling Method LINHART and ZUCCHINI · Model Selection LITTLE and RUBIN · Statistical Analysis with Missing Data, Second Edition LLOYD · The Statistical Analysis of Categorical Data MAGNUS and NEUDECKER · Matrix Differential Calculus with Applications in Statistics and Econometrics, Revised Edition MALLER and ZHOU · Survival Analysis with Long Term Survivors MALLOWS · Design, Data, and Analysis by Some Friends of Cuthbert Daniel MANN, SCHAFER, and SINGPURWALLA · Methods for Statistical Analysis of Reliability and Life Data MANTON, WOODBURY, and TOLLEY · Statistical Applications Using Fuzzy Sets MARDIA and JUPP · Directional Statistics MASON, GUNST, and HESS · Statistical Design and Analysis of Experiments with Applications to Engineering and Science, Second Edition McCULLOCH and SEARLE · Generalized, Linear, and Mixed Models McFADDEN · Management of Data in Clinical Trials McLACHLAN · Discriminant Analysis and Statistical Pattern Recognition McLACHLAN and KRISHNAN · The EM Algorithm and Extensions McLACHLAN and PEEL · Finite Mixture Models McNEIL · Epidemiological Research Methods MEEKER and ESCOBAR · Statistical Methods for Reliability Data MEERSCHAERT and SCHEFFLER · Limit Distributions for Sums of Independent Random Vectors: Heavy Tails in Theory and Practice *MILLER · Survival Analysis, Second Edition MONTGOMERY, PECK, and VINING · Introduction to Linear Regression Analysis, Third Edition MORGENTHALER and TUKEY · Configural Polysampling: A Route to Practical Robustness MUIRHEAD · Aspects of Multivariate Statistical Theory MURRAY · X-STAT 2.0 Statistical Experimentation, Design Data Analysis, and Nonlinear Optimization MYERS and MONTGOMERY · Response Surface Methodology: Process and Product Optimization Using Designed Experiments, Second Edition MYERS, MONTGOMERY, and VINING · Generalized Linear Models. With Applications in Engineering and the Sciences NELSON · Accelerated Testing, Statistical Models, Test Plans, and Data Analyses NELSON · Applied Life Data Analysis NEWMAN · Biostatistical Methods in Epidemiology OCHI · Applied Probability and Stochastic Processes in Engineering and Physical Sciences OKABE, BOOTS, SUGIHARA, and CHIU · Spatial Tesselations: Concepts and Applications of Voronoi Diagrams, Second Edition OLIVER and SMITH · Influence Diagrams, Belief Nets and Decision Analysis PANKRATZ · Forecasting with Dynamic Regression Models PANKRATZ · Forecasting with Univariate Box-Jenkins Models: Concepts and Cases *PARZEN · Modern Probability Theory and Its Applications PEÑA, TIAO, and TSAY · A Course in Time Series Analysis PIANTADOSI · Clinical Trials: A Methodologic Perspective PORT · Theoretical Probability for Applications POURAHMADI · Foundations of Time Series Analysis and Prediction Theory PRESS · Bayesian Statistics: Principles, Models, and Applications PRESS · Subjective and Objective Bayesian Statistics, Second Edition PRESS and TANUR · The Subjectivity of Scientists and the Bayesian Approach PUKELSHEIM · Optimal Experimental Design PURI, VILAPLANA, and WERTZ · New Perspectives in Theoretical and Applied Statistics PUTERMAN · Markov Decision Processes: Discrete Stochastic Dynamic Programming *RAO · Linear Statistical Inference and Its Applications, Second Edition RENCHER · Linear Models in Statistics RENCHER · Methods of Multivariate Analysis, Second Edition RENCHER · Multivariate Statistical Inference with Applications RIPLEY · Spatial Statistics RIPLEY · Stochastic Simulation ROBINSON · Practical Strategies for Experimenting ROHATGI and SALEH · An Introduction to Probability and Statistics, Second Edition ROLSKI, SCHMIDLI, SCHMIDT, and TEUGELS · Stochastic Processes for Insurance and Finance ROSENBERGER and LACHIN · Randomization in Clinical Trials: Theory and Practice ROSS · Introduction to Probability and Statistics for Engineers and Scientists ROUSSEEUW and LEROY · Robust Regression and Outlier Detection RUBIN · Multiple Imputation for Nonresponse in Surveys RUBINSTEIN · Simulation and the Monte Carlo Method RUBINSTEIN and MELAMED · Modern Simulation and Modeling RYAN · Modern Regression Methods RYAN · Statistical Methods for Quality Improvement, Second Edition SALTELLI, CHAN, and SCOTT (editors) · Sensitivity Analysis *SCHEFFE · The Analysis of Variance *Now available in a lower priced paperback edition in the Wiley Classics Library. SCHIMEK · Smoothing and Regression: Approaches, Computation, and Application SCHOTT · Matrix Analysis for Statistics SCHUSS · Theory and Applications of Stochastic Differential Equations SCOTT · Multivariate Density Estimation: Theory, Practice, and Visualization *SEARLE · Linear Models SEARLE · Linear Models for Unbalanced Data SEARLE · Matrix Algebra Useful for Statistics SEARLE, CASELLA, and McCULLOCH · Variance Components SEARLE and WILLETT · Matrix Algebra for Applied Economics SEBER and LEE · Linear Regression Analysis, Second Edition SEBER · Multivariate Observations SEBER and WILD · Nonlinear Regression SENNOTT · Stochastic Dynamic Programming and the Control of Queueing Systems *SERFLING · Approximation Theorems of Mathematical Statistics SHAFER and VOVK · Probability and Finance: It’s Only a Game! SMALL and MCLEISH · Hilbert Space Methods in Probability and Statistical Inference SRIVASTAVA · Methods of Multivariate Statistics STAPLETON · Linear Statistical Models STAUDTE and SHEATHER · Robust Estimation and Testing STOYAN, KENDALL, and MECKE · Stochastic Geometry and Its Applications, Second Edition STOYAN and STOYAN · Fractals, Random Shapes and Point Fields: Methods of Geometrical Statistics STYAN · The Collected Papers of T. W. Anderson: 1943–1985 SUTTON, ABRAMS, JONES, SHELDON, and SONG · Methods for Meta-Analysis in Medical Research TANAKA · Time Series Analysis: Nonstationary and Noninvertible Distribution Theory THOMPSON · Empirical Model Building THOMPSON · Sampling, Second Edition THOMPSON · Simulation: A Modeler’s Approach THOMPSON and SEBER · Adaptive Sampling THOMPSON, WILLIAMS, and FINDLAY · Models for Investors in Real World Markets TIAO, BISGAARD, HILL, PEÑA, and STIGLER (editors) · Box on Quality and Discovery: with Design, Control, and Robustness TIERNEY · LISP-STAT: An Object-Oriented Environment for Statistical Computing and Dynamic Graphics TSAY · Analysis of Financial Time Series UPTON and FINGLETON · Spatial Data Analysis by Example, Volume II: Categorical and Directional Data VAN BELLE · Statistical Rules of Thumb VIDAKOVIC · Statistical Modeling by Wavelets WEISBERG · Applied Linear Regression, Second Edition WELSH · Aspects of Statistical Inference WESTFALL and YOUNG · Resampling-Based Multiple Testing: Examples and Methods for p-Value Adjustment WHITTAKER · Graphical Models in Applied Multivariate Statistics WINKER · Optimization Heuristics in Economics: Applications of Threshold Accepting WONNACOTT and WONNACOTT · Econometrics, Second Edition WOODING · Planning Pharmaceutical Clinical Trials: Basic Statistical Principles WOOLSON and CLARKE · Statistical Methods for the Analysis of Biomedical Data, Second Edition WU and HAMADA · Experiments: Planning, Analysis, and Parameter Design Optimization YANG · The Construction Theory of Denumerable Markov Processes *ZELLNER · An Introduction to Bayesian Inference in Econometrics ZHOU, OBUCHOWSKI, and MCCLISH · Statistical Methods in Diagnostic Medicine *Now available in a lower priced paperback edition in the Wiley Classics Library.


Comments

Copyright © 2025 UPDOCS Inc.