Wednesday, December 3, 2014

Kazakhstani income and oil prices (wonkish)

In Kazakhstan, a 10% change in the annual spot price of Brent oil seems to lead to roughly a 5% change in average real income in the same direction. 

Since oil and natural gas exports comprise more than a third of Kazakhstan’s economy, a simple model of global oil prices well explains changes in gross domestic product per capita, which is measured in international dollars adjusted for inflation. 

The table below reports the results of an Ordinary Least Squares (OLS) regression of real GDP per capita on the spot price of Brent oil (a benchmark for the global market) for 1999 through 2013.  Both variables are annual and in log form. 

In general, the model performs fairly well.  R-squared indicates that the model explains 92.5% of the variation in average income from year to year.  The large F statistic (159.52) suggests that the model almost certainly predicts more accurately than one that assumes that GDP per capita is constant over time.  The root mean squared error suggests that the model’s average annual mistake in predicting GDP (as measured by this statistic) is 8.5%.     

The t statistic for the coefficient on OilPrice is large (12.63).  Setting aside shocks to the world economy that are unexpected and extraordinary, we can be virtually certain that global oil prices will continue to dominate Kazakhstan’s economy in the next few years at least. 

The coefficient on OilPrice is the elasticity of average real income with respect to the oil price.  A 1% increase in that price leads to a rise in income in the same year of .46 of a percent.

The constant in the model (7.845) suggests that 2,553 international dollars of annual real income do not depend on oil prices.  In particular, if oil prices fall to $1 per barrel, then the model predicts an income per capita of 2,553 dollars, about one-seventh of actual income averaged over the period of 1999 through 2013. 

Output for the OLS model
      Source |       SS       df       MS                 Number of obs =      15
-------------+------------------------------          F(1, 13) =  159.52
       Model |  1.147         1        1.147               Prob > F      =  0.000
    Residual |   .093        13         .007               R-squared     =    .925
-------------+------------------------------           
       Total   |  1.240       14         .089                Root MSE      =  .085

------------------------------------------------------------------------------
     GDP      |      Coef.   Std. Err.          t      P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
  OilPrice    |       .456         .036    12.63   0.000           .378      .534
   Constant  |     7.845         .145    54.28   0.000         7.533    8.158


The estimated model is:

LN (GDP per capita) = 7.845 + .456 * LN (OilPrice)

where LN denotes a natural log.

Caveats.  A plain-vanilla OLS model assumes that the strength of the relationship between the dependent variable and an independent variable is the same whether the latter rises or falls.  Thus our model predicts that if oil prices rise 10%, then income will rise about 5%; and if oil prices fall 10%, income will fall 5%.  In reality, annual oil prices have fallen sharply only once since 1998; and income has fallen only once in that period.  Both declines occurred in 2009, during the Great Recession.  At that time, oil prices fell 36.3%; income, only 1.4%, at least partly because the government stepped up spending to cover the loss of private consumption.  So experience suggests that the model may overstate the loss of income due to a large decline in oil prices.

The overstatement occurs because OLS is a linear model and because it assigns the same weight to each observation.  Of the 15 observations, only one is of a decline in average income.  –Leon Taylor, tayloralmaty@gmail.com 

Technical notes

OLS assumes that the independent variables (which are on the right-hand side) capture all important systematic influences on the dependent variable (on the left-hand side).  Unimportant or random influences show up in the error term, which is the difference between the actual value of the dependent variable and the predicted value.  

The assumption of independent errors might often be wrong.  The error term might correlate with the independent variables since these might influence its variance (heteroskedasticity). Or the error term may correlate with itself; e(t), for example, might correlate with e(t-1) (serial correlation).  Let’s check our model for these possibilities.

Heteroskedasticity.  I ran the Breusch-Pagan/Cook-Weisberg test.  Its null hypothesis is homoskedasticity (that is, the variance of the error is the same for each observation, which is what OLS assumes).  The probability of homoskedasticity, given estimated parameters, is 11.9%, so I did not reject that possibility.

Serial correlation.  I ran the Breusch-Godfrey (LM) test.  Its null hypothesis is no serial correlation.  The probability that the null is correct, given parameter estimates, is .364, so I did not reject it.

Nonstationarity.  A model that changes over time may predict poorly because it is based on obsolete data.  Dickey-Fuller tests found that the GDP variable was stationary but the oil-price variable was not.  (The test statistics were -4.24 and -1.35 respectively.  Both variables were in log form.)

When rewritten, nonstationary variables may have a stable – that is, stationary – relationship to one another.  In that case, the revision may predict well.  To check for this possibility of cointegration, I ran the Dickey-Fuller test on the error term of a model regressing GDP on oil prices.  The test statistic was -2.81.  Given an indefinitely large sample, the critical value under the Engle-Granger procedure at the 10% level of significance is -3.04.  (The level of significance is the largest probability of error that one is willing to tolerate by rejecting the null hypothesis.  In this case, the null is nonstationarity.) 

Although I have a small sample, I concluded that the model may border on cointegration.  I preferred this model to one that first-differences the oil-price variable in order to get stationarity, since differencing would eliminate an observation.  But one should bear in mind that in the model I use, nonstationarity may affect forecasts.   

Omitted variables.  If we fail to control for a systematic influence with an independent variable, then it will show up in the error term.  If it also correlates with an independent variable, then it may bias the coefficient estimate for that variable.  This problem is more serious than are serial correlation and heteroskedasticity since these do not produce bias. 

I applied the Ramsey RESET test, which examines the possibility of omitted variables that are higher-order forms of the independent variables already in the model.  The null hypothesis is that there are no omitted variables.  The probability of the null is .72, so I did not agonize over the problem. 

No comments:

Post a Comment