Monday, September 26, 2016

Fascinating Conference at Chicago

I just returned from the University of Chicago conference, "Machine Learning: What's in it for Economics?"  Lots of cool things percolating.  I'm teaching a Penn Ph.D. course later this fall on aspects of the ML/econometrics interface.  Feeling really charged.

By the way, hadn't yet been to the new Chicago economics "cathedral" (Saieh Hall for Economics) and Becker-Friedman Institute.  Wow.  What an institution, both intellectually and physically.

Tuesday, September 20, 2016

On "Shorter Papers"

Journals should not corral shorter papers into sections like "Shorter Papers".  Doing so sends a subtle (actually unsubtle) message that shorter papers are basically second-class citizens, somehow less good, or less important, or less something -- not just less long -- than longer papers.  If a paper is above the bar, then it's above the bar, and regardless of its length it should then be published simply as a paper, not a "shorter paper", or a "note", or anything else.  There are numerous examples of "shorter papers" much more important than the vast majority of "longer papers".

Monday, September 12, 2016

Time-Series Econometrics and Climate Change

It's exciting to see time series econometrics contributing to the climate change discussion.  

Check out the upcoming CREATES conference, "Econometric Models of Climate Change", here.

Here are a few good examples of recent time-series climate research, in chronological order.  (There are many more.  Look through the reference lists, for example, in the 2016 and 2017 papers below.)

Jim Stock et al. (2009) in Climatic Change.

Pierre Perron et al. (2013) in Nature.

Peter Phillips et al. (2016) in Nature.

Proietti and Hillebrand (2017), forthcoming in Journal of the Royal Statistical Society.

Tuesday, September 6, 2016

Inane Journal "Impact Factors"

Why are journals so obsessed with "impact factors"? (The five-year impact factor is average citations/article in a five-year window.)  They're often calculated to three decimal places, and publishers trumpet victory when they go from (say) 1.225 to 1.311!  It's hard to think of a dumber statistic, or dumber over-interpretation.  Are the numbers after the decimal point anything more than noise, and for that matter, are the numbers before the decimal much more than noise?

Why don't journals instead use the same citation indexes used for individuals? The leading index seems to be the h-index, which is the largest integer h such that an individual has h papers, each cited at least h times. I don't know who cooked up the h-index, and 
surely it has issues too, but the gurus love it, and in my experience it tells the truth.

Even better, why not stop obsessing over clearly-insufficient statistics of any kind? I propose instead looking at what I'll call a "citation signature plot" (CSP), simply plotting the number of cites for the most-cited paper, the number of cites for the second-most-cited paper, and so on. (Use whatever window(s) you want.) The CSP reveals everything, instantly and visually. How high is the CSP for the top papers? How quickly, and with what pattern, does it approach zero? etc., etc. It's all there.

Google-Scholar CSP's are easy to make for individuals, and they're tremendously informative. They'd be only slightly harder to make for journals. I'd love to see some.

Monday, August 29, 2016

On Credible Cointegration Analyses

I may not know whether some \(I(1)\) variables are cointegrated, but if they are, I often have a very strong view about the likely number and nature of cointegrating combinations. Single-factor structure is common in many areas of economics and finance, so if cointegration is present in an \(N\)-variable system, for example, a natural benchmark is 1 common trend (\(N-1\) cointegrating combinations).  And moreover, the natural cointegrating combinations are almost always spreads or ratios (which of course are spreads in logs). For example, log consumption and log income may or may not be cointegrated, but if they are, then the obvious benchmark cointegrating combination is \((ln C - ln Y)\). Similarly, the obvious benchmark for \(N\) government bond yields \(y\) is \(N-1\) cointegrating combinations, given by term spreads relative to some reference yield; e.g., \(y_2 - y_1\), \(y_3 - y_1\), ..., \(y_N - y_1\).

There's not much literature exploring this perspective. (One notable exception is Horvath and Watson, "Testing for Cointegration When Some of the Cointegrating Vectors are Prespecified", Econometric Theory, 11, 952-984.) We need more.

Sunday, August 21, 2016

More on Big Data and Mixed Frequencies

I recently blogged on Big Data and mixed-frequency data, arguing that Big Data (wide data, in particular) leads naturally to mixed-frequency data.  (See here for the tall data / wide data / dense data taxonomy.)  The obvious just occurred to me, namely that it's also true in the other direction. That is, mixed-frequency situations also lead naturally to Big Data, and with a subtle twist: the nature of the Big Data may be dense rather than wide. The theoretically-pure way to set things up is as a state-space system laid out at the highest observed frequency, appropriately treating most of the lower-frequency data as missing, as in ADS.  By construction, the system is dense if any of the series are dense, as the system is laid out at the highest frequency.

Wednesday, August 17, 2016

On the Evils of Hodrick-Prescott Detrending

[If you're reading this in email, remember to click through on the title to get the math to render.]

Jim Hamilton has a very cool new paper, "Why You Should Never Use the Hodrick-Prescott (HP) Filter".

Of course we've known of the pitfalls of HP ever since Cogley and Nason (1995) brought them into razor-sharp focus decades ago.  The title of the even-earlier Nelson and Kang (1981) classic, "Spurious Periodicity in Inappropriately Detrended Time Series", says it all.  Nelson-Kang made the spurious-periodicity case against polynomial detrending of I(1) series.  Hamilton makes the spurious-periodicity case against HP detrending of many types of series, including I(1).  (Or, more precisely, Hamilton adds even more weight to the Cogley-Nason spurious-periodicity case against HP.)

But the main contribution of Hamilton's paper is constructive, not destructive.  It provides a superior detrending method, based only on a simple linear projection. 

Here's a way to understand what "Hamilton detrending" does and why it works, based on a nice connection to Beveridge-Nelson (1981) detrending not noticed in Hamilton's paper.  

First consider Beveridge-Nelson (BN) trend for I(1) series.  BN trend is just a very long-run forecast based on an infinite past.  [You want a very long-run forecast in the BN environment because the stationary cycle washes out from a very long-run forecast, leaving just the forecast of the underlying random-walk stochastic trend, which is also the current value of the trend since it's a random walk.  So the BN trend at any time is just a very long-run forecast made at that time.]  Hence BN trend is implicitly based on the projection: \(y_t ~ \rightarrow ~ c, ~ y_{t-h}, ~...,~ y_{t-h-p} \), for \(h \rightarrow \infty \) and \(p \rightarrow \infty\).

Now consider Hamilton trend.  It is explicitly based on the projection: \(y_t ~ \rightarrow ~ c, ~ y_{t-h}, ~...,~ y_{t-h-p} \), for \(p = 3 \).  (Hamilton also uses a benchmark of  \(h = 8 \).)

So BN and Hamilton are both "linear projection trends", differing only in choice of \(h\) and \(p\)!  BN takes an infinite forecast horizon and projects on an infinite past.  Hamilton takes a medium forecast horizon and projects on just the recent past.

Much of Hamilton's paper is devoted to defending the choice of \(p = 3 \), which turns out to perform well for a wide range of data-generating processes (not just I(1)).  The BN choice of \(h = p = \infty \), in contrast, although optimal for I(1) series, is less robust to other DGP's.  (And of course estimation of the BN projection as written above is infeasible, which people avoid in practice by assuming low-ordered ARIMA structure.)

Monday, August 15, 2016

More on Nonlinear Forecasting Over the Cycle

Related to my last post, here's a new paper that just arrived from Rachidi Kotchoni and Dalibor Stevanovic, "Forecasting U.S. Recessions and Economic Activity". It's not non-parametric, but it is non-linear. As Dalibor put it, "The method is very simple: predict turning points and recession probabilities in the first step, and then augment a direct AR model with the forecasted probability." Kotchoni-Stevanovic and Guerron-Quintana-Zhong are usefully read together.

Sunday, August 14, 2016

Nearest-Neighbor Forecasting in Times of Crisis

Nonparametric K-nearest-neighbor forecasting remains natural and obvious and potentially very useful, as it has been since its inception long ago.

[Most crudely: Find the K-history closest to the present K-history, see what followed it, and use that as a forecast. Slightly less crudely: Find the N K-histories closest to the present K-history, see what followed each of them, and take an average. There are many obvious additional refinements.]

Overall, nearest-neighbor forecasting remains curiously under-utilized in dynamic econometrics. Maybe that will change. In an interesting recent development, for example, new Federal Reserve System research by Pablo Guerron-Quintana and Molin Zhong puts nearest-neighbor methods to good use for forecasting in times of crisis.

Monday, August 8, 2016

NSF Grants vs. Improved Data

Lots of people are talking about the Cowen-Tabarrok Journal of Economic Perspectives piece, "A Skeptical View of the National Science Foundation’s Role in Economic Research". See, for example, John Cochrane's insightful "A Look in the Mirror".

A look in the mirror indeed. I was a 25-year ward of the NSF, but for the past several years I've been on the run. I bolted in part because the economics NSF reward-to-effort ratio has fallen dramatically for senior researchers, and in part because, conditional on the ongoing existence of NSF grants, I feel strongly that NSF money and "signaling" are better allocated to young assistant and associate professors, for whom the signaling value from NSF support is much higher.

Cowen-Tabarrok make some very good points. But I can see both sides of many of their issues and sub-issues, so I'm not taking sides. Instead let me make just one observation (and I'm hardly the first).

If NSF funds were to be re-allocated, improved data collection and dissemination looks attractive. I'm not talking about funding cute RCTs-of-the-month. Rather, I'm talking about funding increased and ongoing commitment to improving our fundamental price and quantity data (i.e., the national accounts and related statistics). They desperately need to be brought into the new millennium. Just look, for example, at the wealth of issues raised in recent decades by the Conference on Research in Income and Wealth.

Ironically, it's hard to make a formal case (at least for data dissemination as opposed to creation), as Chris Sims has emphasized with typical brilliance. His "The Futility of Cost-Benefit Analysis for Data Dissemination" explains "why the apparently reasonable idea of applying cost-benefit analysis to government programs founders when applied to data dissemination programs." So who knows how I came to feel that NSF funds might usefully be re-allocated to data collection and dissemination. But so be it.