Note: Quandl is now Nasdaq Data Link. Visit data.nasdaq.com for more information.
This is the second part of our interview with a senior quantitative portfolio manager at a large hedge fund. In the first part, we covered the theoretical phase of creating a quantitative trading strategy. In this part, we cover the transition into “production.” We’ve also published a third part with answers to readers’ questions.
You can read the first part of the interview here.
You can read the third part of the interview here.
What does moving into production entail?
For starters, I now have to worry about the “real world” — nuances like day-count conventions, settlement dates and holidays. When calibrating on historical data, you can get away with approximations for these. But when it comes to individual live trades, you can’t be sloppy; you have to be exact.
Another aspect of production is that speed is critical. I can’t fit my model to market data in real time (gradient descent is slow!) so instead, I have to reduce everything to linear approximations of changes. This entails a lot of matrix manipulation.
I usually build an execution prototype that does everything “correctly” but inefficiently. I then hand that over to my engineering colleagues who build a performant version in Python or even C, using the market libraries they’ve built over the years. And that version pipes its output into my trading station, for me to actually start executing on this strategy.
And then, hopefully, I start making money.
How long does this entire process take?
It typically takes months of work to bring a new strategy from drawing-board to production – and that’s for the strategies that actually work. Most don’t. And even the successful strategies have a shelf-life of a couple of years before they get arbitraged away, so the above process repeats itself all the time. I have to reinvent my approach to trading every few years.
Are you ever worried that the model-based opportunities you strive to capture will disappear for good?
Of course. In my experience all opportunities eventually go away. And indeed one of the biggest challenges in the type of modelling I do is knowing when a live model is obsolete. All models lose money on some days and weeks. And it is very difficult to recognize when losses are part of a model that is still working and when those losses are signalling the death of the model.
Where do you get ideas for new models or trading strategies?
Anywhere I can! But I have a few avenues I resort to pretty consistently.
First, data. If you have a new or obscure source of data which anticipates the market in some way, that’s the easiest way to generate alpha. These days especially there are a ton of interesting new data sources out there: startups collecting new datasets; analytics firms with predictive indicators; large corporations with “data exhaust” that we can mine; and aggregators like Quandl to bring them all together. I’m always on the lookout for interesting, unusual and predictive datasets.
Second, the markets themselves. Bankers are always inventing new instruments, and they typically foster new inefficiencies. If you keep your finger on the pulse of the market, it’s pretty easy to find model-driven opportunities that less sophisticated participants might miss.
Third, global patterns. History may not repeat but it certainly rhymes. For instance, if you want to trade the US interest curve, the Japanese market is a rich source of ideas; Japan went through zero interest rates years before the US. In fact, I built some very successful models of the US Treasury market based purely on the behavior of JGBs a decade previously.
Fourth, analogies. Some of the best trades occur because I apply thinking from regime A within regime B. Different asset classes have differing degrees of sophistication; you can arbitrage this difference.
Fifth, I just keep my eyes and ears open. The world is a pretty inefficient place; if you’re inquisitive and keep asking “why? / why not?”, you will always find opportunities.
It sounds like you use a lot of tools – Mathematica, Matlab, Python, Excel, C. Is that deliberate?
Absolutely. Different stages in the pipeline require different tools. I’d be an idiot to build real-time performant systems in Excel, or do symbolic manipulation in Python. Not that that can’t be done, but there are other tools that are better for those tasks.
How do you manage the data flow for all these stages and tools?
In the early days, I’m still working on proof of concept, so stylized data is fine. But as the model gets closer and closer to production, the more granular and “real” the data has to become. There’s a whole separate infrastructure around getting high quality data and maintaining it – Quandl helps here – which I haven’t talked about, but the truth is, it’s critical. The best model in the world will fail if the data inputs are incorrect. No amount of skill can make up for bad data.
(We received so many excellent questions from readers that we published a third part of this series.)