Momentum as a risk premium has been extensively documented in the academic finance literature, with evidence of persistent abnormal returns found ubiquitously across a broad range of asset classes, prediction horizons and time periods.

(Image credit:

In time series momentum (TSMOM), an asset’s past returns is used to forecast future profitability. While conceptually simple, common TSMOM strategies require a trend estimator as well as a position sizing rule to be explicitly defined. We remove this need for handcrafting with our proposed Deep Momentum Networks, which uses long short-term memory (LSTM) architectures to learn both the estimation and sizing rule in a data driven manner [1]. Finally, the use of autodiff in backpropagation frameworks allows us to directly optimise the Sharpe ratio — improving the overall risk profile of the signal. Backtesting on a portfolio of 88 continuous futures contracts, our Sharpe-optimised LSTM model outperformed conventional methods by more than twice in the absence of transaction costs.

(Image credit:

With cross-sectional momentum (CSMOM), the strategy focuses on the relative performance across a slate of assets at a fixed point in time — after ranking securities by historical returns, top performers are bought while the bottom performers are sold. The success of such a strategy depends critically on accurately ranking assets prior to portfolio construction. Current CSMOM strategies perform this step with naive heuristics such as past returns or by sorting outputs from standard regression or classical models, which are sub-optimal for ranking in other domains (e.g. in information retrieval). By using modern information retrieval algorithms such as ListNet and LambdaMART which have featured extensively in web and database search, we learn the broader pairwise and listwise group structures across securities and consequently produce better rankings. We show in a recent work that this new class of LTR (Learning to Rank) algorithms significantly improves strategy performance — delivering approximately threefold boosting of Sharpe ratios compared to traditional benchmark on portfolios composed of US equities [2]. While these algorithms improve ranking accuracy on average, they do not account for the possibility that assets located at the extreme ends of the ranked list — which are ultimately used to construct the long/short trading portfolios — can assume different distributions in the input space, and thus lead to suboptimal strategy performance. Drawing from studies undertaken in ranking refinement, we adapt the Transformer architecture to encode the features of extreme assets for refining our selection obtained with an initial retrieval [3]. Backtesting on a set of 31 currencies, this re-ranking methodology significantly enhances Sharpe ratios – by approximately 20% over the original LTR algorithms and double that of conventional baselines.

[1] B. Lim, S. Zohren, and S. Roberts, ‘Enhancing Time Series Momentum Strategies Using Deep Neural Networks’, SSRN Journal, 2019, doi: 10.2139/ssrn.3369195. Available:

[2] D. Poh, B. Lim, S. Zohren, and S. Roberts, ‘Building Cross-Sectional Systematic Strategies by Learning to Rank’, JFDS, p. jfds.2021.1.060, Mar. 2021, doi: 10.3905/jfds.2021.1.060. Available:

[3] D. Poh, B. Lim, S. Zohren, and S. Roberts, ‘Enhancing Cross-Sectional Currency Strategies by Ranking Refinement with Transformer-based Architectures’, arXiv:2105.10019 [cs, q-fin], May 2021. Available: