** **

**ML and Quantitative Finance Workshop, Wednesday, 1 ^{st} June 2022**

**The Mathematical Institute auditorium, University of Oxford**

**Drinks reception and dinner at Somerville College**

Register here

**Note this is an in-person event and online options are not currently available**

9.00am – 9.30am Tea, Coffee, Registration

9.30am – 10.15am **Svetlana Bryzgalova, London Business Schoo****l**** **

**Title:** Missing Financial Data

**Abstract**: Missing data is a prevalent, yet often ignored, feature of company fundamentals. In this paper, we document the structure of missing financial data and show how to systematically deal with it. In a comprehensive empirical study we establish four key stylized facts. First, the issue of missing financial data is profound: it affects over 70% of firms that represent about half of the total market cap. Second, the problem becomes particularly severe when requiring multiple characteristics to be present. Third, firm fundamentals are not missing-at-random, invalidating traditional ad-hoc approaches to data imputation and sample selection. Fourth, stock returns themselves depend on missingness. We propose a novel imputation method to obtain a fully observed panel of firm fundamentals. It exploits both time-series and cross-sectional dependency of firm characteristics to impute their missing values, while allowing for general systematic patterns of missing data. Our approach provides a substantial improvement over the standard leading empirical procedures such as using cross-sectional averages or past observations. Our results have crucial implications for many areas of asset pricing.

10.15am – 10.25am Q&A

10.25am – 11.10am **Roel Oomen, Deutsche Bank and LSE**

**Title: **AI driven liquidity provision in OTC markets (paper)

**Abstract: **Providing liquidity in over-the-counter markets is a challenging under-taking, in large part because a market maker does not observe where their competitors quote, nor do they typically know how many rivals they compete with or what the trader’s overall liquidity demand is. Optimal pricing strategies can be derived in theory assuming full knowledge of the competitive environment, but these results do not translate into practice where information is incomplete and asymmetric. This paper studies whether artificial intelligence, in the form of multi-armed bandit reinforcement learning algorithms, can be used by liquidity providers to dynamically set spreads using only information that is commonly available to them. We also investigate whether collusive effects can arise when competing liquidity providers all employ such algorithms. Our findings are as follows. In a single-agent setup where only one liquidity provider is optimising pricing in an otherwise static environment, all the algorithms considered are able to locate the theoretically optimal pricing policy, albeit they do so quite inefficiently when compared to a model-based approach. In a multi-agent setting where competing liquidity providers simultaneously and independently use algorithms to optimise pricing, we demonstrate that for one class of algorithms (pseudo) collusion can not arise, while for another it can theoretically arise in certain circumstances and we provide examples where it does. The scenarios where collusive effects appear, however, are fragile and sensitive to the specific configuration and exceedingly unlikely to occur in practice. Moreover, with a modest number of competitors, collusive effects that might otherwise arise in some of the most contrived scenarios are largely or entirely eliminated.

11.10am – 11.20am Q&A

11.20am – 11.50am Tea and coffee break

11.50am – 12.35am **Markus Pelger, Stanford University**

**Title: **Deep Learning Statistical Arbitrage

**Abstract: **Statistical arbitrage identifies and exploits temporal price differences between similar assets. We propose a unifying conceptual framework for statistical arbitrage and develop a novel deep learning solution, which finds commonality and time-series patterns from large panels in a data-driven and flexible way. First, we construct arbitrage portfolios of similar assets as residual portfolios from conditional latent asset pricing factors. Second, we extract the time series signals of these residual portfolios with one of the most powerful machine learning time-series solutions, a convolutional transformer. Last, we use these signals to form an optimal trading policy, that maximizes risk-adjusted returns under constraints. We conduct a comprehensive empirical comparison study with daily large cap U.S. stocks. Our optimal trading strategy obtains a consistently high out-of-sample Sharpe ratio and substantially outperforms all benchmark approaches. It is orthogonal to common risk factors, and exploits asymmetric local trend and reversion patterns. Our strategies remain profitable after taking into account trading frictions and costs. Our findings suggest a high compensation for arbitrageurs to enforce the law of one price.

12.35am – 12.45 Q&A

12.45pm – 1.30pm **Patrick Chang, OMI DPhil student**

**Title: **Algorithmic Collusion in Electronic Markets: The Impact of Tick Size (paper)

**Abstract: **We characterise the stochastic interaction of independent learning algorithms as a deterministic system of ordinary differential equations and use it to understand the long-term behaviour of the algorithms in a repeated game. In a symmetric bimatrix repeated game, we prove that the dynamics of many learning algorithms converge to a pure strategy Nash equilibrium of the stage game. In contrast, we prove that competition between Q-learning algorithms does not always lead to a Nash equilibrium of the stage game. We apply these results to study how the size of the tick in a limit order book facilitates or obstructs tacit collusion among algorithms that compete to provide liquidity. We characterise the set of pure strategy Nash equilibria from the market making stage game with a discrete action space (e.g., the price grid of a limit order book) and its relation to the Bertrand–Nash equilibrium of the game with a continuous action space (i.e., an idealised limit order book with a zero tick size). We derive the bounds that define the set of Nash equilibria of the stage game when the action space is discrete and show that the bounds converge to the Bertrand–Nash equilibrium as the tick size of the limit order book tends to zero. For all the algorithms considered, our findings show that a large tick size obstructs competition; a smaller tick size lowers trading costs for liquidity takers, but slows down the speed of convergence to a rest point. For the algorithms with theoretical guarantees to reach a Nash equilibrium, there is no assurance that the Nash equilibrium reached is the most competitive outcome. Indeed, we show that tacit collusion can and does arise. However, the excess profits are bounded by the range of possible Nash equilibria, which shrinks with the tick size. Finally, for Q-learning, many of the outcomes are sub-optimal for both the market makers and the liquidity takers**.**

1.30pm – 1.40pm Q&A

1.40pm – 2.40pm Lunch

2.40pm – 3.25pm **Roman Kozhan, Warwick Business School**

**Title:** International Asset Pricing and Segmentation Across Asset Classes

**Abstract:** This paper studies international market integration across three major asset classes: equities, sovereign bonds, and currencies. Using the IPCA methodology, we document significant market segmentation across asset classes. In particular, factors that are optimally chosen to price risks in either currency or sovereign bond markets do not price international equity indices. At the same time, equity factors price bond and currency returns. We show that the cross-section of returns across all three asset classes can be described by a global conditional factor model with six factors and time-varying factor loadings. The optimal IPCA model greatly reduces dimensionality of the global factor model and significantly outperforms models based on traditional factors.

3.25pm – 3.35pm Q&A

3.35pm – 4.20pm **Petter Kolm, New York University**

**Title: **Deep Order Flow Imbalance: Extracting Alpha at Multiple Horizons from the Limit Order Book

**Abstract: **We describe how deep learning methods can be applied to forecast stock returns from high frequency order book states. We review the literature in this area and describe a study where we evaluate return forecasts for several deep learning models for a large subset of symbols traded on the Nasdaq exchange. We investigate whether transformations of the order book states are necessary and relate the performance of deep learning models to the stocks’ microstructural properties. In addition, we provide some color on hyperparameter sensitivity for the problem of high frequency return forecasting. This is joint work with Jeremy Turiel and Nicholas Westray.

4.20pm – 4.30pm Q&A

4.30pm – 5.00pm Tea and coffee break

5.00pm – 5.45pm **Dacheng Xiu, ****University of Chicago Booth School of Business**

**Title: **The Statistical Limit of Arbitrage, jointly written with Rui Da and Stefan Nagel

**Abstract: **In the context of a linear asset pricing model, we document a statistical limit to arbitrage due to the fact that arbitrageurs are incapable of learning a large cross-section of alphas with sufficient precision given a limited time span of data. Consequently, the optimal Sharpe ratio of arbitrage portfolios developed under rational expectation in the classical arbitrage pricing theory (APT) is overly exaggerated, even as the sample size increases and the investment opportunity set expands. We derive the optimal Sharpe ratio achievable by any feasible arbitrage strategy, and illustrate in a simple model how this Sharpe ratio varies with the strength and sparsity of alpha signals, which characterize the difficulty of arbitrageurs’ learning problem. Furthermore, we design an “all-weather” arbitrage strategy that achieves this optimal Sharpe ratio regardless of the conditions of alpha signals. We also show how arbitrageurs can adopt multiple-testing, LASSO, and Ridge methods to achieve optimality under distinct conditions of alpha signals, respectively. Our empirical analysis of more than 50 years of monthly US individual equity returns shows that all strategies we consider achieve a moderately low Sharpe ratio out of sample, in spite of a considerably higher yet infeasible one, suggesting the empirical relevance of the statistical limit of arbitrage and the empirical success of APT.

5.45pm – 5.55pm Q&A

6.30pm Drinks reception – Brittain Williams Room, Somerville College

7.45pm Dinner, main Dining Hall, Somerville College