On this page, you can find my research activities. My research interests include data & AI-centric topics, particularly methods to make social science data more useful for analysis. Moreover, I am refereeing for the Journal of Financial Services Research, Journal of Financial Stability, Spring Meeting Young Economists, and IEEE International Conference on Big Data.
Multilingual De-Duplication Strategies (joint with Stefan Pasch and Dimitrios Petridis), submitted
[arXiv] [Replication Files] [Slides] [Media Coverage]
Text Classification with Limited Training Data (joint with Stefan Pasch), IEEE International Conference on Big Data, Washington D.C., 2024, pp. 8506-8511
[Accepted Version] [Replication Files] [Slides]
Debt Derisking (joint with Gianpaolo Parise and Andreas Schrimpf), Management Science, pre-print available online (2023)
[Accepted Version] [Replication Files] [BIS Working Paper No. 868] [CEPR Discussion Paper 14817]
Breaking the U: ASymetric U-net for Object Detection (joint with Stefan Pasch), IEEE International Conference on Big Data, Sorrento, Italy, 2023, pp. 6069-6075
[Accepted Version] [Replication Files] [Slides]
Econometrics at Scale: Spark Up Big Data In Economics (joint with Benjamin Bluhm), Journal of Data Science, Volume 20, Issue 3 (2022), pp. 413–436
[Accepted Version] [Replication Files] [SAFE Working Paper No. 262]
Incentive Effects from Write-down CoCo Bonds - An Empirical Analysis (joint with Hennig Hesse), Journal of Financial Regulation, Volume 8, Issue 2 (2022)
[Accepted Version] [SAFE Working Paper No. 212]
Debt holder monitoring and implicit guarantees: Did the BRRD improve market discipline? (solo-authored) Journal of Financial Stability, Volume 54, June 2021
[Accepted Version] [SAFE Working Paper No. 232] [ESRB Working Paper No. 111]
Banks' Dollar Funding: A Source of Financial Vulnerability (joint with Adolfo Barajas, John Caparusso, Yingyuan Chen, Andrea Deghi, Zhi Ken Gan, Oksana Khadarina, Dulani Seneviratne, Peichu Xie, Yizhi Xu, Xinze Juno Yao), IMF Global Financial Stability Report, Analytical chapter, October 2019
[Accepted Version] [Replication Files]
2024: IEEE International Conference on Big Data 2024 (Washington D.C.)
2023: IEEE International Conference on Big Data 2023 (Sorrento), Workshops for Ukraine Working with Big Data with Hadoop and Spark
2022: 7th Banco de Portugal Microdata Research Laboratory (BPLIM) Empirical Research with Large Datasets
2021: Bank of Finland Data Science Community
2019: Columbia University Data Science Institute Financial and Business Analytics' Center poster session, Bank of England Conference on Modelling with Big Data and Machine Learning poster session*, CESifo Banking and Institutions Workshop, the American Finance Association (AFA) Ph.D. Poster Session
2018: 23rd Spring Meeting of Young Economists (SMYE), Deutsche Bundesbank Financial Cycles and Regulation Conference, Superintendencia de Bancos e Instituciones Financieras (SBIF) 4th Conference on Banking Development, Stability and Sustainability on Financial System Architecture
2017: 15th Corporate Finance Day Antwerp, Bank for International Settlements (BIS) Brown Bag*
2016: Goethe University Finance Brown Bag
* presented by coauthor
2024: 1st place reproducibility award and 2nd place accuracy award Eurostat Deduplication Challenge (14,000 EUR), 5th place IEEE Big Data Challenge suicide risk detection
2023: 4th place IEEE Big Data 2023 Challenge Muon Tomography
2019: The Bank of England travel grant, Ifo travel grant, Job market paper were short-listed for the Ieke van den Burg price for research on systemic risk
2018: Inter-American Development Bank travel grant (1,400$)
2017: American Finance Association Ph.D. travel grant (1,000$)
2016: SAFE Ph.D. scholarship (11,000 EUR)
2012: Baden-Württemberg Stipendium for academic exchange in the US (8,000 EUR)
2010: Stipendium der Begabten Förderung der Friedrich-Naumann-Stiftung für die Freiheit (21,600 EUR), e-fellows.net scholarship
joint with Stefan Pasch, IEEE International Conference on Big Data (forthcoming), Washington D.C.,
In this work, we evaluate the effectiveness of several machine learning models for text classification on small datasets, focusing on a collection of Reddit posts labeled for suicidal behavior. Unlike with larger datasets, where fine-tuning complex models is typically effective, we demonstrate that carefully engineered prompts can achieve superior classification accuracy when training data is limited. Our findings highlight the potential of prompt-based approaches for effective use in resource-constrained scenarios, offering insights for researchers tackling similar small dataset challenges in text classification
joint with Gianpaolo Parise and Andreas Schrimpf, Management Science, preprint (2023), https://doi.org/10.1287/mnsc.2022.03406
We examine how corporate bond fund managers manipulate portfolio risk in response to incentives. We find that liquidity risk concerns drive the allocation decisions of underperforming funds, whereas tournament incentives are of secondary importance. This leads laggard fund managers to trade off yield for liquidity while holding the exposure to other sources of risk constant. The documented derisking is stronger for managers with shorter tenure and is reinforced by a more concave flow to performance sensitivity and by periods of market stress. Derisking meaningfully supports ex-post laggard fund returns. Flexible net asset values (swing pricing) may, however, reduce derisking incentives and create moral hazard.
joint with Stefan Pasch, IEEE International Conference on Big Data, Sorrento, Italy, 2023, pp. 6069-6075, 10.1109/BigData59044.2023.10386932
We utilize a U2 net-inspired deep learning model for object detection within a scattering muon tomography setup. Capturing muon readings from scintillators situated above and beneath the examination area, we discern the angular disparities between inbound and outbound muon paths. These angular measurements serve as inputs for a neural network, proficient in forecasting the contours of objects and, to an extent, their constituent materials within the research zone. Our model handles the inherent complexity of the input data, where the angular measurements map onto a 200 x 200 space, while producing an output image of a smaller 40 x 40 resolution, resulting in an asymmetric U-Net architecture that diverges from conventional semantic segmentation models which typically maintain the same input and output dimensions.
joint with Benjamin Bluhm, Journal of Data Science, Volume 20, Issue 3 (2022): Special Issue: Data Science Meets Social Sciences, pp. 413–436, https://doi.org/10.6339/22-JDS1035
This paper provides an overview of how to use “big data” for social science research (with an emphasis on economics and finance). We investigate the performance and ease of use of different Spark applications running on a distributed file system to enable the handling and analysis of data sets that were previously not usable due to their size. More specifically, we explain how to use Spark to (i) explore big data sets that exceed retail-grade computers' memory size and (ii) run typical statistical/econometric tasks including cross-sectional, panel data, and time series regression models which are prohibitively expensive to evaluate on stand-alone machines. By bridging the gap between the abstract concept of Spark and ready-to-use examples which can easily be altered to suite the researcher's need, we provide economists and social scientists more generally with the theory and practice to handle the ever-growing datasets available. The ease of reproducing the examples in this paper makes this guide a useful reference for researchers with a limited background in data handling and distributed computing.
joint with Hennig Hesse, Journal of Financial Regulation, Volume 8, Issue 2 (2022): Pages 162–186, https://doi.org/10.1093/jfr/fjac005
Departing from the principle of absolute priority, contingent convertible (CoCo) bonds are particularly exposed to bank losses despite not having ownership rights. In this article we show the link between adverse CoCo bond design and their yields, confirming the existence of market discipline in designated bail-in debt. Specifically, focusing on the write-down feature as a loss-absorption mechanism in CoCo debt, we find a yield premium on this feature relative to equity-conversion CoCo bonds as predicted by theoretical models. Moreover, and consistent with theories on moral hazard, we find this premium to be largest when existing incentives for opportunistic behaviour are largest, while the premium is non-existent if moral hazard is perceived to be small. Overall, our findings support the notion of market discipline through monitoring debt investors and have important implications for the optimal design of CoCos from a regulatory perspective.
Journal of Financial Stability, Volume 54, June 2021, https://doi.org/10.1016/j.jfs.2021.100879
This paper argues that the European Union’s Banking Recovery and Resolution Directive (BRRD) has improved market discipline in the European bank market for unsecured debt. The different impact of the BRRD on bank bonds provides a quasi-natural experiment that allows us to study the effects of the BRRD within banks using a difference-in-difference approach. Identification is based on the fact that (otherwise identical) bonds of a given bank maturing before 2016 are explicitly protected from BRRD bail-in. The empirical results are consistent with the hypothesis that debt holders actively monitor banks and that the BRRD diminished bailout expectations after its enactment. Bank bonds subject to BRRD bail-in carry a 13-basis points bail-in premium in terms of the yield spread, driven by low capitalization. Banks that respond to market pressure by de-risking their portfolios are able to secure cheaper funding for instruments that are subject to bail-in.
with Adolfo Barajas, John Caparusso, Jannic Alexander Cutura, Yingyuan Chen, Andrea Deghi, Zhi Ken Gan, Oksana Khadarina, Dulani Seneviratne, Peichu Xie, Yizhi Xu, Xinze Juno Yao, IMF Global Financial Stability Report, Analytical chapter, October 2019, https://www.imf.org/-/media/Files/Publications/GFSR/2019/October/English/ch5.ashx
In the run-up to the global financial crisis, lending in US dollars by global banks headquartered outside the United States (global non-US banks), together with their reliance on short-term and volatile wholesale funding, became crucial transmission mechanisms for shocks that originated in the major funding markets for US dollars. Whereas regulation following the crisis has improved the resilience of banking sectors in many dimensions, these mechanisms remain a source of vulnerability for the global financial system. This chapter constructs three measures to gauge the degree of US dollar funding fragility of global non-US banks and describes their evolution in recent years. Empirical results show that an increase in US dollar funding costs leads to financial stress in the economies that are home to global non-US banks and to spillovers through a cutback in loans to recipient economies, those that borrow US dollars. US dollar funding fragility and the share of US dollar assets to total assets amplify these negative effects. However, some policy-related factors can mitigate them, such as swap line arrangements between central banks and international reserve holdings by home economy central banks. Furthermore, this chapter finds that emerging markets that are recipient economies are particularly susceptible to declines in US dollar cross-border lending because they have limited ability to turn to other sources of US dollar borrowing or to replace dollars with other currencies. These results highlight the importance of controlling vulnerabilities arising from the US dollar funding of non-US banks. The US dollar funding fragility measures constructed in this chapter can help improve their monitoring.