Download my resume or academic CV.
European Central Bank
Frankfurt, Hessen, Germany
Staff Data Engineer
Aug 2025 - Present
Technical Product Ownership – Guiding a team of 15 data engineers and analysts on Agora, bridging technical execution with business strategy across multiple divisions and two directorates.
Strategic Roadmap Development – Collaborated with business owners and stakeholders to define and prioritize platform roadmap, balancing immediate needs with long-term architectural vision.
Enterprise Standards & Governance – Ensured all solutions adhered to enterprise-wide standards for security, compliance, data governance, and operational excellence throughout the development lifecycle.
Senior Data Engineer
Jul 2024 - July 2025
Agora Data Platform – Core contributor to Agora, the European banking supervision's unified data lake providing single-stop access to all prudential information for SSM staff across Europe. Built on AWS using Apache Spark, Cloudera stack, Python, and Kubernetes.
Platform Modernization – Led migration from Cloudera Data Platform and legacy Workload Automation to Airflow and AWS Glue, significantly reducing operational costs while improving performance and scalability.
Engineering Standards – Implemented new PySpark framework with clear separation between framework code and business logic. Established comprehensive testing practices (unit and integration tests) and an infrastructure-as-code approach using Terraform.
Software Developer
Aug 2022 - Jun 2024
As a software engineer in the agile/SCRUM analytics team, my tasks revolve around developing a complex bespoke SupTech application (data quality checks/risk models) using python, spark and SQL (Exadata).
Parsing business requirements to draft user stories while managing relationships with banking supervision to ensure a shared understanding of the product's vision focusing on business value. Presenting solutions to broad (non-technical) audiences
Troubling shooting failures/unexpected behavior of large-scale data processing pipelines of stress test calculations by analyzing logs and writing complex queries on the relational database to identify issues
Streamlining the deployment process through automation using Gitlab CICD pipeline - Developed a data pipeline to export data from Oracle to Cloudera (DEVO) using cluster computing (spark via AWS Glue) with infrastructure setup in terraform and spark application code deployed through Gitlab. - Automated several data processing tasks using orchestrations and data integration tools (mostly Kubernetes cron jobs and Camunda) - Led a workstream in the STAR cloud migration, focusing on optimizing the execution of analytics code on AWS.
For example, I drafted a technical proposal for an efficient, serverless, and scalable solution for executing risk models using cloud-native technologies. To achieve this, I implemented a proof of concept where I containerized the code using Docker and executed it on AWS Lambda
Data Analyst
Apr 2020 - Jul 2022
Systemic Risk Infrastructure – Enhanced the division's contagion model for systemic footprint assessment, developing new diagnostic tools and risk monitoring mechanisms for the banking system. Co-authored an occasional paper on interconnectedness modeling at the ECB, incorporating feedback from multiple stakeholders and domain experts.
Data Infrastructure Modernization – Led migration of critical databases and ETL pipelines from legacy systems (Excel, Stata, FSSDB) to modern cloud infrastructure (Python, PySpark, Hive, Hadoop). Initiatives included AnaCredit data infrastructure, capital buffer requirements database, and Bank Lending Survey (BLS), all designed with OOP principles, automated quality checks, and Airflow compatibility.
Banking Group Structure Solutions – Developed consolidated banking group structure tables combining RIAD and ROSSI data, enabling aggregation of individual MFIs to highest prudential consolidation levels. Created tools for debtor consolidation through iterative parent/child relationship mapping and integrated market data from Orbis and Reuters via CSDB/GLEIF.
Technical Leadership & Knowledge Transfer – Coordinated cross-functional teams (economists, RAs, trainees) on infrastructure projects. Established GitLab version control and Confluence documentation practices. Delivered training sessions to the division and broader ECB community (ML community series, DGE Data Integration sessions), promoting modern development practices including unit/integration testing and modular design principles.
DSTI School of Engineering
Paris, France
Research Fellow & Lecturer
Jul 2024 - Present
Research on AI topics (see also research section)
Teaching "Data Pipeline II" (demo slides)
Code is publically available here
International Monetary Fund
Washington D.C., United States
Graduate Research Fellow (Fund Internship Program)
Jun 2019 - Sep 2019
Participant of the 2019 Fund Internship Program (FIP) working in the Monetary and Capital Markets (MCM) division. Coauthoring the 2019 Global Financial Stability Report (GFSR) analytical chapter titled “Banks Dollar Funding: A Source for Financial Vulnerability” by providing state-of-the-art data preprocessing and analysis. The GFSR is the flagship policy document related to financial stability and systemic risk published by the IMF.
The internship led me to pursue a follow-up project where I developed a cloud-based web crawler. This application, powered by AWS, extracted data from the Security Exchange Commission's online reporting engine (EDGAR) for the research community. It utilized a serverless, managed spark solution (AWS Glue) for efficient data processing, a NoSQL database (AWS DynamoDB) for storing metadata, and cloud object storage (AWS S3) for persisting large data files. The design followed cloud-native principles, prioritizing lightweight serverless managed services. The resulting application, as well as the data, were open-sourced, promoting collaboration and accessibility. In a related research project, the information on fund ownership structures was modeled in a graph database (neo4j) to study the impact of fund mergers for which accurate information on time-varying organization (ownership) structure is crucial
Bank for International Settlements
Basel, Switzerland
Graduate Research Fellow
Dec 2016 - May 2017
Implementing the entire data ingestion and analysis pipeline for a research project on corporate bond funds:
Using explorative data science techniques to understand and structure a large dataset received from Reuters (Lipper eMAXX) - Constructing a graph database like master/reference dataset (for a subsequent project remodeled in neo4j) on mutual fund ownership structures (mapping fund families, fund portfolios and fund share classes across time) by implementing a python fuzzy string-matching to match fund names and using SEC’s EDGAR API to query reference data
Cleaning the 45GB+ TRACE dataset by setting up the required data manipulation tasks (“Dick-Nielsen deduplication”) on the WRDS cloud using SAS. - Implemented bond liquidity measures according to business specifications. Implemented and ran regression models to support the data analysis of the business side. Ingesting the pre-processed CSV data dumps into a relational database system (Oracle). Drafting technical documentation for the eMAXX database, by considering the potentially different needs and diverse backgrounds of other analysts and economists
The data analysis resulted in a research paper Debt Derisking which is published in Management Science
Goethe University
Frankfurt, Hessen, Germany
Ph.D. Candidate (Research & Teaching Assistant)
Jun 2017 - Jul 2020
Providing research assistance for various policy-oriented projects, e.g. a report for the German federal ministry of finance
Teaching both undergraduate and graduate-level classes
supervision of bachelor and master theses
SAFE
Frankfurt, Germany
Dataroom Supervisor
Aug 2015 - Junge 2016 (part time)
Proving level 1 support for data room users (students/researchers)
Contributing to a natural language processing project based on Java and ANTLR
Coding parts of the basic data infrastructure for the System Financial Risk Platform (SFRP) by creating the using Thomson Reuters Datastream Advance Nightshift Server and Python. Supervising undergraduate and graduate students usage of the SAFE dataroom
German Institue of Economic Research
Berlin, Germany
Internship
Aug 2014 - Oct 2014
Data collection, verification, and examination. Excel programming & report writing
Using this data to estimate Input-Output models with Excel for Impact Analysis of companies in the healthcare sector and the renewable energy industry.