PKU

MAFS 5440. Artificial Intelligence in Fintech
Fall 2025


Course Information

Synopsis

This course offers a comprehensive exploration of the fundamental concepts and underlying principles of artificial intelligence (AI). It delves into the core principles of machine learning and provides valuable insights through case studies of relevant technologies. By providing opportunities for hands-on experimentation with machine learning applications, the course aims to inspire students to devise innovative approaches to address real-life problems in fintech using readily-available AI technologies.
Prerequisite: Some preliminary course on (statistical) machine learning, applied statistics, and deep learning will be helpful.

Instructors:

Yuan Yao

Time and Place:

Wednesday 19:30-22:20pm, G010, CYT Bldg (140)

Reference (参考教材)

An Introduction to Statistical Learning, with applications in R / Python. By James, Witten, Hastie, and Tibshirani

ISLR-python, By Jordi Warmenhoven.

ISLR-Python: Labs and Applied, by Matt Caudill.

Manning: Deep Learning with Python, by Francois Chollet [GitHub source in Python 3.6 and Keras 2.0.8]

MIT: Deep Learning, by Ian Goodfellow, Yoshua Bengio, and Aaron Courville

Tutorials: preparation for beginners

Python-Numpy Tutorials by Justin Johnson

scikit-learn Tutorials: An Introduction of Machine Learning in Python

Jupyter Notebook Tutorials

PyTorch Tutorials

Deep Learning: Do-it-yourself with PyTorch, A course at ENS

Tensorflow Tutorials

MXNet Tutorials

Theano Tutorials

The Elements of Statistical Learning (ESL). 2nd Ed. By Hastie, Tibshirani, and Friedman

statlearning-notebooks, by Sujit Pal, Python implementations of the R labs for the StatLearning: Statistical Learning online course from Stanford taught by Profs Trevor Hastie and Rob Tibshirani.

Teaching Assistant:


Email: Mr. FU, Xiaoyi < aifin.hkust (add "AT gmail DOT com" afterwards) >

Schedule

Date Topic Instructor Scriber
03/09/2025, Wed Lecture 01: Overview and History of Artificial Intelligence in Fintech. [ slides ] Y.Y.
10/09/2025, Wed Lecture 02: Supervised Learning: Linear Regression and Classification [ slides ] Y.Y.
11/09/2025, Thu Seminar.
  • Title: PKU Quest: AI-Powered Math Education Practice at Peking University [ announcement ] [ slides ]
  • Speaker: Leheng Chen and Zihao Liu, Peking University
  • Time: Thursday Sep 11, 2025, 3:30pm
  • Venue: Room 2612B (near Lift 31 & 32)
  • Abstract: The advent of Generative AI necessitates a paradigm shift in higher education, calling for new, diverse models of interaction between students, teachers, and AI. In response to this challenge, Peking University has developed PKU Quest, an AI-assisted platform designed to explore these new pedagogical frontiers. PKU Quest focuses on optimizing for the unique demands of mathematics education, and has developed the "Math Tutor," a tool specifically designed for math problem-solving support. Instead of providing direct answers, the Math Tutor engages students in a heuristic and exploratory dialogue, guiding them to develop independent thinking and problem-solving skills. This application has now been implemented across all foundational mathematics courses at Peking University. This presentation will share our journey in developing PKU Quest, discussing the motivations, challenges, and practical outcomes of what we consider a first step in exploring the vast potential of AI in education.
  • Bio: Leheng Chen is a Ph.D. student at the Beijing International Center for Mathematical Research (BICMR), Peking University, advised by Professor Bin Dong. He has broad interests in the application of artificial intelligence. Previously, he explored research directions in AI for Science, such as thermodynamic modeling and foundation models for partial differential equations, with his work published in Physical Review E and at an ICLR Workshop. He has since shifted his research focus to the practical application of AI in Education, where he designed and developed "PKU Quest," an AI-assisted teaching and learning platform for Peking University.
    Zihao Liu (Leo) is a Ph.D. student in Applied Mathematics and Artificial Intelligence at the School of Mathematical Sciences, Peking University. His interests span the application of AI to education and scientific understanding, with recent work focusing on improving the pedagogical effectiveness of AI-powered educational agents and building benchmark datasets for evaluating AI capabilities. As the founder and lead developer of PKU Quest and AKIS (AI Knowledge Intelligent Solution), he focuses on the practical deployment of AI-in-education systems and has helped design and develop “AIBOOKS,” an intelligent digital-textbook platform, and “Math Tutor,” a guided problem-solving assistant for students. He is deeply committed to advancing the integration of AI and education.
Y.Y.
17/09/2025, Wed Lecture 03: Model Assessment and Selection: Subset, Ridge, Lasso, and PCR [ slides ]
    [ Seminar ]
  • Speaker: QRT guest speakers [ poster ]
Y.Y.
24/09/2025, Wed Lecture 04: Project 1 (via Canvas Zoom due to typhoon) [ pdf ]
    [ Reference ]:
  • Kaggle: Home Credit Default Risk [ link ]
  • Kaggle: M5 Forecasting - Accuracy, Estimate the unit sales of Walmart retail goods. [ link ]
Y.Y.
27/09/2025, Sat Lecture 05: Decision Tree, Bagging, Random Forests and Boosting [ slides ]
Y.Y.
10/08/2025, Wed Lecture 06: Support Vector Machines [ slides ]
    [ Reference ]:
  • To view .ipynb files below, you may try [ Jupyter NBViewer]
  • Python Notebook for Support Vector Machines [ MAFS6010_svm.ipynb ]

  • Daniel Soudry, Elad Hoffer, Mor Shpigel Nacson, Suriya Gunasekar, Nathan Srebro. The Implicit Bias of Gradient Descent on Separable Data. [ arXiv:1710.10345 ]. ICLR 2018. Gradient descent on logistic regression leads to max margin.
  • Matus Telgarsky. Margins, Shrinkage, and Boosting. [ arXiv:1303.4172 ]. ICML 2013. An older paper on gradient descent on exponential/logistic loss leads to max margin.
  • Yuan Yao, Lorenzo Rosasco and Andrea Caponnetto. On Early Stopping in Gradient Descent Learning. Constructive Approximation, 2007, 26 (2): 289-315. [ link ]
  • Jingfeng Wu, Peter L. Bartlett, Jason D. Lee, Sham M. Kakade, and Bin Yu. Risk Comparisons in Linear Regression: Implicit Regularization Dominates Explicit Regularization. [ arXiv:2509.17251 ]
Y.Y.
10/11/2025, Sat Lecture 6+: Seminar
  • Title: A Statistical View on Implicit Regularization: Gradient Descent Dominates Ridge [ slides ]
  • Speaker: Dr. Jingfeng WU, UC Berkeley
  • Time: 10:30am, LTF
  • Abstract: A key puzzle in deep learning is how simple gradient methods find generalizable solutions without explicit regularization. This talk discusses the implicit regularization of gradient descent (GD) through the lens of statistical dominance. Using linear regression as a clean proxy, we present three surprising findings. First, GD dominates ridge regression: with comparable regularization, the excess risk of GD is always within a constant factor of ridge, but ridge can be polynomially worse even when tuned optimally. Second, GD is incomparable with online stochastic gradient descent (SGD). While it is known that for certain problems GD can be polynomially better than SGD, the reverse is also true: we construct problems, inspired by benign overfitting theory, where optimally stopped GD is polynomially worse. Finally, GD dominates SGD for a significant subclass of problems -- those with fast and continuously decaying covariance spectra -- which includes all problems satisfying the standard capacity condition. This is joint work with Peter Bartlett, Sham Kakade, Jason Lee, and Bin Yu.
  • Bio: Jingfeng Wu is a postdoctoral fellow at the Simons Institute for the Theory of Computing at UC Berkeley. His research focuses on deep learning theory, optimization, and statistical learning. He earned his Ph.D. in Computer Science from Johns Hopkins University in 2023. Prior to that, he received a B.S. in Mathematics (2016) and an M.S. in Applied Mathematics (2019), both from Peking University. In 2023, he was recognized as a Rising Star in Data Science by the University of Chicago and UC San Diego.
Y.Y.
10/15/2025, Wed Lecture 07: An Introduction to Convolutional Neural Networks [ slides ] and EasyChair Instruction for Project 1 [ slides ]
Y.Y.
11/01/2025, Sat Lecture 08: An Introduction to Recurrent Neural Networks (RNN), Long Short Term Memory (LSTM), Attention and Transformer [ slides ], Rm 5560, 14:00.
Y.Y.
11/05/2025, Wed Lecture 9: Transformer and Applications [ slides ]
    [ Seminar ]
  • Title: Transformers As Statisticians: Provable In-Context Learning with In-Context Algorithm Selection. [ slides ] [ video ]
  • Speaker: Prof. Song MEI, University of California at Berkeley.
  • Abstract: Neural sequence models based on the transformer architecture have demonstrated remarkable in-context learning (ICL) abilities, where they can perform new tasks when prompted with training and test examples, without any parameter update to the model. This work first provides a comprehensive statistical theory for transformers to perform ICL. Concretely, we show that transformers can implement a broad class of standard machine learning algorithms in context, such as least squares, ridge regression, Lasso, learning generalized linear models, and gradient descent on two-layer neural networks, with near-optimal predictive power on various in-context data distributions. Using an efficient implementation of in-context gradient descent as the underlying mechanism, our transformer constructions admit mild size bounds, and can be learned with polynomially many pretraining sequences. Building on these ``base'' ICL algorithms, intriguingly, we show that transformers can implement more complex ICL procedures involving in-context algorithm selection, akin to what a statistician can do in real life -- A single transformer can adaptively select different base ICL algorithms -- or even perform qualitatively different tasks -- on different input sequences, without any explicit prompting of the right algorithm or task. We both establish this in theory by explicit constructions, and also observe this phenomenon experimentally. In theory, we construct two general mechanisms for algorithm selection with concrete examples: pre-ICL testing, and post-ICL validation. As an example, we use the post-ICL validation mechanism to construct a transformer that can perform nearly Bayes-optimal ICL on a challenging task -- noisy linear models with mixed noise levels. Experimentally, we demonstrate the strong in-context algorithm selection capabilities of standard transformer architectures.
  • Bio: Song Mei is an Assistant Professor in the Department of Statistics and the Department of Electrical Engineering and Computer Sciences at UC Berkeley. In June 2020, he received Ph.D. from Stanford, with Prof. Andrea Montanari. Song's research is motivated by data science and AI, and lies at the intersection of statistics, machine learning, information theory, and computer science. His current research interests include language models and diffusion models, theory of deep learning, theory of reinforcement learning, high dimensional statistics, quantum algorithms, and uncertainty quantification. Song received Sloan Research Fellowship in 2025 and NSF career award in 2024.
  • Reference: Yu Bai, Fan Chen, Huan Wang, Caiming Xiong, and Song Mei. Transformers as Statisticians: Provable In-Context Learning with In-Context Algorithm Selection. NeurIPS, 2023 (Oral). [ arXiv:2306.04637]
Y.Y.
11/12/2025, Wed Lecture 10: An Introduction to Reinforcement Learning with Applications in Quantitative Finance [ slides ] and Final Project Initialization [ project2.pdf ]
    [ Reference ]:
  • Google DeepMind's Deep Q-learning playing Atari Breakout: [ youtube ]
  • To view .ipynb files below, you may try [ Jupyter NBViewer]
  • Deep Q-Learning Pytorch Tutorial: [ link ]
  • A Tutorial of Reinforcement Learning for Quantitative Trading: [ Tutorial ] [ Replicate ]
  • FinRL: Deep Reinforcement Learning for Quantitative Finance [ GitHub ]
  • Reinforcement Learning and Supervised Learning for Quantitative Finance: [ link ]
  • Hierarchical Reinforced Trader (HRT): A Bi-Level Approach for Optimizing Stock Selection and Execution, by Zijie Zhao, Roy E. Welsch. [ arXiv:2410.14927 ]
  • Prof. Michael Kearns, University of Pennsyvania, Algorithmic Trading and Machine Learning, Simons Institute at Berkeley [ link ]
    [ Reference ]
  • Kaggle: Home Credit Default Risk [ link ]
  • Kaggle: G-Research Crypto Forecasting. [ link ]
  • Kaggle: Jane Street Real-Time Market Data Forecasting. [ link ]
  • Kaggle: M5 Forecasting - Accuracy, Estimate the unit sales of Walmart retail goods. [ link ]
  • Kaggle: M5 Forecasting - Uncertainty, Estimate the uncertainty distribution of Walmart unit sales. [ link ]
    [ Paper Replication ]
  • Shihao Gu, Bryan Kelly and Dacheng Xiu
    "Empirical Asset Pricing via Machine Learning", Review of Financial Studies, Vol. 33, Issue 5, (2020), 2223-2273. Winner of the 2018 Swiss Finance Institute Outstanding Paper Award.
    [ link ]

  • Jingwen Jiang, Bryan Kelly and Dacheng Xiu
    "(Re-)Imag(in)ing Price Trends", Chicago Booth Report, Aug 2021
    [ link ]

Y.Y.
11/19/2025, Wed Lecture 11: Seminar and Student Presentations.
    [ Seminar ]
  • Title: Application of AI Technology in the Securities and Finance Industry.
  • Speaker: Dr. WANG Ying, Guosen Securities Co., Ltd.
  • Abstract: This talk focuses on AI applications in the securities and financial industry. First, we introduce the development of AI technologies. Second, we discuss the mainstream approaches to AI application in Fintech. Next, we will elaborate on Guosen's AI architecture design, including AI platform construction and core AI scenario applications, with a focus on how AI can help solve core challenges in these scenarios. Finally, we will explore the potential risks and future prospects of AI applications in finance.
  • Bio: Dr. WANG Ying holds a PhD in Computer Science and Engineering from the Hong Kong University of Science and Technology and a Bachelor's degree in Computer Science and Technology from the University of Science and Technology of China. She is currently an AI Architect at the Financial Technology Headquarters of Guosen Securities Co., Ltd., researching the application of AI in securities scenarios. Previously, during her doctoral studies at Huawei's 2012 Lab, her main research focus was on network resource allocation algorithms based on reinforcement learning.
    [ Selected Presentations ]
  • Ming Mei, Li Xuzhi, Chen Yilin and Xie Xiaoxiao. Predicting Home Credit Default by Using Light Gradient Boosting Model. [ slides (pptx) ]
  • Kuo Yang. MAFS5440 Project 1: Home Credit Default Risk [ slides (pdf) ]
  • Xinyu Xu, Zilong Pan and Hongen Tang. Home Credit Default Risk Assessment Based on LightGBMLinks to an external site. [ slides (pdf) ]
  • Kedeng Qiu, Zewen Wan, Yifei Song and Jiarui Jiang. Project 1: Home Credit Default Risk. [ slides (pptx) ]
Y.Y.

by YAO, Yuan.