Sikun Xu

I am a Ph.D. candidate at Olin Business School, Washington University in St. Louis. I’m fortunate to be advised by Prof. Dennis Zhang and Prof. Raphael Thomadsen. I study how firms can make reliable decisions using modern machine learning and AI systems when the available data is noisy and high-dimensional.

I’m on the 2026-2027 job market!

Contacts

sikun [at] wustl [dot] edu
xusikun96 [at] gmail [dot] com

Education

Olin Business School, Washington University in St. Louis (2021-now)
- Ph.D. Candidate
- Dissertation Title: Data-Driven Business Decision-Making with Causal Inference and Machine Learning
Columbia University in the City of New York (2019-2020)
- M.S. in Operations Research
- Data Science Institute Scholar
Shanghai Jiao Tong University (2015-2019)
- B.S. in Industrial Engineering

Job Market Papers

The Winner's Curse in Data-Driven Decision-Making: Evidence and Solutions
Sikun Xu, Raphael Thomadsen and Dennis Zhang
Submitted. Accepted to SICS 2025.
SSRN

Abstract
Data-driven decision making involves estimating the value of each potential option and selecting the one with the highest estimated efficacy. This approach underpins a wide array of modern marketing, operations, and AI applications, including A/B testing, advertising and bidding, pricing, and personalized targeting. However, several papers have shown that the estimated effectiveness of the chosen options will be systematically over-optimistic (Smith and Winkler 2006, Efron 2011, Andrews et al. 2024), even when the estimated outcomes are themselves unbiased and efficient. Using simulations calibrated to realistic parameter values from recent marketing studies, we first demonstrate that the magnitude of the winner’s curse is often high in relevant marketing contexts, and that the severity of the winner’s curse depends on the true performance difference between options relative to the level of noise in the data, the number of alternatives under consideration, and the number of observations per tested condition. We further show that using machine learning methods to evaluate what treatment to give to individual consumers can lead to extremely high levels of winner’s curse, especially if the machine learning functional form is very flexible. We propose a correction method based on a non-continuous bootstrap, and benchmark our method against several existing proposed solutions across many common marketing scenarios. We demonstrate that our bootstrap approach generally performs well, and usually outperforms the solutions that have been previously proposed in the literature.
SOTA or Luck? The Winner’s Curse in LLM Leaderboards
Sikun Xu, Raphael Thomadsen, Dennis Zhang, and Heng Zhang
Submitted.

Abstract
Public LLM leaderboards are now central to how models are evaluated, compared, and publicized, yet we show that the reported performance of top-ranked models can be systematically overstated. The reason is a leaderboard winner’s curse (WC): benchmark scores are noisy averages over finite task sets, so ranking models in the decreasing order of scores rewards sampling luck alongside genuine ability. This overstatement is large enough to make the apparent winner statistically fragile: across four out of the five SWE-bench variants we study, the published rank 1 model has less than a 50% bootstrap probability of remaining rank 1 under task resampling. We develop an $m$-out-of-$N$ task bootstrap for iid benchmark tasks while allowing arbitrary within-task correlation among models. We prove that the worst-case rank-induced bias (WC as a special case for the first rank) is $\Theta(1/\sqrt{N})$ in the number of tasks $N$, and that our correction reduces it to $o(1/\sqrt{N})$. We also introduce a rank probability matrix that replaces a single deterministic ranking with a distribution over plausible rankings. Empirically, rank-1 inflation reaches 1–3 percentage points across eight benchmark settings spanning code generation and agentic customer service. On SWE-bench Verified, selection bias explains 47% of the winner’s apparent top-five lead, and the top 28 models are statistically indistinguishable at the 5% level. Our results suggest that benchmark designers should either enlarge the task pools substantially to suppress the WC or report bias-corrected scores and ranking uncertainty alongside raw rankings.

Working Papers

A Causal Approach to Representation Learning for Unstructured Data
Sikun Xu, Zhenling Jiang, Zhengling Qi, and Dennis Zhang
Major revision at Management Science. Accepted to 19th Annual Bass FORMS Conference (2025).
SSRN

Abstract
The increasing availability of unstructured data (e.g., images) in business and economics research has created new opportunities to control for confounders. A common approach is embedding-then-inference, where unstructured data is compressed into low-dimensional embeddings and incorporated into causal models. However, we show that this method can introduce significant bias because representation learning models optimized for reconstruction may miss relevant confounders. To address this, we propose causal embeddings, which explicitly align the objective of representation learning with the causal task by jointly predicting both treatment and outcome variables. This approach captures confounding information while maintaining low-dimensional efficiency and accommodates various embedding methods, including fine-tuned pretrained models. Simulations demonstrate that causal embeddings outperform both embedding-then-inference and direct adjustment with double machine learning (DML) in subsequent causal inference tasks. A real-world application further highlights the practical importance of properly accounting for unstructured data in causal models.

Conference Proceedings

Verifying Global Optimality of Candidate Solutions to Polynomial Optimization Problems using a Determinant Relaxation Hierarchy.
Sikun Xu, Ruoyi Ma, Daniel K. Molzahn, Hassan Hijazi, and Cédric Josz
60th IEEE Conference on Decision and Control (2021).
IEEE

Abstract
We propose an approach for verifying that a given feasible point for a polynomial optimization problem is globally optimal. The approach relies on the Lasserre hierarchy and the result of Lasserre regarding the importance of the convexity of the feasible set as opposed to that of the individual constraints. By focusing solely on certifying global optimality and relaxing the Lasserre hierarchy using necessary conditions for positive semidefiniteness based on matrix determinants, the proposed method is implementable as a computationally tractable linear program. We demonstrate this method via application to several instances of polynomial optimization, including the optimal power flow problem used to operate electric power systems.

Work in Progress

Policy Learning with Noncompliant AI Agents
Sikun Xu
Exploration Without Noise: Quality-Gated Learning in Platform Recommendations
Sikun Xu, with industry partners
Generative Learning-to-Rank in Recommendation Systems
Sikun Xu, with industry partners

Conference Presentations

The Winner’s Curse in Data-Driven Decision-Making: Evidence and Solutions
- 2025 INFORMS Annual Meeting (Atlanta)
- 2025 INFORMS Marketing Science Conference (Washington, D.C.)
- 2024 Conference on Artificial Intelligence, Machine Learning, and Business Analytics (Yale)
A Causal Approach to Representation Learning for Unstructured Data
- 2025 ISMS Marketing Science Conference (Washington, D.C.)
- 2024 INFORMS Annual Meeting (Seattle, Sesssion Chair)
- 2024 POMS Annual Conference (Minneapolis)
- 2023 INFORMS Annual Meeting (Phoenix)
Data-driven security selection for wealth management
- 2023 POMS Annual Conference (Orlando)
- 2022 INFORMS Annual Meeting (Indianapolis)
Verifying global optimality of candidate solutions to polynomial optimization problems using a determinant relaxation hierarchy [slide]
- 2021 INFORMS Annual Meeting (Virtual)

Teaching

Guest Lecturer

Washington University in St. Louis (MGT680E, 2024 Fall)
Columbia University (IEOR4721, 2022 Spring)
Columbia University (IEOR4721, 2021 Summer)

Teaching Assistants

Columbia University in the City of New York

IEOR4742 Deep Learning; FL2020
IEOR4525 Machine Learning; SP2020

Washington University in St. Louis

SCOT519E Revenue Management; FL2022
SCOT5704 Operations Management; FL2022
SCOT500D Project Management; FL2022, SP2023
SCOT500M Supply Chain Analytics: Stochastic Models; SP2023, SP2024
SCOT400D Supply Chain Analytics; SP2023
SCOT558 Advanced Operations Strategy; FL2023
SCOT356 Operations and Manufacturing Management; FL2024
MGT680E AI & Machine Learning Business Applications; FL2024

Academic Services

Session Chair at INFORMS Annual Meeting (2024)
Reviewer for Journal of Investment Strategies

Sikun Xu (徐思坤)