Skip to main content

Revealed Preference and Adaptive Inverse Reinforcement Learning

Vikram Krishnamurthy (Cornell)

Inverse reinforcement learning (IRL) concerns the problem of inferring an agent’s objective from observations of its behavior. This talk has two related parts.

In the first part, we discuss revealed preference theory and generalizations of Afriat’s theorem, which provide necessary and sufficient conditions for the existence of a utility function consistent with observed datasets. The second part discusses  adaptive IRL using passive Langevin dynamics. We show how the resulting occupancy measure is linked to the underlying utility function, and how estimation efficiency can be improved via a Nadaraya–Watson regression interpretation. Finally, we discuss Malliavin derivative–based estimators as an alternative to kernel smoothing. Malliavin calculus provides weak sensitivity representations that avoid bandwidth selection and naturally extend to continuous-time stochastic systems, offering a principled approach to passive IRL.

This talk focuses on the underlying fundamental ideas emerging from two industrial research collaborations: predicting user engagement in online multimedia and identifying adaptive radar systems from observed emissions.

Vikram Krishnamurthy

Bio.  Vikram Krishnamurthy is a Professor in the School of Electrical and Computer Engineering at Cornell University, with affiliations in applied mathematics and mechanical engineering. His research spans statistical signal processing, stochastic control, and reinforcement learning, with applications to social networks, sensing systems, and decision-making under uncertainty. He is the author of the widely used monograph Partially Observed Markov Decision Processes: Filtering to Controlled Sensing. Krishnamurthy is an IEEE Fellow, recognized for contributions to statistical signal processing and controlled sensing, and has held prestigious positions including a Canada Research Chair prior to joining Cornell.