{"id":2397,"date":"2025-02-20T12:34:34","date_gmt":"2025-02-20T12:34:39","guid":{"rendered":"https:\/\/meyn.ece.ufl.edu\/?page_id=2397"},"modified":"2026-03-22T08:06:37","modified_gmt":"2026-03-22T13:06:37","slug":"efficientlysearchingpomdps","status":"publish","type":"page","link":"https:\/\/faculty.eng.ufl.edu\/meyn\/c3\/c3-9\/efficientlysearchingpomdps\/","title":{"rendered":"Efficiently searching for good agent state based policies in Dec-POMDPs"},"content":{"rendered":"<h3><a href=\"https:\/\/www.mcgill.ca\/engineering\/aditya-mahajan\"><span style=\"font-weight: 400\">Aditya Mahajan<\/span><\/a> (<span style=\"font-weight: 400\">McGill University<\/span>)<\/h3>\n<span style=\"font-weight: 400\">Decentralized partially observable Markov decision processes (Dec-POMDPs) are becoming increasingly popular in various applications ranging from decentralized control of fleet of autonomous vehicles to that of smart grids. Optimally solving Dec-POMDPs is notoriously hard as is illustrated by various counterexamples including Witsenhausen&#8217;s counterexample and Whittle and Rudge counterexample. The complexity of finding best history based policies is NEXP complete. Agent-state based policies have emerged as a popular paradigm to address some of these challenges.<\/span>\n\n<span style=\"font-weight: 400\">In this talk, we review the existing solution approaches to find optimal agent state base policies and present a novel policy search algorithm which has\u00a0 monotonic improvement guarantee and converges to a locally optimal solution. We conclude by presenting experimental results that show that that the proposed algorithm identifies close to optimal policies in various POMDP and Dec-POMDP benchmarks.<\/span>\n\n<span style=\"font-weight: 400\">Joint work with Amit Sinha and Matthieu Geist.<\/span>\n\n<img loading=\"lazy\" decoding=\"async\" class=\"wp-image-2693 alignleft\" src=\"https:\/\/faculty.eng.ufl.edu\/meyn\/wp-content\/uploads\/sites\/671\/2026\/03\/portrait-of-aditya-mahajan.jpeg.webp\" alt=\"Aditya Mahajan\" width=\"217\" height=\"271\" srcset=\"https:\/\/faculty.eng.ufl.edu\/meyn\/wp-content\/uploads\/sites\/671\/2026\/03\/portrait-of-aditya-mahajan.jpeg.webp 748w, https:\/\/faculty.eng.ufl.edu\/meyn\/wp-content\/uploads\/sites\/671\/2026\/03\/portrait-of-aditya-mahajan.jpeg-240x300.webp 240w\" sizes=\"auto, (max-width: 217px) 100vw, 217px\" \/>\n\n<b>Bio:<\/b><span style=\"font-weight: 400\"> Aditya Mahajan is Professor of Electrical and Computer Engineering atMcGill University, Montreal, Canada. He is a member of the McGill Center of Intelligent Machines (CIM), Mila &#8211; Qu\u00e9bec AI Institute, International Laboratory for Learning Systems (ILLS), and Groupe d\u2019\u00e9tudes et de recherche en analyse des d\u00e9cisions (GERAD).<\/span>\n\n<span style=\"font-weight: 400\">He is the recipient of the 2015 George Axelby Outstanding Paper Award, the 2016 NSERC Discovery Accelerator Award, the 2014 CDC Best Student Paper Award (as supervisor), and the 2016 NecSys Best Student Paper Award (as supervisor). His principal research interests include decentralized stochastic control, team theory, reinforcement learning, multi-armed bandits and information theory.<\/span>\n\n","protected":false},"excerpt":{"rendered":"<p>Aditya Mahajan (McGill University) Decentralized partially observable Markov decision processes (Dec-POMDPs) are becoming increasingly popular in various applications ranging from decentralized control of fleet of autonomous vehicles to that of smart grids. Optimally solving Dec-POMDPs is notoriously hard as is illustrated by various counterexamples including Witsenhausen&#8217;s counterexample and Whittle and Rudge counterexample. The complexity of [&hellip;]<\/p>\n","protected":false},"author":1347,"featured_media":0,"parent":2631,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"page-templates\/page-section-nav.php","meta":{"_acf_changed":false,"inline_featured_image":false,"featured_post":"","footnotes":"","_links_to":"","_links_to_target":""},"class_list":["post-2397","page","type-page","status-publish","hentry"],"acf":[],"_links":{"self":[{"href":"https:\/\/faculty.eng.ufl.edu\/meyn\/wp-json\/wp\/v2\/pages\/2397","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/faculty.eng.ufl.edu\/meyn\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/faculty.eng.ufl.edu\/meyn\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/faculty.eng.ufl.edu\/meyn\/wp-json\/wp\/v2\/users\/1347"}],"replies":[{"embeddable":true,"href":"https:\/\/faculty.eng.ufl.edu\/meyn\/wp-json\/wp\/v2\/comments?post=2397"}],"version-history":[{"count":8,"href":"https:\/\/faculty.eng.ufl.edu\/meyn\/wp-json\/wp\/v2\/pages\/2397\/revisions"}],"predecessor-version":[{"id":2809,"href":"https:\/\/faculty.eng.ufl.edu\/meyn\/wp-json\/wp\/v2\/pages\/2397\/revisions\/2809"}],"up":[{"embeddable":true,"href":"https:\/\/faculty.eng.ufl.edu\/meyn\/wp-json\/wp\/v2\/pages\/2631"}],"wp:attachment":[{"href":"https:\/\/faculty.eng.ufl.edu\/meyn\/wp-json\/wp\/v2\/media?parent=2397"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}