{"id":1461,"date":"2022-08-22T17:27:10","date_gmt":"2022-08-22T17:27:10","guid":{"rendered":"https:\/\/meyn.ece.ufl.edu\/?page_id=1461"},"modified":"2022-08-22T17:27:10","modified_gmt":"2022-08-22T17:27:10","slug":"csrl-resources","status":"publish","type":"page","link":"https:\/\/faculty.eng.ufl.edu\/meyn\/control-systems-and-reinforcement-learning\/csrl-resources\/","title":{"rendered":"Resources"},"content":{"rendered":"<h4>Lectures and Video from DeepLearn 2022<\/h4>\n<table>\n<tbody>\n<tr>\n<td>\n<h5>Link to Slides:<\/h5>\n<\/td>\n<td><\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td><a href=\"https:\/\/www.slideshare.net\/spmeyn\/deeplearn2022-1-goals-algorithmdesignpdf\" target=\"_blank\" rel=\"noopener\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-1485 size-full\" src=\"http:\/\/faculty.eng.ufl.edu\/meyn\/wp-content\/uploads\/sites\/671\/2022\/08\/DeepLearn2022_1_GoalsAndAlgorithmDesign.png\" alt=\"\" width=\"3226\" height=\"2419\" \/><\/a><\/td>\n<td><a href=\"https:\/\/www.slideshare.net\/spmeyn\/deeplearn2022-2-variance-matters\" target=\"_blank\" rel=\"noopener\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-1483 size-full\" src=\"http:\/\/faculty.eng.ufl.edu\/meyn\/wp-content\/uploads\/sites\/671\/2022\/08\/DeepLearn2022_2_VarianceMatters.png\" alt=\"\" width=\"3226\" height=\"2419\" \/><\/a><\/td>\n<td><a href=\"https:\/\/www.slideshare.net\/spmeyn\/deeplearn2022-3-td-and-q-learning\" target=\"_blank\" rel=\"noopener\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-1481 size-full\" src=\"http:\/\/faculty.eng.ufl.edu\/meyn\/wp-content\/uploads\/sites\/671\/2022\/08\/DeepLearn2022_3_TDQ.png\" alt=\"\" width=\"3226\" height=\"2419\" \/><\/a><\/td>\n<\/tr>\n<tr>\n<td>\n<h5>Link to Video:<\/h5>\n<\/td>\n<td><\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td><a href=\"https:\/\/www.dropbox.com\/s\/pqolgdt16tnbr9q\/SeanMeyn_1_QSA%2BESC.mp4?dl=0\">Part 1, Goals and Challenges <em>and solutions!<\/em><\/a><\/p>\n<p>Introduction to the ODE method in a simple deterministic setting, with applications to extremum seeking control (a class of algorithms for online optimization, with applications to reinforcement learning).<\/p>\n<p>Much is taken from Chapter 4, and 2022 publications available on arXiv, such as<\/p>\n<ul>\n<li class=\"title mathjax\"><a href=\"https:\/\/arxiv.org\/abs\/2207.06371\">Markovian Foundations for Quasi-Stochastic Approximation with Applications to Extremum Seeking Control<\/a><\/li>\n<\/ul>\n<\/td>\n<td><a href=\"https:\/\/www.dropbox.com\/s\/83osh169uyzsrt9\/SeanMeyn_2_SA.mp4?dl=0\">Part 2, Variance Matters<\/a><\/p>\n<p>The theoretical side of reinforcement learning has focused almost entirely on stochastic models for algorithm design and analysis.\u00a0 \u00a0This talk surveys techniques for algorithm design and testing, building on part 1.<\/p>\n<p>The material is taken from Chapter 8,\u00a0 and recent tutorials and articles including<\/p>\n<ul>\n<li class=\"title is-5 mathjax\"><a href=\"https:\/\/arxiv.org\/abs\/2110.14427\">The ODE Method for Asymptotic Statistics in Stochastic Approximation and Reinforcement Learning<\/a><\/li>\n<\/ul>\n<\/td>\n<td><a href=\"https:\/\/www.dropbox.com\/s\/aoexrbqqsowa632\/SeanMeyn_3_TD.mp4?dl=0\">Part 3, TD and Q-Learning<\/a><\/p>\n<p>Covers final two chapters: All about algorithm design for TD- and Q-learning in a stochastic environment.<\/p>\n<div class=\"slideshow-description-text-wrapper\">\n<p>Much of Part II of CS&amp;RL is based on handouts created over the years, some of which evolved to become<\/p>\n<ul>\n<li><a href=\"https:\/\/link.springer.com\/chapter\/10.1007\/978-3-030-60990-0_4\">Fundamental Design Principles for Reinforcement Learning Algorithms<\/a><\/li>\n<\/ul>\n<div class=\"fade-slideshow-description fade hidden\"><\/div>\n<\/div>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Lectures and Video from DeepLearn 2022 Link to Slides: Link to Video: Part 1, Goals and Challenges and solutions! Introduction to the ODE method in a simple deterministic setting, with applications to extremum seeking control (a class of algorithms for online optimization, with applications to reinforcement learning). Much is taken from Chapter 4, and 2022 [&hellip;]<\/p>\n","protected":false},"author":1347,"featured_media":0,"parent":1207,"menu_order":2,"comment_status":"closed","ping_status":"closed","template":"page-templates\/page-section-nav.php","meta":{"_acf_changed":false,"inline_featured_image":false,"featured_post":"","footnotes":"","_links_to":"","_links_to_target":""},"class_list":["post-1461","page","type-page","status-publish","hentry"],"acf":[],"_links":{"self":[{"href":"https:\/\/faculty.eng.ufl.edu\/meyn\/wp-json\/wp\/v2\/pages\/1461","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/faculty.eng.ufl.edu\/meyn\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/faculty.eng.ufl.edu\/meyn\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/faculty.eng.ufl.edu\/meyn\/wp-json\/wp\/v2\/users\/1347"}],"replies":[{"embeddable":true,"href":"https:\/\/faculty.eng.ufl.edu\/meyn\/wp-json\/wp\/v2\/comments?post=1461"}],"version-history":[{"count":0,"href":"https:\/\/faculty.eng.ufl.edu\/meyn\/wp-json\/wp\/v2\/pages\/1461\/revisions"}],"up":[{"embeddable":true,"href":"https:\/\/faculty.eng.ufl.edu\/meyn\/wp-json\/wp\/v2\/pages\/1207"}],"wp:attachment":[{"href":"https:\/\/faculty.eng.ufl.edu\/meyn\/wp-json\/wp\/v2\/media?parent=1461"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}