{"id":1269,"date":"2022-04-05T19:43:04","date_gmt":"2022-04-05T19:43:04","guid":{"rendered":"https:\/\/meyn.ece.ufl.edu\/?page_id=1269"},"modified":"2026-04-14T09:45:34","modified_gmt":"2026-04-14T14:45:34","slug":"control-systems-reinforcement-learning","status":"publish","type":"page","link":"https:\/\/faculty.eng.ufl.edu\/meyn\/courses\/control-systems-reinforcement-learning\/","title":{"rendered":"Control Systems &amp; Reinforcement Learning"},"content":{"rendered":"<p class=\"p1\"><em>Reinforcement learning<\/em> is a collection of tools for the design of decision and control algorithms. What makes RL different from traditional control is that the modelling step is avoided, and instead the control design is based on observations of the system to be controlled.<\/p>\n<table>\n<tbody>\n<tr>\n<td><a href=\"http:\/\/faculty.eng.ufl.edu\/meyn\/wp-content\/uploads\/sites\/671\/2022\/04\/RL2021_Info-1.pdf\">Course information from Spring 2021<\/a>Covers Part I of the <a href=\"https:\/\/meyn.ece.ufl.edu\/control-systems-and-reinforcement-learning\/\">new monograph<\/a> of the same name:<\/p>\n<ol>\n<li class=\"p1\">Introduction<\/li>\n<li class=\"p1\">Control Crash Course<\/li>\n<li class=\"p1\">Optimal Control<\/li>\n<li class=\"p1\">ODE Methods for Algorithm Design\u00a0 \u2192 Actor only methods, and Actor-Critic in Part II<\/li>\n<li class=\"p1\">Value Function Approximations\u00a0 \u00a0\u2192 TD and Q-learning<\/li>\n<\/ol>\n<\/td>\n<td style=\"text-align: center\" valign=\"middle\"><a href=\"https:\/\/meyn.ece.ufl.edu\/publications\/current\/control-systems-and-reinforcement-learning\/\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-971 size-medium\" src=\"http:\/\/faculty.eng.ufl.edu\/meyn\/wp-content\/uploads\/sites\/671\/2021\/08\/SunsetCoverNoBird-207x300.png\" alt=\"Book site\" width=\"207\" height=\"300\" srcset=\"https:\/\/faculty.eng.ufl.edu\/meyn\/wp-content\/uploads\/sites\/671\/2021\/08\/SunsetCoverNoBird-207x300.png 207w, https:\/\/faculty.eng.ufl.edu\/meyn\/wp-content\/uploads\/sites\/671\/2021\/08\/SunsetCoverNoBird-707x1024.png 707w, https:\/\/faculty.eng.ufl.edu\/meyn\/wp-content\/uploads\/sites\/671\/2021\/08\/SunsetCoverNoBird-768x1113.png 768w, https:\/\/faculty.eng.ufl.edu\/meyn\/wp-content\/uploads\/sites\/671\/2021\/08\/SunsetCoverNoBird-1060x1536.png 1060w, https:\/\/faculty.eng.ufl.edu\/meyn\/wp-content\/uploads\/sites\/671\/2021\/08\/SunsetCoverNoBird-1413x2048.png 1413w, https:\/\/faculty.eng.ufl.edu\/meyn\/wp-content\/uploads\/sites\/671\/2021\/08\/SunsetCoverNoBird-scaled.png 1767w\" sizes=\"auto, (max-width: 207px) 100vw, 207px\" \/><\/a><a href=\"https:\/\/meyn.ece.ufl.edu\/publications\/current\/control-systems-and-reinforcement-learning\/\">CS&amp;RL<\/a><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Intended for graduate students and senior undergraduates <em>without the usual background in stochastic processes<\/em> (though this is desirable).\u00a0 Experience with Matlab or Python is essential, and the always-essential signals &amp; systems mathematical toolbox.<\/p>\n<p>The origins or RL go all the way back to Claude Shannon in the 1950s, and the field made headlines in the public press more recently following the success of AlphaGo and other RL algorithms that beat grand masters at complex games like Go and Chess. Today it is hoped that RL will be an engine behind autonomous cars, as well as better decision making in fields ranging from medicine to finance. This course provides an introduction to RL through the lens of control theory. We will find that the DQN algorithm behind AlphaGo is related to classical control concepts going back to the 1960s. Given this intuition we will discover techniques to create new and potentially more reliable algorithms for decision and control.<\/p>\n<h4><strong>More Resources<\/strong><\/h4>\n<ul>\n<li>Sutton and A. Barto. <a href=\"http:\/\/www.cs.ualberta.ca\/~sutton\/book\/the-book.html\">Reinforcement Learning: An Introduction<\/a>. MIT Press, Cambridge, MA, 2nd edition, 2018.<\/li>\n<li>Karl J. \u00c5str\u00f6m and Richard M. Murray. <a href=\"http:\/\/www.cds.caltech.edu\/~murray\/amwiki\/index.php\/Second_Edition\">Feedback Systems: An Introduction for Scientists and Engineers<\/a>,\u00a0and <a href=\"https:\/\/simons.berkeley.edu\/talks\/murray-control-1\">Murray\u2019s crash course<\/a> from 2018,<\/li>\n<li>Csaba Szepesvari. <a href=\"https:\/\/sites.ualberta.ca\/~szepesva\/papers\/RLAlgsInMDPs.pdf\">Algorithms for Reinforcement Learning. Synthesis Lectures on Artificial Intelligence and Machine Learning<\/a>. Morgan &amp; Claypool Publishers, 2010<\/li>\n<li><a href=\"https:\/\/simons.berkeley.edu\/programs\/rl20\">Theory of Reinforcement Learning: tutorials at the Simons Institute program<\/a>, Aug. 19&#8211;Dec. 18, 2020,<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Reinforcement learning is a collection of tools for the design of decision and control algorithms. What makes RL different from traditional control is that the modelling step is avoided, and instead the control design is based on observations of the system to be controlled.<\/p>\n","protected":false},"author":1347,"featured_media":1657,"parent":711,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"page-templates\/page-section-nav.php","meta":{"_acf_changed":false,"inline_featured_image":false,"featured_post":"","footnotes":"","_links_to":"","_links_to_target":""},"class_list":["post-1269","page","type-page","status-publish","has-post-thumbnail","hentry"],"acf":[],"_links":{"self":[{"href":"https:\/\/faculty.eng.ufl.edu\/meyn\/wp-json\/wp\/v2\/pages\/1269","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/faculty.eng.ufl.edu\/meyn\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/faculty.eng.ufl.edu\/meyn\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/faculty.eng.ufl.edu\/meyn\/wp-json\/wp\/v2\/users\/1347"}],"replies":[{"embeddable":true,"href":"https:\/\/faculty.eng.ufl.edu\/meyn\/wp-json\/wp\/v2\/comments?post=1269"}],"version-history":[{"count":1,"href":"https:\/\/faculty.eng.ufl.edu\/meyn\/wp-json\/wp\/v2\/pages\/1269\/revisions"}],"predecessor-version":[{"id":2895,"href":"https:\/\/faculty.eng.ufl.edu\/meyn\/wp-json\/wp\/v2\/pages\/1269\/revisions\/2895"}],"up":[{"embeddable":true,"href":"https:\/\/faculty.eng.ufl.edu\/meyn\/wp-json\/wp\/v2\/pages\/711"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/faculty.eng.ufl.edu\/meyn\/wp-json\/wp\/v2\/media\/1657"}],"wp:attachment":[{"href":"https:\/\/faculty.eng.ufl.edu\/meyn\/wp-json\/wp\/v2\/media?parent=1269"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}