Libdoc | Search

Notes on finite MDPs, Bellman equations and policy improvement à la Sutton/Barto (2024)

Flemming, Jens

The book 'Reinforcement Learning: An Introduction' by Sutton and Barto is the standard text book for introductory courses to reinforcement learning. Next to concrete algorithms and extensive examples the book contains several fundamental results related to Markov decision processes (MDPs) and Bellman equations in Chapters 3 and 4. Unfortunately some proofs are missing, some theorems lack precise formulation, and for some results the line of arguments is quite garbled. In this note we provide all missing proofs, give precise formulations of theorems and untangle the line of arguments. Further, we avoid using random variables and their expected values. Since we (like Sutton/Barto) restrict our attention to finite MDPs all expected values can be made explicit avoiding overloaded notation and murky conclusions. This article bridges the gap between introductory literature like Sutton/Barto and research literature containing exact formulations and proofs of relevant results, but being less accessible to beginners due to higher generality and complexity.

Bus stop data quality in OpenStreetMap (2024)

Flemming, Jens

OpenStreetMap (OSM) is a large open database for geographic data created and maintained by volunteers. OSM's main data use is rendering an extremely detailed map of the world. Data quality is an important issue for applications like routing of pedestrians to public transport facilities. In this report we describe different schemes for mapping bus stops in OSM and we provide statistics on usage of those schemes, the good ones and the not so good ones.

JupyterHub and autograding on bare-metal lab servers (2022)

Flemming, Jens

The Jupyter ecosystem with JupyterHub and JupyterLab as its most prominent members is the de-facto standard for teaching Python programming and also for research in machine learning and data science. Although the Jupyter project is well documented, there are lots of settings and situations requiring deep knowledge of the internal workings of Jupyter, Linux and related software tools. This report describes three problems and possible solutions arising when installing and configuring a Jupyter-based teaching environment. These three problems are the installation and setup of the autograding tool nbgrader, the interplay between JupyterHub and Linux PAM, and providing access to WebDAV resources for users of JupyterHub.

Host-hub communication for LEGO Spike Prime on Linux (2022)

Flemming, Jens

LEGO robotics sets are a well established tool for teaching programming in undergraduate courses. Starting with the now outdated EV3 set LEGO provided a Python programming interface and (inofficial) Linux support. The current LEGO Spike Prime set still provides Python programming, but no direct support for Linux. In this report we collect and extend information on controlling Spike Prime robots from Linux hosts. We cover access to a robot's Python interpreter and code transfer as well as bidirectional robot-to-host communication via USB and Bluetooth. Results may be extended to robot-to-robot communication via Bluetooth.

Hochschulforschungsbericht 2014 (2014)

Forschungsberichterstattung, Forschungsergebnisse 2013 Projektübersichten, Projektkurzberichte Präsentationen, Ereignisse, Namen

Open Access

Refine

Has Fulltext

Year of publication

Document Type

Institute

Language

Author

Is part of the Bibliography

5 search hits