Distributed Computing
ETH Zurich

Seminar in Deep Reinforcement Learning (FS 2020)

Organization

When & Where: Tuesdays 10:15 @ ETZ G 91
First seminar: 18.02.2018
Last seminar: 26.05.2018
Coordinators: Roger Wattenhofer & Oliver Richter

As a seminar participant, you are invited to attend all the talks and make a presentation. Your presentation should be in English. The presentation should last 35 minutes plus about 10 minutes of discussion.

Disclaimer: This is a seminar, we will focus on reasearch and skip most of the basics. If you feel like you cannot follow the discussions we invite you to check out this lecture or at least this talk.

Presentation & Discussion

The seminar will be exciting if the presentations are exciting. Here is a 2 page guideline how to do a great scientific presentation. Here are some additional guidelines: 1, 2 and 3. You can find further guidance to structure your talk as well as resources and ideas about what each topic should address here.

We further expect the presentation to motivate a lively discussion. We encourage discussions during and after the presentations as a main objective of this seminar. It may help discussions if you also try to be critical about the presented work. These are all scientific papers, but if you have been in science long enough...

COVID-19 Situation

Due to the COVID-19 outbreak we will continue the seminar in digital form, with discussions over zoom. All talks can be found here (you should have received the corresponding password per mail). If you still have a presentation upcoming, please have it ready as video the day before you would have held it in the seminar.

Grade

Your grade will mostly depend on your presentation. In addition, we also grade how actively you participate in the discussions throughout the whole semester. Further, there will be a programming challenge alongside the seminar, in which you can take part to improve your grade.

Coding Challenge

The idea of the coding challenge is that you once code a deep reinforcement learning algorithm from scratch. You can implement whichever algorithm you like and take inspiration from existing code libraries, however, your agent should be your implementation. The goal is to learn variations of the card game BlackJack. Further instructions and the environment to train on can be found here. A hand in of your code including a description of your results is expected by 19.05. such that we can discuss your implementations in the last seminar.

How To Sign Up

There will be two presentation per week, so there is a limited number of slots (topics) which will be assigned based on preference. If you have not received a mail so far, confirming your spot in the seminar, write a sentence regarding your background (courses, projects, ...) in deep reinforcement learning to Oliver Richter, to get a spot on the waiting list, in case someone cancels.

After You Got Your Topic

We established the following rules to ensure a high quality of the talks and hope that these will result in a good grade for you:

Schedule

Date Presenter(s) Title Mentor Slides
18.02.2020 Oliver Richter Introduction [pdf]
25.02.2020 Zhao Ma, Constantin Le Clei Deep Learning and Neural Architecture Zhao Meng [pdf][pptx]
03.03.2020 Alexander Nedergaard, Xiang Li On Policy vs. Off-Policy vs. Batch-Policy Learning Gino Brunner [pdf][pdf]
10.03.2020 Samriddhi Jain, Yunke Ao Deep reinforcement learning in continuous action spaces Oliver Richter [pdf][pptx]
17.03.2020 Cliff Li, Lucas Brunner Hierarchical deep reinforcement learning Béni Egressy [pdf][pdf]
24.03.2020 Tommaso Macri, Jérémy Scheurer Deep reinforcement learning and stochastic planning in games Kevin Roth [pdf][pdf]
31.03.2020 Adrian Hoffmann, Lee Sharkey Model based vs. model free deep reinforcement learning Pascal Weber [pdf][pdf]
07.04.2020 Thomas Langerak, Sébastien Foucher Deep reinforcement learning in partial observability Lukas Faber [pdf][pdf]
21.04.2020 Florian Turati, Orhan Saeedi Multi-Armed Bandits Ye Wang [pdf][pdf]
28.04.2020 Felix Schur Non-differentiable optimization Henri Devillez [pdf]
05.05.2020 Philippe Blatter, Steven Battilana Meta-Learning Giambattista Parascandolo [pdf][pdf]
12.05.2020 - No seminar - free time for coding challenge
19.05.2020 Ramon Witschi, Nicolas Zucchet Continual Learning Damian Pascual [pdf][pdf]
26.05.2020 Oliver Richter Discussion and Review