Q-learning: car driving narrow track (2)

  Рет қаралды 719

42smurf

42smurf

Күн бұрын

A simple, model-free Q-Learning, reinforcement learning agent navigating a car on a track. Code for example (4) is available here: github.com/cpe...
The purpose of this exercise was to test whether/how the agent would follow the track given only coarse sensory information.
The agent has been trained on tracks different from the one shown (track shown: length: 800 m, width: 4 m, curve radii: 5.2 to 24 m.). Training tracks included: (a) oval tracks with aspect ratio 10:1, (b) elliptical circuits, (c) pentagon to nonagon tracks. Tracks of type (b) and (c) had curve radii of 0 m and track widths of 3.8 to 5.0 m. Tracks of type (a) were 7.6 to 10 m wide with 400 m long straight segments.
Q values are stored in a table for the set of discrete states and actions.
Possible actions are any combination of speed change and wheel rotation: speed-up one step (+), speed-down one step (-), maintain speed (.), turn wheel more left (<), turn wheel more right (>), leave wheel as-is (|).
The state of the learning agent is the combination of: speed (-0.1, 0, 0.1, 0.3, 1.0, or 3.0 m per time step), nominal wheel direction (-35, -20, -10, -5, -2.5, 0, 2.5, 5, 10, 20, 35 degrees), clearance rear, front left, rear left, front right rear right (<0.1 m, ≥0.1 m) and left front and right front (<0.1 m, <0.3 m, <1 m, <2 m, <3 m, <10 m, ≥10 m).
Rewards given are inversely proportional to speed (note: all rewards are negative), or -3000 in case of collision with the track boundaries.
The discrete state is illustrated by a sketch (e.g., yellow zones are unobstructed areas), and the table shows the Q-values associated with the possible actions in the current state. Actions are chosen based on a greedy policy.

Пікірлер: 1
@yuheng1992
@yuheng1992 7 жыл бұрын
Hi, are you able to share this code with me ?
Q-learning: car driving narrow track (3)
16:40
42smurf
Рет қаралды 107
Neil deGrasse Tyson Explains the Physics of Formula One Racing
16:47
Сюрприз для Златы на день рождения
00:10
Victoria Portfolio
Рет қаралды 2,4 МЛН
Fake watermelon by Secret Vlog
00:16
Secret Vlog
Рет қаралды 25 МЛН
Não sabe esconder Comida
00:20
DUDU e CAROL
Рет қаралды 34 МЛН
Я сделала самое маленькое в мире мороженое!
00:43
Кушать Хочу
Рет қаралды 4,8 МЛН
Reveal Invisible Motion With This Clever Video Trick
14:20
Steve Mould
Рет қаралды 2,6 МЛН
Danya Pulls UNO REVERSE on Kramnik's CHEATING Allegations
17:53
TightKnights
Рет қаралды 108 М.
3 Hours vs. 3 Years of Blender
17:44
Isto Inc.
Рет қаралды 5 МЛН
Dear Game Developers, Stop Messing This Up!
22:19
Jonas Tyroller
Рет қаралды 721 М.
Simple Explanation of the Birthday Paradox
12:11
Wrath of Math
Рет қаралды 14 М.
Can We Build an Artificial Hippocampus?
23:51
Artem Kirsanov
Рет қаралды 210 М.
Oh, wait, actually the best Wordle opener is not “crane”…
10:53
How Math Becomes Difficult
39:19
MAKiT
Рет қаралды 74 М.
AI Boxing Got me Wheezing
15:06
cozmouz
Рет қаралды 332 М.
I Remade YouTube From Scratch Using Just Bash
17:51
icitry
Рет қаралды 60 М.
Сюрприз для Златы на день рождения
00:10
Victoria Portfolio
Рет қаралды 2,4 МЛН