Q Learning Algorithm شرح

  Рет қаралды 4,513

د. محمد زيدان

د. محمد زيدان

Күн бұрын

Пікірлер: 17
@roozy_313a2
@roozy_313a2 2 ай бұрын
رحم الله والديك دكتور 🌺
@MK-cu1se
@MK-cu1se 2 ай бұрын
❤❤❤
@shahd.h6430
@shahd.h6430 Жыл бұрын
episode 1: state= room 4, action-room =0 Q(4,0)=100+ 0.1*0 =100 episode 2: state=3, action=4 Q(3,4)=0+0.1*100=10 episode 3: state=1, action=3 Q(1,3)=0+0.1*10=1 So the path will be from room 1 to room 3 to room 4 to room 0 which is the goal
@youssef-ns9ny
@youssef-ns9ny Жыл бұрын
Youssef mohamed The quiz consists of three episodes, where the goal is to determine the values of certain Q-functions. In episode 1, the Q-function for state (4,0) is determined to be 100, which signals the completion of the episode. In episode 2, the Q-function for state (5,4) is found to be 10, with the Q-function for state (4,0) still being 100, and the episode is completed. Similarly, in episode 3, the Q-function for state (3,4) is found to be 10, with the Q-function for state (4,0) still being 100, and the episode is completed. The final path to reach the goal room 0 involves moving from room 1 to room 3 to room 4 before reaching the goal in room 0.
@yosrmahmod5473
@yosrmahmod5473 Жыл бұрын
Episode 1: Initial state at room 1 Q(1,3)=0+0.1*0=0 Q(3,4)=0+0.1*100=10 Episode 2: Initial state at room 4 Q(4,0)=100+0.1*0=100 Episode 3: Initial state at room 5 Q(5,4)=0+0.1*100=10 Path 5 to 4 to 0
@jawaheralbaddawi
@jawaheralbaddawi 8 ай бұрын
what we do if start from room 0 ?
@user-ie3hr9ez9u
@user-ie3hr9ez9u Жыл бұрын
Episode 1 Q(4,0) = 100+0.1*0 =100 Episode 2 Q(3,4) = 0 + 0.1*100= 10 Episode 3 Q(1,3) = 0+0.1*10 = 1 So from room 1 to 3 to 4 to 0 and that's our goal
@user-mu7ht8lj1f
@user-mu7ht8lj1f Жыл бұрын
Abdallah Ali Ahmed In episode 1, the quiz begins with the initial state at room 1. The Q-function for state (1,3) is calculated to be 0, where the reward obtained from moving from state (1,3) to state (3,4) is 0, and the discount factor is 0.1. On the other hand, the Q-function for state (3,4) is determined to be 10, where the reward obtained from moving from state (3,4) to terminal state (4,0) is 100, and the discount factor is 0.1. In episode 2, the quiz starts with the initial state at room 4. The Q-function for state (4,0) is calculated to be 100, where the reward obtained from moving from state (4,0) to terminal state (4,0) is 0, and the discount factor is 0.1. In episode 3, the quiz begins with the initial state at room 5. The Q-function for state (5,4) is determined to be 10, where the reward obtained from moving from state (5,4) to terminal state (4,0) is 100, and the discount factor is 0.1. To reach the goal at room 0, the final path involves moving from state (5,4) to state (3,4) to state (4,0). final path from 5 to 4 to 0, Thank you,
@nadaabdo7830
@nadaabdo7830 Жыл бұрын
Nada Abdelregal The answer of the quiz is: Episode 1: State:room4 , action: room 0 Q(4,0)=100 which is the goal Episode 2: State:room5 , action: room 4 Q(5,4)=0+0.1*(100 ,0 ,0)=10 Episode 3: State:room3 , action: room 4 Q(3,4)=0+0.1*(100 ,0 ,0)=10 So the path will be from room 1 to room 3 to room 4 to room 0 which is the goal
@AhmedAli-sz9vi
@AhmedAli-sz9vi Жыл бұрын
Ahmed Ali : the answer of quiz : episode 1: Q(4,0)=100 (goal) Then finish the episode episode 2: Q(5,4)=10 Q(4,0)=100 (goal) Then finish the episode episode 3: Q(3,4)=10 Q(4,0)=100 (goal) Then finish the episode and the path in the end will be from room 1 to room 3 to room 4 to the goal room 0
@mouradfakhfakh4823
@mouradfakhfakh4823 5 ай бұрын
Would you please share the python code. Thanks!
@user-cr3ly1xy8b
@user-cr3ly1xy8b Жыл бұрын
Mai Mohamed The answer of the quiz is: Episode 1: State:room4 , action: room 0 Q(4,0)=100 which is the goal Episode 2: State:room5 , action: room 4 Q(state, action)=R(state, action)+Gamma*max(Q(next state) Q(5,4)=0+0.1*(100 ,0 ,0)=10 Episode 3: State:room3 , action: room 4 Q(state, action)=R(state, action)+Gamma*max(Q(next state) Q(3,4)=0+0.1*(100 ,0 ,0)=10 So the path will be from room 1 to room 3 to room 4 to room 0 which is the goal
@user-zu3xc8bn2k
@user-zu3xc8bn2k Жыл бұрын
episode 1: state= room 4 , action=room =0 Q(4,0)=100 + 0.1*0 =100 episode 2: state=5 , action=1 Q(5,1)= 0 + 0.1*0 =0 episode 3: state=3 , action=4 Q(3,4)= 0 + 0.1*100 =10 Then finish the episode
@manarmohamed2923
@manarmohamed2923 Жыл бұрын
Manar Mohamad The Answer Of the Quiz is : Episode 1 Random Stata room 3 action 4 Update Q Matrix Q(3,4)=0+0.1*100=10 4 not a goal Then choose state 4 action 0 Q(4,0)=100+0.1*0=100 0 is a goal - > Episode 1 Finished -------------------------- Episode 2 Random State room 5 action 4 Update Q Matrix Q(5,4)=0+0.1*100=10 4 not a goal Then choose state 4 action 0 Q(4,0)=100+0.1*0=100 0 is a goal - > Episode 2 Finished -------------------------- Episode 3 Random State room 4 action 0 Update Q Matrix Q(4,0)=100+0.1*0=100 0 is a goal - > Episode 3 Finished -------------------------- Start From Stare 5 by choosing Max value Optimal path is 5 to 4 Then From 4 to 0 [5-4-0]
@MoatazMEwis
@MoatazMEwis Жыл бұрын
Here is the quiz, prof. Mohamed: drive.google.com/file/d/12cYcErLIkQWl3pyGn-ya5N6hSftSJBTp/view?usp=drivesdk
@gamaladel9308
@gamaladel9308 Жыл бұрын
❤❤
Q Learning Project
1:27
د. محمد زيدان
Рет қаралды 507
Deep Q-Networks Explained!
10:51
CodeEmporium
Рет қаралды 23 М.
The FASTEST way to PASS SNACKS! #shorts #mingweirocks
00:36
mingweirocks
Рет қаралды 13 МЛН
WILL IT BURST?
00:31
Natan por Aí
Рет қаралды 42 МЛН
Watch Brilliant play in Contract Bridge 203
1:41
渡邊秀夫
Рет қаралды 10
Q-Learning: A Complete Example in Python
18:08
Dr. Daniel Soper
Рет қаралды 63 М.
Q Learning simply explained | SARSA and Q-Learning Explanation
9:46
Q-Learning Explained - A Reinforcement Learning Technique
8:38
deeplizard
Рет қаралды 227 М.
The BEST Q-Learning example! | The Mountain Car Problem
4:44
Marcus Koseck
Рет қаралды 2,8 М.
Q - Learning Algorithm - Reinforcement Machine learning- شرح عربي
15:37
Diffie-Hellman Key Exchange: How to Share a Secret
9:09
Spanning Tree
Рет қаралды 147 М.
Q-learning - Explained!
11:54
CodeEmporium
Рет қаралды 20 М.
SARSA vs Q Learning
16:31
Marcus Fong
Рет қаралды 12 М.
Actor Critic Algorithms
9:44
Siraj Raval
Рет қаралды 94 М.
The FASTEST way to PASS SNACKS! #shorts #mingweirocks
00:36
mingweirocks
Рет қаралды 13 МЛН