Dec 11, 2023
Q(s,a) is the expected return of taking an action, refer to the temporal difference
Q(s,a)=r + discounted rate*V(s’)
hope this helps!
Q(s,a) is the expected return of taking an action, refer to the temporal difference
Q(s,a)=r + discounted rate*V(s’)
hope this helps!
A Technology Enthusiast who constantly seeks out new challenges by exploring cutting-edge technologies to make the world a better place!