Not Only Rewards But Also Constraints: Applications on Legged Robot Locomotion

Рет қаралды 7,038

RaiLab Kaist

10 ай бұрын

arxiv.org/abs/2308.12517

Пікірлер: 9

@TextZip 9 ай бұрын

Hi, the video and the paper are really impressive. I wanted to know what simulation platform was used and if the code will be made public..?

@railabkaist9016 9 ай бұрын

You can download the simulator at: raisim.com/ The code will be made public after the paper is accepted

@snuffybox 10 ай бұрын

You should link the paper

@NowayJose14 9 ай бұрын

It's in the description, chief

@snuffybox 9 ай бұрын

@@NowayJose14 it wasn't when I commented

@user-le3uf8hw5d 10 ай бұрын

Hello, I hope you're well. While my question isn't directly related to the paper, may I inquire about it? I'm wondering how significant the reality-gap is. Can you consistently expect that every learned policies can be transferred seamlessly to the actual robot? Additionally, would the behavior of both the simulation and the real-world robot be nearly identical? Lastly, if one were to forgo the teacher-student structure, might there be a noticeable decrease in performance?

@user-pg6ym1th9 9 ай бұрын

Thank you for your interest in our work. What we wanted to claim in this work is to use both rewards and constraints when designing learning-based controllers for complex robotic systems. We used the teacher-student learning framework but it is not limited to it. Other methods such as vanilla learning [1] or asymmetric learning [2] can also be used. In our experience, the sim-to-real gap highly depends on the characteristics of the robotic system you are working on (e.g., actuator mechanism, system software latency, actuation bandwidth) rather than the learning algorithm itself. Based on the characteristics of your system, you should select appropriate methods to solve the sim-to-real gap (e.g., domain randomization, domain adaptation, actuator networks [1]). In our case (Raibo, Mini-cheetah), domain randomization was enough. [1] Hwangbo, Jemin, et al. "Learning agile and dynamic motor skills for legged robots." Science Robotics 4.26 (2019): eaau5872. [2] Nahrendra, I. Made Aswin, Byeongho Yu, and Hyun Myung. "Dreamwaq: Learning robust quadrupedal locomotion with implicit terrain imagination via deep reinforcement learning." 2023 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2023.

@user-le3uf8hw5d 9 ай бұрын

@@user-pg6ym1th9 Thanks for the detailed discussion! I'm working on a humanoid RL and was wondering if there could be a way to further reduce the reality gap. Thanks for your kind explanation. :)