您的当前位置:首页正文

DATA-EFFICIENT HIERARCHICAL REINFORCEMENT LEARNING

来源:个人技术集锦
专利内容由知识产权出版社提供

专利名称:DATA-EFFICIENT HIERARCHICAL

REINFORCEMENT LEARNING

发明人:Honglak Lee,Shixiang Gu,Sergey Levine申请号:US17050546申请日:20190517

公开号:US20210187733A1公开日:20210624

专利附图:

摘要:Training and/or utilizing a hierarchical reinforcement learning (HRL) model forrobotic control. The HRL model can include at least a higher-level policy model and alower-level policy model. Some implementations relate to technique(s) that enable more

efficient off-policy training to be utilized in training of the higher-level policy modeland/or the lower-level policy model. Some of those implementations utilize off-policycorrection, which re-labels higher-level actions of experience data, generated in the pastutilizing a previously trained version of the HRL model, with modified higher-level actions.The modified higher-level actions are then utilized to off-policy train the higher-levelpolicy model. This can enable effective off-policy training despite the lower-level policymodel being a different version at training time (relative to the version when theexperience data was collected).

申请人:Google LLC

地址:Mountain View CA US

国籍:US

更多信息请下载全文后查看

因篇幅问题不能全部显示,请点此查看更多更全内容