DATA-EFFICIENT HIERARCHICAL REINFORCEMENT LEARNING

来源：个人技术集锦

专利内容由知识产权出版社提供

专利名称：DATA-EFFICIENT HIERARCHICAL

REINFORCEMENT LEARNING

发明人：Honglak Lee,Shixiang Gu,Sergey Levine申请号：US17050546申请日：20190517

公开号：US20210187733A1公开日：20210624

专利附图：

摘要：Training and/or utilizing a hierarchical reinforcement learning (HRL) model forrobotic control. The HRL model can include at least a higher-level policy model and alower-level policy model. Some implementations relate to technique(s) that enable more

efficient off-policy training to be utilized in training of the higher-level policy modeland/or the lower-level policy model. Some of those implementations utilize off-policycorrection, which re-labels higher-level actions of experience data, generated in the pastutilizing a previously trained version of the HRL model, with modified higher-level actions.The modified higher-level actions are then utilized to off-policy train the higher-levelpolicy model. This can enable effective off-policy training despite the lower-level policymodel being a different version at training time (relative to the version when theexperience data was collected).

申请人：Google LLC

地址：Mountain View CA US

国籍：US

更多信息请下载全文后查看

因篇幅问题不能全部显示，请点此查看更多更全内容

查看全文

全部栏目

DATA-EFFICIENT HIERARCHICAL REINFORCEMENT LEARNING