Publication:
Towards a common implementation of reinforcement learning for multiple robotic tasks

dc.contributor.authorMartinez-Tenor, Angel
dc.contributor.authorAntonio Fernandez-Madrigal, Juan
dc.contributor.authorCruz-Martin, Ana
dc.contributor.authorGonzalez-Jimenez, Javier
dc.contributor.authoraffiliation[Martinez-Tenor, Angel] ETSI Informat, Inst Invest Biomed Malaga, Dept Ingn Sistemas & Automat, Machine Percept & Intelligent Robot MAPIR, Campus Teatinos,Blvd Luis Pasteur S-N, Malaga 29071, Spain
dc.contributor.authoraffiliation[Antonio Fernandez-Madrigal, Juan] ETSI Informat, Inst Invest Biomed Malaga, Dept Ingn Sistemas & Automat, Machine Percept & Intelligent Robot MAPIR, Campus Teatinos,Blvd Luis Pasteur S-N, Malaga 29071, Spain
dc.contributor.authoraffiliation[Cruz-Martin, Ana] ETSI Informat, Inst Invest Biomed Malaga, Dept Ingn Sistemas & Automat, Machine Percept & Intelligent Robot MAPIR, Campus Teatinos,Blvd Luis Pasteur S-N, Malaga 29071, Spain
dc.contributor.authoraffiliation[Gonzalez-Jimenez, Javier] ETSI Informat, Inst Invest Biomed Malaga, Dept Ingn Sistemas & Automat, Machine Percept & Intelligent Robot MAPIR, Campus Teatinos,Blvd Luis Pasteur S-N, Malaga 29071, Spain
dc.contributor.funderSpanish Government
dc.date.accessioned2023-02-12T02:20:49Z
dc.date.available2023-02-12T02:20:49Z
dc.date.issued2018-06-15
dc.description.abstractMobile robots are increasingly being employed for performing complex tasks in dynamic environments. Those tasks can be either explicitly programmed by an engineer or learned by means of some automatic learning method, which improves the adaptability of the robot and reduces the effort of setting it up. In this sense, reinforcement learning (RI.) methods are recognized as a promising tool for a machine to learn autonomously how to do tasks that are specified in a relatively simple manner. However, the dependency between these methods and the particular task to learn is a well-known problem that has strongly restricted practical implementations in robotics so far. Breaking this barrier would have a significant impact on these and other intelligent systems; in particular, having a, core method that requires little tuning effort for being applicable to diverse tasks would boost their autonomy in learning and self adaptation capabilities. In this paper we present such a practical core implementation of RL, which enables the learning process for multiple robotic tasks with minimal per-task tuning or none. Based on value iteration methods, we introduce a novel approach for action selection, called Q-biased softmax regression (QBIASSR), that takes advantage of the structure of the state space by attending the physical variables involved (e.g., distances to obstacles, robot pose, etc.), thus experienced sets of states accelerate the decision-making process of unexplored or rarely-explored states. Intensive experiments with both real and simulated robots, carried out with the software framework also introduced here, show that our implementation is able to learn different robotic tasks without tuning the earning method. They also suggest that the combination of true online SARSA(lambda) (TOSL) with QBIASSE can outperform the existing RL core algorithms in low-dimensional robotic tasks. All of these are promising results towards the possibility of learning much more complex tasks autonomously by a robotic agent. (C) 2017 Elsevier Ltd. All rights reserved.
dc.identifier.doi10.1016/j.eswa.2017.11.011
dc.identifier.essn1873-6793
dc.identifier.issn0957-4174
dc.identifier.unpaywallURLhttp://arxiv.org/pdf/1702.06329
dc.identifier.urihttp://hdl.handle.net/10668/18773
dc.identifier.wosID427665100019
dc.journal.titleExpert systems with applications
dc.journal.titleabbreviationExpert syst. appl.
dc.language.isoen
dc.organizationInstituto de Investigación Biomédica de Málaga-IBIMA
dc.page.number246-259
dc.publisherPergamon-elsevier science ltd
dc.rights.accessRightsopen access
dc.subjectReinforcement learning
dc.subjectRobotics
dc.subjectExploration
dc.titleTowards a common implementation of reinforcement learning for multiple robotic tasks
dc.typeresearch article
dc.type.hasVersionSMUR
dc.volume.number100
dc.wostypeArticle
dspace.entity.typePublication

Files