ESA GNC Conference Papers Repository
Learning-based motion control of a rover on unknown ground
This paper presents the current status and scientific approaches from the ESA project DeLeMIS. Within that, e:fs and DLR Robotics investigate state-of-the-art approaches of self-learning methods to enable improvements in planetary space missions. The main objective of DeLeMIS is the autonomous navigation of a rover on unknown terrain without human intervention. To this end, algorithms from the latest research in the AI and control engineering community are used for environmental perception and behaviour control. On the one hand, the rover is supposed to learn the ground conditions of different areas and their boundaries, both through camera inputs and through the interaction of the wheels with the ground. On the other hand, the system is intended to learn the best behaviour and the most appropriate strategy for its motion and navigation across different ground surfaces, taking into account safety and robustness aspects. Overall goal is to achieve a better motion behaviour, e.g. regarding the deviation from the desired trajectory, the energy consumption or the safety of the rover on unknown and challenging terrain. The motion control of the rover, the corresponding concepts, learning-based approaches and first results from the project are presented in this paper. Furthermore, the test and validation platform, including a simulation environment, is described which is also used to generate data for the ML approaches in the future work. The considered scenario is based on DLR's LRU (Lightweight Rover Unit) in DLRs test facilities. A model of the LRU is also implemented in the simulation environment for rapid testing and data generation. The task of the motion control subsystem is to follow a given trajectory on a partially known or unknown ground. It is desired that the task is performed better after every iteration, which is taken as an indication for the learning. To isolate the effect of the learning-based controller, the position and the orientation of the rover is assumed to be exactly known. For easier and faster development during the project, a simulation environment was created, including a model of the LRU rover, a lunar surface and a flexible interface architecture for the integration of the different algorithms. This gazebo simulator (https://gazebosim.org/), embedded in the ROS (https://www.ros.org/) framework serves also as a data generator for the ML-based methods and as a test and verification platform for the algorithm development. To increase the representativeness and the reliability of the simulation, data from DLRs facilities will be (By the deadline of the final paper, this will be done. We plan a first test campaign already in February) gathered and used to verify the behaviour of the simulated rover in relevant and comparable scenarios. To this end, the rover will ride pre-defined trajectories in DLRs test facility containing a new lunar soil simulant, which is the same as the one that will be used for the new DLR/ESA lunar test facility LUNA in Cologne. The resulting trajectory data will be then compared to those from the simulator and simulation parameter will be adjusted accordingly. The overall closed loop system consists of several components, for instance, the rover or the representing simulation model, sensors and perception, low-level actuator controller and high-level motion controller. The perception module is not within the scope of this paper. There are two control modules which are designed each to be tested and analysed independently. Furthermore, the algorithms and the interfaces are generic to be applicable to both the rover and the simulation. Static (non-learning) algorithms are implemented for tasks such as trajectory planning and interfacing the wheel actuators. Also, a non-linear, Lyapunov-based, controller is applied for trajectory tracking control. Besides, three modules with learning-based components are involved. The perception module includes image recognition for segmentation and classification of the surface. The learning-based kinematics adjustments module manipulates the low-level wheel steering controller and the learning-based motion control is applied on a high-level. Below, first implementations of each of the two latter algorithms are presented and results are discussed. For the learning-based kinematics adjustments module a first algorithm is implemented to manipulate the currently available wheel steering controller on the rover using an additive term to the input. For this purpose, an approximation function is trained from the simulated trajectory tracking errors to predict correction control action. Simulation results show that Gaussian Process Regression (GPR) is suitable as approximation function for online learning and leads to improvement. For the learning-based motion controller, a learning-based model predictive controller (MPC) is developed with a parametric model for trajectory tracking. Simulation results show that the proposed controller is able to track a given trajectory while considering state and input constraints and improving performance with the time. In summary, at the current stage of the project, a solid framework for the development of different learning-based algorithms has been created, including a simulation environment for data generation and verification of the learning-based methods. In addition, first promising approaches have been developed and implemented. Next steps are to replace the parametric approach of the LBMPC by a neural network for which the training data is being generated by the validated simulator. Finally, results will be verified using the rover to compare to those from the simulation. At the end of the project, a test and demonstration campaign is planned at DLRs facilities.