Online-Learning in Humanoid Robots

Conradt, Jörg

Online-Learning in Humanoid Robots

Zusammenfassung

Inhaltsangabe:Abstract:
Humanoid Robotic Systems have gained an increasing significance in the research world within the last few years. Just five years ago, there were hardly any human-like robots in the world, and those available did not represent human properties at all. They neither looked nor behaved like human beings. Today, a variety of research groups around the world is starting to work on topics related to humanoid robots, and it is very likely that these robots will become important within the upcoming decades even beyond the realm of science.
Trying to determine what humanoid robots are, a first draft of a definition might read as follows: such robots are to be called humanoid robots which - to some extent - are able to live and interact with the everyday human world, and represent certain human features, like cognitive or acting abilities. The main strength of such humanoid robots lies in their ability to operate in surroundings that have been designed for humans in the first place. Humanoid robots can be imagined to become useful assistants for every-day life in areas as diverse as:
- Rescue and clearing of dangerous situations.
- Janitorial services, Housekeeping.
- Security services.
- Care-taking in hospitals, recreational facilities.
- Entertainment.
In all these fields, close human interaction is a core issue and can be regarded as the minimum common basis. The interaction happens on many different levels, from physical touch to gesture recognition and the processing of spoken language. On cognitive issues like the two last named, much research has been done in the past few years. One has, however, to keep in mind that also the physical appearance, e.g. smoothness of motions, is an important issue when designing humanoid robots.

Inhaltsverzeichnis:Table of Contents:
FOREWORD1
1.INTRODUCTION2
1.1INTRODUCING THE AREA OF HUMANOID ROBOTICS2
1.2MECHANICAL DESIGN FOR HUMANOID ROBOTS3
1.3CONTROLLING HUMANOID ROBOTS4
1.4EXAMPLES OF TODAY'S HUMANOID ROBOTS5
1.5THE PURPOSE OF THE THESIS9
2.A BRIEF RECAPITULATION OF BASIC ROBOT CONTROL10
2.1INTRODUCTION TO ROBOT CONTROL10
2.2CONTROLLING THE EXECUTION OF DESIRED TRAJECTORIES12
2.3THE FEEDBACK CONTROL FUNCTION14
2.4THE FEED-FORWARD CONTROL FUNCTION17
2.5ESTIMATING DYNAMICS USING RIGID BODY ASSUMPTIONS18
2.6RECAPITULATION22
2.7CONTROL OF HUMANOID ROBOTS23
3.INTRODUCTION TO ROBOT LEARNING25
3.1GENERAL REMARKS ON ROBOT LEARNING25
3.2THE BIAS / VARIANCE […]

Leseprobe

Inhaltsverzeichnis

ID 4858

Conradt, Jörg: Online-Learning in Humanoid Robots / Jörg Conradt -

Hamburg: Diplomica GmbH, 2001

Zugl.: Berlin, Technische Universität, Diplom, 2001

Dieses Werk ist urheberrechtlich geschützt. Die dadurch begründeten Rechte, insbesondere die

der Übersetzung, des Nachdrucks, des Vortrags, der Entnahme von Abbildungen und Tabellen,

der Funksendung, der Mikroverfilmung oder der Vervielfältigung auf anderen Wegen und der

Speicherung in Datenverarbeitungsanlagen, bleiben, auch bei nur auszugsweiser Verwertung,

vorbehalten. Eine Vervielfältigung dieses Werkes oder von Teilen dieses Werkes ist auch im

Einzelfall nur in den Grenzen der gesetzlichen Bestimmungen des Urheberrechtsgesetzes der

Bundesrepublik Deutschland in der jeweils geltenden Fassung zulässig. Sie ist grundsätzlich

vergütungspflichtig. Zuwiderhandlungen unterliegen den Strafbestimmungen des

Urheberrechtes.

Die Wiedergabe von Gebrauchsnamen, Handelsnamen, Warenbezeichnungen usw. in diesem

Werk berechtigt auch ohne besondere Kennzeichnung nicht zu der Annahme, dass solche

Namen im Sinne der Warenzeichen- und Markenschutz-Gesetzgebung als frei zu betrachten

wären und daher von jedermann benutzt werden dürften.

Die Informationen in diesem Werk wurden mit Sorgfalt erarbeitet. Dennoch können Fehler nicht

vollständig ausgeschlossen werden, und die Diplomarbeiten Agentur, die Autoren oder

Übersetzer übernehmen keine juristische Verantwortung oder irgendeine Haftung für evtl.

verbliebene fehlerhafte Angaben und deren Folgen.

Diplomica GmbH

http://www.diplom.de, Hamburg 2001

Printed in Germany

Wissensquellen gewinnbringend nutzen

Qualität, Praxisrelevanz und Aktualität zeichnen unsere Studien aus. Wir

bieten Ihnen im Auftrag unserer Autorinnen und Autoren Wirtschafts-

studien und wissenschaftliche Abschlussarbeiten Dissertationen,

Diplomarbeiten, Magisterarbeiten, Staatsexamensarbeiten und Studien-

arbeiten zum Kauf. Sie wurden an deutschen Universitäten, Fachhoch-

schulen, Akademien oder vergleichbaren Institutionen der Europäischen

Union geschrieben. Der Notendurchschnitt liegt bei 1,5.

Wettbewerbsvorteile verschaffen Vergleichen Sie den Preis unserer

Studien mit den Honoraren externer Berater. Um dieses Wissen selbst

zusammenzutragen, müssten Sie viel Zeit und Geld aufbringen.

http://www.diplom.de bietet Ihnen unser vollständiges Lieferprogramm

mit mehreren tausend Studien im Internet. Neben dem Online-Katalog und

der Online-Suchmaschine für Ihre Recherche steht Ihnen auch eine Online-

Bestellfunktion zur Verfügung. Inhaltliche Zusammenfassungen und

Inhaltsverzeichnisse zu jeder Studie sind im Internet einsehbar.

Individueller Service

Gerne senden wir Ihnen auch unseren Papier-

katalog zu. Bitte fordern Sie Ihr individuelles Exemplar bei uns an. Für

Fragen, Anregungen und individuelle Anfragen stehen wir Ihnen gerne zur

Verfügung. Wir freuen uns auf eine gute Zusammenarbeit.

Ihr Team der Diplomarbeiten Agentur

ABLE OF

ONTENTS

FOREWORD... 1

1. INTRODUCTION... 2

1.1 I

NTRODUCING THE

REA OF

UMANOID

OBOTICS

... 2

1.2 M

ECHANICAL

ESIGN FOR

UMANOID

OBOTS

... 3

1.3 C

ONTROLLING

UMANOID

OBOTS

... 4

1.4 E

XAMPLES OF

ODAY

UMANOID

OBOTS

... 5

1.5 T

URPOSE OF THE

HESIS

... 9

2. A BRIEF RECAPITULATION OF BASIC ROBOT CONTROL ... 10

2.1. I

NTRODUCTION TO

OBOT

ONTROL

... 10

2.2. C

ONTROLLING THE

XECUTION OF

ESIRED

RAJECTORIES

... 12

2.3. T

EEDBACK

ONTROL

UNCTION

... 14

2.4. T

EED

FORWARD

ONTROL

UNCTION

... 17

2.5. E

STIMATING

YNAMICS USING

IGID

ODY

SSUMPTIONS

... 18

2.6. R

ECAPITULATION

... 22

2.7. C

ONTROL OF

UMANOID

OBOTS

... 23

3. INTRODUCTION TO ROBOT LEARNING... 25

3.1 G

ENERAL

EMARKS ON

OBOT

EARNING

... 25

3.2 T

IAS

/ V

ARIANCE

RADEOFF

... 27

3.3 G

LOBAL VERSUS

OCAL

EARNING

TRATEGIES

... 29

3.4 T

URSE OF

IMENSIONALITY

... 31

3.5 O

NLINE

-L

EARNING

... 32

3.6 L

EARNING

NVERSE

YNAMICS

... 34

3.7 R

ESULTS

... 34

4. THE LEARNING ALGORITHM LWPR... 36

4.1 A

DVANTAGES OF

EARNING

PPROACHES

... 36

4.2 D

ESIRED

LGORITHMIC

ROPERTIES

... 38

4.3 I

NPUT

ATA

REPROCESSING

... 45

4.4 P

REDICTING

UTPUT

ATA USING

LWPR... 47

4.5 L

EARNING

ARAMETERS

... 50

4.6 T

INAL

LWPR

LGORITHM

... 57

4.7 P

ROPERTIES OF

LWPR ... 59

5. EVALUATION OF LWPR'S PERFORMANCE ON ARTIFICIAL DATA... 61

5.1 I

NTRODUCTION

... 61

5.2 O

IMENSIONAL

UNCTION

ITTING

... 61

5.3 T

-D

IMENSIONAL

UNCTION

ITTING WITH

HIFTING

NPUT

ISTRIBUTIONS

... 65

5.4 L

IMITING THE

OMPUTATIONAL

OMPLEXITY

... 68

5.5 D

ISCUSSION

... 69

6. THE HUMANOID ROBOT... 71

6.1 H

ARDWARE FOR

OBOT

ONTROL

... 71

6.2 S

OFTWARE FOR

OBOT

ONTROL

... 74

6.3 T

ARCOS

OBOTIC

... 77

7. EVALUATION OF THE ARM'S MOTION ... 81

7.1 G

ENERATING

RAJECTORIES FOR

EARNING

... 81

7.2 L

EARNING

NVERSE

YNAMICS

... 85

7.3 L

EARNING

ESULTS ON

AMPLED

ATA

... 87

7.4 V

ERIFYING THE

ESULTS ON

EAL

OTION

... 91

8. CONCLUSION ... 96

8.1 D

ISCUSSION

... 96

8.2 B

RIEF

NTRODUCTION TO

SING

LWPR

FOR

NVERSE

INEMATICS

STIMATION

... 97

8.2 F

UTURE

ESEARCH

... 99

REFERENCES... 101

SHORT GERMAN SUMMARY ... 104

EIDESSTATTLICHE ERKLÄRUNG... 105

Foreword

The basis for this thesis was laid in the academic year 1999/2000, which I spent as an

exchange student at the University of Southern California (USC), Los Angeles, and as a

guest at the CLMC laboratory. The Computational Learning and Motor Control

laboratory (CLMC laboratory), headed by Dr. Stefan Schaal, is pursuing research in the

border areas of neuroscience, computer science, statistics, and robotics (control theory).

The focus of the laboratory is to study the principles of biological behavior.

During the previous years, research has realized that even simple biological

systems achieve by far superior sensory-motor competence compared to any artificial

system today. On this basis, trying to understand and imitate these biological principles

for artificial systems seems to be a promising approach. In this thesis, the motor control

for a human-like robotic arm was learned based on a biologically plausible model.

During my time at the USC, I had the opportunity to participate in the research

activities of the CLMC laboratory. I am very thankful for the discussions, for help, and

for much interesting information that I experienced there. Especially Dr. Stefan Schaal

and Dr. Sethu Vijayakumar have been patient and good-humored, and with their

generosity in respect to equipment and other resources they have tremendously eased my

start in the new environment. I sincerely hope that the time at the CLMC laboratory has

generated a link that will last for much longer. The students belonging to the CLMC lab,

Gaurav Tevatia and Biren Metha, also deserve my gratitude for discussing the first results

and offering generous help on various topics.

I am also particularly grateful to the Fulbright Commission for providing funds to

stay for one year at an American university. I could never have studied in the U.S.

without their generous support.

Prof. Dr. Hommel from the TU Berlin was the person who first brought me into

contact with the field of robotics. I especially wish to thank him for the encouragement

and the careful supervision of this thesis. In addition, he kindly offered advice on all

questions of research, by no means limited to the scope of this thesis.

Furthermore, I owe thanks to all the people who have read and commented on

parts of earlier drafts and helped to improve the thesis.

1. Introduction

1.1 Introducing the Area of Humanoid Robotics

Humanoid Robotic Systems have gained an increasing significance in the research world

within the last few years. Just five years ago, there were hardly any human-like robots in

the world, and those available did not represent human properties at all. They neither

looked nor behaved like human beings. Today, a variety of research groups around the

world is starting to work on topics related to humanoid robots, and it is very likely that

these robots will become important within the upcoming decades even beyond the realm

of science.

Trying to determine what humanoid robots are, a first draft of a definition might

read as follows: such robots are to be called humanoid robots which - to some extent - are

able to live and interact with the everyday human world, and represent certain human

features, like cognitive or acting abilities. The main strength of such humanoid robots lies

in their ability to operate in surroundings that have been designed for humans in the first

place. Humanoid robots can be imagined to become useful assistants for every-day life in

areas as diverse as

· Rescue and clearing of dangerous situations

· Janitorial services, Housekeeping

· Security services

· Care-taking in hospitals, recreational facilities

· Entertainment

In all these fields, close human interaction is a core issue and can be regarded as the

minimum common basis. The interaction happens on many different levels, from

physical touch to gesture recognition and the processing of spoken language. On

cognitive issues like the two last named, much research has been done in the past few

years. One has, however, to keep in mind that also the physical appearance, e.g.

smoothness of motions, is an important issue when designing humanoid robots.

1.2 Mechanical Design for Humanoid Robots

Given the close interaction with humans and the potential working spaces listed above,

some core requirements for the design of humanoid robots can be set out:

First of all, humanoid robots need the ability to act in environments tailored for

human needs and to operate devices originally designed for humans, e.g. when turning

knobs. Therefore, they need the basic equipment, e.g. manipulator arms, and legs, which

can perform independent tasks. Their movements have to be fast and accurate, and their

grip fine and powerful. Their links have to be lightweight to reduce the influence of

inertia. It is also desirable to have compliant joints, because only these will allow the

robot to react flexibly to external stimuli. This might be the case when the robot is pushed

aside during the performance of a task. A natural result of building robots according to

these prerequisites is a high degree of redundancy in the whole system - an issue, which

has to be handled appropriately.

Moreover, humanoid robots have to be equipped with sensory and processing

capabilities to interact with their environment, and they must be mobile within a wide

range. In general, their mechanical design has to ensure that they can operate in normal

living environments without extensive modifications of these surroundings.

With these requirements, the mechanical design of humanoid robots differs

substantially from that of today's robots, e.g. robots used for manufacturing. Such

industrial robots are obviously designed according to very different needs: One of their

main purposes is to repeat tasks with high accuracy. This requires stiff joints, solid links,

and strong actuators. To allow simple control, they are designed to behave as linear as

possible.

The amount of compliance in the joints is a particularly important issue for

biologically plausible motion and shall now be set out in some more detail. As has been

said above, compliance in the joints is not only important for the smoothness of motion.

It is above all the central prerequisite for the robot's ability to react appropriately to

external stimuli. Such stimuli occur frequently in natural environments and it seems to be

one of the most remarkable abilities of living organisms to react to such changes and

adapt their own behavior and actions. Such changes might be local impediments while a

task is performed, or pushes. Assume shaking hands with the robot: If the robot consists

of stiff joints, shaking hands cannot be performed well as one has to follow exactly the

robot's desired motion. If the joints are compliant, much more variation in the movement

is possible - a human partner can always force the robot to slightly adjust its own

position.

On the mechanical level, motion in the joints can be achieved by different means.

The two major approaches are electric motors combined with gearboxes, and hydraulic

actuation. When using electric motors, gearboxes are a necessary complement, because

only they provide sufficient torques for the robot's movements. Gearboxes, however,

increase the stiffness in all joints by a considerable amount, since they do not offer back-

drive ability. Therefore, motors with gearboxes are unsuitable as actuators for humanoid

robots.

Alternatively, robots can be equipped with hydraulics to generate high torques for

the joints. When using hydraulics together with load sensors in every joint, the joints'

behaviors can be anywhere between very stiff and very compliant, only depending on the

controller. This means, the problem of compliance is moved to the level of software.

The only problem that remains is that any increase of compliance goes along with

a decrease of accuracy. It is therefore highly important to find a good tradeoff between

accuracy of motion and compliance in the joints. The quality of a controlling method can

be directly correlated with its ability to find such a balance.

1.3 Controlling Humanoid Robots

Let us assume that all the mechanical desires described in chapter 1.2 can be fulfilled,

and that the problems with stiff joints and heavy material used in the links can be solved.

There still remains a major problem: By what means could such a robot be controlled?

What kind of algorithms allows the robot to use the whole variety of motion that is

usually associated with biological motion? For Example, how could it be accomplished

that a humanoid robot gives way for external motion, such as pressure enforced by

contact with humans? And given the desired compliance is achieved using lightweight

material as described in chapter 1.2: How can we cope with the constraints that such

materials add on the control algorithms? It is obvious that traditional algorithms, e.g.

those based on rigid body dynamics assumptions, are not well suited to control such

mechanics. A fairly novel way to solve the question of controlling the robot is the

application of learning approaches. The major advantage of a learned control strategy is

that it adapts the control schema based on how the system behaves. This means that a

well-suited learning algorithm will always stay accurate.

1.4 Examples of Today's Humanoid Robots

In this chapter, I would like to provide a short overview of today's humanoid robots, and

present some examples.

Probably the best-known robot was build by Honda during the last 10 years (e.g.

Hirai, 1987, Hirai, Hirose, et al., 1998). Figure 1.1 shows a picture of the robot, which

looks very much like a human.

Figure 1.1 The Honda Robot P3: a) Frontal View b) Side View c) Technical Diagram

The main goal of Honda's development was to create an autonomous robot that can move

about in a human environment. The robot can walk, climb stairs, and manipulate simple

objects. Every singe step, however, needs to be carefully programmed in advance. As

soon as more-complex tasks arise, the robot has to be switched to tele-operation mode

and becomes controlled by a human supervisor. Corresponding to that, the robot is highly

stiff in all joints and the sensing capabilities are very limited. The size of stair-steps it has

to climb, for example, has to match exactly the preprogrammed values. Very little visual

feedback or other sensor-information is used to correct for unexpected changes in the

environment. This robot appears to be a human, but does not at all behave like a

humanoid robot.

The robot's technical parameters are summarized in the following table:

Weight: 130

Height: 160cm

Width:

60cm

Depth:

55,5cm

Walking speed:

2.0 km / hour, ca. 25 minutes autonomous walking

Operated on DC servo motors with gears

Lift Capacity per Hand: 2kg

Degrees of Freedom: 28 (Legs: 2x6, Arms: 2x7, Hands: 2x1)

Table 1.1 Technical Parameters of the Honda Humanoid Robot

Another example of a humanoid robot is COG, a robot torso developed at MIT. A picture

of COG is shown in figure 1.2:

Figure 1.2 COG, the Robot Torso developed at MIT

COG is used to study the question how far a humanoid robot can become 'cognitive';

therefore, the focus of research is on the robot's interaction with people. The robot

realizes what people want and acts properly, sometimes also following its own 'desires'. It

is not designed for complicated motion. Though the robot's physical behavior is very

much different from humans, experiments have shown that it can help to understand the

way people interact with each other. In the near future, the research objective of the

development team is to achieve a robot's perception that is comparable to that of a six-

months-old baby.

The third example for a humanoid robot is DB (Dynamic Brain), a robot at the

ATR human resources research laboratories in Kyoto, Japan. DB is a hydraulically

actuated anthropomorphic robot with legs, arms (with hand palms but without fingers), a

jointed torso, and a head. It was designed and built by Sarcos, a company that usually

builds tele-operated robots for entertainment purposes like movies and amusement parks.

The robot offers a variety of motions; however, it is mounted on a pelvis, so free standing

or walking experiments cannot be performed. The research performed at the ATR

laboratories focuses on upper-body movement (e.g. arm motion).

DB's technical parameters are summarized in the following table:

Weight: 80

Height: 185cm

Width:

60cm

Depth:

35cm

Operated on hydraulic actuators with 650 psi pumps

25 linear actuators, 5 rotary actuators

Degrees of Freedom: 30 (Legs: 2x3, Trunk: 3, Arms: 2x7, Neck: 3, Eyes: 2x2)

Position and load sensors in every Degree of Freedom (except eyes)

Video cameras providing stereo fovea and panoramic view

Table 1.2 Technical Parameters of the DB robot

Comparing DB's parameters with those of the Honda robot, one can easily recognize that

DB represents human properties much better, especially in terms of the weight and size.

Moreover, being actuated by hydraulic pressure, the joints can be controlled in a

compliant way, as discussed in chapter 1.2. Figure 1.3 shows the robot.

Figure 1.3 a) DB balancing a pole, b) DB reading 'Science'

Other laboratories, where humanoid robots are developed and much research is done are

the University of Tokyo, the Waseda University and at the Vanderbilt University.

Figure 1.4 a) Hadaly and b) Wabian from Waseda University

1.5 The Purpose of the Thesis

The following study focuses on the control problem of compliant joints used in humanoid

robots. As described in chapter 1.2, humanoid robots should be designed and operated

with compliant joints. These joints allow biologically plausible motion, i.e. motion that is

efficient, differentiated, energetically economic, and offers a wide variety of positions. In

addition, it enables the robot to interact in an environment shared with humans. For

example, it minimizes the risk of injuring people during interaction. Now the problem is

that controlling compliant joints is extremely difficult when at the same time motion shall

be accurate. Traditional linear high-gain control methods are not suitable for compliant

control. And the model-based nonlinear control with rigid body dynamics models is often

too inaccurate (see the further discussion in chapter 2.5).

My thesis offers a new approach to control the motion of humanoid robots. This

method is based on neural net learning techniques, and leads to superior motion

performance compared to traditional approaches. My test case is learning the inverse

dynamics model of a seven degree-of-freedom robot arm. This anthropomorphic robot

arm was built from lightweight materials to resemble human arm dynamics. It is

hydraulically actuated and has load sensors in each joint to allow compliance.

The learned inverse dynamics model can be used in a computed torque feed-

forward controller. In contrast to traditional feed-forward control, the new controller

increases the accuracy of the arm motion to a significant degree without reducing its

compliance. The results show that with the help of this new approach, natural motion of a

human-like robot can be achieved with high accuracy.

My research aims to help closing a gap that opens in the robot control framework.

Today, broad research efforts are spend on algorithms to decide what a robot does in a

given situation. But if a robot decided what to do, it may not know how to execute the

desired motion accurately. Motion execution using compliant joints becomes more

relevant as robots start to interact with people.

2. A brief Recapitulation of basic Robot Control

2.1. Introduction to Robot Control

For moving a manipulator (e.g. a robot arm) from a discrete state to another desired

discrete state, an appropriate control command (e.g. current for an electric motor) has to

be generated all the time during the robot's motion. This is achieved by updating the

previous command at discrete time steps, equivalent to the reciprocal value of the control

loop's frequency:

. The control loop usually runs at a high frequency to allow fast

command update and accurate motion.

For generating commands to achieve desired robot behavior, one has to

distinguish three different phases: 1

planning a trajectory in end-effector space, 2

translating the desired trajectory into joint space,

and 3

executing the planned

trajectory. Concerning the first phase, let us assume throughout this thesis that a desired

trajectory in end-effector space exists. E.g., a desired behavior can be relatively simple to

plan like a point-to-point reaching. This can easily be planned using a direct line as

desired line-of-motion and adding constraints, such as bell shaped velocity and

acceleration profiles. When these constraints are known, it is possible to estimate all

coefficients of an n

order polynomial that describes the desired fingertip position

between start and end position. Typically, third or fifth order polynomials are used for

planning. They allow 4 or 6 constraints in total. These constraints usually are given by

the start- and end-position, and the start and end-velocity. The accelerations can be added

if 6 unknowns are used. A path in end-effector space given by a set of via-points can be

decomposed into several short paths from point

to point

The second part of robot control, transferring the desired trajectory from end-

effector space into joint space, is called the inverse kinematics problem. To solve this

problem, a coordinate transformation from Cartesian space into joint space is needed. If

we define the intrinsic coordinates of a manipulator as the n-dimensional vector

and the position and orientation of the manipulator's end-effector as the m-dimensional

It is also possible to directly plan a trajectory in actuator space, without the need to translate it. However,

this is hardly done in practice, as the joint space of a reasonable sized robot is huge and not simple to

understand for humans.

vector

, the kinematics function can be written as

)

(

. What we need here is

the inverse relationship:

)

(

= f

There are two general approaches to solving inverse kinematics problems with

optimization criteria: 1. One can use global methods to find an optimal path of

with

respect to the entire trajectory, or use 2. local methods, which only compute an optimal

change in ,

for a small change in x, x

. In this case, one would have to integrate

to generate the entire joint space trajectory. An example of a local method is

Resolved Motion Rate Control (e.g. Whitney, 1969). It uses the Jacobian J of the forward

kinematics to describe a change of the end-effector's position by

)

(

. This

equation can be solved for

by taking the inverse of J, if it is square, i.e.

, and

non-singular. For redundant manipulators (hence for almost all robots), solutions to the

inverse equation are usually non-unique (Craig, J. 1986), so that additional optimization

criteria have to be introduced. Alternatively, other methods can be used to invert J, like

the pseudo-inverse method (e.g. Liegeois, 1977), or the Extended Jacobian Method

(Baillieul, J., 1985). There exists a variety of literature on inverse kinematics.

The robot used for research at the CLMC-lab consists of seven joints, hence 7

degrees of freedom (DOFs). Our motion algorithms usually calculate a single desired

end-effector position in Cartesian space (with no particular constraints on the orientation

of the fingertip). This explains our need for an inverse dynamics mapping from

(which must be ill-defined, as it maps from lower into higher space). My

thesis, however, does not concentrate on the inverse kinematics problem. It will only give

an idea of how the learning algorithm LWPR can be applied for solving inverse

kinematics in the conclusion in chapter 8.2.

The remaining problem, executing the desired trajectory in joint space, will be

described in more detail in the following section. One of its subcomponents, the Inverse

Dynamics Problem, with a learning approach to solve this problem, will be discussed

carefully in the thesis. Therefore, let us assume that a planned desired trajectory in joint

space is available at any time.

E.g. Schwinn, W., 1992, Rieseler, H., 1992, Kovács, P., 1993

2.2. Controlling the Execution of Desired Trajectories

The following chapter presents a simple controlling method for joint position's during

motion. Additional components are added step by step to increase the accuracy:

Controller

Open Loop Control Diagram

Robot

des

Controller

Open Loop Control Diagram

Robot

des

Figure 2.1 Open Loop Control

)

(

des

(eq. 2.1)

As can be seen in figure 2.1 and in the according equation, this is a very simple model,

which controls the robot `in a blind way'. The controller generates commands (u) using a

function

)

(

based on a desired state (

des

constant parameters of the system (

. ) and

the time (t). The robot changes its state to a new state (

) based on the control command

it receives. The controller sends a command to the system without monitoring how the

system responds. There is no error correction because the controller does not know what

the system really does. To solve this problem, we can include a feedback mechanism:

Controller

Closed Loop Control Diagram

Robot

des

Controller

Closed Loop Control Diagram

Robot

des

Figure 2.2 Closed Loop Control

)

(

des

(eq. 2.2)

Throughout this chapter,

denotes the state of a system, given by its position, velocity, and acceleration:

)

(

Figure 2.2 illustrates that the controller is now enabled to correct errors, because it knows

about the state of the system and can verify, whether a previously sent command has

moved the robot into the expected state. If the result was not satisfactory, the controller

can change the subsequent command u

appropriately.

Feedback

Controller

Negative Feedback Control Diagram

Robot

des

Feedback

Controller

Negative Feedback Control Diagram

Robot

des

Figure 2.3 Negative Feedback Control

)

(

des

= f

(eq. 2.3)

A special case of closed loop control is negative feedback control, illustrated in figure

2.3. The controller does not know the desired state of the robot, but only the difference

between the desired and the real state (

des

- ). This difference can be interpreted as an

error signal of the robot's state. Generating appropriate commands (u

) to minimize the

input (and thus the error signal) will lead to a control solution. The negative feedback

approach will become infinitesimal accurately when using proper command generation

functions

)

(

. The major disadvantage of the negative feedback control is the time

delay, which causes a significant loss of accuracy in the robot's movements. This control

strategy is further explained in chapter 2.3. in the context of PID-controllers. To further

minimize the error, we can add a feed-forward controller:

Feedback

Controller

Negative Feedback and Feedforward Control Diagram

Robot

Feedforward

Controller

des

Feedback

Controller

Negative Feedback and Feedforward Control Diagram

Robot

Feedforward

Controller

des

Figure 2.4 Negative Feedback and Feed-forward Control

)

(

)

(

des

(eq. 2.4)

In figure 2.4, a computed-torque feed-forward controller is introduced to reduce the error

and thus improve the robot's performance. The feed-forward controller uses a dynamics

model of the robot. It is now able to predict the actuator commands, which correspond to

a desired motion. Intuitively, the feed-forward controller moves the robot towards the

desired state in big steps and as accurately as possible. The dynamics model used in

)

(

to compute the feed-forward commands (u

) will never be absolutely accurate: It

leaves a remaining error to be corrected by the feedback controller. The feed-forward

controller operates as a set-top-box, which generates commands (u

) independently of the

negative feedback controller. The final command sent to the robot thus is the addition of

both commands, feed-forward and negative feedback:

We will have a closer look at both, the negative feedback and the feed-forward

control functions, in the following two chapters.

2.3. The Feedback Control Function

PID-controllers are widely used as control policy in negative feedback controllers. These

controllers correct errors in position and velocity. They are usually based on a linear

control function. The name PID-controller is derived from:

· Proportional Control ('Position Error')

· Integral Control ('Steady State Error')

· Derivative Control ('Damping')

The Control Function

of a PID-Controller is given by

(

)

(

des

(eq. 2.5)

with

the (n×1) vector of computed neg.-feedback commands

, k

, k

the (n×1) gain vectors for P-, D-, and Icontrol

des

the (n×1) vector of errors in positions

des

the (n×1) vector of errors in velocities

The term

)

(

des

- in equation 2.5 will become non-zero, whenever the robot is in a

position

)

(

that differs from the desired position

)

(

des

. This position error

)

(

des

multiplied by a fixed gain (k

) yields in an appropriate command to compensate for that

error, given that the gain is properly adjusted: Too small gains will cause the system to

only slowly correct for errors; too high gains may lead to overcompensation (and

ultimately to unstable systems).

The second term,

)

(

des

- in equation 2.5, has the same effect on the velocities:

whenever a velocity differs from the desired velocity, a correction command will be

calculated by the difference multiplied with a gain k

. This part of the feedback

controller introduces a damping term.

The integral part additionally corrects very small errors once a steady state is

reached. Those errors might be introduced by external forces, e.g. gravity. To understand

the term, assume the robot has reached its target, but gravity makes it drop slightly under

the desired position. The first term

)

(

des

- in equation 2.1 will provide compensation,

but only proportional to the distance-error. If this error is small, or the arm is heavy, the

generated command may not succeed in moving the arm upwards. Ultimately, there will

be an equilibrium state below the desired target, when the correction of the PD-controller

compensates for gravity. The arm will not move upwards to the desired target anymore.

The integral controller will trace and accumulate the error. The integral will only

stop adding to the accumulation, when

)

(

des

- is zero, i.e. when the robot is exactly at

the target position. This means, it will continue until the accumulation can achieve a

compensation for the position offset (considering the gain k

). In case the robot

overshoots, the accumulation will decrease, as

)

(

des

- changes signs. Choosing an

appropriate gain k

, the correction introduced by the integral part of equation 2.5 will

keep the robot exactly at the desired position.

Finding a gain k

is probably the most difficult task in this process. A poor choice

can easily lead to an unstable system with catastrophic results. Integral controllers

compensate for steady states only, but this study concentrates on robot arms in motion.

Therefore, integral controllers have not been used as part of the negative feedback

control. All feedback controllers were modeled by a proportional and a derivative part

only.

Looking for appropriate gains k

and k

to design a PD-controller as shown

above, one faces a tradeoff between stiff and accurate systems:

· Assume high gains:

Whenever the desired state differs from the robot's state, the

command sent for compensation is relatively big. This will ensure fast

compensation. However, if you need the robot to give in, e.g. due to a push or

when hitting an object, a high gain controller will exert a high force to

compensate for this `position-error'.

· Assume low gains: The position error is multiplied by a smaller number. Thus,

the command send to the robot to correct the error is significantly smaller. Hence,

the robot is much more flexible when it hits an object or is being pushed aside. On

the other hand, it also needs longer to correct for `real' position errors.

Especially in the context of Humanoid Robotics, this tradeoff is very difficult. Humanoid

robots are supposed to work in close human interaction and may not hurt anyone in

interacting; but at the same time, they need to be capable of fast and accurate motion. The

High gains can also result in instable systems, as the robot may increase the absolute distance to the

desired target with every discrete step. Let us assume that the gains do not exceed the maximal value that

guarantees safe operation.

approach usually taken today is to use low gain feedback controllers and add a feed-

forward path to enhance performance. I will explain this mechanism in more detail in the

following chapters.

2.4. The Feed-forward Control Function

As seen in chapter 2.2, the feed-forward control function uses the dynamics of a system

to generate feed-forward commands. Then, these commands are used to move the system

quickly along a desired trajectory:

)

(

des

(eq. 2.7)

The dynamics of the system is hidden in the parameter vector

. .

In general, the dynamics model provides a description of the relationship between

the joint actuator torques and the motion on the structure. It relates the vectors of

positions (

), velocities (

), and accelerations ( ) to the torque vector ( 2 ). This last

vector is necessary to accomplish the acceleration in order to reach the desired state.

The direct dynamics problem assumes a given initial state of the system (position,

and velocity) and an incoming stream of joint torques. It will then predict the acceleration

in all joints for times

. Updates of velocities and positions can be calculated using

integration techniques over the accelerations. The direct dynamics is a useful model for

simulations, as it allows predictions of the physical system's motion, when the exerted

joint torques are known.

In contrast, the inverse dynamics model predicts the joint torques needed to

generate a specified motion. Solving the inverse dynamics problem will allow the

execution of a desired trajectory, once the trajectory is specified in terms of positions,

velocities, and accelerations in joint space. Usually, this is a result of an inverse

kinematics process (see chapter 2.1). Simulating the trajectory before executing the

motion can be used to verify the trajectory's feasibility: e.g., torques may not exceed a

maximal value, nor may they change abruptly.

Using the inverse dynamics model for feed-forward command predictions leads to

improved performance of the controller, compared to a `simple' feedback controller. This

combined controller will always move the robot close to the correct position using feed-

forward commands. Then, the `slowly adjusting' feedback part only has to compensate

for small position errors. Hence, the overall performance will increase. If we had a

perfect robot and knew the dynamics model, no feedback control was necessary at all.

But as our robot is not perfect, we do need to find the inverse dynamics model. Thus, the

question of how to estimate the dynamics model remains.

2.5. Estimating Dynamics using Rigid Body Assumptions

The Rigid Body Dynamics assumptions

are frequently used to estimate the dynamics of

a robot system

. The assumptions include, that the system to be modeled consists of

single stiff bodies. These rigid bodies are assumed to behave like perfect 'single link

robots', being connected to form a robot with multiple degrees of freedom. In addition,

the link's motion is not supposed to have influence on any other of the links and joints.

Only the link's weight and its position are modeled. The joints connecting two links are

not allowed to have friction or position inaccuracies. The dynamics of the system is

therefore described by the shape and weight of the links, not by their joint behavior.

The general structure of rigid body dynamics, also called the 'Joint space dynamic model',

can be estimated in two different ways:

The Lagrange formulation offers the system's dynamics in a closed form based on

the Lagrangian of the systems' total energy.

The Newton-Euler formulation allows describing the model in a recursive form by

forcing a torque balance on every link. This is computationally more efficient. However,

parameter estimation for unknown systems is more difficult compared to a closed form

model. For this reason, we will have a closer look at the Lagrangian formula of a

mechanical system. It is defined as:

E.g. Sciavicco Siciliano, 2000 or An, Atkeson Hollerbach, 1988

E.g. Sciavicco Siciliano, 2000 or Pfeiffer Reithmeier, 1987

L = T U,

(eq. 2.8)

where

T and U denote the total kinetic energy and, respectively, the total potential energy

of the system. The Lagrange's Equation is expressed by

1,...,

(eq. 2.9)

where n denotes the number of joints in the robot and

the generalized force associated

with the coordinate

. The forces include all non-conservative forces, such as joint

actuator torques, the joint friction torques, and the joint torques induced by end-effector

forces at the contact with the environment. As we have seen earlier, equation 2.5

establishes the relation between generalized forces (i.e. the joint torques) and joint

positions, velocities and acceleration. Consequently, it is possible to derive the dynamic

model, once the kinetic and potential energy of all links are known. For details of the

computation of these energies, please refer to e.g. Sciavicco Siciliano, 2000. The

general formula of the equation of motion derived by the Lagrangian method is

)

(

)

(

)

(

(eq. 2.10)

with:

(n × n)

: a positive definite inertia matrix,

only depending on the robot's current position

(n × n)

: a matrix containing the centripetal and Coriolis forces,

depending on the robot's current position and its current velocities

(n × 1)

: a vector of gravitational forces,

only depending on the robot's current position

(n × 1)

: a vector of the active torques at the joints 1, ..., n

The thesis will only provide a brief overview on the Lagrange method; for further information, please

refer e.g. Sciavicco Siciliano, 2000 or Pfeiffer Reithmeier, 1987.

A physical interpretation of the parameters in equation 2.10 gives an intuitive

understanding of the formula:

· The coefficients b

(i.e. the elements on the diagonal of the matrix B) represent

the moment of inertia at joint axis j in the current manipulator configuration,

when all other joints are blocked. The coefficients b

with

account for the

effects of acceleration from joint i on joint j.

· The coefficients in the C matrix represent the influence that one joint has on

another, as well as the Coriolis effect induced on a joint by the velocities of two

other joints.

· The terms g

(components of the vector G) represent the momentum generated at

the joint i axis of the manipulator in the current configuration, due to gravity.

Some of these parameters in the matrices can be simplified during the design process of

the robot, e.g. by using massive and stiff materials for the links. Another simplification

would be the use of gearboxes with high transmission ratios in the joints: in this case, all

the coefficients in the G and C matrices can be neglected, and the B matrix becomes

almost diagonal. This means, that every joint has influence on itself only. However, the

objective of this work is to find a model for a humanoid robot. Recalling the constraints

on humanoid robots from chapter 1.2 (e.g. compliant, lightweight), there are not many

possibilities for simplification. Gearboxes, e.g., may not be used, as the robot will

become too stiff; massive materials will make the robot unbearably heavy.

No matter how simple the robot setup is designed, there still remains the problem

of estimating appropriate coefficients in B, G, and C. Modern CAD systems provide

parameter estimation based on the design data of real systems. But this estimation is not

simple, and the results are poor most of the time. Usually, the process includes idealized

assumptions of the system's mechanics and uncertainties in the manufacturing process. In

addition, some properties of the links change during the robot's lifetime because of

material wear. So this does not seem to be a useful tool for such complex designs as such

of humanoid robots.

Details

Seiten
Erscheinungsform: Originalausgabe
Erscheinungsjahr: 2001
ISBN (eBook): 9783832448585
ISBN (Paperback): 9783838648583
DOI: 10.3239/9783832448585
Dateigröße: 2.2 MB
Sprache: Englisch
Institution / Hochschule: Technische Universität Berlin – Informatik
Erscheinungsdatum: 2001 (Dezember)
Note: 1,0
Schlagworte: robot-motion-control neuroinformatics robotik
Produktsicherheit: Diplom.de

Autor

Jörg Conradt (Autor:in)

Online-Learning in Humanoid Robots

Zusammenfassung

Leseprobe

Inhaltsverzeichnis

Details

Autor

Jörg Conradt (Autor:in)