Lade Inhalt...

Online-Learning in Humanoid Robots

©2001 Diplomarbeit 114 Seiten

Zusammenfassung

Inhaltsangabe:Abstract:
Humanoid Robotic Systems have gained an increasing significance in the research world within the last few years. Just five years ago, there were hardly any human-like robots in the world, and those available did not represent human properties at all. They neither looked nor behaved like human beings. Today, a variety of research groups around the world is starting to work on topics related to humanoid robots, and it is very likely that these robots will become important within the upcoming decades even beyond the realm of science.
Trying to determine what humanoid robots are, a first draft of a definition might read as follows: such robots are to be called humanoid robots which - to some extent - are able to live and interact with the everyday human world, and represent certain human features, like cognitive or acting abilities. The main strength of such humanoid robots lies in their ability to operate in surroundings that have been designed for humans in the first place. Humanoid robots can be imagined to become useful assistants for every-day life in areas as diverse as:
- Rescue and clearing of dangerous situations.
- Janitorial services, Housekeeping.
- Security services.
- Care-taking in hospitals, recreational facilities.
- Entertainment.
In all these fields, close human interaction is a core issue and can be regarded as the minimum common basis. The interaction happens on many different levels, from physical touch to gesture recognition and the processing of spoken language. On cognitive issues like the two last named, much research has been done in the past few years. One has, however, to keep in mind that also the physical appearance, e.g. smoothness of motions, is an important issue when designing humanoid robots.

Inhaltsverzeichnis:Table of Contents:
FOREWORD1
1.INTRODUCTION2
1.1INTRODUCING THE AREA OF HUMANOID ROBOTICS2
1.2MECHANICAL DESIGN FOR HUMANOID ROBOTS3
1.3CONTROLLING HUMANOID ROBOTS4
1.4EXAMPLES OF TODAY'S HUMANOID ROBOTS5
1.5THE PURPOSE OF THE THESIS9
2.A BRIEF RECAPITULATION OF BASIC ROBOT CONTROL10
2.1INTRODUCTION TO ROBOT CONTROL10
2.2CONTROLLING THE EXECUTION OF DESIRED TRAJECTORIES12
2.3THE FEEDBACK CONTROL FUNCTION14
2.4THE FEED-FORWARD CONTROL FUNCTION17
2.5ESTIMATING DYNAMICS USING RIGID BODY ASSUMPTIONS18
2.6RECAPITULATION22
2.7CONTROL OF HUMANOID ROBOTS23
3.INTRODUCTION TO ROBOT LEARNING25
3.1GENERAL REMARKS ON ROBOT LEARNING25
3.2THE BIAS / VARIANCE […]

Leseprobe

Inhaltsverzeichnis


ID 4858
Conradt, Jörg: Online-Learning in Humanoid Robots / Jörg Conradt -
Hamburg: Diplomica GmbH, 2001
Zugl.: Berlin, Technische Universität, Diplom, 2001
Dieses Werk ist urheberrechtlich geschützt. Die dadurch begründeten Rechte, insbesondere die
der Übersetzung, des Nachdrucks, des Vortrags, der Entnahme von Abbildungen und Tabellen,
der Funksendung, der Mikroverfilmung oder der Vervielfältigung auf anderen Wegen und der
Speicherung in Datenverarbeitungsanlagen, bleiben, auch bei nur auszugsweiser Verwertung,
vorbehalten. Eine Vervielfältigung dieses Werkes oder von Teilen dieses Werkes ist auch im
Einzelfall nur in den Grenzen der gesetzlichen Bestimmungen des Urheberrechtsgesetzes der
Bundesrepublik Deutschland in der jeweils geltenden Fassung zulässig. Sie ist grundsätzlich
vergütungspflichtig. Zuwiderhandlungen unterliegen den Strafbestimmungen des
Urheberrechtes.
Die Wiedergabe von Gebrauchsnamen, Handelsnamen, Warenbezeichnungen usw. in diesem
Werk berechtigt auch ohne besondere Kennzeichnung nicht zu der Annahme, dass solche
Namen im Sinne der Warenzeichen- und Markenschutz-Gesetzgebung als frei zu betrachten
wären und daher von jedermann benutzt werden dürften.
Die Informationen in diesem Werk wurden mit Sorgfalt erarbeitet. Dennoch können Fehler nicht
vollständig ausgeschlossen werden, und die Diplomarbeiten Agentur, die Autoren oder
Übersetzer übernehmen keine juristische Verantwortung oder irgendeine Haftung für evtl.
verbliebene fehlerhafte Angaben und deren Folgen.
Diplomica GmbH
http://www.diplom.de, Hamburg 2001
Printed in Germany

Wissensquellen gewinnbringend nutzen
Qualität, Praxisrelevanz und Aktualität zeichnen unsere Studien aus. Wir
bieten Ihnen im Auftrag unserer Autorinnen und Autoren Wirtschafts-
studien und wissenschaftliche Abschlussarbeiten ­ Dissertationen,
Diplomarbeiten, Magisterarbeiten, Staatsexamensarbeiten und Studien-
arbeiten zum Kauf. Sie wurden an deutschen Universitäten, Fachhoch-
schulen, Akademien oder vergleichbaren Institutionen der Europäischen
Union geschrieben. Der Notendurchschnitt liegt bei 1,5.
Wettbewerbsvorteile verschaffen ­ Vergleichen Sie den Preis unserer
Studien mit den Honoraren externer Berater. Um dieses Wissen selbst
zusammenzutragen, müssten Sie viel Zeit und Geld aufbringen.
http://www.diplom.de bietet Ihnen unser vollständiges Lieferprogramm
mit mehreren tausend Studien im Internet. Neben dem Online-Katalog und
der Online-Suchmaschine für Ihre Recherche steht Ihnen auch eine Online-
Bestellfunktion zur Verfügung. Inhaltliche Zusammenfassungen und
Inhaltsverzeichnisse zu jeder Studie sind im Internet einsehbar.
Individueller Service
­
Gerne senden wir Ihnen auch unseren Papier-
katalog zu. Bitte fordern Sie Ihr individuelles Exemplar bei uns an. Für
Fragen, Anregungen und individuelle Anfragen stehen wir Ihnen gerne zur
Verfügung. Wir freuen uns auf eine gute Zusammenarbeit.
Ihr Team der Diplomarbeiten Agentur

T
ABLE OF
C
ONTENTS
FOREWORD... 1
1. INTRODUCTION... 2
1.1 I
NTRODUCING THE
A
REA OF
H
UMANOID
R
OBOTICS
... 2
1.2 M
ECHANICAL
D
ESIGN FOR
H
UMANOID
R
OBOTS
... 3
1.3 C
ONTROLLING
H
UMANOID
R
OBOTS
... 4
1.4 E
XAMPLES OF
T
ODAY
'
S
H
UMANOID
R
OBOTS
... 5
1.5 T
HE
P
URPOSE OF THE
T
HESIS
... 9
2. A BRIEF RECAPITULATION OF BASIC ROBOT CONTROL ... 10
2.1. I
NTRODUCTION TO
R
OBOT
C
ONTROL
... 10
2.2. C
ONTROLLING THE
E
XECUTION OF
D
ESIRED
T
RAJECTORIES
... 12
2.3. T
HE
F
EEDBACK
C
ONTROL
F
UNCTION
... 14
2.4. T
HE
F
EED
-
FORWARD
C
ONTROL
F
UNCTION
... 17
2.5. E
STIMATING
D
YNAMICS USING
R
IGID
B
ODY
A
SSUMPTIONS
... 18
2.6. R
ECAPITULATION
... 22
2.7. C
ONTROL OF
H
UMANOID
R
OBOTS
... 23
3. INTRODUCTION TO ROBOT LEARNING... 25
3.1 G
ENERAL
R
EMARKS ON
R
OBOT
L
EARNING
... 25
3.2 T
HE
B
IAS
/ V
ARIANCE
T
RADEOFF
... 27
3.3 G
LOBAL VERSUS
L
OCAL
L
EARNING
S
TRATEGIES
... 29
3.4 T
HE
C
URSE OF
D
IMENSIONALITY
... 31
3.5 O
NLINE
-L
EARNING
... 32
3.6 L
EARNING
I
NVERSE
D
YNAMICS
... 34
3.7 R
ESULTS
... 34
4. THE LEARNING ALGORITHM LWPR... 36
4.1 A
DVANTAGES OF
L
EARNING
A
PPROACHES
... 36
4.2 D
ESIRED
A
LGORITHMIC
P
ROPERTIES
... 38
4.3 I
NPUT
D
ATA
P
REPROCESSING
... 45

4.4 P
REDICTING
O
UTPUT
D
ATA USING
LWPR... 47
4.5 L
EARNING
P
ARAMETERS
... 50
4.6 T
HE
F
INAL
LWPR
A
LGORITHM
... 57
4.7 P
ROPERTIES OF
LWPR ... 59
5. EVALUATION OF LWPR'S PERFORMANCE ON ARTIFICIAL DATA... 61
5.1 I
NTRODUCTION
... 61
5.2 O
NE
D
IMENSIONAL
F
UNCTION
F
ITTING
... 61
5.3 T
WO
-D
IMENSIONAL
F
UNCTION
F
ITTING WITH
S
HIFTING
I
NPUT
D
ISTRIBUTIONS
... 65
5.4 L
IMITING THE
C
OMPUTATIONAL
C
OMPLEXITY
... 68
5.5 D
ISCUSSION
... 69
6. THE HUMANOID ROBOT... 71
6.1 H
ARDWARE FOR
R
OBOT
C
ONTROL
... 71
6.2 S
OFTWARE FOR
R
OBOT
C
ONTROL
... 74
6.3 T
HE
S
ARCOS
R
OBOTIC
A
RM
... 77
7. EVALUATION OF THE ARM'S MOTION ... 81
7.1 G
ENERATING
T
RAJECTORIES FOR
L
EARNING
... 81
7.2 L
EARNING
I
NVERSE
D
YNAMICS
... 85
7.3 L
EARNING
R
ESULTS ON
S
AMPLED
D
ATA
... 87
7.4 V
ERIFYING THE
R
ESULTS ON
R
EAL
M
OTION
... 91
8. CONCLUSION ... 96
8.1 D
ISCUSSION
... 96
8.2 B
RIEF
I
NTRODUCTION TO
U
SING
LWPR
FOR
I
NVERSE
K
INEMATICS
E
STIMATION
... 97
8.2 F
UTURE
R
ESEARCH
... 99
REFERENCES... 101
SHORT GERMAN SUMMARY ... 104
EIDESSTATTLICHE ERKLÄRUNG... 105

1
Foreword
The basis for this thesis was laid in the academic year 1999/2000, which I spent as an
exchange student at the University of Southern California (USC), Los Angeles, and as a
guest at the CLMC laboratory. The Computational Learning and Motor Control
laboratory (CLMC laboratory), headed by Dr. Stefan Schaal, is pursuing research in the
border areas of neuroscience, computer science, statistics, and robotics (control theory).
The focus of the laboratory is to study the principles of biological behavior.
During the previous years, research has realized that even simple biological
systems achieve by far superior sensory-motor competence compared to any artificial
system today. On this basis, trying to understand and imitate these biological principles
for artificial systems seems to be a promising approach. In this thesis, the motor control
for a human-like robotic arm was learned based on a biologically plausible model.
During my time at the USC, I had the opportunity to participate in the research
activities of the CLMC laboratory. I am very thankful for the discussions, for help, and
for much interesting information that I experienced there. Especially Dr. Stefan Schaal
and Dr. Sethu Vijayakumar have been patient and good-humored, and with their
generosity in respect to equipment and other resources they have tremendously eased my
start in the new environment. I sincerely hope that the time at the CLMC laboratory has
generated a link that will last for much longer. The students belonging to the CLMC lab,
Gaurav Tevatia and Biren Metha, also deserve my gratitude for discussing the first results
and offering generous help on various topics.
I am also particularly grateful to the Fulbright Commission for providing funds to
stay for one year at an American university. I could never have studied in the U.S.
without their generous support.
Prof. Dr. Hommel from the TU Berlin was the person who first brought me into
contact with the field of robotics. I especially wish to thank him for the encouragement
and the careful supervision of this thesis. In addition, he kindly offered advice on all
questions of research, by no means limited to the scope of this thesis.
Furthermore, I owe thanks to all the people who have read and commented on
parts of earlier drafts and helped to improve the thesis.

2
1. Introduction
1.1 Introducing the Area of Humanoid Robotics
Humanoid Robotic Systems have gained an increasing significance in the research world
within the last few years. Just five years ago, there were hardly any human-like robots in
the world, and those available did not represent human properties at all. They neither
looked nor behaved like human beings. Today, a variety of research groups around the
world is starting to work on topics related to humanoid robots, and it is very likely that
these robots will become important within the upcoming decades even beyond the realm
of science.
Trying to determine what humanoid robots are, a first draft of a definition might
read as follows: such robots are to be called humanoid robots which - to some extent - are
able to live and interact with the everyday human world, and represent certain human
features, like cognitive or acting abilities. The main strength of such humanoid robots lies
in their ability to operate in surroundings that have been designed for humans in the first
place. Humanoid robots can be imagined to become useful assistants for every-day life in
areas as diverse as
· Rescue and clearing of dangerous situations
· Janitorial services, Housekeeping
· Security services
· Care-taking in hospitals, recreational facilities
· Entertainment
In all these fields, close human interaction is a core issue and can be regarded as the
minimum common basis. The interaction happens on many different levels, from
physical touch to gesture recognition and the processing of spoken language. On
cognitive issues like the two last named, much research has been done in the past few
years. One has, however, to keep in mind that also the physical appearance, e.g.
smoothness of motions, is an important issue when designing humanoid robots.

3
1.2 Mechanical Design for Humanoid Robots
Given the close interaction with humans and the potential working spaces listed above,
some core requirements for the design of humanoid robots can be set out:
First of all, humanoid robots need the ability to act in environments tailored for
human needs and to operate devices originally designed for humans, e.g. when turning
knobs. Therefore, they need the basic equipment, e.g. manipulator arms, and legs, which
can perform independent tasks. Their movements have to be fast and accurate, and their
grip fine and powerful. Their links have to be lightweight to reduce the influence of
inertia. It is also desirable to have compliant joints, because only these will allow the
robot to react flexibly to external stimuli. This might be the case when the robot is pushed
aside during the performance of a task. A natural result of building robots according to
these prerequisites is a high degree of redundancy in the whole system - an issue, which
has to be handled appropriately.
Moreover, humanoid robots have to be equipped with sensory and processing
capabilities to interact with their environment, and they must be mobile within a wide
range. In general, their mechanical design has to ensure that they can operate in normal
living environments without extensive modifications of these surroundings.
With these requirements, the mechanical design of humanoid robots differs
substantially from that of today's robots, e.g. robots used for manufacturing. Such
industrial robots are obviously designed according to very different needs: One of their
main purposes is to repeat tasks with high accuracy. This requires stiff joints, solid links,
and strong actuators. To allow simple control, they are designed to behave as linear as
possible.
The amount of compliance in the joints is a particularly important issue for
biologically plausible motion and shall now be set out in some more detail. As has been
said above, compliance in the joints is not only important for the smoothness of motion.
It is above all the central prerequisite for the robot's ability to react appropriately to
external stimuli. Such stimuli occur frequently in natural environments and it seems to be
one of the most remarkable abilities of living organisms to react to such changes and
adapt their own behavior and actions. Such changes might be local impediments while a
task is performed, or pushes. Assume shaking hands with the robot: If the robot consists

4
of stiff joints, shaking hands cannot be performed well as one has to follow exactly the
robot's desired motion. If the joints are compliant, much more variation in the movement
is possible - a human partner can always force the robot to slightly adjust its own
position.
On the mechanical level, motion in the joints can be achieved by different means.
The two major approaches are electric motors combined with gearboxes, and hydraulic
actuation. When using electric motors, gearboxes are a necessary complement, because
only they provide sufficient torques for the robot's movements. Gearboxes, however,
increase the stiffness in all joints by a considerable amount, since they do not offer back-
drive ability. Therefore, motors with gearboxes are unsuitable as actuators for humanoid
robots.
Alternatively, robots can be equipped with hydraulics to generate high torques for
the joints. When using hydraulics together with load sensors in every joint, the joints'
behaviors can be anywhere between very stiff and very compliant, only depending on the
controller. This means, the problem of compliance is moved to the level of software.
The only problem that remains is that any increase of compliance goes along with
a decrease of accuracy. It is therefore highly important to find a good tradeoff between
accuracy of motion and compliance in the joints. The quality of a controlling method can
be directly correlated with its ability to find such a balance.
1.3 Controlling Humanoid Robots
Let us assume that all the mechanical desires described in chapter 1.2 can be fulfilled,
and that the problems with stiff joints and heavy material used in the links can be solved.
There still remains a major problem: By what means could such a robot be controlled?
What kind of algorithms allows the robot to use the whole variety of motion that is
usually associated with biological motion? For Example, how could it be accomplished
that a humanoid robot gives way for external motion, such as pressure enforced by
contact with humans? And given the desired compliance is achieved using lightweight
material as described in chapter 1.2: How can we cope with the constraints that such
materials add on the control algorithms? It is obvious that traditional algorithms, e.g.

5
those based on rigid body dynamics assumptions, are not well suited to control such
mechanics. A fairly novel way to solve the question of controlling the robot is the
application of learning approaches. The major advantage of a learned control strategy is
that it adapts the control schema based on how the system behaves. This means that a
well-suited learning algorithm will always stay accurate.
1.4 Examples of Today's Humanoid Robots
In this chapter, I would like to provide a short overview of today's humanoid robots, and
present some examples.
Probably the best-known robot was build by Honda during the last 10 years (e.g.
Hirai, 1987, Hirai, Hirose, et al., 1998). Figure 1.1 shows a picture of the robot, which
looks very much like a human.
a)
b)
c)
Figure 1.1 The Honda Robot P3: a) Frontal View b) Side View c) Technical Diagram
The main goal of Honda's development was to create an autonomous robot that can move
about in a human environment. The robot can walk, climb stairs, and manipulate simple
objects. Every singe step, however, needs to be carefully programmed in advance. As
soon as more-complex tasks arise, the robot has to be switched to tele-operation mode
and becomes controlled by a human supervisor. Corresponding to that, the robot is highly
stiff in all joints and the sensing capabilities are very limited. The size of stair-steps it has
to climb, for example, has to match exactly the preprogrammed values. Very little visual
feedback or other sensor-information is used to correct for unexpected changes in the

6
environment. This robot appears to be a human, but does not at all behave like a
humanoid robot.
The robot's technical parameters are summarized in the following table:
Weight: 130
kg
Height: 160cm
Width:
60cm
Depth:
55,5cm
Walking speed:
2.0 km / hour, ca. 25 minutes autonomous walking
Operated on DC servo motors with gears
Lift Capacity per Hand: 2kg
Degrees of Freedom: 28 (Legs: 2x6, Arms: 2x7, Hands: 2x1)
Table 1.1 Technical Parameters of the Honda Humanoid Robot
Another example of a humanoid robot is COG, a robot torso developed at MIT. A picture
of COG is shown in figure 1.2:
Figure 1.2 COG, the Robot Torso developed at MIT
COG is used to study the question how far a humanoid robot can become 'cognitive';
therefore, the focus of research is on the robot's interaction with people. The robot
realizes what people want and acts properly, sometimes also following its own 'desires'. It
is not designed for complicated motion. Though the robot's physical behavior is very
much different from humans, experiments have shown that it can help to understand the

7
way people interact with each other. In the near future, the research objective of the
development team is to achieve a robot's perception that is comparable to that of a six-
months-old baby.
The third example for a humanoid robot is DB (Dynamic Brain), a robot at the
ATR human resources research laboratories in Kyoto, Japan. DB is a hydraulically
actuated anthropomorphic robot with legs, arms (with hand palms but without fingers), a
jointed torso, and a head. It was designed and built by Sarcos, a company that usually
builds tele-operated robots for entertainment purposes like movies and amusement parks.
The robot offers a variety of motions; however, it is mounted on a pelvis, so free standing
or walking experiments cannot be performed. The research performed at the ATR
laboratories focuses on upper-body movement (e.g. arm motion).
DB's technical parameters are summarized in the following table:
Weight: 80
kg
Height: 185cm
Width:
60cm
Depth:
35cm
Operated on hydraulic actuators with 650 psi pumps
25 linear actuators, 5 rotary actuators
Degrees of Freedom: 30 (Legs: 2x3, Trunk: 3, Arms: 2x7, Neck: 3, Eyes: 2x2)
Position and load sensors in every Degree of Freedom (except eyes)
Video cameras providing stereo fovea and panoramic view
Table 1.2 Technical Parameters of the DB robot
Comparing DB's parameters with those of the Honda robot, one can easily recognize that
DB represents human properties much better, especially in terms of the weight and size.
Moreover, being actuated by hydraulic pressure, the joints can be controlled in a
compliant way, as discussed in chapter 1.2. Figure 1.3 shows the robot.

8
a)
b)
Figure 1.3 a) DB balancing a pole, b) DB reading 'Science'
Other laboratories, where humanoid robots are developed and much research is done are
the University of Tokyo, the Waseda University and at the Vanderbilt University.
a)
b)
Figure 1.4 a) Hadaly and b) Wabian from Waseda University

9
1.5 The Purpose of the Thesis
The following study focuses on the control problem of compliant joints used in humanoid
robots. As described in chapter 1.2, humanoid robots should be designed and operated
with compliant joints. These joints allow biologically plausible motion, i.e. motion that is
efficient, differentiated, energetically economic, and offers a wide variety of positions. In
addition, it enables the robot to interact in an environment shared with humans. For
example, it minimizes the risk of injuring people during interaction. Now the problem is
that controlling compliant joints is extremely difficult when at the same time motion shall
be accurate. Traditional linear high-gain control methods are not suitable for compliant
control. And the model-based nonlinear control with rigid body dynamics models is often
too inaccurate (see the further discussion in chapter 2.5).
My thesis offers a new approach to control the motion of humanoid robots. This
method is based on neural net learning techniques, and leads to superior motion
performance compared to traditional approaches. My test case is learning the inverse
dynamics model of a seven degree-of-freedom robot arm. This anthropomorphic robot
arm was built from lightweight materials to resemble human arm dynamics. It is
hydraulically actuated and has load sensors in each joint to allow compliance.
The learned inverse dynamics model can be used in a computed torque feed-
forward controller. In contrast to traditional feed-forward control, the new controller
increases the accuracy of the arm motion to a significant degree without reducing its
compliance. The results show that with the help of this new approach, natural motion of a
human-like robot can be achieved with high accuracy.
My research aims to help closing a gap that opens in the robot control framework.
Today, broad research efforts are spend on algorithms to decide what a robot does in a
given situation. But if a robot decided what to do, it may not know how to execute the
desired motion accurately. Motion execution using compliant joints becomes more
relevant as robots start to interact with people.

10
2. A brief Recapitulation of basic Robot Control
2.1. Introduction to Robot Control
For moving a manipulator (e.g. a robot arm) from a discrete state to another desired
discrete state, an appropriate control command (e.g. current for an electric motor) has to
be generated all the time during the robot's motion. This is achieved by updating the
previous command at discrete time steps, equivalent to the reciprocal value of the control
loop's frequency:
f
1
=
. The control loop usually runs at a high frequency to allow fast
command update and accurate motion.
For generating commands to achieve desired robot behavior, one has to
distinguish three different phases: 1
st
planning a trajectory in end-effector space, 2
nd
translating the desired trajectory into joint space,
1
and 3
rd
executing the planned
trajectory. Concerning the first phase, let us assume throughout this thesis that a desired
trajectory in end-effector space exists. E.g., a desired behavior can be relatively simple to
plan like a point-to-point reaching. This can easily be planned using a direct line as
desired line-of-motion and adding constraints, such as bell shaped velocity and
acceleration profiles. When these constraints are known, it is possible to estimate all
coefficients of an n
th
order polynomial that describes the desired fingertip position
between start and end position. Typically, third or fifth order polynomials are used for
planning. They allow 4 or 6 constraints in total. These constraints usually are given by
the start- and end-position, and the start and end-velocity. The accelerations can be added
if 6 unknowns are used. A path in end-effector space given by a set of via-points can be
decomposed into several short paths from point
n
p
to point
1
+
n
p
.
The second part of robot control, transferring the desired trajectory from end-
effector space into joint space, is called the inverse kinematics problem. To solve this
problem, a coordinate transformation from Cartesian space into joint space is needed. If
we define the intrinsic coordinates of a manipulator as the n-dimensional vector
n
,
and the position and orientation of the manipulator's end-effector as the m-dimensional
1
It is also possible to directly plan a trajectory in actuator space, without the need to translate it. However,
this is hardly done in practice, as the joint space of a reasonable sized robot is huge and not simple to
understand for humans.

11
vector
m
x
, the kinematics function can be written as
)
(
x
f
=
. What we need here is
the inverse relationship:
)
(
1
x
-
= f
.
There are two general approaches to solving inverse kinematics problems with
optimization criteria: 1. One can use global methods to find an optimal path of
with
respect to the entire trajectory, or use 2. local methods, which only compute an optimal
change in ,
,
for a small change in x, x
. In this case, one would have to integrate
to generate the entire joint space trajectory. An example of a local method is
Resolved Motion Rate Control (e.g. Whitney, 1969). It uses the Jacobian J of the forward
kinematics to describe a change of the end-effector's position by
J
x
)
(
=
. This
equation can be solved for
by taking the inverse of J, if it is square, i.e.
m
n
=
, and
non-singular. For redundant manipulators (hence for almost all robots), solutions to the
inverse equation are usually non-unique (Craig, J. 1986), so that additional optimization
criteria have to be introduced. Alternatively, other methods can be used to invert J, like
the pseudo-inverse method (e.g. Liegeois, 1977), or the Extended Jacobian Method
(Baillieul, J., 1985). There exists a variety of literature on inverse kinematics.
2
The robot used for research at the CLMC-lab consists of seven joints, hence 7
degrees of freedom (DOFs). Our motion algorithms usually calculate a single desired
end-effector position in Cartesian space (with no particular constraints on the orientation
of the fingertip). This explains our need for an inverse dynamics mapping from
7
3
(which must be ill-defined, as it maps from lower into higher space). My
thesis, however, does not concentrate on the inverse kinematics problem. It will only give
an idea of how the learning algorithm LWPR can be applied for solving inverse
kinematics in the conclusion in chapter 8.2.
The remaining problem, executing the desired trajectory in joint space, will be
described in more detail in the following section. One of its subcomponents, the Inverse
Dynamics Problem, with a learning approach to solve this problem, will be discussed
carefully in the thesis. Therefore, let us assume that a planned desired trajectory in joint
space is available at any time.
2
E.g. Schwinn, W., 1992, Rieseler, H., 1992, Kovács, P., 1993

12
2.2. Controlling the Execution of Desired Trajectories
The following chapter presents a simple controlling method for joint position's during
motion. Additional components are added step by step to increase the accuracy:
Controller
Open Loop Control Diagram
Robot
u
~
des
~
Controller
Open Loop Control Diagram
Robot
u
~
des
~
Figure 2.1 Open Loop Control
)
.
(
u
t
f
,
,
~
des
=
(eq. 2.1)
As can be seen in figure 2.1 and in the according equation, this is a very simple model,
which controls the robot `in a blind way'. The controller generates commands (u) using a
function
)
(
f
based on a desired state (
des
~
),
3
constant parameters of the system (
. ) and
the time (t). The robot changes its state to a new state (
~
) based on the control command
it receives. The controller sends a command to the system without monitoring how the
system responds. There is no error correction because the controller does not know what
the system really does. To solve this problem, we can include a feedback mechanism:
Controller
Closed Loop Control Diagram
Robot
u
~
des
~
Controller
Closed Loop Control Diagram
Robot
u
~
des
~
Figure 2.2 Closed Loop Control
)
t
~
,
~
(
des
.
,
u
f
=
(eq. 2.2)
3
Throughout this chapter,
~
denotes the state of a system, given by its position, velocity, and acceleration:
)
,
,
(
~
=
.

13
Figure 2.2 illustrates that the controller is now enabled to correct errors, because it knows
about the state of the system and can verify, whether a previously sent command has
moved the robot into the expected state. If the result was not satisfactory, the controller
can change the subsequent command u
+1
appropriately.
Feedback
Controller
Negative Feedback Control Diagram
Robot
u
fb
-
+
~
des
~
Feedback
Controller
Negative Feedback Control Diagram
Robot
u
fb
-
+
~
des
~
Figure 2.3 Negative Feedback Control
)
t
~
~
(
des
fb
.
,
u
-
= f
(eq. 2.3)
A special case of closed loop control is negative feedback control, illustrated in figure
2.3. The controller does not know the desired state of the robot, but only the difference
between the desired and the real state (
~
~
des
- ). This difference can be interpreted as an
error signal of the robot's state. Generating appropriate commands (u
fb
) to minimize the
input (and thus the error signal) will lead to a control solution. The negative feedback
approach will become infinitesimal accurately when using proper command generation
functions
)
(
fb
f
. The major disadvantage of the negative feedback control is the time
delay, which causes a significant loss of accuracy in the robot's movements. This control
strategy is further explained in chapter 2.3. in the context of PID-controllers. To further
minimize the error, we can add a feed-forward controller:

14
Feedback
Controller
Negative Feedback and Feedforward Control Diagram
Robot
u
fb
-
+
Feedforward
Controller
u
ff
+
+
~
des
~
u
Feedback
Controller
Negative Feedback and Feedforward Control Diagram
Robot
u
fb
-
+
Feedforward
Controller
u
ff
+
+
~
des
~
u
Figure 2.4 Negative Feedback and Feed-forward Control
)
t
,
,
~
(
)
t
,
,
~
~
(
des
ff
des
fb
.
.
u
f
f
+
-
=
(eq. 2.4)
In figure 2.4, a computed-torque feed-forward controller is introduced to reduce the error
and thus improve the robot's performance. The feed-forward controller uses a dynamics
model of the robot. It is now able to predict the actuator commands, which correspond to
a desired motion. Intuitively, the feed-forward controller moves the robot towards the
desired state in big steps and as accurately as possible. The dynamics model used in
)
(
ff
f
to compute the feed-forward commands (u
ff
) will never be absolutely accurate: It
leaves a remaining error to be corrected by the feedback controller. The feed-forward
controller operates as a set-top-box, which generates commands (u
ff
) independently of the
negative feedback controller. The final command sent to the robot thus is the addition of
both commands, feed-forward and negative feedback:
ff
fb
u
u
u
+
=
.
We will have a closer look at both, the negative feedback and the feed-forward
control functions, in the following two chapters.
2.3. The Feedback Control Function
PID-controllers are widely used as control policy in negative feedback controllers. These
controllers correct errors in position and velocity. They are usually based on a linear
control function. The name PID-controller is derived from:

15
· Proportional Control ('Position Error')
· Integral Control ('Steady State Error')
· Derivative Control ('Damping')
The Control Function
fb
f
of a PID-Controller is given by
-
+
-
+
-
=
dt
(
k
)
(
k
(
k
u
I
D
P
des
des
des
fb
(eq. 2.5)
with
u
fb
:
the (n×1) vector of computed neg.-feedback commands
k
P
, k
D
, k
I
:
the (n×1) gain vectors for P-, D-, and I­control
-
des
:
the (n×1) vector of errors in positions
-
des
:
the (n×1) vector of errors in velocities
The term
)
(
des
- in equation 2.5 will become non-zero, whenever the robot is in a
position
)
(
that differs from the desired position
)
(
des
. This position error
)
(
des
-
multiplied by a fixed gain (k
P
) yields in an appropriate command to compensate for that
error, given that the gain is properly adjusted: Too small gains will cause the system to
only slowly correct for errors; too high gains may lead to overcompensation (and
ultimately to unstable systems).
The second term,
)
(
des
- in equation 2.5, has the same effect on the velocities:
whenever a velocity differs from the desired velocity, a correction command will be
calculated by the difference multiplied with a gain k
D
. This part of the feedback
controller introduces a damping term.
The integral part additionally corrects very small errors once a steady state is
reached. Those errors might be introduced by external forces, e.g. gravity. To understand
the term, assume the robot has reached its target, but gravity makes it drop slightly under
the desired position. The first term
)
(
des
- in equation 2.1 will provide compensation,
but only proportional to the distance-error. If this error is small, or the arm is heavy, the
generated command may not succeed in moving the arm upwards. Ultimately, there will
be an equilibrium state below the desired target, when the correction of the PD-controller
compensates for gravity. The arm will not move upwards to the desired target anymore.

16
The integral controller will trace and accumulate the error. The integral will only
stop adding to the accumulation, when
)
(
des
- is zero, i.e. when the robot is exactly at
the target position. This means, it will continue until the accumulation can achieve a
compensation for the position offset (considering the gain k
I
). In case the robot
overshoots, the accumulation will decrease, as
)
(
des
- changes signs. Choosing an
appropriate gain k
I
, the correction introduced by the integral part of equation 2.5 will
keep the robot exactly at the desired position.
Finding a gain k
I
is probably the most difficult task in this process. A poor choice
can easily lead to an unstable system with catastrophic results. Integral controllers
compensate for steady states only, but this study concentrates on robot arms in motion.
Therefore, integral controllers have not been used as part of the negative feedback
control. All feedback controllers were modeled by a proportional and a derivative part
only.
Looking for appropriate gains k
P
and k
D
to design a PD-controller as shown
above, one faces a tradeoff between stiff and accurate systems:
· Assume high gains:
4
Whenever the desired state differs from the robot's state, the
command sent for compensation is relatively big. This will ensure fast
compensation. However, if you need the robot to give in, e.g. due to a push or
when hitting an object, a high gain controller will exert a high force to
compensate for this `position-error'.
· Assume low gains: The position error is multiplied by a smaller number. Thus,
the command send to the robot to correct the error is significantly smaller. Hence,
the robot is much more flexible when it hits an object or is being pushed aside. On
the other hand, it also needs longer to correct for `real' position errors.
Especially in the context of Humanoid Robotics, this tradeoff is very difficult. Humanoid
robots are supposed to work in close human interaction and may not hurt anyone in
interacting; but at the same time, they need to be capable of fast and accurate motion. The
4
High gains can also result in instable systems, as the robot may increase the absolute distance to the
desired target with every discrete step. Let us assume that the gains do not exceed the maximal value that
guarantees safe operation.

17
approach usually taken today is to use low gain feedback controllers and add a feed-
forward path to enhance performance. I will explain this mechanism in more detail in the
following chapters.
2.4. The Feed-forward Control Function
As seen in chapter 2.2, the feed-forward control function uses the dynamics of a system
to generate feed-forward commands. Then, these commands are used to move the system
quickly along a desired trajectory:
)
t
~
(
des
ff
ff
.
,
u
f
=
.
(eq. 2.7)
The dynamics of the system is hidden in the parameter vector
. .
In general, the dynamics model provides a description of the relationship between
the joint actuator torques and the motion on the structure. It relates the vectors of
positions (
), velocities (
), and accelerations ( ) to the torque vector ( 2 ). This last
vector is necessary to accomplish the acceleration in order to reach the desired state.
The direct dynamics problem assumes a given initial state of the system (position,
and velocity) and an incoming stream of joint torques. It will then predict the acceleration
in all joints for times
0
t
t
. Updates of velocities and positions can be calculated using
integration techniques over the accelerations. The direct dynamics is a useful model for
simulations, as it allows predictions of the physical system's motion, when the exerted
joint torques are known.
In contrast, the inverse dynamics model predicts the joint torques needed to
generate a specified motion. Solving the inverse dynamics problem will allow the
execution of a desired trajectory, once the trajectory is specified in terms of positions,
velocities, and accelerations in joint space. Usually, this is a result of an inverse
kinematics process (see chapter 2.1). Simulating the trajectory before executing the
motion can be used to verify the trajectory's feasibility: e.g., torques may not exceed a
maximal value, nor may they change abruptly.

18
Using the inverse dynamics model for feed-forward command predictions leads to
improved performance of the controller, compared to a `simple' feedback controller. This
combined controller will always move the robot close to the correct position using feed-
forward commands. Then, the `slowly adjusting' feedback part only has to compensate
for small position errors. Hence, the overall performance will increase. If we had a
perfect robot and knew the dynamics model, no feedback control was necessary at all.
But as our robot is not perfect, we do need to find the inverse dynamics model. Thus, the
question of how to estimate the dynamics model remains.
2.5. Estimating Dynamics using Rigid Body Assumptions
The Rigid Body Dynamics assumptions
5
are frequently used to estimate the dynamics of
a robot system
6
. The assumptions include, that the system to be modeled consists of
single stiff bodies. These rigid bodies are assumed to behave like perfect 'single link
robots', being connected to form a robot with multiple degrees of freedom. In addition,
the link's motion is not supposed to have influence on any other of the links and joints.
Only the link's weight and its position are modeled. The joints connecting two links are
not allowed to have friction or position inaccuracies. The dynamics of the system is
therefore described by the shape and weight of the links, not by their joint behavior.
The general structure of rigid body dynamics, also called the 'Joint space dynamic model',
can be estimated in two different ways:
7
The Lagrange formulation offers the system's dynamics in a closed form based on
the Lagrangian of the systems' total energy.
The Newton-Euler formulation allows describing the model in a recursive form by
forcing a torque balance on every link. This is computationally more efficient. However,
parameter estimation for unknown systems is more difficult compared to a closed form
model. For this reason, we will have a closer look at the Lagrangian formula of a
mechanical system. It is defined as:
5
E.g. Sciavicco Siciliano, 2000 or An, Atkeson Hollerbach, 1988
6
E.g. Sciavicco Siciliano, 2000 or Pfeiffer Reithmeier, 1987

19
L = T ­ U,
(eq. 2.8)
where
T and U denote the total kinetic energy and, respectively, the total potential energy
of the system. The Lagrange's Equation is expressed by
i
i
i
=
-
L
L
dt
d
,
n
1,...,
i
=
(eq. 2.9)
where n denotes the number of joints in the robot and
i
the generalized force associated
with the coordinate
i
. The forces include all non-conservative forces, such as joint
actuator torques, the joint friction torques, and the joint torques induced by end-effector
forces at the contact with the environment. As we have seen earlier, equation 2.5
establishes the relation between generalized forces (i.e. the joint torques) and joint
positions, velocities and acceleration. Consequently, it is possible to derive the dynamic
model, once the kinetic and potential energy of all links are known. For details of the
computation of these energies, please refer to e.g. Sciavicco Siciliano, 2000. The
general formula of the equation of motion derived by the Lagrangian method is
2
G
C
B
=
+
+
)
(
)
,
(
)
(
(eq. 2.10)
with:
B
(n × n)
: a positive definite inertia matrix,
only depending on the robot's current position
C
(n × n)
: a matrix containing the centripetal and Coriolis forces,
depending on the robot's current position and its current velocities
G
(n × 1)
: a vector of gravitational forces,
only depending on the robot's current position
2
(n × 1)
: a vector of the active torques at the joints 1, ..., n
7
The thesis will only provide a brief overview on the Lagrange method; for further information, please
refer e.g. Sciavicco Siciliano, 2000 or Pfeiffer Reithmeier, 1987.

20
A physical interpretation of the parameters in equation 2.10 gives an intuitive
understanding of the formula:
· The coefficients b
jj
(i.e. the elements on the diagonal of the matrix B) represent
the moment of inertia at joint axis j in the current manipulator configuration,
when all other joints are blocked. The coefficients b
ij
with
j
i
account for the
effects of acceleration from joint i on joint j.
· The coefficients in the C matrix represent the influence that one joint has on
another, as well as the Coriolis effect induced on a joint by the velocities of two
other joints.
· The terms g
i
(components of the vector G) represent the momentum generated at
the joint i axis of the manipulator in the current configuration, due to gravity.
Some of these parameters in the matrices can be simplified during the design process of
the robot, e.g. by using massive and stiff materials for the links. Another simplification
would be the use of gearboxes with high transmission ratios in the joints: in this case, all
the coefficients in the G and C matrices can be neglected, and the B matrix becomes
almost diagonal. This means, that every joint has influence on itself only. However, the
objective of this work is to find a model for a humanoid robot. Recalling the constraints
on humanoid robots from chapter 1.2 (e.g. compliant, lightweight), there are not many
possibilities for simplification. Gearboxes, e.g., may not be used, as the robot will
become too stiff; massive materials will make the robot unbearably heavy.
No matter how simple the robot setup is designed, there still remains the problem
of estimating appropriate coefficients in B, G, and C. Modern CAD systems provide
parameter estimation based on the design data of real systems. But this estimation is not
simple, and the results are poor most of the time. Usually, the process includes idealized
assumptions of the system's mechanics and uncertainties in the manufacturing process. In
addition, some properties of the links change during the robot's lifetime because of
material wear. So this does not seem to be a useful tool for such complex designs as such
of humanoid robots.

Details

Seiten
Erscheinungsform
Originalausgabe
Jahr
2001
ISBN (eBook)
9783832448585
ISBN (Paperback)
9783838648583
DOI
10.3239/9783832448585
Dateigröße
2.2 MB
Sprache
Englisch
Institution / Hochschule
Technische Universität Berlin – Informatik
Erscheinungsdatum
2001 (Dezember)
Note
1,0
Schlagworte
robot-motion-control neuroinformatics robotik
Zurück

Titel: Online-Learning in Humanoid Robots
book preview page numper 1
book preview page numper 2
book preview page numper 3
book preview page numper 4
book preview page numper 5
book preview page numper 6
book preview page numper 7
book preview page numper 8
book preview page numper 9
book preview page numper 10
book preview page numper 11
book preview page numper 12
book preview page numper 13
book preview page numper 14
book preview page numper 15
book preview page numper 16
book preview page numper 17
book preview page numper 18
book preview page numper 19
book preview page numper 20
book preview page numper 21
book preview page numper 22
book preview page numper 23
book preview page numper 24
114 Seiten
Cookie-Einstellungen