# Quasi-Monte Carlo Methods in Finance with Application to Optimal Asset Allocation

## Zusammenfassung

Inhaltsangabe:Introduction:
Portfolio optimization is a widely studied problem in finance. The common question is, how a small investor should invest his wealth in the market to attain certain goals, like a desired payoff or some insurance against unwished events.
The starting point for the mathematical treatment of this is the work of Harry Markowitz in the 1950s. His idea was to set up a relation between the mean return of a portfolio and its variance. In his terminology, an efficient portfolio has minimal variance of return among others with the same mean rate of return. Furthermore, if linear combinations of efficient portfolios and a riskless asset are allowed, this leads to the market portfolio, so that a linear combination of the risk-free asset and the market portfolio dominates any other portfolio in the mean-variance sense.
Later, this theory was extended resulting in the CAPM, or capital asset pricing model, which was independently introduced by Treynor, Sharpe, Lintner and Mossin in the 1960s. In this model, every risky asset has a mean rate of return that exceeds the risk-free rate by a specific risk premium, which depends on a certain attribute of the asset, namely its _. The so-called _ in turn is the covariance of the asset return normalized by the variance of the market portfolio. The problem of the CAPM is its static nature, investments are made once and then the state of the model changes. Due to this and other simplifications, this model was and is often not found to be realistic.
An impact to this research field were the two papers of Robert Merton in 1969 and 1971. He applied the theory of Ito calculus and stochastic optimal control and solved the corresponding Hamilton-Jacobi-Bellman equation. For his multiperiod model, he assumed constant coefficients and an investor with power utility. Extending the mean-variance analysis, he found that a long-term investor would prefer a portfolio that includes hedging components to protect against fluctuations in the market. Again this approach was generalized by numerous researchers and results in the problem of solving a nonlinear partial differential equation.
The next milestone in this series is the work by Cox and Huang from 1989, where they solve for Optimal Consumption and Portfolio Policies when Asset Prices Follow a Diffusion Process. They apply the martingale technique to get rid of the nonlinear PDE and rather solve a linear PDE. This, with several refinements, is […]

## Inhaltsverzeichnis

Mario Rometsch
Quasi-Monte Carlo Methods in Finance with Application to Optimal Asset Allocation
ISBN: 978-3-8366-1562-4
Druck Diplomica® Verlag GmbH, Hamburg, 2008
Zugl. Universität Ulm, Ulm, Deutschland, Diplomarbeit, 2007
Dieses Werk ist urheberrechtlich geschützt. Die dadurch begründeten Rechte,
insbesondere die der Übersetzung, des Nachdrucks, des Vortrags, der Entnahme von
Abbildungen und Tabellen, der Funksendung, der Mikroverfilmung oder der
Vervielfältigung auf anderen Wegen und der Speicherung in Datenverarbeitungsanlagen,
bleiben, auch bei nur auszugsweiser Verwertung, vorbehalten. Eine Vervielfältigung
dieses Werkes oder von Teilen dieses Werkes ist auch im Einzelfall nur in den Grenzen
der gesetzlichen Bestimmungen des Urheberrechtsgesetzes der Bundesrepublik
Deutschland in der jeweils geltenden Fassung zulässig. Sie ist grundsätzlich
vergütungspflichtig. Zuwiderhandlungen unterliegen den Strafbestimmungen des
Urheberrechtes.
Die Wiedergabe von Gebrauchsnamen, Handelsnamen, Warenbezeichnungen usw. in
diesem Werk berechtigt auch ohne besondere Kennzeichnung nicht zu der Annahme,
dass solche Namen im Sinne der Warenzeichen- und Markenschutz-Gesetzgebung als frei
zu betrachten wären und daher von jedermann benutzt werden dürften.
Die Informationen in diesem Werk wurden mit Sorgfalt erarbeitet. Dennoch können
Fehler nicht vollständig ausgeschlossen werden, und die Diplomarbeiten Agentur, die
Autoren oder Übersetzer übernehmen keine juristische Verantwortung oder irgendeine
Haftung für evtl. verbliebene fehlerhafte Angaben und deren Folgen.
http://www.diplom.de, Hamburg 2008
Printed in Germany

Abstract
We introduce some quasi-Monte Carlo methods and show how to apply them to prob-
lems emerging from mathematical finance. Next, the Malliavin derivative is defined
and the Clark-Ocone formula is derived. Then, following a work of Detemple, Garcia
and Rindisbacher [DGR05] from 2005, the problem of optimal portfolio allocation and
consumption is solved, where the optimal portfolio shares are given as expectations
of certain random variables and their derivatives. We show how quasi-Monte Carlo
methods can be applied to the computation of portfolio weights.

to Elisabeth Rometsch
1920 - 2005

Acknowledgment
I am deeply indebted to both my advisors, Prof. Dr. R¨
udiger Kiesel and Prof. Dr.
Karsten Urban, for their constant support. Without their help and inspiration, this
work would not have been possible. I would also like to thank Prof. Dr. Alexander
Keller for providing me his valuable insights on the quasi-Monte Carlo theory.
Special thanks go to my parents, Inge and Werner Rometsch for putting me through
university, and of course I also thank my sister, Jana Rometsch, for being such an
important person for me.
I would like to express my gratitude to my scholarship sponsor, the "Deutsche
Forschungsgemeinschaft", for allowing me to work as a research student for over one
and a half year now in the "Graduiertenkolleg 1100" here at the university of Ulm.
I am grateful to Daniel Bauer and Shaohui Wang for helping me out with some ques-
tions that emerged during this work.
I would also like to thank Michael Lehn and Alexander Stippler for providing me
FLENS, the Flexible Library for Efficient Numerical Solutions. With such an excep-
tional helpful linear algebra software package for C++, I saved much implementation
time while using the full spectrum of computer vector arithmetic. Furthermore, I thank
the "Institute for Numerical Mathematics" for letting me use their computer pool.
I am very thankful to my colleagues Wolfgang H¨
ogerle, Dennis Sch¨
atz and Jens
Stephan for proofreading this thesis numerous times.
I would also like to thank the open source community for providing such excellent
quality software products such as (in no particular order of preference) Ubuntu Linux,
GCC, eclipse, QuantLib, Gnuplot, unison, L
A
TEX and Gnome.
Finally, I would like to express my deepest gratitude for the constant support, un-
derstanding and love that I received from my beloved mate Sabrina. You make the sun
shine everyday.

Contents
Introduction
v
1
Monte Carlo and quasi-Monte Carlo methods
1
1.1
Numerical integration
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
1.2
Evaluation of integrals with Monte Carlo methods . . . . . . . . . . . .
2
1.3
Quasi-Monte Carlo methods . . . . . . . . . . . . . . . . . . . . . . . . .
3
1.3.1
Introduction
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
1.3.2
Discrepancy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
1.3.3
The Koksma-Hlawka inequality . . . . . . . . . . . . . . . . . . .
5
1.4
Classical constructions . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6
1.4.1
One-dimensional sequences . . . . . . . . . . . . . . . . . . . . .
6
1.4.2
Multi-dimensional sequences
. . . . . . . . . . . . . . . . . . . .
7
1.5
(t,m,s)-nets and (t,s)-sequences . . . . . . . . . . . . . . . . . . . . . . .
10
1.5.1
Variance reduction . . . . . . . . . . . . . . . . . . . . . . . . . .
10
1.5.2
Nets and sequences . . . . . . . . . . . . . . . . . . . . . . . . . .
11
1.5.3
Two constructions for (t,s)-sequences . . . . . . . . . . . . . . . .
13
1.6
Digital nets and sequences . . . . . . . . . . . . . . . . . . . . . . . . . .
20
1.7
Lattice rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
21
1.8
The curse of dimension revisited
. . . . . . . . . . . . . . . . . . . . . .
22
1.8.1
Padding techniques . . . . . . . . . . . . . . . . . . . . . . . . . .
23
1.8.2
Latin Supercube sampling . . . . . . . . . . . . . . . . . . . . . .
23
1.9
Time consumption of the various point generators
. . . . . . . . . . . .
25
1.10 quasi-Monte Carlo methods in Finance . . . . . . . . . . . . . . . . . . .
26
1.10.1 Example: Arithmetic option . . . . . . . . . . . . . . . . . . . . .
26
1.10.2 Path generation . . . . . . . . . . . . . . . . . . . . . . . . . . . .
27
1.10.3 Sampling size . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
34
1.10.4 Results
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
36
2
Malliavin Calculus
40
2.1
Wiener-It^
o chaos expansion . . . . . . . . . . . . . . . . . . . . . . . . .
40
2.2
Skorohod integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
46
2.3
Differentiation of random variables . . . . . . . . . . . . . . . . . . . . .
50
i

2.4
Examples of Malliavin derivatives . . . . . . . . . . . . . . . . . . . . . .
64
2.5
The Clark-Ocone formula . . . . . . . . . . . . . . . . . . . . . . . . . .
65
2.6
The generalized Clark-Ocone formula . . . . . . . . . . . . . . . . . . . .
70
2.7
Multivariate Malliavin Calculus . . . . . . . . . . . . . . . . . . . . . . .
78
3
Asset Allocation
82
3.1
Problem formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
82
3.1.1
Financial market model . . . . . . . . . . . . . . . . . . . . . . .
82
3.1.2
Wealth process . . . . . . . . . . . . . . . . . . . . . . . . . . . .
84
3.1.3
Expected utility
. . . . . . . . . . . . . . . . . . . . . . . . . . .
84
3.1.4
Portfolio problem . . . . . . . . . . . . . . . . . . . . . . . . . . .
85
3.1.5
Equivalent static problem . . . . . . . . . . . . . . . . . . . . . .
86
3.1.6
Optimal portfolio . . . . . . . . . . . . . . . . . . . . . . . . . . .
88
3.2
Solution of the portfolio problem . . . . . . . . . . . . . . . . . . . . . .
94
3.2.1
Optimal portfolio . . . . . . . . . . . . . . . . . . . . . . . . . . .
94
3.2.2
Optimal portfolio with constant relative risk aversion (CRRA) .
94
4
Implementation
97
4.1
A single state variable model with explicit solution . . . . . . . . . . . .
97
4.2
Simulation-based approach
. . . . . . . . . . . . . . . . . . . . . . . . . 100
4.3
SDE system as multidimensional SDE . . . . . . . . . . . . . . . . . . . 101
4.4
Error analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
4.4.1
Discretisation error . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.4.2
Conditional expectation approximation error . . . . . . . . . . . 104
4.5
Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.5.1
One year time horizon . . . . . . . . . . . . . . . . . . . . . . . . 108
4.5.2
Two year time horizon . . . . . . . . . . . . . . . . . . . . . . . . 111
4.5.3
Five year time horizon . . . . . . . . . . . . . . . . . . . . . . . . 114
4.5.4
Experiments with a small time horizon . . . . . . . . . . . . . . . 117
Conclusion
119
Summary
120
ii

List of Figures
1.1
The quasi-random sequence fills the unit square more equally than the
pseudo-random sequence.
. . . . . . . . . . . . . . . . . . . . . . . . . .
8
1.2
First 1000 points of the 30-dimensional Halton sequence. . . . . . . . . .
9
1.3
An example of stratified sampling. . . . . . . . . . . . . . . . . . . . . .
10
1.4
An example of Latin hypercube sampling. . . . . . . . . . . . . . . . . .
11
1.5
Example of elementary intervals in base 2, with volume 1/16. . . . . . .
12
1.6
A (0, 4, 2)-net in base 2. . . . . . . . . . . . . . . . . . . . . . . . . . . .
13
1.7
First 1024 points of the 100-dimensional Sobol' sequence . . . . . . . . .
30
1.8
Six steps of a Brownian Bridge construction.
. . . . . . . . . . . . . . .
30
1.9
3D view of a Brownian Bridge construction. . . . . . . . . . . . . . . . .
31
1.10 Monte Carlo convergence order is as expected.
. . . . . . . . . . . . . .
37
1.11 The Sobol' sequence without integrand transformation does not achieve
a higher convergence order. . . . . . . . . . . . . . . . . . . . . . . . . .
37
1.12 The Sobol' sequence with a Brownian bridge construction gives better
results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
38
1.13 The Sobol' sequence with Principal Component Analysis attains optimal
convergence order. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
38
1.14 The rank-1 lattice also attains optimal convergence order. . . . . . . . .
39
2.1
A sample path of the Wiener process, and the path perturbed at t = 0.3. 58
4.1
A sample path of the market price of risk and the corresponding sample
path of the stock. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
97
4.2
The dimensionality resulting from the number of simulated trajectories.
105
4.3
One year time horizon. Two Monte Carlo computations with various
proportions of paths and time discretisation steps. . . . . . . . . . . . . 108
4.4
One year time horizon. Three quasi-Monte Carlo computations with
various proportions of paths and time discretisation steps. . . . . . . . . 108
4.5
One year time horizon. Computation with the Sobol' sequence with
various proportions of paths and time discretisation steps. . . . . . . . . 109
4.6
One year time horizon. Computation with the Sobol' sequence in com-
bination with the Milshtein approximation. . . . . . . . . . . . . . . . . 109
iii

4.7
One year time horizon. Direct comparison of two quasi-Monte Carlo and
a Monte Carlo scheme. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
4.8
Two year time horizon. The Monte Carlo scheme converges as expected. 111
4.9
Two year time horizon. The Monte Carlo method is again outclassed by
the Sobol' sequence in combination with a Brownian Bridge.
. . . . . . 111
4.10 Two year time horizon.
The Sobol' sequence in combination with a
Brownian Bridge construction converges faster than the Sobol' sequence
with a PCA construction. . . . . . . . . . . . . . . . . . . . . . . . . . . 112
4.11 Two year time horizon. Computation with the Sobol' sequence in com-
bination with the Milshtein scheme.
. . . . . . . . . . . . . . . . . . . . 112
4.12 Two year time horizon. The Latin Supercube sampling from a Sobol'
sequence shows no advantage compared to Monte Carlo sampling.
. . . 113
4.13 Five year time horizon. The Monte Carlo scheme converges as expected. 114
4.14 Five year time horizon. The Monte Carlo method is again outclassed by
the Sobol' sequence in combination with a Brownian Bridge construction. 114
4.15 Five year time horizon. Again, the PCA construction does not result in
a higher convergence order. . . . . . . . . . . . . . . . . . . . . . . . . . 115
4.16 Five year time horizon. Computation with the Sobol' sequence and the
Milshtein scheme. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
4.17 Five year time horizon. The Latin Supercube sampling gives no advan-
tage over Monte Carlo integration. . . . . . . . . . . . . . . . . . . . . . 116
4.18 Small time horizon. The Monte Carlo scheme keeps its convergence order.117
4.19 Small time horizon. The Sobol' sequence with the PCA construction
achieves nearly optimal convergence. . . . . . . . . . . . . . . . . . . . . 118
iv

Introduction
Portfolio optimization is a widely studied problem in finance. The common question
is, how a small investor should invest his wealth in the market to attain certain goals,
like a desired payoff or some insurance against unwished events.
The starting point for the mathematical treatment of this is the work of Harry
Markowitz in the 1950s. His idea was to set up a relation between the mean return
of a portfolio and its variance. In his terminology, an efficient portfolio has minimal
variance of return among others with the same mean rate of return. Furthermore, if
linear combinations of efficient portfolios and a riskless asset are allowed, this leads to
the market portfolio, so that a linear combination of the risk-free asset and the market
portfolio dominates any other portfolio in the mean-variance sense.
Later, this theory was extended resulting in the CAPM, or capital asset pricing
model, which was independently introduced by Treynor, Sharpe, Lintner and Mossin
in the 1960s. In this model, every risky asset has a mean rate of return that exceeds
the risk-free rate by a specific risk premium, which depends on a certain attribute of
the asset, namely its . The so-called in turn is the covariance of the asset return
normalized by the variance of the market portfolio. The problem of the CAPM is its
static nature, investments are made once and then the state of the model changes. Due
to this and other simplifications, this model was and is often not found to be realistic.
An impact to this research field were the two papers of Robert Merton in 1969 [Mer69]
and 1971 [Mer71]. He applied the theory of It^
o calculus and stochastic optimal control
and solved the corresponding Hamilton-Jacobi-Bellman equation. For his multiperiod
model, he assumed constant coefficients and an investor with power utility. Extending
the mean-variance analysis, he found that a long-term investor would prefer a portfolio
that includes hedging components to protect against fluctuations in the market. Again
this approach was generalized by numerous researchers and results in the problem of
solving a nonlinear partial differential equation.
The next milestone in this series is the work by Cox and Huang [CH89] from 1989,
where they solve for "Optimal Consumption and Portfolio Policies when Asset Prices
Follow a Diffusion Process". They apply the martingale technique to get rid of the non-
linear PDE and rather solve a linear PDE. This, with several refinements, is nowadays
a standard method for asset allocation in a complete market.
Approximately at the same time in 1991, Ocone and Karatzas published two pa-
v

pers, [OK91] and [KOL91], in which they extend the Clark-Haussmann representation
formula to Wiener functionals under an equivalent measure. They use the martingale
method for selecting the optimal terminal wealth resp. optimal consumption and then
apply the representation theorem to derive the optimal portfolio strategy. This ex-
pression involves expectations of random variables depending on the interest rate, the
market prices of risk and unspecified derivatives of these state variables.
Because of this unspecification, the hedging terms of the resulting portfolio do not
have an explicit form and are pretty difficult to evaluate numerically. Therefore, the
recent literature focused more on models with state variable specifications, for which
closed-form solutions are available, for example [Wac02], or on specifications which, as
mentioned above, rely on dynamic programming.
One main part of this thesis is now to present the latest work of Detemple, Garcia
and Rindisbacher, [DGR03a] and [DGR05], in which they tie up the ideas from Ocone
and Karatzas. They solve for optimal portfolio allocation in a complete market model
and derive explicit expressions for the hedging terms. The optimal portfolio shares are
found to be expectations of random variables, which allow a simulation-based approach.
In opposite to the dynamic programming approach, this method is capable of handling
complex, nonlinear dynamics for a large number of state variables.
When then the optimal portfolio shares need to be calculated, this basically reduces
to a high-dimensional integration over the unit hypercube. Perhaps because of its
easy and straightforward implementation, this is most often done with Monte Carlo
methods. Niederreiter [Nie92] notes that the official emergence of the Monte Carlo
method was a paper of Nicholas Metropolis and Stanislaw Ulam in 1949, but at that
time, pseudo-random algorithms had been used for years in secret projects of the U.S.
Department of Defense, like the Manhattan project. Since then, these methods have
become the most widely used tools for computing high-dimensional integrals.
Although they have appealing features like weak regularity conditions, all these
Monte Carlo or pseudo-random methods suffer from one severe shortcoming: their
convergence order stems from the central limit theorem, hence they converge very
slowly.
The quasi-Monte Carlo method now tries to overcome this hitch while keeping the
applicability to high-dimensional integration at the same time. This is achieved by
sampling from very careful chosen deterministic points. The methods and algorithms
then are not called pseudo-random but quasi-random. The term "quasi-Monte Carlo"
method appeared the first time in a technical report from 1951, and Niederreiters out-
standing monograph [Nie92] cites many works and applications to problems like the
numerical solution of integral equations or integro-differential equations. Quasi-Monte
vi

Carlo applications to problems from the finance setting came up in the nineties with
Ecuyer et al. This topic composes the second part of this work,
in which we will present some concepts from the quasi-Monte Carlo theory to problems
that emerge in mathematical finance.
In this thesis, we derive the optimal portfolio formula on the basis of the proofs in
[DGR03a] resp. [DGR05]. The main contribution is then the application of some quasi-
Monte Carlo methods for its computation. Using a simple model with exact solution,
different schemes were tested and showed, that the substitution of the pseudo-random
number generator for the quasi-random number generator requires special care.
This thesis is organized as follows. In Chapter 1, we present some concepts from
the quasi-Monte Carlo theory and show, how these methods can be applied to price a
simple financial derivative, an arithmetic call option. In Chapter 2 we follow the lecture
notes [Øks97] and present an introduction to the "stochastic calculus of variations" or
Malliavin Calculus, which will culminate in the definition of the Malliavin derivative
and the derivation of the Clark-Ocone formula.
Chapter 1: quasi-Monte Carlo methods
Chapter 2: Malliavin Calculus
Chapter 3: Asset Allocation
Chapter 4: Implementation
-
The reader wishing a quick start to the asset allocation problem may thus jump directly
to Chapter 3. There, we will introduce the financial market model and formulate the
optimization problem. We will follow the work of Detemple, Garcia and Rindisbacher
[DGR03a] resp. [DGR05] and derive the optimal portfolio formula. However, for this
we will need the calculation rules from Chapter 2, so if skipped, the optimal portfolio
formula has to be taken as granted. The implementation in Chapter 4 will then apply
these formulas to a simple model with explicit solution, and we show, how the quasi-
Monte Carlo methods from Chapter 1 can be used to improve the efficiency over plain
Monte Carlo methods.
vii

1 Monte Carlo and quasi-Monte Carlo
methods
To get a feeling for the integration techniques now commonly known as "quasi-Monte
Carlo methods", we will first introduce some basics from numerical integration and
classical Monte Carlo methods. After that, we will illustrate the idea behind quasi-
random algorithms and present some integration rules. At the end of this chapter, we
show how a simple financial derivative can be priced with these techniques. We won't go
too much into theoretical details here, the reader wishing some more thorough overview
is referred to the great book of Harald Niederreiter [Nie92]. We will follow this book
in our introduction.
1.1 Numerical integration
The evaluation of integrals in dimension s is a standard problem of numerical analysis.
For s = 1 there are some classical numerical methods as, for example, the trapezoidal
rule:
[0,1]
f (x) dx
n
i=0
i
f
i
n
with
i
=
1
2n
:
i = 0
or
i = n
1
n
:
0 < i < n
(1.1)
Now if f is twice continuously differentiable on the interval (0, 1), the quadrature error
is of magnitude O(n
-2
). In higher dimensions, the integration can be carried out by
means of an iteration of one-dimensional integration rules. Therefore, the node set is
a product of one-dimensional node sets, and the weights are products of weights from
one-dimensional integration. The s-dimensional trapezoidal rule is defined as
[0,1]
s
f (x) dx
n
i
1
=0
. . .
n
i
s
=0
i
1
· · ·
i
s
f
i
1
n
, . . . ,
i
s
n
,
(1.2)
with
i
like in (1.1) and overall N = (n+1)
s
node points. If f is again twice continuously
differentiable in every coordinate, the error behaves similar to the one-dimensional case
like O(n
-2
). If one replaces n by N
-s
, the error becomes O(N
-2/s
). This expression
does fatefully depend on the dimension s (this is called the "curse of dimension"), e.g.
to get an overall accuracy of two decimal digits, one has to use roughly the total number
of 10
s
nodes!
1

1.2 Evaluation of integrals with Monte Carlo methods
Monte Carlo methods are a technique that can be used to evaluate high dimensional
integrals without the "curse of dimension" as mentioned above. Therefore, the inte-
grand is interpreted as a transformation of a s-dimensional random variable, that is
uniformly distributed over the unit cube.
[0,1]
s
f (x) dx
=
f (X) dP
=
E[f (X)],
with
X U
[0,1]
s
.
The integral can then be calculated by random sampling
E[f (X)]
1
N
N
i=1
f (x
i
),
(1.3)
where the x
i
are realizations of X
i
U
[0,1]
s
. The theoretical foundation for this is given
by the strong law of large numbers:
Theorem 1.2.1 (Strong Law of Large Numbers)
Let X
1
, X
2
, . . . be i.i.d. random variables and f L
2
([0, 1]
s
). Then we have with
probability 1
1
N
N
i=1
f (X
i
)
N
- E[f (X
1
)].
Proof. The proof can be found in most books about probability or statistics.
For the integration error in Theorem 1.2.1 one can show that for f L
2
([0, 1]
s
) it
holds
[0,1]
s
. . .
[0,1]
s
1
N
N
i=1
f (x
i
) - E[f (X
1
)]
2
dx
1
· · · dx
N
=
2
(f )
N
,
where
2
(f ) =
[0,1]
s
f (x) - E[f (X
1
)]
2
dx.
So the error is in average (f )N
-
1
2
and summing up the above we get the Monte Carlo
estimate
[0,1]
s
f (x) dx
1
N
N
i=1
f (X
i
),
where X
1
, X
2
. . . , X
N
are i.i.d. random variables, X
i
U
[0,1]
s
, with the probabilistic
error bound O(N
-
1
2
). The important fact is, that this error does not depend on the
dimension s as it was the case for the product rule. So the Monte Carlo method out-
classes the trapezoidal rule approximately from dimension s = 5 and beyond. Having
said this, we have to immediately point out the drawbacks of the Monte Carlo method:
· There are only probabilistic error bounds
2

· Generating (really) random samples with a computer yields some problems
· The regularity of the integrand is not reflected by the error bound
The problem with random sampling on a computer has been studied widely with two
possible outcomes: Either one takes a so called pseudo-random number generator like
the Mersenne Twister
1
, which produces a deterministic sequence of numbers which is
designed to pass some statistical tests for randomness, or one could use a true random
number generator, that produces true random numbers, for example based on the noise
of a resistor. Both of these methods have their disadvantages. With this short summary
of Monte Carlo methods, let us now pass to the quasi-Monte Carlo method.
1.3 Quasi-Monte Carlo methods
1.3.1 Introduction
The error in the Monte Carlo method is of order O(N
-
1
2
) on the average over all node
sets. Therefore, there must exist some node sets, which perform significantly better
than the average. The quasi-Monte Carlo approach now takes the deterministic nature
of the pseudo-random numbers in the Monte Carlo method as granted and tries to
construct special node sets, which perform much better than the average. So for short,
the quasi-Monte Carlo approximation of the integral is
[0,1]
s
f (x) dx
1
N
N
i=1
f (x
i
),
(1.4)
with the only difference, that the x
i
are now carefully chosen deterministic points from
a so-called low-discrepancy sequence.
1.3.2 Discrepancy
Think of the points x
i
in (1.4) as part of an infinite sequence x
1
, x
2
, . . . of points in
[0, 1]
s
. The first requirement on such a sequence is that it yields a convergent method:
lim
N
1
N
N
i=1
f (x
i
) =
[0,1]
s
f (x) dx
(1.5)
for f from some function space, say f C([0, 1]). So the first reasonable condition
would be, that the x
i
are uniformly distributed over [0, 1]
s
, which is equivalent to
2
lim
N
1
N
N
i=1
1l
A
(x
i
)
=
(A)
1
We will use this pseudo-random number generator for our Monte Carlo calculations.
2
According to [Nie92], this is shown in [KN74].
3

for all subintervals A of [0, 1]
s
, where denotes the Lebesgue measure. So the empirical
distribution of the node point sequence should be as close as possible to the uniform
distribution. This leads to the two following measures.
Definition 1.3.1 (star discrepancy)
We define the star discrepancy D
N
(P ) = D
N
(x
1
, . . . , x
N
) of a point set P as
D
N
(P ) = sup
y
i
[0,1]
s
i=1
y
i
-
#
P
s
×
i=1
[0, y
i
]
N
= sup
y[0,1]
s
F
U
(y) - ^
F
x
1
,...,x
N
(y)
where #{A} is the counting function of the set A, F
U
(y) is the distribution function
of the s-dimensional uniform distribution and ^
F
x
1
,...,x
N
(y) is the empirical distribution
function of the point set (x
1
, . . . , x
N
).
Definition 1.3.2 (extreme discrepancy)
The extreme discrepancy D
N
(P ) = D
N
(x
1
, . . . , x
N
) of a point set P is defined as
D
N
(P ) =
sup
0aibi1
i=1,...,N
s
i=1
(b
i
- a
i
) -
#
P
s
×
i=1
[a
i
, b
i
]
N
.
Niederreiter [Nie92] shows that both discrepancies may be bounded by each other
D
N
(P ) D
N
(P ) 2
s
D
N
(P ),
so it suffices to concentrate our interest on the first one.
If we look at the one-
dimensional interval [0, 1], then in order to minimize the star discrepancy of the N -
point set P = (x
1
, . . . , x
N
) we would choose x
i
=
2i-1
2N
, yielding a star discrepancy of
D
N
=
1
2N
.
For a sequence S = (x
1
, x
2
, . . .) in [0, 1], this doesn't hold anymore. There, the star
discrepancy is bounded from below with
D
N
(S) = D
N
(x
1
, . . . , x
N
) cN
-1
log N
for infinitely many N.
In dimensions higher than 1, Niederreiter states that it is widely believed, that the star
discrepancy is also bounded from below by
D
N
(S) = D
N
(x
1
, . . . , x
N
) cN
-1
(log N )
s
for infinitely many N,
so the best star discrepancy that can be achieved is O(N
-1
(log N )
s
) for a sequence
and O(N
-1
(log N )
s-1
) for a point set, quantifying the term low-discrepancy.
4

Definition 1.3.3 (low-discrepancy point set)
If the star discrepancy of a point set is of order O N
-1
(log N )
s-1
, then the point set
is called a low-discrepancy point set.
Definition 1.3.4 (low-discrepancy sequence)
If the star discrepancy of a sequence is of order O N
-1
(log N )
s
, then the sequence is
called a low-discrepancy sequence.
1.3.3 The Koksma-Hlawka inequality
We will now state an error bound, which directly links the star discrepancy and the
quadrature error. For this we need some definitions that characterize the smoothness
of the integrand.
Definition 1.3.5 (Variation in the sense of Vitali)
Let f be a function on [0, 1]
s
and A [0, 1]
s
a rectangle of the form A =
s
×
i=1
[a
i
, b
i
).
Define (f ; A) as an alternating sum of the values of f at the vertices of A:
(f ; A) =
1
j
1
=0
. . .
1
j
s
=0
(-1)
P
s
k=1
j
k
f j
1
a
1
+ (1 - j
1
)b
1
, . . . , j
s
a
s
+ (1 - j
s
)b
s
For P a partition of [0, 1]
s
in subintervals of the form A, the variation of f on [0, 1]
s
in the sense of Vitali is defined as
V
(s)
(f ) = sup
P
AP
|(f ; A)| .
This is indeed a measure for the variability of the function f , because if f is for
example s-times continuously differentiable on (0, 1)
s
, then another expression for the
variation in the sense of Vitali is available
3
:
V
(s)
(f ) =
1
0
· · ·
1
0
s
f
x
1
· · · x
s
dx
1
· · · dx
s
.
For our error bound, we need an extension of this measure.
Definition 1.3.6 (Variation in the sense of Hardy and Krause)
Let again f be defined on [0, 1]
s
.
For 1 k s and 1 i
1
< . . . < i
k
s,
let V
(k)
(f ; i
1
, . . . , i
k
) be the variation in the sense of Vitali of f restricted to the k-
dimensional face
(x
1
, . . . , x
s
) [0, 1]
s
x
j
= 1 for j = i
1
, . . . , i
k
. Then
V (f ) =
s
k=1
1i
1
<...<i
k
s
V
(k)
(f ; i
1
, . . . , i
k
)
is called the variation of f on [0, 1]
s
in the sense of Hardy and Krause.
3
Compare page 19, [Nie92].
5

If V (f ) is finite, then f is said to be of bounded variation in this sense. With this
notion of variation, we can state the main theorem in this chapter.
Theorem 1.3.7 (Koksma-Hlawka inequality)
If f has bounded variation on [0, 1]
s
in the sense of Hardy and Krause, then for
x
1
, x
2
, . . . , x
N
[0, 1]
s
it holds
1
N
N
i=1
f (x
i
) -
[0,1]
s
f (x) dx V (f )D
N
(x
1
, . . . , x
N
).
This inequality is sharp, because for any x
1
, . . . , x
N
[0, 1]
s
and any > 0, there
exists a function f C
([0, 1]
s
) with V (f ) = 1 and
1
N
N
i=1
f (x
i
) -
[0,1]
s
f (x) dx > D
N
(x
1
, . . . , x
N
) - .
So with this inequality, we can tackle down two problems of the Monte Carlo method,
namely that we do not need random samples and that the error is bounded not only
with a certain probability. However, in practice we cannot expect the Koksma-Hlawka
inequality to act as an error estimator, because the terms on the right side are much
more harder to evaluate as the integral itself.
If we take for granted, that for a given integrand we cannot change its variation, this
error bound leads to the conclusion, that the star discrepancy of the point set from
that we sample should be as small as possible. In the next section, we introduce some
specific constructions for such point sets.
1.4 Classical constructions
Before we come to more recent constructions of low-discrepancy point sets, we give a
short overview about some classical results.
1.4.1 One-dimensional sequences
Although one-dimensional quasi-Monte Carlo methods are not so important from the
practical point of view, we will present some concepts, as they are the basis for some
multi-dimensional constructions. A starting point is the following function, which maps
integers to points in the unit interval.
Let b 2 denote an integer base, then every integer n has a unique digit representation
n =
i=0
a
i
b
i
6

in base b, where the sum is actually a finite sum. The radical inverse function
b
(n) is
now just the reflection of this representation about the decimal point, more formally
b
(n) =
i=0
a
i
b
-i-1
,
which is actually again a finite sum.
Trivial to see, that the image of the radical inverse function is [0, 1) and so we get the
Definition 1.4.2 (Van-der-Corput sequence in base b)
Let b 2 denote an integer base. Then the sequence x
1
, x
2
, . . . with x
n
=
b
(n) is called
the Van-der-Corput sequence in base b.
The Van-der-Corput sequence fills the unit interval with points in a maximally bal-
anced way. We demonstrate this with an example of the Van-der-Corput sequence in
base 2:
i
2
(i)
1
0.5
2
0.25
3
0.75
4
0.125
5
0.625
6
0.375
7
0.875
8
0.0625
9
0.5625
..
.
..
.
All Van-der-Corput sequences are low-discrepancy sequences, i.e. for all bases b 2,
the star discrepancy of the first N terms is O(N
-1
log N ). Although they are one-
dimensional sequences, this concept can be easily extended to multiple dimensions.
1.4.2 Multi-dimensional sequences
Definition 1.4.3 (Halton sequence)
With the dimension s 1, let b
1
, b
2
, . . . , b
s
be integers, b
i
2, and define the Halton
sequence by
x
n
=
b
1
(n), . . . ,
b
s
(n)
with points in [0, 1)
s
.
7

(a) 1000 points from the Mersenne Twister
(b) 1000 points from the Halton sequence, di-
mension 1 and 2
Figure 1.1: The quasi-random sequence fills the unit square more equally than the
pseudo-random sequence.
If the bases b
i
are pairwise relatively prime, Niederreiter (Theorem 3.6, [Nie92]) shows
that the star discrepancy of the Halton sequence is bounded by
D
(x
1
, x
2
, . . . , x
N
) <
s
N
+
1
N
s
i=1
b
i
- 1
2 log b
i
log N +
b
i
+ 1
2
,
N 1,
and if the b
i
are the first prime numbers, he finds that
D
(x
1
, x
2
, . . . , x
N
) A(b
1
, . . . , b
s
)N
-1
(log N )
s
+ O N
-1
(log N )
s-1
(1.6)
with
A(b
1
, . . . , b
s
) =
s
i=1
b
i
- 1
2 log b
i
.
A little modification of the Halton sequence results in the Hammersley point set,
that has a smaller star discrepancy bound than the Halton sequence but comes with a
fixed number of points.
Definition 1.4.4 (Hammersley point set)
Again s 2 defines the dimension and let b
1
, . . . , b
s-1
be integers. Then the N -element
point set (x
1
, . . . , x
N
) with x
n
defined by
x
n
=
n
N
,
b
1
(n), . . . ,
b
s-1
(n) ,
n = 0, 1, . . . , N - 1
is called the N -element Hammersley point set in the bases b
1
, . . . , b
s-1
.
The star discrepancy of the Hammersley point set is again bounded from above, and
if the b
i
are pairwise relatively prime we get
D
(x
1
, . . . , x
N
) <
s
N
+
1
N
s-1
i=1
b
i
- 1
2 log b
i
log N +
b
i
+ 1
2
.
8

Similar, with b
1
, b
2
, . . . , b
s-1
the first prime numbers we have
4
D
(x
1
, . . . , x
N
) A(b
1
, . . . , b
s-1
)N
-1
(logN )
s-1
+ O N
-1
(logN )
s-2
.
(1.7)
These two constructions contain a severe catch that follows from the behaviour of
the Van-der-Corput sequence with a large base b. If it is large, then the sequence
produces long monotone segments. This fact is easily seen from the following picture,
which shows the first 1000 points of the Halton sequence
5
, left is the projection onto
dimension 1 and 2, the right panel shows the projection onto dimension 29 and 30.
(a) Projection onto dimension 1 and 2
(b) Projection onto dimension 29 and 30
Figure 1.2: First 1000 points of the 30-dimensional Halton sequence. The figure shows
the problem in higher dimensions.
This problem is also reflected by the constant A in (1.6) and (1.7). If we take the
first s prime numbers as bases, we get
A
s
= A(b
1
, . . . , b
s
) =
s
i=1
b
i
- 1
2 log b
i
,
and Niederreiter [Nie92] shows that from the prime number theorem it follows that
lim
s
log A
s
s log s
= 1.
So the superexponential growth of A
s
is a serious problem if the dimension rises. Ac-
tually, the discrepancy bounds (1.6) and (1.7) get useless. To overcome this problem,
Faure [Fau82] developed a different extension of the Van-der-Corput sequence to multi-
ple dimensions, where all coordinates use the same base b. This base then has to be at
least as large as the dimension itself, but this can be much smaller than the largest base
for a Halton sequence with the same dimension. Additionally, Sobol' also extended the
4
This is shown in Theorem 3.8, [Nie92].
5
For the bases, we used the first 30 prime numbers.
9

Van-der-Corput sequence to multiple dimensions, but there, all dimensions use the base
2. We will present both constructions in a somewhat more general concept called nets
and sequences.
1.5 (t,m,s)-nets and (t,s)-sequences
1.5.1 Variance reduction
A basic variance reduction technique from Monte Carlo methods is called stratified
sampling. It relies on the subdivision of the s-dimensional unit cube into a partition
A
1
, . . . , A
k
and then to take random samples from these strata. In one dimension, this
means that we partition the unit interval into n strata
A
1
= 0,
1
n
, A
2
=
1
n
,
2
n
, . . . , A
n
=
n - 1
n
, 1
and then draw n samples according to
V
i
=
i - 1
n
+
U
i
n
,
i = 1, . . . , n,
where the U
1
, . . . , U
n
are independent, uniformly distributed random variables from
[0, 1).
For the 2-dimensional unit cube, this can easily be demonstrated by the following
figure, where the domain is divided into 16 strata and in each section, a sample is
placed.
r
r
r
r
r
r
r
r
r
r
r
r
r
r
r
r
Figure 1.3: An example of stratified sampling. The domain is divided into 16 strata.
It is easy to see, that stratified sampling (with proportional point allocation, i.e.
placing the same number of points in the strata) can only decrease variance, as for
example Glasserman shows in [Gla04], page 217.
Another type of variance reduction called Latin hypercube sampling is an extension
of stratified sampling to multiple dimensions. The problem with stratified sampling
is, that in dimension s the partitioning of each coordinate into K strata results in
10

K
s
strata for the hypercube and this requires a sample size of at least K
s
to ensure
that each stratum is sampled. This reduces the benefit of stratified sampling for even
moderately large s.
So in Latin hypercube sampling, only the one-dimensional marginals of a multidi-
mensional distribution are stratified. If we again look at the unit cube [0, 1]
s
, this
means that we subdivide the unit cube into K intervals along each coordinate. Then
we choose the samples randomly such that each interval contains exactly one point.
Again, a little figure in dimension s = 2 with K = 16 will clarify this technique:
r
r
r
r
r
r
r
r
r
r
r
r r
r
r
r
Figure 1.4: An example of Latin hypercube sampling.
We can see that the one-dimensional projections are perfectly stratified.
Kollig
[KK02] notes that the variance can slightly increase in comparison to plain Monte
Carlo
2
LHS
N
N - 1
min{s-1,1}
·
2
MC
due to the presence of more restrictions in the placement of the samples, but we need
far less points to stratify each dimension compared to stratified sampling. He notes that
in practical applications, the variance is often reduced with Latin hypercube sampling.
1.5.2 Nets and sequences
Now we come to the definition of nets and sequences, that Niederreiter developed in
[Nie92], which generalizes the concept of stratification seen above. First, we need some
notations.
Definition 1.5.1 (Elementary interval)
Let s 1 denote the dimension and fix an integer base b 2. A subinterval E [0, 1)
s
of the form
E =
s
×
i=1
a
i
b
d
i
,
(a
i
+ 1)
b
d
i
with a
i
, d
i
N, 0 a
i
< b
d
i
1 i s is called an elementary interval in base b.
11

The volume of such an elementary interval E is
(E) =
s
j=1
1
b
d
j
= b
- P
s
j=1
d
j
.
If we take again [0, 1]
2
, for example all elementary intervals in base 2 with volume
(E) =
1
16
are:
Figure 1.5: Example of elementary intervals in base 2, with volume 1/16.
With this in mind, we can define a (t,m,s)-net.
Definition 1.5.2 ((t, m, s)-net)
Let 0 t m be integers. A (t,m,s)-net in base b is a set P consisting of b
m
points,
P [0, 1)
s
, such that # {E P } = b
t
for every elementary interval E in base b with
(E) = b
t-m
.
In Theorem 4.10 [Nie92], Niederreiter shows that for P a (t, m, s)-net in base b the
star discrepancy fulfills the following relation
D
N
(P ) B(s, b)b
t
N
-1
(log N )
s-1
+ O b
t
N
-1
(log N )
s-2
where N = b
m
and
B(s, b) =
b-1
2 log b
s-1
if s = 2 or b = 2, s = 3, 4
1
(s-1)!
b/2
log b
s-1
otherwise
and the implied constant in the O-symbol depends only on b and s. So we can consider
t as a quality parameter of the net. One thing to mention is that there is a relationship
between the parameters t, m and s:
For m 2, a (0, m, s)-net can only exist, if s b + 1, see Corollary 4.21 [Nie92].
An extension to an infinite number of points is the
Definition 1.5.3 ((t, s)-sequence)
Let 0 t be an integer. A sequence x
0
, x
1
, . . . of points in [0, 1)
s
is a (t,s)-sequence in
base b if, for all integers k 0 and m t, the point set
x
n
kb
m
n < (k + 1)b
m
is a (t, m, s)-net in base b.
12

A simplified bound (see Theorem 4.17, [Nie92]) for the star-discrepancy of the first
N terms of a (t, s)-sequence is
D
(x
1
, . . . , x
N
) C(s, b)N
-1
b
t
(log N )
s
+ O(b
t
N
-1
(log N )
s-1
)),
N 2
with
C(s, b) =
1
s
b-1
2 log b
s
if s = 2 or b = 2, s = 3, 4
1
s!
b-1
2 b/2
b/2
log b
s
otherwise
and again the implied constant in the Landau symbol depends only on b and s. Again
the base b, the dimension s and the quality parameter t are linked
6
:
For every base b 2 and every dimension s 1, a necessary condition for the
existence of a (t, s)-sequence in base b is
t
s
b
- log
b
(b - 1)s + b + 1
2
.
From the figure below, which shows on the right a (0, 4, 2)-net in base 2, we can
see that the notion of a net combines both stratified sampling (left figure) and Latin
hypercube sampling (middle figure). Even more stratification is imposed
7
.
r
r
r
r
r
r
r
r
r
r
r
r
r
r
r
r
Figure 1.6: A (0, 4, 2)-net in base 2 is shown in the right figure. The left figure shows
the stratification imposed by stratified sampling while the middle figure
shows the stratification imposed by Latin hypercube sampling.
Now we will present two important constructions of (t, s)-sequences.
1.5.3 Two constructions for (t,s)-sequences
Until now we made statements about nets and sequences without concerning their
construction. In this subsection, we will present two specific types. We recall that
the problem of the Halton sequence in higher dimensions was the large base for the
6
See Theorem 1, [NX98] page 271.
7
This was highlighted in [KK02].
13

corresponding Van-der-Corput sequence. The next construction due to Faure tries to
overcome this problem by choosing one base for all dimensions. The next definition
follows Section 5.2.2, [Gla04].
Definition 1.5.4 (Faure sequence)
As usual, let s denote the dimension. For the base b, we choose the smallest prime
number greater or equal than s. Again as in the definition of the radical inverse function
(1.4.1), we denote by a
i
the coefficients in the base-b expansion of n
n =
i=0
a
i
b
i
=
i=0
a
i
(n)b
i
.
(1.8)
Now the i-th coordinate, i = 1, . . . , s of the k - th point in the Faure sequence is given
by
x
(i)
k
=
j=1
y
(i)
j
(k)
b
j
(1.9)
where
y
(i)
j
(k) =
l=0
l
j - 1
(i - 1)
l-j+1
a
l
(k) mod b
(1.10)
with
m
n
=
m!
(m-n)!n!
,
m n
0
,
otherwise
All the sums above have only a finite number of nonzero summands. If the base-b
expansion of k (1.8) has exactly r terms, then the sum (1.10) also has r summands.
But if j r + 1, then the summands for l = 0, . . . , r - 1 in (1.10) also vanish and this in
turn implies that (1.9) has at most r nonzero terms. So we can view the construction
as the result of the matrix-vector multiplication
y
(i)
1
(k)
y
(i)
2
(k)
..
.
y
(i)
r
(k)
= C
(i-1)
a
0
(k)
a
1
(k)
..
.
a
r-1
(k)
mod b,
where the modulo is applied componentwise and the r × r matrix C
(i)
has the entries
C
(i)
(m, n) =
n - 1
m - 1
i
n-m
for n m and zero otherwise. Note that C
(0)
is the identity matrix. The matrices C
have the following recursive relation
C
(i)
= C
(1)
C
(i-1)
,
i = 1, 2, . . . ,
14

and in fact, the transformations (1.9) and (1.10) result in a permutation of the coeffi-
cients a
i
(n).
Faure [Fau82] showed that the star discrepancy of a sequence constructed this way
satisfies
D
(x
1
, . . . , x
N
) F
s
(b)
(log N )
s
N
+ O
(logN )
s-1
N
with
F
s
(b) =
1
s!
b - 1
2 log b
s
,
so the Faure sequence is a low-discrepancy sequence and the coefficient F
s
(b) 0 as
s . Moreover, the s-dimensional Faure sequence is a (0, s)-sequence.
There exist many computer implementations of the Faure sequence, and we are going
to use the one of QuantLib
8
, for which we provide a wrapper to flens
9
vectors.
Listing 1.1: Faure sequence generator
s t a t i c
QuantLib : : FaureRsg f a u r e r s g = QuantLib : : FaureRsg ( 1 ) ;
v o i d
f a u r e i n i t (
i n t
dim )
{
s t d : : c o u t <<
" F a u r e q r n g r e s t a r t e d in d i m e n s i o n "
<< dim
<<
" ... "
<< s t d : : f l u s h ;
f a u r e r s g = QuantLib : : FaureRsg ( dim ) ;
s t d : : c o u t <<
" d o n e ! "
<< s t d : : e n d l ;
}
f l e n s : : D en se Ve ct o r<f l e n s : : Array<
d o u b l e
> >
f a u r e n e x t
(
i n t
dim )
{
f l e n s : : De n se Ve ct or<f l e n s : : Array<
d o u b l e
> > q u a s i ( dim ) ;
QuantLib : : Array p o i n t ;
p o i n t = f a u r e r s g . n e x t S e q u e n c e ( ) . v a l u e ;
f o r
(
i n t
i = 0 ;
i < dim ;
i ++)
{
q u a s i ( i +1) = p o i n t [ i ] ;
}
r e t u r n
q u a s i ;
}
Another possibility to construct high-dimensional, low-discrepancy sequences was
developed by Sobol' in [Sob67]. In contrast to the Faure sequence, where the base
8
QuantLib is an open-source quantitative finance library written in C++, which is available from
http://www.quantlib.org, but also included in many GNU/Linux distributions. QuantLib is avail-
able under a BSD-like license, making it possible to include it in proprietary, commercial applications
as well.
9
Throughout this thesis, all computations will be done with C++. For both convenience and ef-
ficiency, we use the Flexible Library for Efficient Numerical Solutions, which is available from
http://flens.sourceforge.net/. This library allows us to have comfortable classes for matrices and
vectors while still using the performance of a BLAS implementation.
15

b had to be a prime number greater or equal than the dimension s, the base for all
coordinates of the Sobol' sequence is always 2. The construction is again based on the
one-dimensional Van-der-Corput sequence and the coordinate sequences result from
permutations of segments of the one-dimensional sequence. The following definition
goes along Section 5.2.3, [Gla04].
Definition 1.5.5 (Sobol' sequence)
Because all coordinates of the Sobol' sequence are constructed the same way with a
different generator matrix V , we will show now exemplarily the construction of a
single coordinate.
The elements of V are either 0 or 1.
The columns of V are
the binary expansion of a set of direction numbers v
1
, . . . , v
r
. The number r could
be arbitrarily large, but in practice it is bounded by finite computer precision.
Let
a(k) = (a
0
(k), a
1
(k), . . . , a
r-1
(k))
T
denote the vector of coefficients of the binary rep-
resentation of k, i.e.
k =
r-1
i=0
a
i
(k) 2
i
.
If we set
y
1
(k)
y
2
(k)
..
.
y
r
(k)
= V
a
0
(k)
a
1
(k)
..
.
a
r-1
(k)
mod 2,
(1.11)
then y
1
(k), . . . , y
r
(k) are the coefficients of the binary expansion of the k-th point in the
sequence,i.e.
x
k
=
r
i=1
y
i
(k)
2
i
.
The multiplication (1.11) can be presented as
a
0
(k)v
1
a
1
(k)v
2
. . . a
r-1
(k)v
r
,
(1.12)
where the v
i
is the i-th column of the matrix V (of the r × r submatrix) and denotes
the binary addition. If we think of v
i
as the binary representation of a number and
as a bitwise XOR operation, we see that the calculation (1.11) can be performed in a
(binary-digit based) computer very fast.
The fundamental question is now how to specify the generator matrix V , or equiv-
alently, the direction numbers v
j
. Sobol' proposed to start by selecting a primitive
polynomial over binary arithmetic, that is a polynomial
x
q
+ c
1
x
q-1
+ · · · + c
q-1
x + 1
(1.13)
with coefficients c
l
{0, 1} with the two properties:
16

· the polynomial is irreducible and
· the smallest power p for which the polynomial divides x
p
+ 1 is p = 2
q
- 1.
For example all primitive polynomials of degree one, two and three are
x + 1,
x
2
+ x + 1,
x
3
+ x + 1
and
x
3
+ x
2
+ 1.
Then from the coefficients of a polynomial like (1.13), a recurrence relation is defined
through
m
j
= 2c
1
m
j-1
2
2
c
2
m
j-2
· · · 2
q-1
c
q-1
m
j-q+1
2
q
m
j-q
m
j-q
,
(1.14)
j > q,
where the m
j
are integers (actually their binary representation). With the m
j
, we can
define the direction numbers by setting
v
j
=
m
j
2
j
.
To fully specify the v
j
, we need initial values m
1
, . . . , m
q
. For those, a minimal re-
quirement is that each initializing m
j
has to be an odd integer less than 2
j
. Then all
subsequent m
j
j
will lie strictly between
0 and 1.
We will clarify this with an example. Consider the polynomial
x
3
+ x
2
+ 1
with degree q = 3. Then we have c
1
= 1, c
2
= 0 and the recurrence, compare (1.14),
m
j
=
2c
1
m
j-1
2
2
c
2
m
j-2
2
3
m
j-3
m
j-3
=
2m
j-1
8m
j-3
m
j-3
.
If we now use the initializing values m
1
= 1, m
2
= 3 and m
3
= 3, we get
m
4
=
2 · 3 8 · 1 1
=
0110 1000 0001
=
1111
=
15
and
m
5
=
2 · 15 8 · 3 3
=
11110 11000 00011
=
00101
=
5.
17

The five corresponding direction numbers are
v
1
= 0.1,
v
2
= 0.11,
v
3
= 0.011,
v
4
= 0.1111, and v
5
= 0.00101,
and finally the corresponding generator matrix V is
V =
1
1
0
1
0
0
1
1
1
0
0
0
1
1
1
0
0
0
1
0
0
0
0
0
1
.
The sequence point x
k
is now calculated if we multiply its binary coefficients a(k) with
V mod 2 and interpret the result as the binary coefficient of x
k
. For example x
3
V a(3) mod 2 = V
1
1
0
0
0
mod 2 =
0
1
0
0
0
,
so x
3
=
1
4
. Like in the case of the Faure sequence, this ends up in a permutation of the
Van-der-Corput sequence.
The actual computation of the Sobol' numbers differs a bit in modern implementa-
tions, because there, the binary representation a(k) of k is replaced by a Gray code
10
representation. This makes it possible, to compute the numbers x
k
recursively.
Now we have the construction of a single coordinate sequence. For a multidimensional
Sobol' sequence, we associate the k-th dimension with the primitive polynomial p
k
with
degree q
k
. Then we have to choose the q
k
initial values m
1
, . . . , m
q
k
. These initialisation
numbers will have a great influence on the resulting sequence and a number of strategies
were proposed to select them. Without stating them here, we refer the reader to Section
8.3 [J¨
ac02], where details were presented and a new initialisation strategy is retrieved.
Last to mention is that the QuantLib implementation, which we are going to use,
does the selection in the way that J¨
ackel proposes
11
. We think that this new way of
initialisation substantially improves the performance of the Sobol' sequence in high
dimensions.
10
In Gray code, the binary representations of two subsequent numbers k and k + 1 will share all bits
except one.
11
The specific choice for the free direction numbers, that J¨
ackel proposes guarantees a good equidis-
tribution for the lower dimensions as well as a regularity breaking initialization for the higher
dimensions. Therefore, the QuantLib Sobol' generator will not suffer too much from a high dimen-
sion.
18

## Details

Seiten
Erscheinungsform
Originalausgabe
Jahr
2007
ISBN (eBook)
9783836615624
Dateigröße
1.1 MB
Sprache
Englisch
Institution / Hochschule
Universität Ulm – Fakultät für Mathematik und Wirtschaftswissenschaften
Erscheinungsdatum
2014 (April)
Note
1,0
Schlagworte
quasi-monte carlo malliavin calculus asset allocation camputational finance portfolio

139 Seiten