QuasiMonte Carlo Methods in Finance with Application to Optimal Asset Allocation
©2007
Diplomarbeit
139 Seiten
Zusammenfassung
Inhaltsangabe:Introduction:
Portfolio optimization is a widely studied problem in finance. The common question is, how a small investor should invest his wealth in the market to attain certain goals, like a desired payoff or some insurance against unwished events.
The starting point for the mathematical treatment of this is the work of Harry Markowitz in the 1950s. His idea was to set up a relation between the mean return of a portfolio and its variance. In his terminology, an efficient portfolio has minimal variance of return among others with the same mean rate of return. Furthermore, if linear combinations of efficient portfolios and a riskless asset are allowed, this leads to the market portfolio, so that a linear combination of the riskfree asset and the market portfolio dominates any other portfolio in the meanvariance sense.
Later, this theory was extended resulting in the CAPM, or capital asset pricing model, which was independently introduced by Treynor, Sharpe, Lintner and Mossin in the 1960s. In this model, every risky asset has a mean rate of return that exceeds the riskfree rate by a specific risk premium, which depends on a certain attribute of the asset, namely its _. The socalled _ in turn is the covariance of the asset return normalized by the variance of the market portfolio. The problem of the CAPM is its static nature, investments are made once and then the state of the model changes. Due to this and other simplifications, this model was and is often not found to be realistic.
An impact to this research field were the two papers of Robert Merton in 1969 and 1971. He applied the theory of Ito calculus and stochastic optimal control and solved the corresponding HamiltonJacobiBellman equation. For his multiperiod model, he assumed constant coefficients and an investor with power utility. Extending the meanvariance analysis, he found that a longterm investor would prefer a portfolio that includes hedging components to protect against fluctuations in the market. Again this approach was generalized by numerous researchers and results in the problem of solving a nonlinear partial differential equation.
The next milestone in this series is the work by Cox and Huang from 1989, where they solve for Optimal Consumption and Portfolio Policies when Asset Prices Follow a Diffusion Process. They apply the martingale technique to get rid of the nonlinear PDE and rather solve a linear PDE. This, with several refinements, is […]
Portfolio optimization is a widely studied problem in finance. The common question is, how a small investor should invest his wealth in the market to attain certain goals, like a desired payoff or some insurance against unwished events.
The starting point for the mathematical treatment of this is the work of Harry Markowitz in the 1950s. His idea was to set up a relation between the mean return of a portfolio and its variance. In his terminology, an efficient portfolio has minimal variance of return among others with the same mean rate of return. Furthermore, if linear combinations of efficient portfolios and a riskless asset are allowed, this leads to the market portfolio, so that a linear combination of the riskfree asset and the market portfolio dominates any other portfolio in the meanvariance sense.
Later, this theory was extended resulting in the CAPM, or capital asset pricing model, which was independently introduced by Treynor, Sharpe, Lintner and Mossin in the 1960s. In this model, every risky asset has a mean rate of return that exceeds the riskfree rate by a specific risk premium, which depends on a certain attribute of the asset, namely its _. The socalled _ in turn is the covariance of the asset return normalized by the variance of the market portfolio. The problem of the CAPM is its static nature, investments are made once and then the state of the model changes. Due to this and other simplifications, this model was and is often not found to be realistic.
An impact to this research field were the two papers of Robert Merton in 1969 and 1971. He applied the theory of Ito calculus and stochastic optimal control and solved the corresponding HamiltonJacobiBellman equation. For his multiperiod model, he assumed constant coefficients and an investor with power utility. Extending the meanvariance analysis, he found that a longterm investor would prefer a portfolio that includes hedging components to protect against fluctuations in the market. Again this approach was generalized by numerous researchers and results in the problem of solving a nonlinear partial differential equation.
The next milestone in this series is the work by Cox and Huang from 1989, where they solve for Optimal Consumption and Portfolio Policies when Asset Prices Follow a Diffusion Process. They apply the martingale technique to get rid of the nonlinear PDE and rather solve a linear PDE. This, with several refinements, is […]
Leseprobe
Inhaltsverzeichnis
Mario Rometsch
QuasiMonte Carlo Methods in Finance with Application to Optimal Asset Allocation
ISBN: 9783836615624
Druck Diplomica® Verlag GmbH, Hamburg, 2008
Zugl. Universität Ulm, Ulm, Deutschland, Diplomarbeit, 2007
Dieses Werk ist urheberrechtlich geschützt. Die dadurch begründeten Rechte,
insbesondere die der Übersetzung, des Nachdrucks, des Vortrags, der Entnahme von
Abbildungen und Tabellen, der Funksendung, der Mikroverfilmung oder der
Vervielfältigung auf anderen Wegen und der Speicherung in Datenverarbeitungsanlagen,
bleiben, auch bei nur auszugsweiser Verwertung, vorbehalten. Eine Vervielfältigung
dieses Werkes oder von Teilen dieses Werkes ist auch im Einzelfall nur in den Grenzen
der gesetzlichen Bestimmungen des Urheberrechtsgesetzes der Bundesrepublik
Deutschland in der jeweils geltenden Fassung zulässig. Sie ist grundsätzlich
vergütungspflichtig. Zuwiderhandlungen unterliegen den Strafbestimmungen des
Urheberrechtes.
Die Wiedergabe von Gebrauchsnamen, Handelsnamen, Warenbezeichnungen usw. in
diesem Werk berechtigt auch ohne besondere Kennzeichnung nicht zu der Annahme,
dass solche Namen im Sinne der Warenzeichen und MarkenschutzGesetzgebung als frei
zu betrachten wären und daher von jedermann benutzt werden dürften.
Die Informationen in diesem Werk wurden mit Sorgfalt erarbeitet. Dennoch können
Fehler nicht vollständig ausgeschlossen werden, und die Diplomarbeiten Agentur, die
Autoren oder Übersetzer übernehmen keine juristische Verantwortung oder irgendeine
Haftung für evtl. verbliebene fehlerhafte Angaben und deren Folgen.
© Diplomica Verlag GmbH
http://www.diplom.de, Hamburg 2008
Printed in Germany
Abstract
We introduce some quasiMonte Carlo methods and show how to apply them to prob
lems emerging from mathematical finance. Next, the Malliavin derivative is defined
and the ClarkOcone formula is derived. Then, following a work of Detemple, Garcia
and Rindisbacher [DGR05] from 2005, the problem of optimal portfolio allocation and
consumption is solved, where the optimal portfolio shares are given as expectations
of certain random variables and their derivatives. We show how quasiMonte Carlo
methods can be applied to the computation of portfolio weights.
to Elisabeth Rometsch
1920  2005
Acknowledgment
I am deeply indebted to both my advisors, Prof. Dr. R¨
udiger Kiesel and Prof. Dr.
Karsten Urban, for their constant support. Without their help and inspiration, this
work would not have been possible. I would also like to thank Prof. Dr. Alexander
Keller for providing me his valuable insights on the quasiMonte Carlo theory.
Special thanks go to my parents, Inge and Werner Rometsch for putting me through
university, and of course I also thank my sister, Jana Rometsch, for being such an
important person for me.
I would like to express my gratitude to my scholarship sponsor, the "Deutsche
Forschungsgemeinschaft", for allowing me to work as a research student for over one
and a half year now in the "Graduiertenkolleg 1100" here at the university of Ulm.
I am grateful to Daniel Bauer and Shaohui Wang for helping me out with some ques
tions that emerged during this work.
I would also like to thank Michael Lehn and Alexander Stippler for providing me
FLENS, the Flexible Library for Efficient Numerical Solutions. With such an excep
tional helpful linear algebra software package for C++, I saved much implementation
time while using the full spectrum of computer vector arithmetic. Furthermore, I thank
the "Institute for Numerical Mathematics" for letting me use their computer pool.
I am very thankful to my colleagues Wolfgang H¨
ogerle, Dennis Sch¨
atz and Jens
Stephan for proofreading this thesis numerous times.
I would also like to thank the open source community for providing such excellent
quality software products such as (in no particular order of preference) Ubuntu Linux,
GCC, eclipse, QuantLib, Gnuplot, unison, L
A
TEX and Gnome.
Finally, I would like to express my deepest gratitude for the constant support, un
derstanding and love that I received from my beloved mate Sabrina. You make the sun
shine everyday.
Contents
Introduction
v
1
Monte Carlo and quasiMonte Carlo methods
1
1.1
Numerical integration
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
1.2
Evaluation of integrals with Monte Carlo methods . . . . . . . . . . . .
2
1.3
QuasiMonte Carlo methods . . . . . . . . . . . . . . . . . . . . . . . . .
3
1.3.1
Introduction
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
1.3.2
Discrepancy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
1.3.3
The KoksmaHlawka inequality . . . . . . . . . . . . . . . . . . .
5
1.4
Classical constructions . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6
1.4.1
Onedimensional sequences . . . . . . . . . . . . . . . . . . . . .
6
1.4.2
Multidimensional sequences
. . . . . . . . . . . . . . . . . . . .
7
1.5
(t,m,s)nets and (t,s)sequences . . . . . . . . . . . . . . . . . . . . . . .
10
1.5.1
Variance reduction . . . . . . . . . . . . . . . . . . . . . . . . . .
10
1.5.2
Nets and sequences . . . . . . . . . . . . . . . . . . . . . . . . . .
11
1.5.3
Two constructions for (t,s)sequences . . . . . . . . . . . . . . . .
13
1.6
Digital nets and sequences . . . . . . . . . . . . . . . . . . . . . . . . . .
20
1.7
Lattice rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
21
1.8
The curse of dimension revisited
. . . . . . . . . . . . . . . . . . . . . .
22
1.8.1
Padding techniques . . . . . . . . . . . . . . . . . . . . . . . . . .
23
1.8.2
Latin Supercube sampling . . . . . . . . . . . . . . . . . . . . . .
23
1.9
Time consumption of the various point generators
. . . . . . . . . . . .
25
1.10 quasiMonte Carlo methods in Finance . . . . . . . . . . . . . . . . . . .
26
1.10.1 Example: Arithmetic option . . . . . . . . . . . . . . . . . . . . .
26
1.10.2 Path generation . . . . . . . . . . . . . . . . . . . . . . . . . . . .
27
1.10.3 Sampling size . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
34
1.10.4 Results
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
36
2
Malliavin Calculus
40
2.1
WienerIt^
o chaos expansion . . . . . . . . . . . . . . . . . . . . . . . . .
40
2.2
Skorohod integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
46
2.3
Differentiation of random variables . . . . . . . . . . . . . . . . . . . . .
50
i
2.4
Examples of Malliavin derivatives . . . . . . . . . . . . . . . . . . . . . .
64
2.5
The ClarkOcone formula . . . . . . . . . . . . . . . . . . . . . . . . . .
65
2.6
The generalized ClarkOcone formula . . . . . . . . . . . . . . . . . . . .
70
2.7
Multivariate Malliavin Calculus . . . . . . . . . . . . . . . . . . . . . . .
78
3
Asset Allocation
82
3.1
Problem formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
82
3.1.1
Financial market model . . . . . . . . . . . . . . . . . . . . . . .
82
3.1.2
Wealth process . . . . . . . . . . . . . . . . . . . . . . . . . . . .
84
3.1.3
Expected utility
. . . . . . . . . . . . . . . . . . . . . . . . . . .
84
3.1.4
Portfolio problem . . . . . . . . . . . . . . . . . . . . . . . . . . .
85
3.1.5
Equivalent static problem . . . . . . . . . . . . . . . . . . . . . .
86
3.1.6
Optimal portfolio . . . . . . . . . . . . . . . . . . . . . . . . . . .
88
3.2
Solution of the portfolio problem . . . . . . . . . . . . . . . . . . . . . .
94
3.2.1
Optimal portfolio . . . . . . . . . . . . . . . . . . . . . . . . . . .
94
3.2.2
Optimal portfolio with constant relative risk aversion (CRRA) .
94
4
Implementation
97
4.1
A single state variable model with explicit solution . . . . . . . . . . . .
97
4.2
Simulationbased approach
. . . . . . . . . . . . . . . . . . . . . . . . . 100
4.3
SDE system as multidimensional SDE . . . . . . . . . . . . . . . . . . . 101
4.4
Error analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
4.4.1
Discretisation error . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.4.2
Conditional expectation approximation error . . . . . . . . . . . 104
4.5
Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.5.1
One year time horizon . . . . . . . . . . . . . . . . . . . . . . . . 108
4.5.2
Two year time horizon . . . . . . . . . . . . . . . . . . . . . . . . 111
4.5.3
Five year time horizon . . . . . . . . . . . . . . . . . . . . . . . . 114
4.5.4
Experiments with a small time horizon . . . . . . . . . . . . . . . 117
Conclusion
119
Summary
120
ii
List of Figures
1.1
The quasirandom sequence fills the unit square more equally than the
pseudorandom sequence.
. . . . . . . . . . . . . . . . . . . . . . . . . .
8
1.2
First 1000 points of the 30dimensional Halton sequence. . . . . . . . . .
9
1.3
An example of stratified sampling. . . . . . . . . . . . . . . . . . . . . .
10
1.4
An example of Latin hypercube sampling. . . . . . . . . . . . . . . . . .
11
1.5
Example of elementary intervals in base 2, with volume 1/16. . . . . . .
12
1.6
A (0, 4, 2)net in base 2. . . . . . . . . . . . . . . . . . . . . . . . . . . .
13
1.7
First 1024 points of the 100dimensional Sobol' sequence . . . . . . . . .
30
1.8
Six steps of a Brownian Bridge construction.
. . . . . . . . . . . . . . .
30
1.9
3D view of a Brownian Bridge construction. . . . . . . . . . . . . . . . .
31
1.10 Monte Carlo convergence order is as expected.
. . . . . . . . . . . . . .
37
1.11 The Sobol' sequence without integrand transformation does not achieve
a higher convergence order. . . . . . . . . . . . . . . . . . . . . . . . . .
37
1.12 The Sobol' sequence with a Brownian bridge construction gives better
results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
38
1.13 The Sobol' sequence with Principal Component Analysis attains optimal
convergence order. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
38
1.14 The rank1 lattice also attains optimal convergence order. . . . . . . . .
39
2.1
A sample path of the Wiener process, and the path perturbed at t = 0.3. 58
4.1
A sample path of the market price of risk and the corresponding sample
path of the stock. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
97
4.2
The dimensionality resulting from the number of simulated trajectories.
105
4.3
One year time horizon. Two Monte Carlo computations with various
proportions of paths and time discretisation steps. . . . . . . . . . . . . 108
4.4
One year time horizon. Three quasiMonte Carlo computations with
various proportions of paths and time discretisation steps. . . . . . . . . 108
4.5
One year time horizon. Computation with the Sobol' sequence with
various proportions of paths and time discretisation steps. . . . . . . . . 109
4.6
One year time horizon. Computation with the Sobol' sequence in com
bination with the Milshtein approximation. . . . . . . . . . . . . . . . . 109
iii
4.7
One year time horizon. Direct comparison of two quasiMonte Carlo and
a Monte Carlo scheme. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
4.8
Two year time horizon. The Monte Carlo scheme converges as expected. 111
4.9
Two year time horizon. The Monte Carlo method is again outclassed by
the Sobol' sequence in combination with a Brownian Bridge.
. . . . . . 111
4.10 Two year time horizon.
The Sobol' sequence in combination with a
Brownian Bridge construction converges faster than the Sobol' sequence
with a PCA construction. . . . . . . . . . . . . . . . . . . . . . . . . . . 112
4.11 Two year time horizon. Computation with the Sobol' sequence in com
bination with the Milshtein scheme.
. . . . . . . . . . . . . . . . . . . . 112
4.12 Two year time horizon. The Latin Supercube sampling from a Sobol'
sequence shows no advantage compared to Monte Carlo sampling.
. . . 113
4.13 Five year time horizon. The Monte Carlo scheme converges as expected. 114
4.14 Five year time horizon. The Monte Carlo method is again outclassed by
the Sobol' sequence in combination with a Brownian Bridge construction. 114
4.15 Five year time horizon. Again, the PCA construction does not result in
a higher convergence order. . . . . . . . . . . . . . . . . . . . . . . . . . 115
4.16 Five year time horizon. Computation with the Sobol' sequence and the
Milshtein scheme. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
4.17 Five year time horizon. The Latin Supercube sampling gives no advan
tage over Monte Carlo integration. . . . . . . . . . . . . . . . . . . . . . 116
4.18 Small time horizon. The Monte Carlo scheme keeps its convergence order.117
4.19 Small time horizon. The Sobol' sequence with the PCA construction
achieves nearly optimal convergence. . . . . . . . . . . . . . . . . . . . . 118
iv
Introduction
Portfolio optimization is a widely studied problem in finance. The common question
is, how a small investor should invest his wealth in the market to attain certain goals,
like a desired payoff or some insurance against unwished events.
The starting point for the mathematical treatment of this is the work of Harry
Markowitz in the 1950s. His idea was to set up a relation between the mean return
of a portfolio and its variance. In his terminology, an efficient portfolio has minimal
variance of return among others with the same mean rate of return. Furthermore, if
linear combinations of efficient portfolios and a riskless asset are allowed, this leads to
the market portfolio, so that a linear combination of the riskfree asset and the market
portfolio dominates any other portfolio in the meanvariance sense.
Later, this theory was extended resulting in the CAPM, or capital asset pricing
model, which was independently introduced by Treynor, Sharpe, Lintner and Mossin
in the 1960s. In this model, every risky asset has a mean rate of return that exceeds
the riskfree rate by a specific risk premium, which depends on a certain attribute of
the asset, namely its . The socalled in turn is the covariance of the asset return
normalized by the variance of the market portfolio. The problem of the CAPM is its
static nature, investments are made once and then the state of the model changes. Due
to this and other simplifications, this model was and is often not found to be realistic.
An impact to this research field were the two papers of Robert Merton in 1969 [Mer69]
and 1971 [Mer71]. He applied the theory of It^
o calculus and stochastic optimal control
and solved the corresponding HamiltonJacobiBellman equation. For his multiperiod
model, he assumed constant coefficients and an investor with power utility. Extending
the meanvariance analysis, he found that a longterm investor would prefer a portfolio
that includes hedging components to protect against fluctuations in the market. Again
this approach was generalized by numerous researchers and results in the problem of
solving a nonlinear partial differential equation.
The next milestone in this series is the work by Cox and Huang [CH89] from 1989,
where they solve for "Optimal Consumption and Portfolio Policies when Asset Prices
Follow a Diffusion Process". They apply the martingale technique to get rid of the non
linear PDE and rather solve a linear PDE. This, with several refinements, is nowadays
a standard method for asset allocation in a complete market.
Approximately at the same time in 1991, Ocone and Karatzas published two pa
v
pers, [OK91] and [KOL91], in which they extend the ClarkHaussmann representation
formula to Wiener functionals under an equivalent measure. They use the martingale
method for selecting the optimal terminal wealth resp. optimal consumption and then
apply the representation theorem to derive the optimal portfolio strategy. This ex
pression involves expectations of random variables depending on the interest rate, the
market prices of risk and unspecified derivatives of these state variables.
Because of this unspecification, the hedging terms of the resulting portfolio do not
have an explicit form and are pretty difficult to evaluate numerically. Therefore, the
recent literature focused more on models with state variable specifications, for which
closedform solutions are available, for example [Wac02], or on specifications which, as
mentioned above, rely on dynamic programming.
One main part of this thesis is now to present the latest work of Detemple, Garcia
and Rindisbacher, [DGR03a] and [DGR05], in which they tie up the ideas from Ocone
and Karatzas. They solve for optimal portfolio allocation in a complete market model
and derive explicit expressions for the hedging terms. The optimal portfolio shares are
found to be expectations of random variables, which allow a simulationbased approach.
In opposite to the dynamic programming approach, this method is capable of handling
complex, nonlinear dynamics for a large number of state variables.
When then the optimal portfolio shares need to be calculated, this basically reduces
to a highdimensional integration over the unit hypercube. Perhaps because of its
easy and straightforward implementation, this is most often done with Monte Carlo
methods. Niederreiter [Nie92] notes that the official emergence of the Monte Carlo
method was a paper of Nicholas Metropolis and Stanislaw Ulam in 1949, but at that
time, pseudorandom algorithms had been used for years in secret projects of the U.S.
Department of Defense, like the Manhattan project. Since then, these methods have
become the most widely used tools for computing highdimensional integrals.
Although they have appealing features like weak regularity conditions, all these
Monte Carlo or pseudorandom methods suffer from one severe shortcoming: their
convergence order stems from the central limit theorem, hence they converge very
slowly.
The quasiMonte Carlo method now tries to overcome this hitch while keeping the
applicability to highdimensional integration at the same time. This is achieved by
sampling from very careful chosen deterministic points. The methods and algorithms
then are not called pseudorandom but quasirandom. The term "quasiMonte Carlo"
method appeared the first time in a technical report from 1951, and Niederreiters out
standing monograph [Nie92] cites many works and applications to problems like the
numerical solution of integral equations or integrodifferential equations. QuasiMonte
vi
Carlo applications to problems from the finance setting came up in the nineties with
work from Paskov, L' ´
Ecuyer et al. This topic composes the second part of this work,
in which we will present some concepts from the quasiMonte Carlo theory to problems
that emerge in mathematical finance.
In this thesis, we derive the optimal portfolio formula on the basis of the proofs in
[DGR03a] resp. [DGR05]. The main contribution is then the application of some quasi
Monte Carlo methods for its computation. Using a simple model with exact solution,
different schemes were tested and showed, that the substitution of the pseudorandom
number generator for the quasirandom number generator requires special care.
This thesis is organized as follows. In Chapter 1, we present some concepts from
the quasiMonte Carlo theory and show, how these methods can be applied to price a
simple financial derivative, an arithmetic call option. In Chapter 2 we follow the lecture
notes [Øks97] and present an introduction to the "stochastic calculus of variations" or
Malliavin Calculus, which will culminate in the definition of the Malliavin derivative
and the derivation of the ClarkOcone formula.
Chapter 1: quasiMonte Carlo methods
Chapter 2: Malliavin Calculus
Chapter 3: Asset Allocation
Chapter 4: Implementation

The reader wishing a quick start to the asset allocation problem may thus jump directly
to Chapter 3. There, we will introduce the financial market model and formulate the
optimization problem. We will follow the work of Detemple, Garcia and Rindisbacher
[DGR03a] resp. [DGR05] and derive the optimal portfolio formula. However, for this
we will need the calculation rules from Chapter 2, so if skipped, the optimal portfolio
formula has to be taken as granted. The implementation in Chapter 4 will then apply
these formulas to a simple model with explicit solution, and we show, how the quasi
Monte Carlo methods from Chapter 1 can be used to improve the efficiency over plain
Monte Carlo methods.
vii
1 Monte Carlo and quasiMonte Carlo
methods
To get a feeling for the integration techniques now commonly known as "quasiMonte
Carlo methods", we will first introduce some basics from numerical integration and
classical Monte Carlo methods. After that, we will illustrate the idea behind quasi
random algorithms and present some integration rules. At the end of this chapter, we
show how a simple financial derivative can be priced with these techniques. We won't go
too much into theoretical details here, the reader wishing some more thorough overview
is referred to the great book of Harald Niederreiter [Nie92]. We will follow this book
in our introduction.
1.1 Numerical integration
The evaluation of integrals in dimension s is a standard problem of numerical analysis.
For s = 1 there are some classical numerical methods as, for example, the trapezoidal
rule:
[0,1]
f (x) dx
n
i=0
i
f
i
n
with
i
=
1
2n
:
i = 0
or
i = n
1
n
:
0 < i < n
(1.1)
Now if f is twice continuously differentiable on the interval (0, 1), the quadrature error
is of magnitude O(n
2
). In higher dimensions, the integration can be carried out by
means of an iteration of onedimensional integration rules. Therefore, the node set is
a product of onedimensional node sets, and the weights are products of weights from
onedimensional integration. The sdimensional trapezoidal rule is defined as
[0,1]
s
f (x) dx
n
i
1
=0
. . .
n
i
s
=0
i
1
· · ·
i
s
f
i
1
n
, . . . ,
i
s
n
,
(1.2)
with
i
like in (1.1) and overall N = (n+1)
s
node points. If f is again twice continuously
differentiable in every coordinate, the error behaves similar to the onedimensional case
like O(n
2
). If one replaces n by N
s
, the error becomes O(N
2/s
). This expression
does fatefully depend on the dimension s (this is called the "curse of dimension"), e.g.
to get an overall accuracy of two decimal digits, one has to use roughly the total number
of 10
s
nodes!
1
1.2 Evaluation of integrals with Monte Carlo methods
Monte Carlo methods are a technique that can be used to evaluate high dimensional
integrals without the "curse of dimension" as mentioned above. Therefore, the inte
grand is interpreted as a transformation of a sdimensional random variable, that is
uniformly distributed over the unit cube.
[0,1]
s
f (x) dx
=
f (X) dP
=
E[f (X)],
with
X U
[0,1]
s
.
The integral can then be calculated by random sampling
E[f (X)]
1
N
N
i=1
f (x
i
),
(1.3)
where the x
i
are realizations of X
i
U
[0,1]
s
. The theoretical foundation for this is given
by the strong law of large numbers:
Theorem 1.2.1 (Strong Law of Large Numbers)
Let X
1
, X
2
, . . . be i.i.d. random variables and f L
2
([0, 1]
s
). Then we have with
probability 1
1
N
N
i=1
f (X
i
)
N
 E[f (X
1
)].
Proof. The proof can be found in most books about probability or statistics.
For the integration error in Theorem 1.2.1 one can show that for f L
2
([0, 1]
s
) it
holds
[0,1]
s
. . .
[0,1]
s
1
N
N
i=1
f (x
i
)  E[f (X
1
)]
2
dx
1
· · · dx
N
=
2
(f )
N
,
where
2
(f ) =
[0,1]
s
f (x)  E[f (X
1
)]
2
dx.
So the error is in average (f )N

1
2
and summing up the above we get the Monte Carlo
estimate
[0,1]
s
f (x) dx
1
N
N
i=1
f (X
i
),
where X
1
, X
2
. . . , X
N
are i.i.d. random variables, X
i
U
[0,1]
s
, with the probabilistic
error bound O(N

1
2
). The important fact is, that this error does not depend on the
dimension s as it was the case for the product rule. So the Monte Carlo method out
classes the trapezoidal rule approximately from dimension s = 5 and beyond. Having
said this, we have to immediately point out the drawbacks of the Monte Carlo method:
· There are only probabilistic error bounds
2
· Generating (really) random samples with a computer yields some problems
· The regularity of the integrand is not reflected by the error bound
The problem with random sampling on a computer has been studied widely with two
possible outcomes: Either one takes a so called pseudorandom number generator like
the Mersenne Twister
1
, which produces a deterministic sequence of numbers which is
designed to pass some statistical tests for randomness, or one could use a true random
number generator, that produces true random numbers, for example based on the noise
of a resistor. Both of these methods have their disadvantages. With this short summary
of Monte Carlo methods, let us now pass to the quasiMonte Carlo method.
1.3 QuasiMonte Carlo methods
1.3.1 Introduction
The error in the Monte Carlo method is of order O(N

1
2
) on the average over all node
sets. Therefore, there must exist some node sets, which perform significantly better
than the average. The quasiMonte Carlo approach now takes the deterministic nature
of the pseudorandom numbers in the Monte Carlo method as granted and tries to
construct special node sets, which perform much better than the average. So for short,
the quasiMonte Carlo approximation of the integral is
[0,1]
s
f (x) dx
1
N
N
i=1
f (x
i
),
(1.4)
with the only difference, that the x
i
are now carefully chosen deterministic points from
a socalled lowdiscrepancy sequence.
1.3.2 Discrepancy
Think of the points x
i
in (1.4) as part of an infinite sequence x
1
, x
2
, . . . of points in
[0, 1]
s
. The first requirement on such a sequence is that it yields a convergent method:
lim
N
1
N
N
i=1
f (x
i
) =
[0,1]
s
f (x) dx
(1.5)
for f from some function space, say f C([0, 1]). So the first reasonable condition
would be, that the x
i
are uniformly distributed over [0, 1]
s
, which is equivalent to
2
lim
N
1
N
N
i=1
1l
A
(x
i
)
=
(A)
1
We will use this pseudorandom number generator for our Monte Carlo calculations.
2
According to [Nie92], this is shown in [KN74].
3
for all subintervals A of [0, 1]
s
, where denotes the Lebesgue measure. So the empirical
distribution of the node point sequence should be as close as possible to the uniform
distribution. This leads to the two following measures.
Definition 1.3.1 (star discrepancy)
We define the star discrepancy D
N
(P ) = D
N
(x
1
, . . . , x
N
) of a point set P as
D
N
(P ) = sup
y
i
[0,1]
s
i=1
y
i

#
P
s
×
i=1
[0, y
i
]
N
= sup
y[0,1]
s
F
U
(y)  ^
F
x
1
,...,x
N
(y)
where #{A} is the counting function of the set A, F
U
(y) is the distribution function
of the sdimensional uniform distribution and ^
F
x
1
,...,x
N
(y) is the empirical distribution
function of the point set (x
1
, . . . , x
N
).
Definition 1.3.2 (extreme discrepancy)
The extreme discrepancy D
N
(P ) = D
N
(x
1
, . . . , x
N
) of a point set P is defined as
D
N
(P ) =
sup
0aibi1
i=1,...,N
s
i=1
(b
i
 a
i
) 
#
P
s
×
i=1
[a
i
, b
i
]
N
.
Niederreiter [Nie92] shows that both discrepancies may be bounded by each other
D
N
(P ) D
N
(P ) 2
s
D
N
(P ),
so it suffices to concentrate our interest on the first one.
If we look at the one
dimensional interval [0, 1], then in order to minimize the star discrepancy of the N 
point set P = (x
1
, . . . , x
N
) we would choose x
i
=
2i1
2N
, yielding a star discrepancy of
D
N
=
1
2N
.
For a sequence S = (x
1
, x
2
, . . .) in [0, 1], this doesn't hold anymore. There, the star
discrepancy is bounded from below with
D
N
(S) = D
N
(x
1
, . . . , x
N
) cN
1
log N
for infinitely many N.
In dimensions higher than 1, Niederreiter states that it is widely believed, that the star
discrepancy is also bounded from below by
D
N
(S) = D
N
(x
1
, . . . , x
N
) cN
1
(log N )
s
for infinitely many N,
so the best star discrepancy that can be achieved is O(N
1
(log N )
s
) for a sequence
and O(N
1
(log N )
s1
) for a point set, quantifying the term lowdiscrepancy.
4
Definition 1.3.3 (lowdiscrepancy point set)
If the star discrepancy of a point set is of order O N
1
(log N )
s1
, then the point set
is called a lowdiscrepancy point set.
Definition 1.3.4 (lowdiscrepancy sequence)
If the star discrepancy of a sequence is of order O N
1
(log N )
s
, then the sequence is
called a lowdiscrepancy sequence.
1.3.3 The KoksmaHlawka inequality
We will now state an error bound, which directly links the star discrepancy and the
quadrature error. For this we need some definitions that characterize the smoothness
of the integrand.
Definition 1.3.5 (Variation in the sense of Vitali)
Let f be a function on [0, 1]
s
and A [0, 1]
s
a rectangle of the form A =
s
×
i=1
[a
i
, b
i
).
Define (f ; A) as an alternating sum of the values of f at the vertices of A:
(f ; A) =
1
j
1
=0
. . .
1
j
s
=0
(1)
P
s
k=1
j
k
f j
1
a
1
+ (1  j
1
)b
1
, . . . , j
s
a
s
+ (1  j
s
)b
s
For P a partition of [0, 1]
s
in subintervals of the form A, the variation of f on [0, 1]
s
in the sense of Vitali is defined as
V
(s)
(f ) = sup
P
AP
(f ; A) .
This is indeed a measure for the variability of the function f , because if f is for
example stimes continuously differentiable on (0, 1)
s
, then another expression for the
variation in the sense of Vitali is available
3
:
V
(s)
(f ) =
1
0
· · ·
1
0
s
f
x
1
· · · x
s
dx
1
· · · dx
s
.
For our error bound, we need an extension of this measure.
Definition 1.3.6 (Variation in the sense of Hardy and Krause)
Let again f be defined on [0, 1]
s
.
For 1 k s and 1 i
1
< . . . < i
k
s,
let V
(k)
(f ; i
1
, . . . , i
k
) be the variation in the sense of Vitali of f restricted to the k
dimensional face
(x
1
, . . . , x
s
) [0, 1]
s
x
j
= 1 for j = i
1
, . . . , i
k
. Then
V (f ) =
s
k=1
1i
1
<...<i
k
s
V
(k)
(f ; i
1
, . . . , i
k
)
is called the variation of f on [0, 1]
s
in the sense of Hardy and Krause.
3
Compare page 19, [Nie92].
5
If V (f ) is finite, then f is said to be of bounded variation in this sense. With this
notion of variation, we can state the main theorem in this chapter.
Theorem 1.3.7 (KoksmaHlawka inequality)
If f has bounded variation on [0, 1]
s
in the sense of Hardy and Krause, then for
x
1
, x
2
, . . . , x
N
[0, 1]
s
it holds
1
N
N
i=1
f (x
i
) 
[0,1]
s
f (x) dx V (f )D
N
(x
1
, . . . , x
N
).
This inequality is sharp, because for any x
1
, . . . , x
N
[0, 1]
s
and any > 0, there
exists a function f C
([0, 1]
s
) with V (f ) = 1 and
1
N
N
i=1
f (x
i
) 
[0,1]
s
f (x) dx > D
N
(x
1
, . . . , x
N
)  .
So with this inequality, we can tackle down two problems of the Monte Carlo method,
namely that we do not need random samples and that the error is bounded not only
with a certain probability. However, in practice we cannot expect the KoksmaHlawka
inequality to act as an error estimator, because the terms on the right side are much
more harder to evaluate as the integral itself.
If we take for granted, that for a given integrand we cannot change its variation, this
error bound leads to the conclusion, that the star discrepancy of the point set from
that we sample should be as small as possible. In the next section, we introduce some
specific constructions for such point sets.
1.4 Classical constructions
Before we come to more recent constructions of lowdiscrepancy point sets, we give a
short overview about some classical results.
1.4.1 Onedimensional sequences
Although onedimensional quasiMonte Carlo methods are not so important from the
practical point of view, we will present some concepts, as they are the basis for some
multidimensional constructions. A starting point is the following function, which maps
integers to points in the unit interval.
Definition 1.4.1 (Radical inverse function)
Let b 2 denote an integer base, then every integer n has a unique digit representation
n =
i=0
a
i
b
i
6
in base b, where the sum is actually a finite sum. The radical inverse function
b
(n) is
now just the reflection of this representation about the decimal point, more formally
b
(n) =
i=0
a
i
b
i1
,
which is actually again a finite sum.
Trivial to see, that the image of the radical inverse function is [0, 1) and so we get the
Definition 1.4.2 (VanderCorput sequence in base b)
Let b 2 denote an integer base. Then the sequence x
1
, x
2
, . . . with x
n
=
b
(n) is called
the VanderCorput sequence in base b.
The VanderCorput sequence fills the unit interval with points in a maximally bal
anced way. We demonstrate this with an example of the VanderCorput sequence in
base 2:
i
2
(i)
1
0.5
2
0.25
3
0.75
4
0.125
5
0.625
6
0.375
7
0.875
8
0.0625
9
0.5625
..
.
..
.
All VanderCorput sequences are lowdiscrepancy sequences, i.e. for all bases b 2,
the star discrepancy of the first N terms is O(N
1
log N ). Although they are one
dimensional sequences, this concept can be easily extended to multiple dimensions.
1.4.2 Multidimensional sequences
Definition 1.4.3 (Halton sequence)
With the dimension s 1, let b
1
, b
2
, . . . , b
s
be integers, b
i
2, and define the Halton
sequence by
x
n
=
b
1
(n), . . . ,
b
s
(n)
with points in [0, 1)
s
.
7
(a) 1000 points from the Mersenne Twister
(b) 1000 points from the Halton sequence, di
mension 1 and 2
Figure 1.1: The quasirandom sequence fills the unit square more equally than the
pseudorandom sequence.
If the bases b
i
are pairwise relatively prime, Niederreiter (Theorem 3.6, [Nie92]) shows
that the star discrepancy of the Halton sequence is bounded by
D
(x
1
, x
2
, . . . , x
N
) <
s
N
+
1
N
s
i=1
b
i
 1
2 log b
i
log N +
b
i
+ 1
2
,
N 1,
and if the b
i
are the first prime numbers, he finds that
D
(x
1
, x
2
, . . . , x
N
) A(b
1
, . . . , b
s
)N
1
(log N )
s
+ O N
1
(log N )
s1
(1.6)
with
A(b
1
, . . . , b
s
) =
s
i=1
b
i
 1
2 log b
i
.
A little modification of the Halton sequence results in the Hammersley point set,
that has a smaller star discrepancy bound than the Halton sequence but comes with a
fixed number of points.
Definition 1.4.4 (Hammersley point set)
Again s 2 defines the dimension and let b
1
, . . . , b
s1
be integers. Then the N element
point set (x
1
, . . . , x
N
) with x
n
defined by
x
n
=
n
N
,
b
1
(n), . . . ,
b
s1
(n) ,
n = 0, 1, . . . , N  1
is called the N element Hammersley point set in the bases b
1
, . . . , b
s1
.
The star discrepancy of the Hammersley point set is again bounded from above, and
if the b
i
are pairwise relatively prime we get
D
(x
1
, . . . , x
N
) <
s
N
+
1
N
s1
i=1
b
i
 1
2 log b
i
log N +
b
i
+ 1
2
.
8
Similar, with b
1
, b
2
, . . . , b
s1
the first prime numbers we have
4
D
(x
1
, . . . , x
N
) A(b
1
, . . . , b
s1
)N
1
(logN )
s1
+ O N
1
(logN )
s2
.
(1.7)
These two constructions contain a severe catch that follows from the behaviour of
the VanderCorput sequence with a large base b. If it is large, then the sequence
produces long monotone segments. This fact is easily seen from the following picture,
which shows the first 1000 points of the Halton sequence
5
, left is the projection onto
dimension 1 and 2, the right panel shows the projection onto dimension 29 and 30.
(a) Projection onto dimension 1 and 2
(b) Projection onto dimension 29 and 30
Figure 1.2: First 1000 points of the 30dimensional Halton sequence. The figure shows
the problem in higher dimensions.
This problem is also reflected by the constant A in (1.6) and (1.7). If we take the
first s prime numbers as bases, we get
A
s
= A(b
1
, . . . , b
s
) =
s
i=1
b
i
 1
2 log b
i
,
and Niederreiter [Nie92] shows that from the prime number theorem it follows that
lim
s
log A
s
s log s
= 1.
So the superexponential growth of A
s
is a serious problem if the dimension rises. Ac
tually, the discrepancy bounds (1.6) and (1.7) get useless. To overcome this problem,
Faure [Fau82] developed a different extension of the VanderCorput sequence to multi
ple dimensions, where all coordinates use the same base b. This base then has to be at
least as large as the dimension itself, but this can be much smaller than the largest base
for a Halton sequence with the same dimension. Additionally, Sobol' also extended the
4
This is shown in Theorem 3.8, [Nie92].
5
For the bases, we used the first 30 prime numbers.
9
VanderCorput sequence to multiple dimensions, but there, all dimensions use the base
2. We will present both constructions in a somewhat more general concept called nets
and sequences.
1.5 (t,m,s)nets and (t,s)sequences
1.5.1 Variance reduction
A basic variance reduction technique from Monte Carlo methods is called stratified
sampling. It relies on the subdivision of the sdimensional unit cube into a partition
A
1
, . . . , A
k
and then to take random samples from these strata. In one dimension, this
means that we partition the unit interval into n strata
A
1
= 0,
1
n
, A
2
=
1
n
,
2
n
, . . . , A
n
=
n  1
n
, 1
and then draw n samples according to
V
i
=
i  1
n
+
U
i
n
,
i = 1, . . . , n,
where the U
1
, . . . , U
n
are independent, uniformly distributed random variables from
[0, 1).
For the 2dimensional unit cube, this can easily be demonstrated by the following
figure, where the domain is divided into 16 strata and in each section, a sample is
placed.
r
r
r
r
r
r
r
r
r
r
r
r
r
r
r
r
Figure 1.3: An example of stratified sampling. The domain is divided into 16 strata.
It is easy to see, that stratified sampling (with proportional point allocation, i.e.
placing the same number of points in the strata) can only decrease variance, as for
example Glasserman shows in [Gla04], page 217.
Another type of variance reduction called Latin hypercube sampling is an extension
of stratified sampling to multiple dimensions. The problem with stratified sampling
is, that in dimension s the partitioning of each coordinate into K strata results in
10
K
s
strata for the hypercube and this requires a sample size of at least K
s
to ensure
that each stratum is sampled. This reduces the benefit of stratified sampling for even
moderately large s.
So in Latin hypercube sampling, only the onedimensional marginals of a multidi
mensional distribution are stratified. If we again look at the unit cube [0, 1]
s
, this
means that we subdivide the unit cube into K intervals along each coordinate. Then
we choose the samples randomly such that each interval contains exactly one point.
Again, a little figure in dimension s = 2 with K = 16 will clarify this technique:
r
r
r
r
r
r
r
r
r
r
r
r r
r
r
r
Figure 1.4: An example of Latin hypercube sampling.
We can see that the onedimensional projections are perfectly stratified.
Kollig
[KK02] notes that the variance can slightly increase in comparison to plain Monte
Carlo
2
LHS
N
N  1
min{s1,1}
·
2
MC
due to the presence of more restrictions in the placement of the samples, but we need
far less points to stratify each dimension compared to stratified sampling. He notes that
in practical applications, the variance is often reduced with Latin hypercube sampling.
1.5.2 Nets and sequences
Now we come to the definition of nets and sequences, that Niederreiter developed in
[Nie92], which generalizes the concept of stratification seen above. First, we need some
notations.
Definition 1.5.1 (Elementary interval)
Let s 1 denote the dimension and fix an integer base b 2. A subinterval E [0, 1)
s
of the form
E =
s
×
i=1
a
i
b
d
i
,
(a
i
+ 1)
b
d
i
with a
i
, d
i
N, 0 a
i
< b
d
i
1 i s is called an elementary interval in base b.
11
The volume of such an elementary interval E is
(E) =
s
j=1
1
b
d
j
= b
 P
s
j=1
d
j
.
If we take again [0, 1]
2
, for example all elementary intervals in base 2 with volume
(E) =
1
16
are:
Figure 1.5: Example of elementary intervals in base 2, with volume 1/16.
With this in mind, we can define a (t,m,s)net.
Definition 1.5.2 ((t, m, s)net)
Let 0 t m be integers. A (t,m,s)net in base b is a set P consisting of b
m
points,
P [0, 1)
s
, such that # {E P } = b
t
for every elementary interval E in base b with
(E) = b
tm
.
In Theorem 4.10 [Nie92], Niederreiter shows that for P a (t, m, s)net in base b the
star discrepancy fulfills the following relation
D
N
(P ) B(s, b)b
t
N
1
(log N )
s1
+ O b
t
N
1
(log N )
s2
where N = b
m
and
B(s, b) =
b1
2 log b
s1
if s = 2 or b = 2, s = 3, 4
1
(s1)!
b/2
log b
s1
otherwise
and the implied constant in the Osymbol depends only on b and s. So we can consider
t as a quality parameter of the net. One thing to mention is that there is a relationship
between the parameters t, m and s:
For m 2, a (0, m, s)net can only exist, if s b + 1, see Corollary 4.21 [Nie92].
An extension to an infinite number of points is the
Definition 1.5.3 ((t, s)sequence)
Let 0 t be an integer. A sequence x
0
, x
1
, . . . of points in [0, 1)
s
is a (t,s)sequence in
base b if, for all integers k 0 and m t, the point set
x
n
kb
m
n < (k + 1)b
m
is a (t, m, s)net in base b.
12
A simplified bound (see Theorem 4.17, [Nie92]) for the stardiscrepancy of the first
N terms of a (t, s)sequence is
D
(x
1
, . . . , x
N
) C(s, b)N
1
b
t
(log N )
s
+ O(b
t
N
1
(log N )
s1
)),
N 2
with
C(s, b) =
1
s
b1
2 log b
s
if s = 2 or b = 2, s = 3, 4
1
s!
b1
2 b/2
b/2
log b
s
otherwise
and again the implied constant in the Landau symbol depends only on b and s. Again
the base b, the dimension s and the quality parameter t are linked
6
:
For every base b 2 and every dimension s 1, a necessary condition for the
existence of a (t, s)sequence in base b is
t
s
b
 log
b
(b  1)s + b + 1
2
.
From the figure below, which shows on the right a (0, 4, 2)net in base 2, we can
see that the notion of a net combines both stratified sampling (left figure) and Latin
hypercube sampling (middle figure). Even more stratification is imposed
7
.
r
r
r
r
r
r
r
r
r
r
r
r
r
r
r
r
Figure 1.6: A (0, 4, 2)net in base 2 is shown in the right figure. The left figure shows
the stratification imposed by stratified sampling while the middle figure
shows the stratification imposed by Latin hypercube sampling.
Now we will present two important constructions of (t, s)sequences.
1.5.3 Two constructions for (t,s)sequences
Until now we made statements about nets and sequences without concerning their
construction. In this subsection, we will present two specific types. We recall that
the problem of the Halton sequence in higher dimensions was the large base for the
6
See Theorem 1, [NX98] page 271.
7
This was highlighted in [KK02].
13
corresponding VanderCorput sequence. The next construction due to Faure tries to
overcome this problem by choosing one base for all dimensions. The next definition
follows Section 5.2.2, [Gla04].
Definition 1.5.4 (Faure sequence)
As usual, let s denote the dimension. For the base b, we choose the smallest prime
number greater or equal than s. Again as in the definition of the radical inverse function
(1.4.1), we denote by a
i
the coefficients in the baseb expansion of n
n =
i=0
a
i
b
i
=
i=0
a
i
(n)b
i
.
(1.8)
Now the ith coordinate, i = 1, . . . , s of the k  th point in the Faure sequence is given
by
x
(i)
k
=
j=1
y
(i)
j
(k)
b
j
(1.9)
where
y
(i)
j
(k) =
l=0
l
j  1
(i  1)
lj+1
a
l
(k) mod b
(1.10)
with
m
n
=
m!
(mn)!n!
,
m n
0
,
otherwise
All the sums above have only a finite number of nonzero summands. If the baseb
expansion of k (1.8) has exactly r terms, then the sum (1.10) also has r summands.
But if j r + 1, then the summands for l = 0, . . . , r  1 in (1.10) also vanish and this in
turn implies that (1.9) has at most r nonzero terms. So we can view the construction
as the result of the matrixvector multiplication
y
(i)
1
(k)
y
(i)
2
(k)
..
.
y
(i)
r
(k)
= C
(i1)
a
0
(k)
a
1
(k)
..
.
a
r1
(k)
mod b,
where the modulo is applied componentwise and the r × r matrix C
(i)
has the entries
C
(i)
(m, n) =
n  1
m  1
i
nm
for n m and zero otherwise. Note that C
(0)
is the identity matrix. The matrices C
have the following recursive relation
C
(i)
= C
(1)
C
(i1)
,
i = 1, 2, . . . ,
14
and in fact, the transformations (1.9) and (1.10) result in a permutation of the coeffi
cients a
i
(n).
Faure [Fau82] showed that the star discrepancy of a sequence constructed this way
satisfies
D
(x
1
, . . . , x
N
) F
s
(b)
(log N )
s
N
+ O
(logN )
s1
N
with
F
s
(b) =
1
s!
b  1
2 log b
s
,
so the Faure sequence is a lowdiscrepancy sequence and the coefficient F
s
(b) 0 as
s . Moreover, the sdimensional Faure sequence is a (0, s)sequence.
There exist many computer implementations of the Faure sequence, and we are going
to use the one of QuantLib
8
, for which we provide a wrapper to flens
9
vectors.
Listing 1.1: Faure sequence generator
s t a t i c
QuantLib : : FaureRsg f a u r e r s g = QuantLib : : FaureRsg ( 1 ) ;
v o i d
f a u r e i n i t (
i n t
dim )
{
s t d : : c o u t <<
" F a u r e q r n g r e s t a r t e d in d i m e n s i o n "
<< dim
<<
" ... "
<< s t d : : f l u s h ;
f a u r e r s g = QuantLib : : FaureRsg ( dim ) ;
s t d : : c o u t <<
" d o n e ! "
<< s t d : : e n d l ;
}
f l e n s : : D en se Ve ct o r<f l e n s : : Array<
d o u b l e
> >
f a u r e n e x t
(
i n t
dim )
{
f l e n s : : De n se Ve ct or<f l e n s : : Array<
d o u b l e
> > q u a s i ( dim ) ;
QuantLib : : Array p o i n t ;
p o i n t = f a u r e r s g . n e x t S e q u e n c e ( ) . v a l u e ;
f o r
(
i n t
i = 0 ;
i < dim ;
i ++)
{
q u a s i ( i +1) = p o i n t [ i ] ;
}
r e t u r n
q u a s i ;
}
Another possibility to construct highdimensional, lowdiscrepancy sequences was
developed by Sobol' in [Sob67]. In contrast to the Faure sequence, where the base
8
QuantLib is an opensource quantitative finance library written in C++, which is available from
http://www.quantlib.org, but also included in many GNU/Linux distributions. QuantLib is avail
able under a BSDlike license, making it possible to include it in proprietary, commercial applications
as well.
9
Throughout this thesis, all computations will be done with C++. For both convenience and ef
ficiency, we use the Flexible Library for Efficient Numerical Solutions, which is available from
http://flens.sourceforge.net/. This library allows us to have comfortable classes for matrices and
vectors while still using the performance of a BLAS implementation.
15
b had to be a prime number greater or equal than the dimension s, the base for all
coordinates of the Sobol' sequence is always 2. The construction is again based on the
onedimensional VanderCorput sequence and the coordinate sequences result from
permutations of segments of the onedimensional sequence. The following definition
goes along Section 5.2.3, [Gla04].
Definition 1.5.5 (Sobol' sequence)
Because all coordinates of the Sobol' sequence are constructed the same way with a
different generator matrix V , we will show now exemplarily the construction of a
single coordinate.
The elements of V are either 0 or 1.
The columns of V are
the binary expansion of a set of direction numbers v
1
, . . . , v
r
. The number r could
be arbitrarily large, but in practice it is bounded by finite computer precision.
Let
a(k) = (a
0
(k), a
1
(k), . . . , a
r1
(k))
T
denote the vector of coefficients of the binary rep
resentation of k, i.e.
k =
r1
i=0
a
i
(k) 2
i
.
If we set
y
1
(k)
y
2
(k)
..
.
y
r
(k)
= V
a
0
(k)
a
1
(k)
..
.
a
r1
(k)
mod 2,
(1.11)
then y
1
(k), . . . , y
r
(k) are the coefficients of the binary expansion of the kth point in the
sequence,i.e.
x
k
=
r
i=1
y
i
(k)
2
i
.
The multiplication (1.11) can be presented as
a
0
(k)v
1
a
1
(k)v
2
. . . a
r1
(k)v
r
,
(1.12)
where the v
i
is the ith column of the matrix V (of the r × r submatrix) and denotes
the binary addition. If we think of v
i
as the binary representation of a number and
as a bitwise XOR operation, we see that the calculation (1.11) can be performed in a
(binarydigit based) computer very fast.
The fundamental question is now how to specify the generator matrix V , or equiv
alently, the direction numbers v
j
. Sobol' proposed to start by selecting a primitive
polynomial over binary arithmetic, that is a polynomial
x
q
+ c
1
x
q1
+ · · · + c
q1
x + 1
(1.13)
with coefficients c
l
{0, 1} with the two properties:
16
· the polynomial is irreducible and
· the smallest power p for which the polynomial divides x
p
+ 1 is p = 2
q
 1.
For example all primitive polynomials of degree one, two and three are
x + 1,
x
2
+ x + 1,
x
3
+ x + 1
and
x
3
+ x
2
+ 1.
Then from the coefficients of a polynomial like (1.13), a recurrence relation is defined
through
m
j
= 2c
1
m
j1
2
2
c
2
m
j2
· · · 2
q1
c
q1
m
jq+1
2
q
m
jq
m
jq
,
(1.14)
j > q,
where the m
j
are integers (actually their binary representation). With the m
j
, we can
define the direction numbers by setting
v
j
=
m
j
2
j
.
To fully specify the v
j
, we need initial values m
1
, . . . , m
q
. For those, a minimal re
quirement is that each initializing m
j
has to be an odd integer less than 2
j
. Then all
subsequent m
j
from (1.14) will share this property and the v
j
will lie strictly between
0 and 1.
We will clarify this with an example. Consider the polynomial
x
3
+ x
2
+ 1
with degree q = 3. Then we have c
1
= 1, c
2
= 0 and the recurrence, compare (1.14),
m
j
=
2c
1
m
j1
2
2
c
2
m
j2
2
3
m
j3
m
j3
=
2m
j1
8m
j3
m
j3
.
If we now use the initializing values m
1
= 1, m
2
= 3 and m
3
= 3, we get
m
4
=
2 · 3 8 · 1 1
=
0110 1000 0001
=
1111
=
15
and
m
5
=
2 · 15 8 · 3 3
=
11110 11000 00011
=
00101
=
5.
17
The five corresponding direction numbers are
v
1
= 0.1,
v
2
= 0.11,
v
3
= 0.011,
v
4
= 0.1111, and v
5
= 0.00101,
and finally the corresponding generator matrix V is
V =
1
1
0
1
0
0
1
1
1
0
0
0
1
1
1
0
0
0
1
0
0
0
0
0
1
.
The sequence point x
k
is now calculated if we multiply its binary coefficients a(k) with
V mod 2 and interpret the result as the binary coefficient of x
k
. For example x
3
V a(3) mod 2 = V
1
1
0
0
0
mod 2 =
0
1
0
0
0
,
so x
3
=
1
4
. Like in the case of the Faure sequence, this ends up in a permutation of the
VanderCorput sequence.
The actual computation of the Sobol' numbers differs a bit in modern implementa
tions, because there, the binary representation a(k) of k is replaced by a Gray code
10
representation. This makes it possible, to compute the numbers x
k
recursively.
Now we have the construction of a single coordinate sequence. For a multidimensional
Sobol' sequence, we associate the kth dimension with the primitive polynomial p
k
with
degree q
k
. Then we have to choose the q
k
initial values m
1
, . . . , m
q
k
. These initialisation
numbers will have a great influence on the resulting sequence and a number of strategies
were proposed to select them. Without stating them here, we refer the reader to Section
8.3 [J¨
ac02], where details were presented and a new initialisation strategy is retrieved.
Last to mention is that the QuantLib implementation, which we are going to use,
does the selection in the way that J¨
ackel proposes
11
. We think that this new way of
initialisation substantially improves the performance of the Sobol' sequence in high
dimensions.
10
In Gray code, the binary representations of two subsequent numbers k and k + 1 will share all bits
except one.
11
The specific choice for the free direction numbers, that J¨
ackel proposes guarantees a good equidis
tribution for the lower dimensions as well as a regularity breaking initialization for the higher
dimensions. Therefore, the QuantLib Sobol' generator will not suffer too much from a high dimen
sion.
18
Details
 Seiten
 Erscheinungsform
 Originalausgabe
 Jahr
 2007
 ISBN (eBook)
 9783836615624
 Dateigröße
 1.1 MB
 Sprache
 Englisch
 Institution / Hochschule
 Universität Ulm – Fakultät für Mathematik und Wirtschaftswissenschaften
 Erscheinungsdatum
 2014 (April)
 Note
 1,0
 Schlagworte
 quasimonte carlo malliavin calculus asset allocation camputational finance portfolio