Está en la página 1de 107

Centre for Computational Geostatistics (CCG)

Guidebook Series Vol. 13

Tools for Multivariate


Geostatistical Modeling

Ryan M. Barnett

August 2011
Centre for Computational Geostatistics (CCG)
Guidebook Series Vol. 13

Tools for Multivariate Geostatistical


Modeling

Ryan M. Barnett
Centre for Computational Geostatistics (CCG)

Guidebook Series
Volume 1. Guide to Geostatistical Grade Control and Dig Limit Determination

Volume 2. Guide to Sampling

Volume 3. Guide to SAGD (Steam Assisted Gravity Drainage) Reservoir Characterization Using
Geostatistics

Volume 4. Guide to Recoverable Reserves with Uniform Conditioning

Volume 5. Users Guide to Alluvsim Program

Volume 6. New Programs for Data Display

Volume 7. Geostatistics with Compositional Data

Volume 8. Application of Spectral Techniques to Geostatistical Modeling

Volume 9. Introduction to Disjunctive Kriging

Volume 10. Ensemble Kalman Filter for Geostatistical Applications

Volume 11. Review of Geostatistical Simulation Methods for Large Problems

Volume 12. Permeability from Core Photos and Image Logs

Volume 13. Tools for Multivariate Geostatistical Modeling

Volume 14. Programs to Aid the Decision of Stationarity

Copyright 2011, Centre for Computational Geostatistics

Published by Centre for Computational Geostatistics

3-133 Markin/CNRL Natural Resources Engineering Facility,

Edmonton, AB, Canada T6G 2W2

http://www.uofaweb.ualberta.ca/ccg/

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any
form or by any means, electronic, mechanical, photocopying, recording or otherwise, without prior permission from
the Centre for Computational Geostatistics.
Preface
Spatial modeling of multiple related variables is a common task within the field of geostatistics, where
overarching goals often revolve around the accurate reproduction of histograms and multivariate
relationships. These multivariate relationships may be complex in nature, involving non-linear,
heteroscedastic, and constraint features. Covariance based modeling frameworks by comparison,
require data that is well behaved and absent of these complex features. Even further, a multivariate
normal distribution will be required if Gaussian simulation or multiGaussian kriging is to proceed. A
variety of multivariate transformation techniques exist to facilitate traditional geostatistical modeling of
complex geologic data. Depending on the assumptions of the subsequent modeling scheme, the
transformations may be used for the removal of complex features, decorrelation of the variables,
formation of a multivariate Gaussian distribution and/or dimension reduction. Following estimation or
simulation of the transformed distribution, the back-transformation reintroduces the original
dimensionality, complex features and correlations. The number of transformations available to a
practitioner may cause confusion in selecting the correct technique. As they are often used in sequential
combination, additional problems may arise in regards to the ordering of their application. The following
guidebook will introduce and demonstrate the most commonly used multivariate geostatistical
transformations. The purpose, strengths, weaknesses and potential location within multiple
transformation workflows will be outlined for each technique. A standardized CCG multivariate toolkit
software package will also be introduced, presenting the program and associated parameters for every
discussed transformation.
Table of Contents
Introduction ...................................................................................................................................................................1
1.1 Non-linear Transforms and Bias Linear Estimates .......................................................................................2
1.2 Multivariate Gaussian Distribution ..............................................................................................................2
1.3 Independent Simulation and Corregionalization .........................................................................................3
1.4 Transformations ...........................................................................................................................................4
1.5 Nickel Laterite Case Study ............................................................................................................................6
1.6 CCG Multivariate Toolkit Software Package.................................................................................................7
Logratios ........................................................................................................................................................................8
2.1 Transformation Concept .............................................................................................................................. 8
2.2 Transformation Theory ................................................................................................................................9
2.3 Example ......................................................................................................................................................10
2.4 Software .....................................................................................................................................................12
Normal Score Transformation .....................................................................................................................................14
3.1 Transformation Concept ............................................................................................................................14
3.2 Transformation Theory .............................................................................................................................. 14
3.3 Example ......................................................................................................................................................16
3.4 Software .....................................................................................................................................................19
Stepwise Conditional Transformation .........................................................................................................................22
4.1 Transformation Concept ............................................................................................................................22
4.2 Transformation Theory .............................................................................................................................. 22
4.3 Example ......................................................................................................................................................26
4.4 Software .....................................................................................................................................................32
Principal Component Analysis (PCA) ...........................................................................................................................35
5.1 Transformation Concept ............................................................................................................................35
5.2 Transformation Theory .............................................................................................................................. 37
5.3 Example ......................................................................................................................................................38
5.4 Software .....................................................................................................................................................44
Conditional Standardization ........................................................................................................................................46
6.1 Transformation Concept ............................................................................................................................46
6.2 Transformation Theory .............................................................................................................................. 47
6.3 Example ......................................................................................................................................................49
6.4 Software .....................................................................................................................................................54
Minimum/Maximum Autocorrelation Factors (MAF) .................................................................................................56
7.1 Transformation Concept ............................................................................................................................56
7.2 Transformation Theory .............................................................................................................................. 58
7.3 Example ......................................................................................................................................................60
7.4 Software .....................................................................................................................................................68
Alternating Conditional Expectations (ACE) ................................................................................................................71
8.1 Transformation Concept ............................................................................................................................71
8.2 Transformation Theory .............................................................................................................................. 75
8.3 Example ......................................................................................................................................................77
8.4 Software .....................................................................................................................................................85
Histogram Reproduction (TRANS) ...............................................................................................................................87
Chained Transformation Workflows............................................................................................................................92
Bibliography .................................................................................................................................................................96

List of Figures
Figure 1.1: Schematic bivariate cross-plots displaying complex relationships of heteroscedasticity (left), non-
linearity (middle) and constraints (right). ......................................................................................................................1
Figure 1.2: Bivariate cross-plots of Ni, Fe, SiO2 and MgO, presented in a format which matches the upper
covariance matrix triangle (red boxes in the schematic variance-covariance matrix at the bottom left). ..................7
Figure 2.1: Distributions of the sum of the four major Nickel laterite components for the original assays (left) and
simulated realizations (right). The components of the realizations are shown to exceed the natural constraint of
100%. .............................................................................................................................................................................8
Figure 2.2: Cross-plots of the additive logratio transformed variables. ......................................................................10
Figure 2.3: Original and transformed Ni and Fe cross-plot from .................................................................................11
Figure 2.4: Sum of the major components for the simulated realizations following back-transformation. ...............11
Figure 2.5: Parameters for logratio .............................................................................................................................12
Figure 2.6: Parameters for logratio_b .........................................................................................................................13
Figure 3.1: Schematic univariate transformation of the original data (Z) to the standard normal gaussian
distribution (Y). ............................................................................................................................................................15
Figure 3.2: Graphical representation of the back-transformation of a zi simulation or multiGaussian kriging
probability. ..................................................................................................................................................................16
Figure 3.3: Original logratio Ni distribution (left) and normal score transformed distribution (right). The cumulative
(above) and probability (below) functions are displayed for both the original and transformed distributions. ........17
Figure 3.4: Cross-plots between logratio transformed Ni, Fe and SiO2. ......................................................................18
Figure 3.5: Cross-plots between normal score transformed Ni, Fe and SiO2. Note the change in the marginal
histograms between this and the logratio transformed distributions in Figure 3.5. ..................................................19
Figure 3.6: Par file for nscoremv..................................................................................................................................20
Figure 3.7: Par file for nscoremv _b.............................................................................................................................21
Figure 4.1: Visual representation of the stepwise transform for a bivariate case (Leuangthong 2003) .....................24
Figure 4.2: Final step of a trivariate SC transformation, where the third variable Z1 is partitioned into 9 bins based
on 3 classes of the two transformed conditioning variables (left), and transformed to a univariate marginal
distribution (right) that forms a multivariate normal distribution with the conditioning variables (left). ..................25
Figure 4.3: Stepwise Conditional transformation workflow for a trivariate system. ..................................................27
Figure 4.4: Cross-plots between logratio transformed Ni, Fe and SiO2. These will be the pre-transform
distributions for applying the trivariate Stepwise Conditional transformation. .........................................................28
Figure 4.5: Cross-plots between Ni, Fe and SiO2 following a trivariate SC transformation. ........................................28
Figure 4.6: Horizontal (left) and vertical (right) experimental semivariograms and fitted models for normal score
transformed Ni, Fe and SiO2. .......................................................................................................................................29
Figure 4.7: Horizontal (left) and vertical (right) experimental semivariograms and fitted models for SC transformed
Ni, Fe and SiO2. ............................................................................................................................................................30
Figure 4.8: Cross-plots between back transformed simulated values of Ni, MgO and SiO2. Comparison with Figure
1.2 displays excellent cross-correlation between variables. .......................................................................................31
Figure 4.9: Forward transformation for a nested SC workflow. ..................................................................................32
Figure 4.10: Par file for stepcon ..................................................................................................................................33
Figure 4.11: Par file for stepcon _b..............................................................................................................................34
Figure 5.1: Three cameras oriented at irregular angles to the motion of an ideal spring along the x-axis (above). The
cameras record the location of the ball at a regular frequency, producing projections of the ball within their unique
two dimensional image axes (Schlens 2009). .............................................................................................................. 35
Figure 5.2: Expanded view of camera As image capture from Figure 5.1, with interpreted signal and noise (Schlens
2009). ...........................................................................................................................................................................36
Figure 5.3: Cross-plots of Ni, Fe, MgO and SiO2, which have been logratio and normal score transformed. Marginal
normal distributions are displayed on the axes. ......................................................................................................... 39
Figure 5.4: Cross-plots of Ni, Fe, MgO and SiO2 which have been logratio, normal score and PCA transformed.......40
Figure 5.5: Cross-plots for back-transformed simulated values. Original complexity features (black lines) reflect the
original distributions in Figure 1.2. ..............................................................................................................................41
Figure 5.6: Magnitude of the principal components, rescaled to percentages of the system variability. ..................42
Figure 5.7: Cross-plots of the back transformed values using only two principal components. This compares with
the pre-PCA distributions seen in Figure 5.3. ............................................................................................................43
Figure 5.8: Cross-plots between the original (Figure 5.3) and back transformed (Figure 5.7) variables with
associated variance reduction factors. ........................................................................................................................43
Figure 5.9: Par file for pca ............................................................................................................................................44
Figure 5.10: Principal Component Analysis Info file which is output from the PCA program. ....................................45
Figure 5.11: Par file for pca_b......................................................................................................................................45
Figure 6.1: Decomposition of the ideal spring ball projections into two orthogonal vectors from section 5.1 (Schlens
2009) ............................................................................................................................................................................46
Figure 6.2: Bivariate cross-plots of Ni, Fe and SiO2, which have undergone logratio and normal score
transformation. Arrows are an approximated decomposition of the distributions by two orthogonal vectors.........47
Figure 6.3: Schematic of a non-linear and heteroscedastic bivariate distribution that has been partitioned according
to conditional probability classes of X (left). Subtraction of the conditional mean and division of the conditional
standard deviation yields a linear and homoscedastic distribution (right). ................................................................48
Figure 6.4: Schematic of a non-linear and heteroscedastic bivariate distribution that has been partitioned according
to conditional probability classes of X and Y (left). Subtraction of the conditional mean and division of the
conditional standard deviation yields a linear and homoscedastic distribution (right). .............................................49
Figure 6.5: Sequential conditional standardization workflow for four variables of the Nickel laterite dataset. ........50
Figure 6.6: Cross-plot between Fe and SiO2 before (left) and following (right) the bivariate conditional
standardization. ...........................................................................................................................................................51
Figure 6.7: Cross-plots between SiO2 and its two conditioning variables before (above) and following (below) the
trivariate conditional standardization. ........................................................................................................................51
Figure 6.8: Three dimensional view of the trivariate distribution before conditional standardization. standardization
and PCA/MAF transformations................................................................................................................................9552
Figure 6.9: Three dimensional view of the trivariate distribution after conditional standardization. ........................52
Figure 6.10: Cross-plots between simulated Ni, Fe and SiO2 (1 out of every 10,000 values plotted) following
backtransformation of the displayed workflow. .........................................................................................................53
Figure 6.11: Par file for constd ....................................................................................................................................55
Figure 6.12: Par file for constd _b................................................................................................................................55
Figure 7.1: Cross-plot between two principal components. .......................................................................................57
Figure 7.2: Experimental cross-semivariogram between the two principal components in Figure 7.1. .....................57
Figure 7.3: Cross-plots of the pre-MAF distributions, which undergone prior transformations as displayed. .........61
Figure 7.4: Cross-plots of the PCA transformed distributions. ....................................................................................62
Figure 7.5: Cross-plots of the MAF transformed distributions. ...................................................................................62
Figure 7.6: Experimental cross-variograms of PCA and MAF transformed distributions. The grey histogram indicates
the relative number of pairs used in the calculation of each lag. ...............................................................................63
Increasing Correlation of MAF transforms ..................................................................................................................64
Figure 7.7: Spatial correlation of the MAF factors, as obtained from Equation 7.11. .................................................64
Figure 7.8: Experimental semi-variograms and fitted model for the MAF factors. Spatial correlation and structure is
seen to increase with ranking. .....................................................................................................................................65
Figure 7.9: Cross-plots between simulated values (1 out of every 10,000) of Ni, Fe and SiO2 following
backtransformation. ....................................................................................................................................................66
Figure 7.10: Experimental cross-semivariogram for four PCA (left) and MAF (right) simulated realizations.
Experimental cross-semivariogram of the original data are also plotted to displayed the targeted correlation. ......67
Figure 7.11: Par file for maf .........................................................................................................................................68
Figure 7.12: Principal Component Analysis Info file which is output from the PCA program .....................................69
Figure 7.13: Par file for maf_b .....................................................................................................................................70
Figure 8.1: Single LSR of Ni given Fe produces a predictive line function (left), while multiple LSR of Ni given Fe and
SiO2 produces a predictive plane (right). .....................................................................................................................71
Figure 8.2:: Response variable (Y-xaxis) vs the five predictor variables (x-axis) ..........................................................73
Figure 8.3: Optimal transforms (y-axis) vs the associated original variables (x-axis) ..................................................74
Figure 8.4: Optimal response transform (y-xaxis) vs the sum of the optimal predictor transforms (x-axis). ..............75
Figure 8.5: Original bivariate relationship of Ni (response) and Fe (predictor) (top left), optimal response vs optimal
predictor (Fe) (top right), optimal response vs. response (bottom left) and optimal predictor vs. predictor (bottom
right). ...........................................................................................................................................................................77
Figure 8.6: Original bivariate cross-plots between response (Ni) and the three predictor variables (Fe, SiO2 and
MgO). ...........................................................................................................................................................................78
Figure 8.7: Original variables vs their optimal transforms (left) and the sum of the optimal predictor transforms vs
the optimal response transformation (right). .............................................................................................................79
Figure 8.8: Figure 8.11: Regression predicted Ni using LSR (above) and ACE (below), cross-validated against the
removed True values of Ni. Note that LSQ Regression and LSR are used interchangeably here for describing linear
least squares reqression. .............................................................................................................................................80
Figure 8.9: Response variable plotted against the two predictor variables for a multivariate normal distribution of
6680 observations. ......................................................................................................................................................81
Figure 8.10: Cross-plot between the response transformation and the sum of the predictor transformations (top
right), as well as cross-plots between the variables and their optimal transformations. ...........................................82
Figure 8.11: Regression predicted Y response variable using ACE_POST (left) and LSR (right), cross-validated against
the removed True values of Y. Note that LSQ Regression and LSR are used interchangeably here for describing
linear least squares reqression. ...................................................................................................................................82
Figure 8.12: Response variable plotted against the two predictor variables for a multivariate normal distribution of
673 observations. ........................................................................................................................................................83
Figure 8.13: Cross-plot between the response transformation and the sum of the predictor transformations (top
right), as well as cross-plots between the variables and their optimal transformations. ...........................................84
Figure 8.14: Regression predicted Y response variable using ACE_POST (left) and LSR (right), cross-validated against
the removed True values of Y. Note that LSQ Regression and LSR are used interchangeably here for describing
linear least squares reqression. ...................................................................................................................................84
Figure 8.15: Par file for ace ..........................................................................................................................................85
Figure 8.16: Par file for ace_post .................................................................................................................................86
Figure 9.1: Schematic illustration of the quantile-to-quantile matching of a simulated source distribution with a
declustered target distribution....................................................................................................................................87
Figure 9.2: CDFs of four uncorrected Ni realizations and the target declustered data distribution...........................89
Figure 9.3: CDFs of four corrected Ni realizations and the target declustered data distribution. ..............................89
Figure 9.4: Quantile-to-quantile plots of i) simulated Nickel realizations vs the declustered data distribution (left)
and ii) TRANS corrected Nickel realizations vs. the declustered data distribution (right)...........................................90
Figure 9.5: Par file for the trans program ....................................................................................................................91
Figure 10.1: Generalized forward transformation workflow where a complex multivariate distribution is
transformed to an approximate multivariate Guassian distribution for geostatistical modeling. ..............................93
Figure 10.2: Generalized forward and backward modeling workflow revolving around the logratio and stepwise
condtional transformations. ........................................................................................................................................94
Figure 10.3: Generalized forward and backward modeling workflow revolving around the logratio, conditional
Chapter 1

Introduction
Spatial modeling of multiple related variables represents a common and critical task for geostatistical practitioners.
Following the hierarchal framework of surface and facies modeling to establish stationary zones, continuous
variables such as grades or other compositional properties must be predicted. The success of these models in
industrial application increasingly hinges on the accurate reproduction of univariate distributions and multivariate
relationships. Within mining, this is required for the optimization of plant design, blending and stockpile planning.
As for petroleum, accurate variability and correlation of petrophysical properties is a necessity for tasks such as
flow simulation and production forecasting.

Covariance based modeling methods require multivariate distributions that are well behaved and free of complex
features, such as non-linearity, heteroscedasticity, and constraints (Figure 1.1 - i). Where spatial modeling of
uncertainty is required, Gaussian based methods are typically applied and require data that obey the multivariate
Gaussian distribution (Figure 1.1 ii). Even further, where high dimensional data sets render cross-correlation
frameworks impractical, zero correlation may be required between variables to allow for the independent
simulation (Figure 1.1 iii).

Figure 1.1: Schematic bivariate cross-plots displaying (i) complex relationships of heteroscedasticity (left), non-
linearity (middle) and constraints (right), (ii) multivariate Gaussian distribution, where the relationship between
two variables is fully defined by the covariance, and (iii) uncorrelated multivariate Gaussian distribution.

1
The inconvenient reality is that all geologic deposits will possess varying, but ever present levels of complexity.
Variables that are the product of a shared depositional or chemical process are expected to be correlated, with
non-linear, heteroscedastic and constraint features likely to define the multivariate relationships. When these
complexities do not conform to the modeling techniques assumptions of the data distribution, erroneous results
which fail to reproduce both the multivariate characteristics of the deposit will inevitably follow.

Fortunately, a large variety of transformation techniques exist to bridge this gap, and facilitate the execution of
traditional geostatistical modeling frameworks. The number of options available, and the fact that many of these
transformations overlap in their purpose often causes confusion in regards to correctly choosing and applying a
technique or workflow. The following guidebook will aim to simplify this problem by introducing the most
commonly used transformations and demonstrating their application.

The remainder of the introduction will provide a brief overview of background concepts and a literature review for
the major transformations. Subsequent chapters will be dedicated to the individual transformations, with an
overview of the concept, theory, and examples provided for each technique. A new standard set of CCG
transformation programs will be used in these examples, the parameters of which are given at the conclusion of
each chapter. As will be seen in the final section of this guidebook, many of these transformations are very
powerful when used in combination, and it is hoped that the release of this multivariate toolkit software package,
will ease and encourage their use in combination where appropriate.

1.1 Non-linear Transforms and Bias Linear Estimates

As mentioned, the primary goal of non-linear transformations is to produce a multivariate distribution that is
absent of features which covariance based modeling techniques would fail to capture. The well behaved
multivariate distribution that is produced by these transformations may appear ideal for the application of
common linear estimation methods such as kriging (Journel and Huijbregts 1978). Caution must be exercised,
however, as it is well documented that such estimates in non-linear transform space are almost guaranteed to
introduce a bias upon back-transformation (Deutsch 2006).

Specialized tools such as PostMG (Lyster and Deutsch 2004) and PostSCT (Deutsch 2006) have been developed by
the CCG to circumvent this issue in the case of multiGaussian kriging estimates. Likewise, the PCA and MAF
transformations are linear in nature, and will correctly transform an estimate without bias, assuming they are not
used in conjunction with other non-linear transforms.

Having established that specific transformations may be correctly used within estimation frameworks, issues
remain in regards to their use within chained non-linear transformation workflows. Furthermore, the common
goals of attempting to reproduce the univariate histograms and multivariate relationships are not congruent with
the smoothed estimates that kriging will produce. It is for this reason that transformations are demonstrated in
this guidebook within simulation frameworks. This is not intended to discourage those interested in using
transformations for other purposes, and specific mention will be made when techniques are being discussed that
may facilitate their unbiased transform of estimates.

1.2 Multivariate Gaussian Distribution

The large majority of geostatistical simulation methods, including the popular sequential Gaussian simulation
(Isaaks 1990) adopt the multivariate Gaussian distribution for describing the form of the data to be modeled. The
multivariate Gaussian model is given by Equation 1.1, where X is a dxn dimension matrix composed of n number

2
of observations on d number of random variables, is the dxd dimension variance-covariance matrix, and is
the 1xd matrix of means for each random variable.

1 ( X ) ' 1 ( X )
f (X ) = 1/2
exp (1.1)
(2 ) d / 2 2

It is observed from the above equation, that any order of multivariate Gaussian distribution is fully defined by the
mean, variance and covariance of the data. There are additional properties that make this distribution so attractive
for geostatistical simulation:

All marginal and conditional distributions are homoscedastic and normally distributed
All conditional distributions are fully defined by the conditional mean and variance
The conditional mean and variance in turn, are easily determined by the simple kriging equations

As a result of the above observations, only the variograms and cross-variograms are required to model the local
conditional distributions of a multivariate Gaussian model. Residual values may then be randomly drawn from the
normally distributed conditional distributions, effectively simulating a deposit. It is this mathematical tractability
and associated computational efficiency, which makes the multivariate Gaussian distribution the only practical
model for most simulation frameworks.

Unfortunately, a distribution defined by so few parameters is not capable of describing complexities such as the
heteroscedastic, non-linear and constraint features that are observed in geologic variables. It is for this reason that
transformations are required to reduce complex geologic systems to multivariate Gaussian distributions. Gaussian
simulation may then correctly proceed on the simplified dataset, with back-transformation reintroducing and
enforcing the original complexity of the system.

1.3 Independent Simulation and Corregionalization

A further simplification of the multivariate Gaussian distribution is zero correlation between variables. This feature
may be convenient or necessary, depending on the number of variables to be modeled since cross-variograms are
otherwise required for co-simulation frameworks. It is often tedious, and in the case of greater than three or four
variables, impractical to calculate and model all of the required cross-variograms.

A multivariate Gaussian distribution, with zero correlation between variables removes the need for cross-
variograms and allows for independent simulation. As geologic variables are generally not independent of the
spatial scale, one must consider at this stage, whether zero correlation at a zero lag distance (as a diagonal
matrix in Equation 1.1 would imply) means zero correlation at all lag distances. Only in the cases of an intrinsic
model of coregionalization (IMC) (Journel and Huijbregts 1978) where the correlation of variables is independent
of the spatial scale, or Markov model (Almeida and Jounel 1994) where spatial correlation is scaled according to
the collocated correlation coefficient, would zero correlation between variables at the zero lag distance guarantee
decorrelation at all lags. The more robust Linear model of corregionalization (LMC) (Journel and Huijbregts 1978)
is often adopted for modeling frameworks, where variables are instead treated as linear combinations of common
nested structures at different spatial scales. Spatial correlation is therefore permitted in an LMC model, regardless
of the zero lag correlation.

3
Transformation methods exist for the decorrelation of variables that are defined by either the IMC/Markov model
or an LMC model that is composed of up to two nested structures. Many multivariate systems will require greater
than two nested structures to accurately model the spatial correlation of variables. Even further, geologic deposits
that are the product of natural phenomena will rarely be fully described by a parametric model. Whether
decorrelating at the zero lag, or two-lag distances, inspection of the remaining spatial cross-correlation must
follow. The decision is then made as to whether the remaining correlation is negligible enough to permit
independent simulation.

Should satisfactory decorrelation not be achieved, alternative transformations exist to potentially reduce the
dimension of the multivariate system. Co-simulation of a more manageable number of variables may then
proceed, with back-transformation restoring the original dimensionality of the dataset.

1.4 Transformations

A variety of univariate and multivariate techniques exist to accomplish the transformation goals which have been
alluded to in the previous sections:

Removal of complex multivariate relationships, including non-linear, heteroscedastic and constraint


features
Multivariate Gaussian distribution
Decorrelation of variables
Dimension reduction

The available transformations include:

Additive Logratio Transformation

Generally applied in the first step of a chained transformation workflow, logratios are a common solution to the
issue of compositional and ratio constraints. The additive logratio transform (Pawlowsky-Glahn and Egozcue 2006)
is the most commonly applied of a variety of logratio transforms in the earth modeling realm. Applying it removes
constraints and reduces the dimension of the system by one. Back-transformation reintroduces the constraints and
the original dimensionality of the system. Only the additive logratio method will be discussed in this guidebook and
readers are referred to the Vol. 7 CCG Guidebook on Compositional Data (Manchuk, 2008) for more information on
this family of transformations.

Normal Score Transformation

First introduced by de Moivre in 1733 and later extended by Laplace in 1812, a normal or Gaussian distribution is
the limit function that occurs when a great number of independent, equally distributed random variables are
summed. The multivariate Gaussian distribution in Equation 1.1 is the multivariate generalization of this function.
While a geologic variable may approach a normal distribution, they rarely conform to its ideal form. The normal
score transformation allows for any univariate distribution to be converted to a univariate standard normal score
distribution. The well-known transformation matches quantiles of the original variable cumulative density function
with quantiles of the standard normal CDF (Deutsch and Journel 1998). While the normal score transformation
guarantees a univariate normal distribution, it is very important to note that it is highly unlikely to produce a
multivariate normal distribution. While facilitating the application of Gaussian simulation, the rank order and

4
standardizing nature of the normal score transformation also makes it a very useful utility tool when used in
conjunction with other non-linear transforms.

Stepwise Conditional Transformation

The stepwise conditional transformation is a multivariate extension of the normal score transformation, as it
attempts to remove non-linear, heteroscedastic and constraint features, while simultaneously transforming the
variables to form a multivariate Gaussian distribution. Furthermore, the technique attempts to decorrelate all of
the variables at a zero lag distance, potentially allowing for independent simulation to take place. The method is a
stepwise application of the normal score transformation, with the first variable simply being normal score
transformed. The second variable is then partitioned according to the conditional probability class of the first,
before having the normal score transformation independently executed on each discretized bin. The third variable
is transformed conditional to the probability class of the first and second, and so on. Introduced by Rosenblatt in
1952 (Rosenblatt 1952), it was further discussed by Luster (Luster 1985) before being implemented and
popularized in the field of geostatistics by Leuangthong and Deutsch (Leuangthong and Deutsch 2003).

Principal Component Analysis

Principal Component Analysis (PCA) is a linear spectral decomposition technique, which seeks to decompose the
multivariate distribution into orthonormal principal components. These principal components are guaranteed to
be uncorrelated at a zero lag distance, while also revealing the relative information contribution which each
transformed variable contributes to the multivariate system. This allows for components that lend only noise to a
multivariate system, to be discarded for simulation with the back-transformation restoring the original
dimensionality. PCA may therefore be particularly attractive in situations where a very large number of variables
are being considered. It has long been a very popular and widely applied technique in many statistical fields, with
excellent theoretical sources available from Johnson and Wichern (Johnson and Wichern 1988) and Schlens
(Schlens 2006), as well as geostatistical application from Boisvert (Boisvert et al. 2009).

Minimum/Maximum Autocorrelation Factors

An immediate extension of PCA, is the lesser known Minimum/Maximum Autocorrelation Factors (MAF)
transformation. It possesses all of the above mentioned features of PCA, while also a providing a more rigorous
spatial decorrelation of the variables. MAF involves two PCA transformations, through spectral decomposition of
the covariance matrix at a zero and non-zero lag distance. If the variables are fully described by a two structure
LMC model, then MAF will decorrelated them at all lags through, producing excellent reproduction of the cross-
correlations. The technique was first introduced by Switzer and Green (Switzer and Green, 1984) in the field of
spatial remote sensing, it was successfully applied in geostatistics by Desbarats and Dimitrakopoulos (Desbarats
and Dimitrakopoulos 2000) for estimating particle size distributions.

Conditional Standardization

Standardization, or standard score, is perhaps the most simple and familiar of all univariate transformations,
whereby individual values x are subtracted and divided by the associated X distribution mean and standard
deviation respectively (Equation 1.2). The transformed distribution  will possess a mean of zero and a variance of
one.

5
xi
yi = (1.2)

Conditional Standardization is a multivariate adaptation of the standardization technique, whereby a variable is
standardized conditional to the value of one or multiple related variables. Analogous to the stepwise conditional
extension of the univariate normal score transformation, conditional standardization partitions the variable to be
transformed according to the probability class of the conditioning variables, and then independently applies the
transform on the discretized bins. It is intended to capture and remove non-linear and heteroscedastic features
from a multivariate distribution. In this way, it may be very effective when applied in conjunction with linear
transformations such as PCA and MAF.

Alternating Conditional Expectations

Alternating Conditional Expectations (ACE) is a non-parametric additive regression technique which may be very
effectively applied where non-linear relationships exist amongst the predictor and response variables. Introduced
by Breiman and Friedman (Breiman and Friedman, 1985), and utilized in petroleum reservoir characterization
studies by Xue and Datta-Gupta (Xue and Datta-Gupta 1996) as well as Zwahlen and Patzek (Zwahlen and Patzek,
1997) it does not lend itself most effectively to the forward and back transformation framework. Rather, it may
be very effectively applied as a prediction tool, offering potentially large improvements over the more familiar
linear regression methods when significant non-linear features define a multivariate system. As the name suggests,
variables are transformed by ACE based on the conditional expectation of the considered variable(s) to maximize
linearity and correlation between the response and summed predictor variables.

1.5 Nickel Laterite Case Study

With so many transformations being demonstrated, a Nickel Laterite multivariate dataset will be held as the
constant variable throughout this guidebooks example applications. It is hoped that this will establish continuity
and familiarity for the reader with the data, allowing for the various benefits and comparative results of the
transformations to be more clearly observed.

Nickel laterite deposits occur in tropical regions where mechanical and chemical weathering (laterization) of
ultramafic igneous rocks leads to highly variable mineralogy profiles that strongly correlate with depth below
surface. In addition to the Nickel (Ni) resource, there are several other minerals of interest in the deposit which a
typical mining model will require, as Iron (Fe), Silica Dioxide (SiO2) and Magnesium Oxide (MgO) all exert a critical
influence on smelting extraction.

The dataset used throughout this guidebook is composed of 7737 samples that have full assay sets of Ni, Fe, SiO2
and MgO. Figure 1.2 displays bivariate cross-plots between all of the variables in an upper covariance triangle
form. Heteroscedastic, non-linear and constraint features are all highlighted in this figure, and clearly play a major
role in this multivariate system. Furthermore, the relatively low correlation coefficients for the majority of the
cross-plots illustrate that these clear relationships are not recognized by a simple linear measurement. Not all
variables will be considered for some examples in subsequent chapters due to practical presentation constraints.
Multivariate cross-plots will be frequently be displayed in this variance-covariance matrix form, with the lower left
corner indicating prior transformations that have been exectuted on the original data (none in the case of Figure
1.2).

6
Figure 1.2: Bivariate cross-plots of Ni, Fe, SiO2 and MgO, presented in a format which matches the upper
covariance matrix triangle (red boxes in the schematic variance-covariance matrix at the bottom left).

1.6 CCG Multivariate Toolkit Software Package

Accomponying the release of this guidebook is the associated CCG software package, which applies the forward
and back-transformation for all techniques within a familiar CCG format. For the majority of the techniques, this
software represents their initial CCG implementation, while previously existent ones such the Normal Score
(nscore) and Stepwise Conditional (stepcon/sctrans) programs have simply been modified with additional features
and formatting. Every transformation features the following features:

Forward transformation programs named according to the technique (eg. pca) with the associated back-
transformation program denoted by an _b appendage (eg. pca_b)
Automatically output a correctly formatted paramater file
Read Geo-EAS formatted datafiles
Output files append the transformed variable(s) to the full dataset contained in the input data file

7
Chapter 2

Logratios
Logratios are a commonly applied family of transformations for the removal of compositional and ratio constraints.
Non-linear in nature, they are only applicable within simulation frameworks and generally appear as the initial step
of a chained transformation workflow. The additive logratio is the most commonly used logratio transformation in
practice, and will be the only one addressed in this guidebook and associated software.

2.1 Transformation Concept

The majority of variables which geostatistical methods aim to model are measured as fractional components of a
greater whole. Porosity, metallurgical grades and particle size distributions are just a few examples, where the
variable in question is a fraction of a greater composition.

Constraints exist when specific components, or the sum of multiple components are not permitted to exceed a
given fraction of the greater composition. A universal compositional constraint is that fractional components of a
related set must not sum to a value greater than one (that is to say, a composition cannot be composed of the
greater than 100% of its components). Similarly, ratio constraints exist, such as the most basic example of a Net to
Gross measurement, where natural laws prevent a ratio of one from being exceeded (Net value cannot exceed
Gross value). While this may sound very obvious, these natural laws are neither understood nor obeyed by
multivariate modeling frameworks unless accounted for.

Using the Nickel Laterite dataset that was introduced in Figure 1.2, a robust modeling framework following the
workflow displayed in Figure 10.2 was executed, but without the application of logratios to address the
compositional constraint. The compositional constraint in this case is that the four major components of Ni, Fe,
SiO2 and MgO must not exceed 100%, which the original sample assays honour according to Figure 2.1. Univariate
histograms and non-linear features were reproduced by the simulated realizations, but the sum of the simulated
components clearly do not honour the compositional constraint according to Figure 2.1.

Figure 2.1: Distributions of the sum of the four major Nickel laterite components for the original assays (left) and
simulated realizations (right). The components of the realizations are shown to exceed the natural constraint of
100%.

8
Such a basic oversight will immediately undermine the integrity of the overall modeling workflow, and a
transformation is therefore required to enforce this compositional constraint. Logratios are a potential solution to
this issue, as they will convert the compositional components into logratio fractions which are free of constraints.
Conventional geostatistical methods may then proceed in the unconstrained space, with back-transformation
reinforcing the constraint.

2.2 Transformation Theory

Following Manchuk (Manchuk 2008), given a multivariate data matrix X composed of d variables and n
observations (2.1), the additive logratio transformed distribution Y is determined by taking a logarithmic of the
ratio between each variable observation xi j and a constant component observation xD j (Equation 2.2).

x11 L x1 j L x1n
M O M

X = xi 1 xi j xi n
(2.1)
M O M
xd 1 L xd j L xd n

xi j
yi j = log (2.2)
xD j

The sum of the component set  = 1, ,


and the divisor variable must sum to the known constraint for each jth
observation, as failure to do so will cause erroneous and obvious errors upon back-transformation. This
transformation reduces the dimension of the multivariate problem by the component and will therefore
require all variables to be simulated with the exception of xD. It is for this reason that is commonly referred to
as a filler variable, with it simply being reintroduced by the back-transformation. Due to the above observations it
is highly recommended that the filler variable simply be calculated for each set of observations as the difference
between the constraint and the sum of the variables to be modeled (the CCG program is implemented in this
fashion).

The back-transformation is given by Equation 2.3, which restores both the constraint and the original
dimensionality.

exp( yi j )
xi j = d
(2.3)
exp( y
i =1
ij ) +1

Challenges which may be encountered in practice will often revolve around the potential of zero values for the
divisor component. Furthermore, the coordinate axes of the transformed space is not orthogonal, but rather is
separated by 60 degrees (Pawlowsky-Glahn and Egozcue 2006).This may have some undesirable effects on

9
subsequent covariance calculations. An extensive discussion, along with possible solutions for these two important
issues can be found in the CCG Guide to Geostatistics with Compositional Data (Manchuk 2008).

2.3 Example

Using the logratio program, the additive logratio transform was applied to the Ni, Fe, SiO2 and MgO components of
the Nickel laterite deposit. The program therefore calculated the filler variable in Equation 2.2 as the difference
between 100 and the sum of these components. Cross-plots of the four variables (Figure 2.2) show that the
bivariate relationships of the logratios have taken on new and interesting relationships (compare to original
distributions seen in Figure 1.2). Focusing on the Ni-Fe relationship in Figure 2.3, it is further observed that the x >
0 constraint has been removed, as negative values now appear along the coordinate axes.

Figure 2.2: Cross-plots of the additive logratio transformed variables.

10
Figure 2.3: Original and transformed Ni and Fe cross-plot from

These logratio transformed variables were further transformed and realizations simulated following an identical
workflow to that presented in Figure 10.2. Following back-transformation using the logratio_b program, the sum
of the simulated major components (Figure 2.4) are now seen to obey the compositional constraint and moreover,
reproduce the distribution of summed variables with respect to the original data (Figure 2.1).

Figure 2.4: Sum of the major components for the simulated realizations following back-transformation.

11
2.4 Software (logratio & logratio_b)

The first logratio transformation program, logratio, is used to perform the forward transformation. The
corresponding required parameters are shown in Figure 2.5 and are explained below:

datafl: file with the input data to be transformed.


numvars: number of variables to transform.
icol(i), i=1,,numvars: column numbers for variables to transform.
tmin, tmax: trimming limits to filter out data .
constant: value of the compositional constraint which the sum (sumvars) of the variables icol(i),
i=1,,numvars must not exceed. The filler variable xD (div), will be calculated for each sample as the
difference between the constant and the sumvar.
outfl: file for output. This file contains numvars columns appended to the original data file with the
transformed data values.

Figure 2.5: Parameters for logratio

The second logratio transformation program, logratio_b, is used to perform the back-transformation. The
corresponding required parameters are shown in Figure 5.5 and are explained below:

datafl: file with the input data to be transformed.


numvars: number of variables to transform.
icol(i), i=1,,numvars: columns for variables.
tmin, tmax: trimming limits to filter out data.
constant: back-transformed components will be represented as fractions by default. It may be useful
to multiply these values by a constant (such as 100 to convert the output to percentages).
outfl: file for output. This file contains numvars columns appended to the original data file with the
transformed data values.

12
Figure 2.6: Parameters for logratio_b

13
Chapter 3

Normal Score Transformation


Although independently applied to univariate distributions, the normal score transformation is a fundamental tool
in the application of multivariate geostatistics. It refers to the quantile-to-quantile, rank ordered transformation of
any univariate global distribution to take on the form of a normal, or Gaussian distribution.

3.1 Transformation Concept

As discussed in section 1.2, the Gaussian distribution is often adopted for geostatistical modeling frameworks for
reasons that revolve largely around mathematical tractability. Since geologic data is rarely Gaussian in nature, a
technique is required to transform variables to normal score space before such modeling methods may applied.
When modeling only a single variable the normal score transformation achieves this goal, as it is guaranteed to
produce a univariate normal distribution.

While the normal score transformation does not guarantee multivariate Gaussianity, it is commonly to applied as a
final transformation in multivariate workflows to insure that at a minimum, the marginal distributions are Gaussian
prior to modeling. All reasonable efforts to achieve multivariate Gaussianity should be made prior to this step,
however, so that the assumption will be a reasonable one. Though non-linear in nature, the technique may
facilitate the transformation of multigaussian kriging estimates through specialized programs such as PostMG
(Lyster and Deutsch 2003), assuming no prior or subsequent non-linear transformations are applied in the
modeling workflow

Beyond its application within Gaussian based modeling frameworks, the normal score transformation may be very
useful as a utility tool within chained transformation workflows. Techniques such as PCA and MAF revolve around
covariance calculations, which i) are highly sensitive to outlying values and ii) require residual (mean zero) datasets
prior to application. The rank order nature of the normal score transformation, and the zero mean distributions
which it produces address these two PCA/MAF issues respectively and therefore make it ideal as a preceding
transformation in related workflows.

3.2 Transformation Theory

The normal or Gaussian distribution is the limit distribution of the Central Limit Theorem (Johnson and Wichern
1988), which states that the sum of a great number of independent equally distributed random variables will tend
to be normally distributed. The univariate normal and standard normal probability density functions are given by
Equation 3.1 and 3.2 respectively. Note that the multivariate Gaussian function (Equation 1.1) is simply the
multivariate generalization of Equation 3.1 below. The Gaussian cumulative density function is not parametrically
defined, though it may be numerically approximated with a great deal of accuracy.

1 1 y 2
g ( y) = exp
2 2 (3.1)

1 y2
g ( y) = exp
2 2 (3.2)

14
The forward normal score transformation is practically accomplished using a quantile-to-quantile, rank ordered
approach, which is shown graphically in Figure 3.1. The steps are as follows:

1. Define the cumulative distribution function F(Z) of the original variable Z.


2. Define the cumulative distribution function G(Y) of a standard normal Gaussian distribution.
3. Determine the probability interval according to F(zi) = pi for i=1,,n data values and match to
the Gaussian value of the matching probability according to G-1(pi). The final equation therefore
reduces to:

yi = G 1 [ F ( zi )]
(3.3)

Global Distribution Global Distribution


Original Units Gaussian Units
=0
=1

Figure 3.1: Schematic univariate transformation of the original data (Z) to the standard normal gaussian
distribution (Y).

The back-transformation is a graphic reversal of Figure 3.1 and an inversion of Equation 3.3, to produce Equation
3.4. Practically speaking, the corresponding original and Gaussian transformed value of each matched quantile
from the forward transformation is stored, allowing a look-up table with tail extrapolation to facilitate the back-
transformation.

zi = F 1 ( G ( yi ) )
(3.4)

The reason a mathematical model as simple as the normal distribution is so attractive for geostatistical simulation
and multigaussian kriging is illustrated in Figure 3.2 and Equation 3.5. Given that all local conditional distributions
G l (Y ) have been modeled and fully defined by their kriged mean yk and variance k2 , randomly draw a
probability p l from a normal CDF to form the realizations. If performing multiGaussian kriging, p l simply becomes
exhaustively sampled, evenly spaced probabilities (Ortiz and Deutsch 2003). Backtransformation of the realizations
or discretized distributions is then achieved through simply non-standardizing the global Gaussian CDF and
matching quantiles according to Equation 3.5.

15
(
z l = F 1 G ( k G 1 ( p l ) + y k ) ) (3.5)

Global Distribution Global Distribution


Original Units Gaussian Units
=0
=1

Local Distribution
Gaussian Units
=yk
= k

Figure 3.2: Graphical representation of the back-transformation of a zi simulation or multiGaussian kriging


probability.

3.3 Example

Returning to the Nickel Laterite dataset, the forward normal score transformation program nscoremv will be
applied to a trivariate set of Ni, Fe, and SiO2 variables which have already been logratio transformed. As will be
discussed in Chapter 10 regarding chained transformation workflows, the logratio transform will commonly
precede all others in the case of compositional data simulation, which is why the notion is being introduced in the
demonstration here.

Focusing first on only the Ni transformation, Figure 3.3 displays that the non-parametric logratio Ni distribution has
been transformed to a normal distribution, possessing a mean of zero and standard deviation of one. The
probability density functions in this figure clearly illustrate that the distribution has been transformed to an ideal
bell shape curve.

16
Figure 3.3: Original logratio Ni distribution (left) and normal score transformed distribution (right). The cumulative
(above) and probability (below) functions are displayed for both the original and transformed distributions.

As for the multivariate picture, the cross-plots of all three variables are displayed prior to and following the normal
score transformation in Figures 3.4 and 3.5 respectively. Histograms of the marginal distributions reveal the
univariate normal score transforms, though multivariate correlation clearly remains in all cases. This reinforces the
statement that univariate normal transformations do not gaurantee a multivariate normal distribution. If
simulation were to immediately proceed following this step, a cosimulation framework would have to be adopted
due to the lack of independence. As the multivariate relationships in Figure 3.5 are not fully described by the
correlation coefficients, even co-simulation would be unlikely to reproduce these distributions.

A final observation that may be made between Figures 3.5 and 3.6, is the attractive feature of transforming highly
skewed distributions with significant outlieing values, to rank ordered symmetric distributions. As mentioned,
covariance based transformation techniques such as PCA and MAF will benefit from this feature. Techniques such
as Conditional Standardization utilize binning to discretize a distribution, and will also draw value for calculating
stable conditional statistics along the margins.

17
Figure 3.4: Cross-plots between logratio transformed Ni, Fe and SiO2.

18
Figure 3.5: Cross-plots between normal score transformed Ni, Fe and SiO2. Note the change in the marginal
histograms between this and the logratio transformed distributions in Figure 3.5.

3.4 Software (nscoremv & nscoremv_b)

The first Normal Score transformation program, nscoremv, is used to perform the forward transformation. This
code is a multivariate adaptation of the original GSLIB nscore program (Deutsch and Journel 2008). The
corresponding required parameters are shown in Figure 3.6 and are explained below:

datafl: file with the input data to be transformed.


numvars: number of variables to transform.
varcols(i), i=1,,numvars: columns for variables to transform.
weightcols(i), i=1,,numvars: columns for weight of associated variable to transform.
outfl: file for output of transformed values
Following line is repeated for i=1,,numvars variables:
transfl(i): file for output of transformation table for the Ith variable. May be used for back-
transformation using GSLIB backtr or nscoremv_b programs.
Following lines may be repeated for i=1,,numvars variables. Refer to notes in parameter file
figure for additional information:

19
ismooth(i),ismoothcols(i,j),j=1,2): columns for the variable to transform (should match the
specification from the varcols line), the variable of the reference distribution, and the weight if any.
refdistfiles(i): file containing the reference distribution variable and weight.

Figure 3.6: Par file for nscoremv

The second Normal Score transformation program, nscoremv_b, is used to perform the back-transformation. It is a
multivariate adaptation of the GSLIB backtr program (Deutsch and Journel 2008). The corresponding required
parameters are shown in Figure 3.7 and are explained below:

datafl: file with the input data to be back transformed.


numvar: number of variables to back transform.
ivr(i), i=1,,numvar: column numbers of variables to back transform.
tmin, tmax: trimming limits to filter out data.
outfl: file for output of transformed values.
Following line is repeated for i=1numvar times:
transflmv(i): file containing the transformation table for the ith variable to transform. Required for
back-transformation using either the GSLIB backtr or nscoremv_b programs.
Following lines are repeated for i=1numvar times:
zmin(i), zmax(i): are the minimum and maximum values to be used for extrapolation in the tails. Note
that if zmin(i) > zmax(i) , then the minimum and maximum values in the transformation table will be
automatically assigned as zmin(i) and zmax(i) respectively.
ltail(i), ltpar(i): specify the back-transformation implementation in the lower tail of the distribution:
ltail(i) = 1 implements linear interpolation to the lower limit zmin(i), and ltail(i) = 2 implements
exponential interpolation to the lower limit zmin(i), with the power assigned from the ltpar(i).
utail(i), utpar(i): specify the back-transformation implementation in the lower tail of the distribution:
ltail(i) = 1 implements linear interpolation to the lower limit zmin(i), and ltail(i) = 2 implements
exponential interpolation to the lower limit zmin(i), with the power assigned from the ltpar(i).

20
Figure 3.7: Par file for nscoremv _b

21
Chapter 4

Stepwise Conditional Transformation


The Stepwise Conditional (SC) transformation is an extension of the normal score transformation, which attempts
to transform a multivariate distribution of any functional form, into a standard normal multivariate distribution
that is uncorrelated at the zero lag distance. The SC transformation is very effective in capturing non-linear
features of a multivariate distribution, but may pose practical implementation challenges in the presence of sparse
data and large dimensionality.

4.1 Transformation Concept

The multivariate Gaussian distribution (Equation 1.1) has been stated as an ideal goal for any multivariate
transformation workflow. A final distribution which is fully defined by the mean and variance of each variable, with
the multivariate relationships of any order fully explained by the bivariate correlations is theoretically required for
the correct application of any Gaussian co-simulation or co-kriging modeling framework.

In the case where this multivariate Gaussian distribution is also characterized by a zero correlation between
variables, modeling is further simplified in that independent simulation may proceed assuming an intrinsic model
of coregionalization. Following simulation, the back-transformation will then reinforce the correct correlations and
non-linear features of the original distribution.

Unfortunately, as was clearly demonstrated in the case study in Chapter 3, independent univariate normal score
transformations do not guarantee, and are unlikely to form a multivariate Gaussian distribution. Non-linear
features still exist in the cross-plots of Figure 3.5, which clearly are not described simply by a correlation between
the variables.

In 1952, Rosenblatt put forward the theory that conditional univariate normal score transformations could be the
solution to this problem. Applied in a stepwise manner, if a variable were to be normal score transformed based
on the conditional probability class of another normal score variable, then multivariate Gaussianity would be
achieved. A specialized program named PostSCT (Deutsch 2006) exists for the back-transformation of
multiGaussian kriging in SC transform form space, assuming that no other non-linear transformations are used in
conjunction.

4.2 Transformation Theory

Following Leuangthong (Leuangthong 2003), the SC transformation is identical to the normal score transformation
when applied to a single variable. When applied to a bivariate distribution, the first variable is normal score
transformed again, but the second variable will be normal score transformed conditional to the probability class of
the first. This extends to an n-variate distribution, with the nth variable being normal score transformed
conditional to the probability class of the previous n-1 variables. This stepwise process is given by Equation 4.1,
where Z i , i = 1,..., n are the original variables and Yi ', i = 1,...n are the stepwise conditionally transformed
variables. The transformed variables are given the prime specification, to differentiate from the normal score
transformed variable notation of Y seen in Chapter 3.

22
Y1 ' = G 1[ F1 ( z1 )]
Y2 ' = G 1[ F2|1 ( z2 | z1 )]
Y3 ' = G 1[ F3|1,2 ( z3 | z1,2 )]
M (4.1)

Yn ' = G 1[ Fn |1,...,n 1 ( zn | z1,L,n 1 )]

Similar in nature to the normal score transformation, the corresponding values for common probability classes
between the untransformed and transformed variables are stored for the back-transformation. The same order
which was applied in the forward transformation must be used for the back-transformation since higher order
variables will require the back transformed value of the conditioning variables that preceded it to determine the
appropriate bin. That is,

z1 = F11[G (Y1 ')]


z2 = F 12|1[G (Y2 ' | z1 )]
(4.2)
M
zn = F 1n|1,...,n [G (Yn ' | z1,L,n 1 )]

Following the graphical representation of the forward transformation for a bivariate case in Figure 4.1, the process
begins with (a) the SC transform of the Z1 variable to the Y1 ' transformed variable. Note that for this first
transformation, the SC transform is identical to the normal score transform, or in other words, Y1 ' = Y . (b) The
second Z 2 variable is then partitioned into bins based on the probability class of the Z1 variable. (c) A normal
score transformation is independently applied to each conditional bin of the Z 2 variable, forming the SC
transformed Y2 ' variable. It must be reinforced, that although normally distributed, Y2 ' will not be the same as its
normal score Y2 equivalent. A cross-plot of the two SC transformed variables confirms that both the marginal and
multivariate distributions are approximately Gaussian in nature (d).

This process is easily extended to a trivariate system, where Z 3 is partitioned according to the probability classes
of the stepwise conditionally transformed Y1 ' and Y2 ' variables. Independent normal score transformation of the
partitioned Z 3 variable then yields the SC transformed Y3 ' . Figure 4.2 provides a schematic demonstration of the
original Z 3 and transformed variable Y3 ' assuming a prior stepwise transformation of Y1 ' and Y2 ' as shown in
Figure 4.1.

23
Figure 4.1: Visual representation of the stepwise transform for a bivariate case (Leuangthong 2003)

24
Y3 ' = G 1[ F3|1,2 ( z3 | z1,2 )]

Figure 4.2: Final step of a trivariate SC transformation, where the third variable Z1 is partitioned into 9 bins based
on 3 classes of the two transformed conditioning variables (left), and transformed to a univariate marginal
distribution (right) that forms a multivariate normal distribution with the conditioning variables (left).

Data Requirements Leading to the Nested Application

It can be seen in Figures 4.1 and 4.2, that the number bins B increases according to Equation 4.3, where C is the
number of partitioning classes and n is the number of considered variables.

B = C n 1 (4.3)

There is no strict rule regarding the number of classes that are required for partitioning, or the number of data that
are required in each bin for the subsequent normal score transformation. The correlation that remains between
the transformed variables is tied to the number of partitioning classes, however, with between 10 and 20 classes
generally found to remove correlation (Leuangthong 2003). So if using between 10 and 20 classes for discretizing
each conditional variable, with 10-20 data reasonably required for the normal score transformation of each
n n
conditional bin, it follows that between 10 and 20 data will be needed (Rosenblatt 1952).

Consequently, only in the case of very large datasets (>10,000 as a minimum), will the SC transformation be
applied beyond a trivariate system. Unfortunately, multivariate distributions exist for geologic datasets of less than
10,000 observations and greater than three variables. A nested approach is commonly adopted under these
circumstances, whereby variables are SC transformed conditional to only one or two variables, depending on the
dimensionality which the dataset permits.

Removal of this lower order correlation between variables will oftentimes resolve the majority of higher order
correlation between variables not SC transformed conditional to one another. It is not guaranteed, however, and
careful decision making must take place regarding the conditioning variables for this nested application.
Considerations may include:

Reproduction of the multivariate relationships which the primary resource variable holds with all
secondary variables (resource variables becomes the first conditioning variable for all transformations)

25
Reproduction of the multivariate relationship in the case of secondary variables where the ratio formed
between them is of critical interest (one secondary variable must condition the other)
Removal of correlation between strongly correlated variables. In the case of high correlation between
variables, it is unlikely that correlation will be entirely removed unless one variable conditions another.

The above considerations will often lead to difficult decision making, as not all of the conditions may be satisfied
by a nested SC application. Also recall from section 1.3, that even in the case where zero correlation is achieved for
all variables, this does not insure that correlation has been removed beyond the zero lag distance. Prior to
independent simulation, experimental cross-variograms should be checked on the transformed data to verify
whether sufficient decorrelation exists for all relevant spatial scales.

Ordering

Another feature which should be quickly noted is that by applying the SC transformation, semi-variograms no
longer reflect only the spatial correlation of the variable in question, but rather are a combination of both the
variograms and cross-variograms of the variable and the conditioning variables preceding it (Leuangthong 2003).
Practically, what this means is that spatial structure of the higher order variables of an SC transformation are
expected to have reduced spatial structure, relative to their original form. This brings into question the ordering of
the variables, and it has been demonstrated that better overall reproduction of the original spatial structure is
seen when the more continuous variables are chosen as the lower order conditioning variables (Leuangthong
2003).

There are many more considerations for this technique and readers are referred to Oy Leuangthongs PhD Thesis
(Leuangthong 2003) for additional details on the theory, implementation and application of the SC transformation.

4.3 Example

A trivariate Stepwise Conditional transformation will be applied to the Nickel laterite dataset (Ni, Fe and SiO2),
using the stepcon program and following the chained transformation workflow presented in Figure 10.2. The
approach for a quadravariate system will also be presented at the conclusion of this chapter in order to
demonstrate the more complicated application of a nested SC transform.

The forward SC transformation is graphically represented in Figure 4.3, where Ni is treated as Z1 , Fe as Z 2 and
SiO2 as Z 3 in the context of Equation 4.1. The untransformed and transformed bivariate cross-plots are shown in
Figures 4.4 and 4.5 respectively. As can be seen, all marginal distributions are Gaussian, with zero correlation
achieved to the third decimal place. The small feature of correlation remaining in the bottom left corner of the SC
Fe and SC Ni cross-plot is attributed to remaining correlation within the conditional bin. Increasing the number of
classes would be expected to remove this feature.

The banding along the margins of the transformed SiO2 plots is the result of a low number of data populating the
conditional bins. The software is implemented so that in the case of insufficient data within a conditional bin, a
search beyond the bin is executed until the required number is satisfied. As the maginal regions of the third
variable in this trivariate SC transformation all have an insufficient number, this search function results in the
minimum number of data being found for all cases. The subsequent normal score transformations then places the
same number of data at the same Gaussian quantiles, create the banding effect (Leauangthong 2003). While not
aesthetically pleasing, it is not a major cause for concern.

26
Figure 4.3: Stepwise Conditional transformation workflow for a trivariate system.

The final discussion of Section 4.2 stated that the more continuous variables are best chosen as the lower order
conditioning variables for a transformation, due to the effect on the resultant variogram structures. To observe the
unfiltered spatial structure prior to the SC transformation, Figure 4.6 displays the normal score experimental semi-
variogram and fitted model for each of the three variables. It may be seen that Fe is in fact the most continuous
variable, which may contradict with the decision to make Ni the first conditioning variable in this example. This is a
case of competing priorities, as in practice it may become a cause for concern when the spatial structure of the
primary resource variable is being altered by a secondary (contaminant in the case of Fe) variable.

Figure 4.7 shows the experimental semi-variograms and fitted model for the three variables following the SC
transformation. The Ni variograms are identical between Figure 4.6 and 4.7, as is to be expected since it is the first
variable in the SC transformation and is therefore the same as its normal score equivalent. Fe and SiO2, however,
are seen to have their spatial structure decreased by the SC transformation relative their normal score equivalents.

27
Figure 4.4: Cross-plots between logratio transformed Ni, Fe and SiO2. These will be the pre-transform
distributions for applying the trivariate Stepwise Conditional transformation.

Figure 4.5: Cross-plots between Ni, Fe and SiO2 following a trivariate SC transformation.

28
Figure 4.6: Horizontal (left) and vertical (right) experimental semivariograms and fitted models for normal score
transformed Ni, Fe and SiO2.

29
Figure 4.7: Horizontal (left) and vertical (right) experimental semivariograms and fitted models for SC transformed
Ni, Fe and SiO2.

Simulation and Back-transformation

Independent Gaussian simulation proceeded using the variogram models in Figure 4.7, and following SC and
logratio back-transformation (recall that the workflow in Figure 10.2 is being applied and therefore involves a
logratio transformation). Figure 4.8 displays the back transformed simulated values, and comparing it with the
original distributions in Figure 1.2, a near identical reproduction of the complex features is observed.

30
Figure 4.8: Cross-plots between back transformed simulated values of Ni, MgO and SiO2. Comparison with Figure
1.2 displays excellent cross-correlation between variables.

Finally, Figure 4.9 displays the forward transformation workflow for a quadravariate multivariate system,
demonstrating the more complex nature of a nested SC application. Ni was used as the first conditioning variable
in a bivariate SC transformation with Fe. Observing that Ni is fairly weakly correlated in the original distributions
with MgO and SiO2 relative to their relationship with Fe, the decision was made to use Fe as the first conditioning
variable for the trivariate transformation of MgO and SiO2. This ensures that the strong correlation with Fe will be
removed, while ensuring that the critical multivariate relationship of MgO and SiO2 is also reproduced (the ratio
between them is a critical factor for blending considerations).

Following Figure 4.9, the forward and backward transformation is as follows:

1. Stepwise transform of Ni ( Z1 ) and the normal score of Fe ( Z2 )


2. Stepwise transform of Fe ( Z1 left untransformed), SiO2 ( Z2 ) and MgO( Z3 )
3. Gaussian simulation of Ni and Fe (from step 1), SiO2 and MgO (from step 2)
4. Back transform Ni ( Y1 ' ) and Fe ( Y2 ' )
4. Given the normal score values of Fe ( Y2 ' ) from step 4, back transform SiO2 ( Y2 ' ) and MgO ( Y3 ' )

31
Figure 4.9: Forward transformation for a nested SC workflow.

4.4 Software (stepcon & stepcon_b)

The first Stepwise Conditional Transformation program, stepcon, is used to perform the forward transformation.
This program is an adaptation of several previous versions from Deutsch and Leuangthong. The corresponding
required parameters are shown in Figure 4.10 and are explained below:

datafl: file with the input data to be transformed.


nvar: number of variables to transform.
icold(i), i=1,,nvar,iwtd,irtd: columns for variables to transform, weights and rocktype.
nrt: number of valid rocktypes to consider for the transformation.
rt(i), i=1,..,nrt: valid rocktype codes to consider for the transformation.
tmin, tmax: trimming limits to filter out data.
irefdist: Weight input data based on multivariate reference distribution (1=yes,0=no).
calfl: file that contains multivariate reference distribution.
icolc(i), i=1,,nvar,icolwt,icolrt: columns for reference variables, weights and rocktype.
ncls: number of classes to partition distributions.
nq: number of quantiles within each partition.

32
minnd: minimum number of data in each class.
leavefirst: option to not transform the first variable, in the case where it has already been univariate
normal score transformed (1=yes,0=no).
iorder: option to order the bivariate transformation in a monotonic nature (1=yes,0=no).
z1min, z1max: minimum and maximum value of the first variable. z1min > z1max removes the
constraint and considers all data within the trimming limits.
n2t: number of minimum and maximum of the second variable.
Following line is populated for i=1n2t times:
z1val(i), z2min(i), z2max(i): minimum z2min(i) and maximum z2max(i) z2 values corresponding to
the conditional z1val(i) of the bivariate distribution. Used to constrain the data in order to avoid
outliers from exerting influence on the tails of the distribution.
n3t: number of minimum and maximum of the third variable.
Following line is populated for i=1n3t times:
z1val(i), z2val(i), z3min(i), z3max(i): minimum z3min(i) and maximum z3max(i) z3 values
corresponding to the conditional z1val(i) and z2val(i) of the trivariate distribution. Used to constrain
the data in order to avoid outliers from exerting influence on the tails of the distirubiton.
outfl: file for output of transformed values.
trnfl: file for output of transformation table, which is required for stepcon_b.

Figure 4.10: Par file for stepcon

33
The second Stepwise Conditional transformation program, stepcon_b, is used to perform the back-transformation.
The corresponding required parameters are shown in Figure 4.11 and are explained below:

nvar: number of variables to back transform.


tmin, tmax: trimming limits to filter out data.
The following two lines are repeated for i=1nvar variables:
simfl(i): file with the input data to be back transformed.
icolv(i): column with input value to be back transformed.
trnfl: file for the transformation table, which is output from stepcon_b.
outfl: file for output of back-transformed values.

Figure 4.11: Par file for stepcon _b

34
Chapter 5

Principal Component Analysis (PCA)


The basic goal of Principal Component Analysis (PCA) is to decorrelate variables, and determine if a complicated
system of many dimensions may be reduced to fewer underlying and often obscured, critical components. The
transformation is linear in nature, and may therefore facilitate the transformation of both simulated and estimated
values.

5.1 Transformation Concept

A demonstrative example will be borrowed from A Tutorial on Principal Component Analysis (Schlens 2009). This
paper is very highly recommended for those who seek a more elaborate and thorough introduction to the general
concept and theory of PCA.

Suppose that a ball is attached to an ideal spring, which is then stretched and set in motion along an x-axis (Figure
5.1). Because it is an ideal spring, no energy will be lost to friction or other forms of resistance, allowing the ball to
remain in a permanent state of motion. Now imagine that a scientist is studying the motion of the ball, but is
completely ignorant as to the existence of the spring mechanism and orientation, knowing only that a ball is non-
stationary in space. In order to study the dynamics of the system, the scientist sets up three video cameras, which
are all at irregular angles in respect to the motion of the ball along the x-axis. The cameras then record the
projection of the ball at an aligned and regular frequency, producing six measurements (x and y axes of the three
images) as displayed in Figure 5.1

Figure 5.1: Three cameras oriented at irregular angles to the motion of an ideal spring along the x-axis (above). The
cameras record the location of the ball at a regular frequency, producing projections of the ball within their unique
two dimensional image axes (Schlens 2009).

35
Figure 5.2: Expanded view of camera As image capture from Figure 5.1, with interpreted signal and noise (Schlens
2009).

The dynamics of the system are of course far simpler than may be evident to the scientist at this stage of
observation, but how should he or she go about extracting the critical information? As the title of this chapter may
have given away, PCA is a solution to this problem. Looking closer at the results from Camera A (Figure 5.2), a
2
fundamental assumption of PCA is that the directions of great variance ( signal) hold important structure.
Orthogonal directions of low variance are then associated with proportionally less important structure, or may
2
even be unresolvable noise ( noise). PCA in this example would involve the coordinate rotation of the each
cameras observations, to align with their respective directions of greatest variance. Through this rotation, the
common signal orientation would be identified, allowing the system to be reduced to the single linear dynamic
feature along the x-axis.

This conceptual example is analogous to the study of rock properties. It is not uncommon for dozens of
geometallurgical variables to be measured in an attempt to better understand and model a deposit. Perhaps these
they are complicating what is a relatively simple system, as the measurements may in fact be at irregular
orientations to the driving mechanisms of the compositional makeup. If that is the case, these numerous variables
may be reduced to a few principal components which explain the majority of the system variance.

Additionally, PCA transforms the variables to be orthogonal to one another, which is an attractive feature in
geostatistics as it allows for independent simulation. Caution must be exercised, however, as the largest drawback
and assumption of the PCA is linearity within the data. The conceptual example above is a perfectly linear system,
but non-linear, higher order features are frequently observed in rock properties. This means that while PCA will
decorrelate the variables, in doing so the method may not accurately capture the true dynamics or variability of
the system. If that is the case, subsequent simulation and back-transformation is likely to reveal a poor
reproduction of the original multivariate relations.

36
5.2 Transformation Theory

Consider a multivariate observation vector X , where n observations have been made for m number of
standardized variables:

x11 L x1n

X mxn = M O M
xm 1 L xm n

The end goal of PCA as previously alluded, is to find the linear combinations (or transformed variables) of a matrix
Y corresponding with this data matrix X , such that the transformed covariance matrix CY is diagonal. A diagonal
covariance matrix implies that the transformed variables are perfectly uncorrelated, due to the zero values of
covariance off of the diagonal. Jumping ahead, the orthonormal matrix P in Equation 5.1 is the link between these
linear combinations. The question is how to derive P ?

Y = PX (5.1)

Recall the calculation of a covariance matrix for standardized variables (Equation 5.2). Following (Schlen 2009) and
(Johnson and Wichern 1988), rewrite the covariance matrix CY in terms of the unknown variable P (Equation 5.3).

CY = 1 YY T (5.2)
n

= 1 ( PX )( PX )T
n
= P( 1 XX T ) PT (5.3)
n
CY = PCx PT

According to eigenvector definition (Johnson and Wichern 1988) any symmetric matrix A is diagonalized by an
T
orthogonal matrix of its eigenvectors. This gives that A = EDE , where E is a matrix of eigenvectors of A and D is a
diagonal matrix. Recognizing that C X is a symmetric matrix, assign P to be a matrix where each row is an
T
eigenvector of C X such that ( P E ). Noting that the transpose of an orthogonal matrix is its inverse ( PT = P 1 ),
it is proven that CY is diagonal according the substitutions in Equation 5.4.

CY = P ( PT DP) PT
CY = ( PP 1 ) D ( PP 1 ) (5.4)
CY = D

37
The principal components of a data matrix are the eigenvectors of its covariance matrix C X (rows of PT ). The
diagonal values of CY are the eigenvalues corresponding with these eigenvectors, meaning they communicate the
magnitude of each principal components variance. As shown in the demonstrative example in section 5.1, these
values may therefore aid in determining the components that are worth modeling, potential reducing the
dimensionality of the system.

Practical Implementation Steps

1) Calculate the covariance matrix for a standardized multivariate observation vector X m x n

C X = 1 XX T
n

2) Perform spectral decomposition of the C X matrix to determine the orthonormal eigenvalues D and rows
of eigenvectors P .
C X = PDPT

3) Compute the PCA transformed values Y through multiplication of the row eigenvector matrix P

Y = PX
4) If required, back transform PCA values through the inversion of the above equation (recall that the inverse
of an orthogonal matrix is its transpose). While dimension reduction may have taken place on the
T
transformed Y values, multiplication with the m x m dimension column eigenvector matrix P will
restore the original dimensionality of the X data matrix.

X = PT Y

5.3 Example

Returning to the Nickel Laterite dataset, the PCA program will be applied to a quadravariate system of Ni, Fe, SiO2
and MgO (Figure 5.3) that have already been logratio and normal score transformed. Refer to Chapter 10 for
additional rationale on these preceding transformations. The logarithmic transformation was executed to remove
the compositional constraints from the data, while the normal score transformation was used to facilitate the
subsequent PCA transformation by i) centering the individual variables at a zero mean and ii) rank ordering the
data to remove outliers which the covariance based PCA will be sensitive to. Note that conditional standardization,
as appears in the workflow of Figure 10.3 has not been applied.

The PCA transformed variables (Figure 5.4) have been decorrelated, but the nature of the transform gaurantees
this effect. The question is whether the PCA transform has accurately captured the nature of the multivariate
relationship, so that it is accurately restored following simulation and back-transformation. Furthermore, while
decorrelated according to the linear measure of correlation, Figure 5.4 displays that the transformed distributions
are far from Gaussian in nature.

38
Furthermore, as discussed in section 1.3, only in cases where an intrinsic model of corregionalization decribes the
spatial correlation between variables, will the correlation of the lag zero matrix decorrelate the variables at for all
lag distances. Section 7.3 of the MAF chapter will demonstrate the spatial correlation that does remain following
the PCA transformation (which MAF will attempt to improve upon).

Figure 5.3: Cross-plots of Ni, Fe, MgO and SiO2, which have been logratio and normal score transformed. Marginal
normal distributions are displayed on the axes.

39
Figure 5.4: Cross-plots of Ni, Fe, MgO and SiO2 which have been logratio, normal score and PCA transformed.

Simulation and Back-transformation

All four PCA components were normal score transformed to facilitate simulation, simulated, and back-transformed
to original space. The bivariate cross-plots of the back-transformed simulated values are displayed in Figure 5.5,
and it may be observed that while the general form of the original distributions are honored, both the variability
and non-linear relationships have not been reproduced. This is perhaps not surprising since there was no
accounting for the non-linear and heteroscedastic multivariate features in the workflow.

40
Figure 5.5: Cross-plots for back-transformed simulated values. Original complexity features (black lines) reflect the
original distributions in Figure 1.2.

Dimension Reduction

The magnitude of the eigenvalues communicates the relative variability which each principal component
contributes to the multi-variate system. Figure 5.6 displays the eigenvalues, rescaled to percentages of the total
system variability. Components 3 and 4 contribute a total of 21.7% of the variability to the system according to
their eigenvalues, and while this is significant within a modeling framework, they will be removed for the back-
transformation to demonstrate the impact.

Applying the back-transformation with only principal components 1 and 2 (transformed data rather than simulated
values), the original quadravariate dimensionality is restored, with the bivariate cross-plots displayed in Figure 5.7.
Comparing the original multivariate relationships in Figure 5.3 with these back-transformed distributions it can be
seen that the general shapes are honored, but that predictably, the full variability of the distributions is not
reproduced.

41
Figure 5.8 directly compares the univariate distributions in Figures 5.3 and 5.7, by cross-plotting back-transformed
variables with their original form. A very high correlation is seen between each distribution as would be hoped,
2
but variability has been reduced in all cases. The variance reduction factor for each variable ( reduction ) was simply
calculated as the difference in the variance between the original and back-transformed distributions, divided by
the original variance. An arithmetic average of the four variance reduction factors seen in Figure 5.7 yields 21.6%,
which compares favourably with the 21.7% reduction that was predicted by the eigenvalues (Figure 5.5). This
indicates that the eigenvalues are indeed an accurate prediction of the system variability, as defined by linear
measures.

Principal Components
Variability Contribution
60
Variability Contribution (%)

50

40

30

20

10

0
1 2 3 4
Principal Component

Figure 5.6: Magnitude of the principal components, rescaled to percentages of the system variability.

42
Figure 5.7: Cross-plots of the back transformed values using only two principal components. This compares with
the pre-PCA distributions seen in Figure 5.3.

Figure 5.8: Cross-plots between the original (Figure 5.3) and back transformed (Figure 5.7) variables with
associated variance reduction factors.

43
5.4 Software (pca & pca_b)

The first Principal Component Analysis program, pca, is used to perform the forward transformation. This program
was adapted from CCG MAF code (Elogne and Leuongthong 2008). The corresponding required parameters are
shown in Figure 5.9 and are explained below:

datafl: file with the input data to be transformed.


numbvar: number of variables to transform.
varcol(i), i=1,,numbvar: columns for variables.
tmin, tmax: trimming limits to filter out data.
dataoutd: transformation information output file. Contains eigenvalues, row ordered eigenvectors,
and the original variable names. Formatted and required by the pca_b program for back-
transformation.
dataoute: file for output. This file contains numbvar columns appended to the original data file with
the transformed data values.

Figure 5.9: Par file for pca

The PCA program outputs an information file (Figure 5.10) which contains the covariance matrix, eigenvalues
matrix in decending order of magnitude, row ordered eigenvector matrix corresponding to eigenvalues, and
original variable names. This file is useful for understanding the potential for dimension reduction (eigenvalues)
and is formatted to be read by the back-transformation program pca_b.

44
Figure 5.10: Principal Component Analysis Info file which is output from the PCA program.

The second Principal Component Analysis program, pca_b, is used to perform the back-transformation. The
corresponding required parameters are shown in Figure 5.11 and are explained below:

datafl: file with the input data to be transformed.


npc: number of variables to transform.
pccol(i), i=1,,npc: columns for variables.
tmin, tmax: trimming limits to filter out data.
vectorfl: principal component information file, which was output from pca program.
nvector: number of vectors (rows) within the vectorfl row ordered vector matrix (original
dimensionality of the untransformed distribution).
outfl: file for output. This file contains nvector columns appended to the original data file with the
transformed data values.

Figure 5.11: Par file for pca_b

45
Chapter 6

Conditional Standardization
It has been demonstrated that linear covariance based techniques, such as the PCA transformation in the
preceding chapter, may not adequately capture complex multivariate relationships. Consequently, geostatistical
models that rely solely on these linear transformation methods may exhibit poor reproduction of heteroscedastic
and non-linear geologic features.

Conditional Standardization is introduced as a simple and intuitive transformation for the removal of non-linear
and heteroscedastic features. Distributions that approach linearity and homoscedasticity are produced, allowing
for the more effective application of subsequent linear modeling or transformations. Back-transformation will then
enforce the original complex relationships.

6.1 Transformation Concept

The covariance between variables underlies traditional multivariate geostatistical modeling techniques such as co-
kriging and co-simulation, as well as multivariate transformations such as PCA and MAF. As covariance is a linear
statistic, it does not capture non-linear and heteroscedastic features.

This issue is perhaps most easily visualized by observing the nature of the PCA transformation. Recall from Chapter
5, that PCA is a rotation of orthogonal vectors to decompose and efficiently explain the variability of a multivariate
system. In the case of a truly linear relationships, such as the ideal spring example that was presented in section
5.1 (repeated in Figure 6.1 below), these orthogonal vectors may effectively describe the variability of the system.
Inspection of the bivariate cross-plots for the Nickel Laterite transformed distributions in Figure 6.2, however,
illustrates that linear orthogonal vectors cannot fully describe the nature of a non-linear system.

Figure 6.1: Decomposition of the ideal spring ball projections into two orthogonal vectors from section 5.1 (Schlens
2009)

46
Figure 6.2: Bivariate cross-plots of Ni, Fe and SiO2, which have undergone logratio and normal score
transformation. Arrows are an approximated decomposition of the distributions by two orthogonal vectors.

A method is required which removes these complex relationships, producing distributions which are linear and
homoscedastic. In doing so, subsequent decorrelation methods such as PCA would more accurately decompose
the remaining variability of the system. Back-transformation would then reintroduce the complex features.

6.2 Transformation Theory

Consider a bivariate distribution, consisting of two variables X and Z for n number of observations.

X 1 xn = [ x1 L xn ] Z1 xn = [ z1 L zn ]

In a manner similar to the Stepwise Conditional Transformation (Rosenblatt, 1952), partition the bivariate
distribution based on the conditional value of the X variable. The vertical lines in Figure 6.3 represent the
boundaries of these conditional partitions that serve to subdivide the Z values into a series of bins.

47
Figure 6.3: Schematic of a non-linear and heteroscedastic bivariate distribution that has been partitioned according
to conditional probability classes of X (left). Subtraction of the conditional mean and division of the conditional
standard deviation yields a linear and homoscedastic distribution (right).

The transformed Z ` value is then determined by applying Equation 6.1.

k
zj 1 zi
Z E {Z | X } i =1
Z'= = j = 1,..., n
k
Var{Z | X }
k
1 ( zi i ) 2
i =1 (6.1)

Where k is the number of data that fall within the jth observations conditional bin. Subtraction of the
conditional mean and division of the conditional standard deviation will effectively remove any non-linear and
heteroscedastic features that is formed between the two variables, as shown in Figure 6.3. This concept may be
extended to higher dimensions, where a variable is transformed conditional to two or more variables. The
trivariate case is illustrated in Figure 6.4 and represented by Equation 6.2, where the transformed Z ' variable is
now conditional to the value of an additional Y variable. The conditional mean in this trivariate figure is now
represented by a plane, as opposed to a line in Figure 6.3. Non-linearity is seen to remain in the transformed
distribution because the bivariate relationship of the conditioning variables was not first addressed.

Z E {Z | X , Y }
Z'=
Var{Z | X , Y } (6.2)

The success of this transform depends on the calculation of a stable conditional mean and standard deviation.
Identical to stepwise, the data required for this transform increases as a power of the number of variables being
n n
considered. In order to discretize a distribution into 10-20 bins that contain 10-20 observations each, 10 - 20 data
would be required, where n is the number of variables being considered. As a result, although theoretically
possible, the transform will generally become unstable beyond the trivariate case for most geologic datasets.

48
Figure 6.4: Schematic of a non-linear and heteroscedastic trivariate distribution that has been partitioned
according to conditional probability classes of X and Y (left). Subtraction of the conditional mean and division of
the conditional standard deviation yields a linear and homoscedastic distribution (right).

The back-transformation of the data or simulated values is achieved by a rearranging of the forward
transformation. The bivariate back-transformation then takes on the form

Z = Z ' Var{Z | X , Y } + E {Z | X , Y }
(6.3)

Similar to the stepwise conditional transformation, following transformation the variograms of the higher order
(conditioned) variables will no longer reflect the true spatial structure of the original variables. Rather, they will be
a combination of the preceding conditioning variograms and the cross-variograms between them. Variables of
greater spatial structure are therefore recommended as low order conditioning variables, though practical
considerations such as leaving the primary resource variable unaltered will weigh on the decision making.

While the mean and standard deviation is calculated here according to a conditional partitioning of the data, the
same methodology is easily adapted to accommodate a continuous parametric function. While the non-linearity
and heteroscedasticity of a distribution is unlikely to be fully fit by parametric forms, this option may be attractive
in the case of few data, or very high dimensionality, as the need for partitioning the conditioning variables is
removed. Both options are implemented within the CCG software.

6.3 Example

Returning to the Nickel Laterite dataset, the constd program will be applied to a trivariate system of Ni, Fe and
SiO2, all of which have been logratio and normal score transformed (Figure 6.2) according to the workflow
presented in Figure 10.3. A normal score transform is not required prior to applying the transformation, but it is
recommended if the margins of the distribution are sparsely populated. Potential instability of the conditional
mean and standard deviation calculation could result at the margins otherwise, which the rank ordered normal
score transformation will mitigate.

49
Only a bivariate and trivariate application will be demonstrated, though the sequential conditional standardization
workflow in Figure 6.5 includes MgO for readers to understand how the transformation may be applied beyond
three dimensions in a nested fashion. This workflow figure displays that an initial bivariate transformation of Fe
conditional to Ni is executed to remove nonlinearity and heteroscedasticity between the two. The bivariate of
cross-plot, before and after this transformation is displayed in Figure 6.6. Next, a trivariate application will
transform SiO2 to remove nonlinearity and heteroscedasticity from its relationship with the Ni and previously
condionally standardized Fe. The bivariate cross-plots of SiO2 with the conditioning variables, before and after the
transformation are displayed in Figure 6.7.

The transformation from a non-linear multivariate distribution, to a linear (and nearly Guassian in appearance)
distribution may be more easily visualized in the three dimensional scatter plots of Figure 6.8 and 6.9 respectively.
Relative to its untransformed non-linear form, the conditionally standardized distribution will be better suited for
both covariance based PCA/MAF transformations, and Gaussian based modeling techniques.

Figure 6.5: Sequential conditional standardization workflow for four variables of the Nickel laterite dataset.

50
Figure 6.6: Cross-plot between normal scores of Ni and Fe (left) and the nscores of Ni and conditional standardized
SiO2 (right).

Figure 6.7: Cross-plots between normal scores SiO2 and its two conditioning variables before (above) and following
(below) the trivariate conditional standardization.

51
Figure 6.8: Three dimensional cross-plot between the normal scores of Ni, Fe and SiO2.

Figure 6.9: Three dimensional cross-plot between the normal scores of Ni, and the conditional standardized Fe and
SiO2.

52
Simulation and Back-transformation

Ni, Fe and SiO2 were simulated and back-transformed following an identical workflow to the one that is presented
in Figure 10.2. The bivariate cross-plots for the back-transformed realizations are presented in Figure 6.10.
Compare these multivariate relationships with the equivalent plot in Figure 5.6 for back-transformed realizations
of a PCA workflow that did not include conditional standardization, as well as the multivariate relationships of the
original distribution in Figure 1.2. With all other variables being held constant between these two workflows, a
significant improvement is seen in the reproduction of the non-linear and heteroscedastic features for the PCA
workflow that involves conditional standardization.

Figure 6.10: Cross-plots between simulated Ni, Fe and SiO2 (1 out of every 10,000 values plotted) following
backtransformation of the displayed workflow.

53
6.4 Software (constd & constd_b)

The first Conditional Standardization program, constd, is used to perform the forward transformation. The
corresponding required parameters are shown in Figure 6.11 and are explained below:

datafl: file with the input data to be transformed.


ixp,iyp,izp,iwp: columns for the x, y, z and weight variables. Refer to Note1 in the parameter figure
for additional considerations.
tmin, tmax: trimming limits to filter out data.
xmin,xmax: minimum and maximum value of the conditional x variable for allocating bins. Refer to
Note2 in the parameter figure for additional considerations.
ymin,ymax: minimum and maximum value of the conditional y variable for allocating bins.
ipar: parametric (1) or discretized (0) calculation of the mean and standard deviation functions.
If ipar=1, the following two lines apply:
xm_regress, ym_regress: order of regression for the parametric calculation of the conditional mean.
Refer to Note 3 in the parameter file for additional considerations. Note that if optimizing, all
methods will be tested, with the functional form that produces the lowest mean squared error being
selected. When possible, it is advised to choose the functional form based on visual inspection of
both the original distribution and resultant residual values, as mean squared error does not reveal
issues regarding the bias of a function.
xstd_regress, ystd_regress: order of regression for the parametric calculation of the conditional
standard deviation. Refer to Note 3 in the parameter file for additional considerations.
If ipar=0, the following three lines apply:
nxdis, nydis: number of discretizations for partitioning the x and y conditioning variables.
bxsize, bysize, nbmax: multiplying factor of sample consideration limits in the x and y directions
(refer to Note 4 in the parameter for additional details). Samples within each conditioning bin will be
sorted by distance from the center, with samples above the nbmax threshold discarded. A large
bxsize/bysize, a large number of nxdis/nydis, with a relatively low nbmax may therefore be used to
improve stability of the function calculation in sparse regions of a distribution, while maintaining
appropriate resolution in the dense regions.
iorder: order relations for the x mean and x standard deviation functions (refer to Note 4 in the
parameter file for specifications).
outfl: file for output from the constd transform. This file contains the transformed z variable
appended to the original data file.
outfltrn: output file for the transformation table. Contains conditional bin limits and associated
mean/standard deviation of the transformed variable.

54
Figure 6.11: Par file for constd

The second Conditional Standardization program, constd_b, is used to perform the back-transformation. The
corresponding required parameters are shown in Figure 6.11 and are explained below:

datafl: file with the input data to be transformed.


ixp,iyp,izp: columns for the x ,y, and z variables. Refer to the Note in the parameter figure for
additional considerations.
tmin, tmax: trimming limits to filter out data.
trnfl: file containing the transformation table from the forward constd program.
outfl: file for output. This file contains the back-transformed variable appended to the original data
file.

Figure 6.12: Par file for constd _b

55
Chapter 7

Minimum/Maximum Autocorrelation Factors (MAF)


The Minimum/Maximum Autocorrelation Factors (MAF) transformation is a direct extension of the Principal
Component Analysis (PCA) transformation that was presented in Chapter 5. Like PCA, MAF produces linear
combinations of the original variables that are uncorrelated. Whereas PCA only considers the lag zero covariance
matrix for the removal of correlation, MAF will apply PCA twice on both the zero and non-zero lag covariance
matrices, allowing for a more robust spatial decorrelation of the variables.

7.1 Transformation Concept

Readers are encouraged to review Chapter 5 on PCA, as no further review will be dedicated to the foundation
theory of spectral decomposition and dimension reduction. As demonstrated in that chapter, the PCA
transformation produces linear combinations of the original multivariate variables that are guaranteed to be
uncorrelated at zero lag distance.

Examining the relationship between two normalized PCA factors, it is observed that while uncorrelated at a zero
lag (Figure 7.1), spatial correlation remains according to their experimental cross-semivariogram (Figure 7.2). It
may not be immediately intuitive to relate correlation with the variogram in this figure, but recall that the
variogram for normalized variables is given by Equation 7.1. Furthermore, as z y (0) is equal to zero when working
with PCA factors, the cross-variogram between two PCA factors is simply equal to the negative of their spatial
correlation (Equation 7.2).

z y (h) = z y (0) z y (h) (7.1)

z y (h) = z y (h) (7.2)

The PCA transformation may be thought of as a nave decorrelation of variables, since it is only performs spectral
decomposition on the lag zero covariance matrix. Spatial correlation of the variables is not addressed, with the
method therefore assuming an intrinsic model of coregionalization (IMC), where correlation is independent of
spatial scale (Journel and Huijbregts 1978).

While practically acceptable in many cases, such an assumption may not be sufficient for describing the spatial
correlation between the variables to be modeled. The vertical axis in Figure 7.2 reveals that the remaining
correlation in this example is indeed very small, indicating that the nave application of PCA has largely addressed
the spatial correlation. Nevertheless, correlation remains, meaning that at least in theory, these variables would
require cross-variograms and non-independent simulation.

As described in section 1.3, the linear model of coregionalization (LMC) is often adopted where spatial correlation
exists that is dependent on spatial scale. Variables will be treated in this LMC model as linear combinations of
common nested structures at different spatial scales (Journel and Huijbregts 1978). MAF seeks to decorrelate the
variables at all lags within a two nested structure LMC model, through the application of PCA at both the zero and
non-zero lag distances. In doing so, a more robust spatial decorrelation of the variables will be obtained, allowing
for the more correct application of independent simulation. As a result, it is expected that MAF simulation will

56
provide improved reproduction of the original spatial correlation between variables, relative to the reproduction
of equivalent PCA frameworks.

Figure 7.1: Cross-plot between two principal components.

Figure 7.2: Experimental cross-semivariogram between the two principal components in Figure 7.1.

57
7.2 Transformation Theory

Following (Desbarats and Dimitrakopoulos 2000) and (Elogne and Leuangthong 2008), consider a multivariate
observation vector X (u ) , where n observations have been made for p number of standardized variables at
locations u . Note that the location of data is not considered for the nave PCA application, which is why u has
been added to the nomenclature relative to chapter 5.

x11 L x1n

X (u ) = M O M
x p1 L x pn

The lag zero covariance matrix for this observation vector is then given by Equation 7.3.

1 p p
C X = E { X i (u ) X j (u )} =
n i =1 j =1
X i (u ) X j (u ) = 1 X (u ) X (u )T
n
(7.3)

Applying spectral decomposition to the covariance matrix, the rank ordered eigenvalue matrix D1 and P1 row
eigenvector matrix are determined (Equation 7.4). Recall that as C X is symmetric, the inverse of the eigenvector
matrix is simply equal to its transpose.

1 T
CX = PD
1 1 P1 = PD
1 1 P1 (7.4)

The rows of the P1 matrix are the eigenvectors that correspond with the rank ordered eigenvalues. The
standardized PCA transformed variables V (u ) , are then attained according to Equation 7.5.

V (u ) = D11/ 2 P1 X (u ) (7.5)

A different nomenclature is used here for the transformed V (u ) data matrix than the transformed Y data matrix
1/2
that appeared in Chapter 5, as no standardization is required for the single step PCA transformation. The D1
matrix standardizes the values since the eigenvalues are equal to the variance of the transformed data.
Standardizing the transformed values translates what is already a diagonal covariance matrix CY into an identity
matrix CV , which cannot be altered by any subsequent linear rotation.

A variogram matrix of the transformed V (u ) distribution is given by Equation 7.6, which is also termed in literature
as a covariance matrix of lag h difference vectors CV (h) .

2V ( h) = E {[V (u ) V (u + h)][V (u ) V (u + h) ]} = CV ( h)
(7.6)

58
Observe that spectral decomposition a lag zero covariance matrix (Equation 7.3) produces the P1 coefficients
needed for decorrelation of variables at zero lag. It then leads that spectral decomposition of Equation 7.6 will
produce the P2 coefficients necessary for decorrelation of the variables at lag h (Equation 7.7).

CV (h) = P2 D2 P2T (7.7)

The transformed MAF variables, decorrelated at both zero and h lag distances are then obtained by Equation 7.8.

T (u ) = P2V (u )
(7.8)

Should the spatial data be fully described by a two structure LMC model, the variables will be decorrelated at all
lags. In regards to the back-transformation, combining Equations 7.5 and 7.8 it can be seen that the MAF
transformed factors T (u ) are linear combinations of a single A matrix.

A = P2 D11/ 2 P1 (7.9)

The back-transformation of the MAF factors are then given by Equation 7.10.

X (u ) = A1T (u ) (7.10)

As A is non-symmetric, the inverse is not equal to its transpose, and a method such as Cholesky decomposition
will be required for its calculation.

Practical Implementation Steps

Part 1: PCA Transformation at lag h=0

1) Calculate the covariance matrix for a standardized multivariate observation vector X (u )

C X = 1 X (u ) X (u )T
n

2) Perform spectral decomposition of the C X matrix to determine the orthonormal eigenvalues D1 and
eigenvectors P1 .
C X = P1 D1 P1T

3) Compute the standardized PCA transformations

V (u ) = D11/ 2 P1 X (u )

59
Part 2: PCA Transformation at lag h > 0

4) Compute the experimental omni-directional variogram matrix 2V (h) for the transformed data matrix
V (u )

5) Terming the variogram matrix 2V (h) as the covariance matrix of lag h differences CV (h) , perform
spectral decomposition to determine the orthonormal eigenvalues D2 and row eigenvectors P2 .

CV (h) = P2 D2 P2T

6) Compute the MAF transformed factors

T (u ) = PV
2 (u )

Dimension Reduction

Recall that the D1 matrix is rank ordered according to the magnitude of the entries, which relate to the variance
contribution of the principal components. Vectors associated with eigenvalues of larger magnitude are therefore
thought to contain the more relevant structure for modeling purposes and potential dimension reduction would
involve removing the higher ranked, lower magnitude vectors.

Although similarly rank ordered according to magnitude of the entries, the D2 eigenvalues and associated vectors
convey a significantly different meaning. The eigenvalue entries now relate to the spatial continuity of the
associated MAF factor according to Equation 7.11. A formal proof of this relation is presented by Desbarats and
Dimitrakopoulos (Desbarats and Dimitrakopoulos 2000). It can be seen from Equation 7.11 that high magnitude,
low ranked entries will be associated with high spatial variability, while small magnitude, low ranked entries will
contain greater spatial structure.

D2
Corr[T (u ), T (u + h)] = I
2 (7.11)

Whereas dimension reduction in PCA will therefore consider the removal of higher ranked, low variance factors,
MAF will look to remove lower ranked, higher spatial variability factors.

7.3 Example

Returning to the Nickel Laterite dataset, the MAF transformation will be applied to the Ni, Fe and SiO2 variables
which have already been logratio, conditionally standardized and normal score transformed (Figure 7.3) according
to the workflow in given in Figure 10.3. Conditional standardization was executed to remove non-linear and
heteroscedastic features that would not be captured by the covariance based MAF transform. The normal score
transform immediately precedes MAF due to the need for mean zero variables, and to decrease the influence of
outliers on the covariance and variogram calculations that will follow.

60
Figure 7.3: Cross-plots of the pre-MAF distributions, which undergone prior transformations as displayed.

Figures 7.4 and 7.5 display cross-plots of the transformed factors following the the initial standardized lag zero PCA
transformation, and the second non-zero PCA transformation (MAF) respectively. It is observed that though
differing in the precise bivariate relationships, the transformed PCA and MAF factors are both perfectly
uncorrelated at the zero lag distance. Viewing the experimental cross-semivariogram for these two
transformations in Figure 7.6, however, it is clear that spatial correlation for the MAF transformed variables is
significantly less than the lag zero PCA transform. As touched upon in the dimension reduction theory section, the
reordered ranking between the PCA and MAF factors means that the first PCA factor is equivalent to the third MAF
factor, explaining the comparisons that are made in this Figure 7.6.

While spatial correlation is largely removed by the MAF transform, a marginal amount does remain according to
Figure 7.6. MAF will only remove all spatial correlation for geologic deposits that are fully described by two nested
structures. While two structures is often adequate for modeling geologic desposits, the natural phenomena which
created them will rarely be entirely desribed by a linear model. Such is the case in this example, but MAF has
sufficiently reduced the spatial correlation to a negligable amount, allowing independent simulation to proceed.

61
Figure 7.4: Cross-plots of the PCA transformed distributions.

Figure 7.5: Cross-plots of the MAF transformed distributions.

62
Figure 7.6: Experimental cross-variograms of PCA and MAF transformed distributions. The grey histogram indicates
the relative number of pairs used in the calculation of each lag.

63
Increasing Correlation of MAF transforms

Dimension reduction would not likely be considered for a system as small as three variables, but should this take
place, the following considerations should be made:

1) Applying Equation 7.11 to the D2 eigenvalues, the spatial correlation of each MAF factor is
determined (Figure 7.7).
2) Inspection of the vertical and horizontal experimental semi-variograms for each factor confirms that
increased spatial correlation is seen with increased MAF ranking (Figure 7.8). The normal score of the
MAF factors are shown in this figure because a final normal score transformation is required prior to
simulation to insure marginal Gaussian distributions (Figure 10.3).
3) Should dimension reduction be considered, the lower ranked factors of lower spatial correlation
would be discarded. All factors in this example exhibit spatial structure and are worthwhile to model.
This option of dimension reduction becomes attractive when the spatial structure of factors are
largely explained by the nugget effect.

MAF
Spatial Correlation
0.18
0.16
I - D2/2 (Spatial Correlation)

0.14
0.12
0.1
0.08
0.06
0.04
0.02
0
1 2 3
Min/Max Autocorrelation Factor

Figure 7.7: Spatial correlation of the MAF factors, as obtained from Equation 7.11.

64
Figure 7.8: Experimental semi-variograms and fitted model for the MAF factors. Spatial correlation and structure is
seen to increase with ranking.

65
Simulation and Back-transformation

Simulation and backtransformation took place following the workflow in Figure 10.3. Cross-plots of the back-
transformed values (Figure 7.9) display excellent reproduction with respect to the original complex features shown
in Figure 1.2 and are nearly identical to the PCA workflow results that were displayed in Figure 6.9
(PCA/Conditional Standardization workflow, which is comparable to the MAF/Conditional Standardization
workflow being applied here).

It has been demonstrated that MAF provides a more robust spatial decorrelation of variables, relative to the single
step PCA transform. It is then expected that with all else being equal, independent simulation of the variables
using the MAF transform will produce superior reproduction of the spatial cross-correlation when compared to
independent PCA simulation. Figure 7.10 displays the reproduced experimental cross-semivariogram of four
arbitrarily selected realizations for both the MAF simulated results, as well as the PCA results from Chapter 6. Also
plotted are the experimental cross-semivariogram of the original data for comparison. It is observed that while
both PCA and MAF reproduce the cross-correlations, the MAF simulations are superior in this regard.

Figure 7.9: Cross-plots between simulated values (1 out of every 10,000) of Ni, Fe and SiO2 following
backtransformation.

66
Figure 7.10: Experimental cross-semivariogram for four PCA (left) and MAF (right) simulated realizations.
Experimental cross-semivariogram of the original data are also plotted to displayed the targeted correlation.

67
7.4 Software (maf & maf_b)

The first Minimum/Maximum Autocorrelation program, maf, is used to perform the forward transformation. This
program was adapted from MAF code by Elogne and Leuangthong (Elogne and Leuangthong 2008). The
corresponding required parameters are shown in Figure 7.11 and are explained below:

datafl: file with the input data to be transformed.


numbvar: number of variables to transform.
varcol(i), i=1,,numbvar: columns for variables.
tmin, tmax: trimming limits to filter out data (see note in parameter file).
ix,iy,iz: columns for the x,y,z coordinates.
xlag,xtol1,xtol2: omnidirectional lagsize for the cross-semivariogram calculation, the associated lower
bound lag tolerance (xtol1) and upper bound lag tolerance (xtol2).
dataoute: file for output. This file contains numbvar columns appended to the original data file with
the transformed data values.
dataoutd: file for output from the first PCA transform at lag h=0. This file contains numbvar columns
appended to the original data file with the transformed data values.
dataout: file for output from the MAF transform at lag h. This file contains numbvar columns
appended to the original data file with the transformed data values.
dataoute: transformation information output file. Contains covariance matrices, eigenvalues, row
ordered eigenvectors from the PCA transforms, as well as the original variable names. Formatted and
required by the maf_b program for back-transformation.

Figure 7.11: Par file for maf

The MAF program outputs an information file (Figure 7.12) which contains the critical calculated matrices which
are used for the MAF transformation. This file is useful for understanding the potential for dimension reduction
(MAF eigenvalues) and is formatted to be read by the maf_b progarm, as the A matrix (corresponding to the A
matrix in Equation 7.10) and original variable names are required for the back-transformation.

68
Figure 7.12: Principal Component Analysis Info file which is output from the PCA program

69
The second Minimum/Maximum Autocorrelation program, maf_b, is used to perform the back-transformation.
The corresponding required parameters are shown in Figure7.12 and are explained below:

datafl: file with the input data to be transformed.


nmaf: number of variables to transform.
mafcol(i), i=1,,nmaf: columns for variables.
tmin, tmax: trimming limits to filter out data.
vectorfl: MAF factor information file (dataoutd) output from maf program.
nvector: number of vectors (rows) within the vectorfl row ordered A vector matrix
outfl: file for output. This file contains nvector columns appended to the original data file with the
transformed data values.

Figure 7.13: Par file for maf_b

70
Chapter 8

Alternating Conditional Expectations (ACE)


Alternating Conditional Expectations (ACE) represents a departure from the forward and back transformation
methods that comprise the rest of this guidebook. ACE estimates the optimal transformations of a response and
multiple predictor variables to maximize the correlation and linearity of an additive regression. There are many
valuable applications for such a technique, but perhaps foremost among these for geostatistical analyses include:

Identification of non-linear relationships that exist between modeled variables. These relationships are
often not apparent prior to the application of ACE, and increased understanding of such complexities
may improve subsequent modeling workflows.
Prediction of an unsampled response variable through the ACE regression of sampled predictor variables.
In the presence of non-linear multivariate relationships, ACE has been found to provide dramatically
improved prediction results through regression compared with more common techniques such linear
least squared regression (LSR).

8.1 Transformation Concept

Additive multiple regression is a generalized term that involves the analysis of multiple independent predictor
variables to explain a single dependent response variable. Perhaps the most familiar and widely applied of this
family of techniques is the linear least squared regression (LSR). The additive coefficient equation for LSR is given
by Equation 8.1 and it may be observed that for a bivariate case, the LSR simply takes on the form of a line, while a
trivariate case will create a plane. This bivariate and trivariate LSR is applied to the Nickel Laterite data to
determine the coefficients, with the resultant functions displayed in Figure 8.1. With these line and plane
functions determined, the response variable y (Ni in this case) may be predicted where it is absent and the
predictor variables xi are present.

y = 0 + 1 x1 + 2 x2 ... + p x p (8.1)

Figure 8.1: Single LSR of Ni given Fe produces a predictive line function (left), while multiple LSR of Ni given Fe and
SiO2 produces a predictive plane (right).

71
A drawback of applying this regression is immediately apparent in Figure 8.1, as the assumption of linearity that is
intrinsic to the technique fails to describe a great deal of variation within the non-linear multivariate system. As a
result, regression for prediction with the LSR method will not be ideal with many geologic datasets that possess
complex relations. There are other non-linear parametric forms available for regression (exponential, quadratic,
etc.), but the enforcement of a mathematical functional form is unlikely to capture the true character of naturally
occurring non-linear features.

ACE provides a potential improvement over these parametric techniques in the presence of non-linear
relationships. The method is non-parametric and makes no assumptions of a distributions functional form.
Through the estimation of the optimal transformation functions of the response and predictor variables, it
maximizes linearity and correlation of the multiple regression. Resultant transformation functions have improved
linearity for multiple regression, which will allow for LSR to be more effectively applied in prediction schemes.

This later statement may sound confusing, but will be better demonstrated by the following synthetic example.

ACE Applied to a Parametric Multivariate System

The concept for the following synthetic example was drawn from Estimating Optimal Transformations for Multiple
Regression Using the ACE Algorithm (Wang and Murphy 2003), which is recommended as an excellent tutorial on
the technique. Consider a multivariate system, where a response variable Y is related to five independent
predictor variables X1 ,..., X 5 according to the complex parametric relationship seen in Equation 8.2, where the
predictor values are independently and randomly drawn from a uniform distribution of U(-1,1).

Y = log sin ( X1 ) + X 2 + ( X 3 ) + ( X 4 ) + X 5
2 3
(8.2)

Now imagine that this complex relationship is unknown to an observer. Cross-plots in Figure 8.2 of the response
against each predictor variable demonstrate that this complex system is not remotely apparent through inspection
of the bivariate relationships. Conventional linear methods would struggle to decipher any sort of correlation
between the individual predictors and response in this case.

Remaining blind to the parametric relationship, ACE is applied to this multivariate system. A major advantage of
ACE lies in its 'plug-n-play' nature, since no assumptions of the functional form are required. Simply apply the
transformation and judge whether meaningful relationships have been revealed. When viewing the cross-plot of a
variable with its associated optimal transform function, the ACE determined relationship that the variable holds
within the multivariate system is displayed. The original variables are plotted against their ACE optimal
transformations in Figure 8.3.

72
Figure 8.2:: Response variable (Y-xaxis) vs the five predictor variables (x-axis)

73
Figure 8.3: Optimal transforms (y-axis) vs the associated original variables (x-axis)

Since this is a synthetic example, the plots in Figure 8.3 may now be compared with the known optimal transforms
that are drawn from Equation 8.2 (restated in Equation 8.3 below).

*(Y ) = exp(Y )
*( X 1 ) = sin( X 1 )
*( X 2 ) = X 2
( 8.3)
*( X 3 ) = ( X 3 ) 2
*( X 4 ) = ( X 4 )3
*( X 5 ) = X 5

74
ACE has very accurately identified the complex parametric relationship of this system. One question that arises, is
how to judge the success of ACE in uncovering underlying relationships when the answer is not known and
verifiable? The original authors of the technique (Breiman and Friedman 1985) prove that these transforms do
converge to optimal forms, but the validity of the functions may be partially judged by the linearity of the
associated regression. This simply involves summing the optimal predictor transformations *( X i ) for each


p
observation, and plotting the summed value ( X i ) against the optimal response transformation *(Y )
i =1 i

(Figure 8.4).

Figure 8.4: Optimal response transform (y-xaxis) vs the sum of the optimal predictor transforms (x-axis).

Perfect correlation and prediction is seen, though this was to be expected given the accuracy of the individual
transforms observed in Figure 8.3. In cases where the ACE regression fails to resolve a great deal of the system
variability (less correlation and linearity in the regression plot), it may be interpretted that the response variable is
not fully explained by the transformed predictor variables.

8.2 Transformation Theory

Following the original authors (Breiman and Friedman 1985), let X1 ,..., X5 be the independent predictor variables
and Y be the dependent response variable, while ( X 1 ),..., ( X p ) and (Y ) are the associated, arbitrary mean
zero functions. Optimal transformation functions * ( X 1 ),..., * ( X p ) and *(Y ) exist such that the fraction of


p
2
variance (e ) not explained by a regression of (Y ) on ( X i ) is minimized.
i =1 i

75
2
It has been proven that optimal transformations exist for the variable functions to minimize the e value, as
calculated in Equation 8.4.

p

2

E (Y ) i ( X i )

e2 =
i =1
(8.4)
E { (Y )}
2

Let E { 2 (Y )} = 1 (variance of 1), which simplifies Equation 8.4 to 8.5.

p

2

e = E (Y ) i ( X i )
2
(8.5)
i =1

An iterative algorithm is used to find the optimal transformations which minimize Equation 8.5. This algorithm will
in an alternating fashion, attempt to minimize 8.5 with respect to one function, while simultaneously holding the
other functions constant. These equations take on the form:

p
i ( xi ) = E[ (Y ) ( X j ) | X i ]
j i j
p
E[ i ( X i ) | Y ] 8.6)
i =1
(Y ) = p
E[ i ( X i ) | Y ]
i =1

The iterative minimization of the conditional expectation functions shown above is where Alternating Conditional
Expectations derives its name. Readers interested in a more thorough explanation of the theory are referred to the
1
original paper by Breiman and Friedman , as they demonstrate the application of the two above equations within a
simplified FORTRAN code sequence that makes intuitive sense. Additional considerations in the practical
application of ACE, including its sensitivity to ordering are also discussed at length.

ACE for Regression Prediction

Once the ACE optimal transformation functions have been determined at collocated locations of response and
predictor variables, it may be of great interest within geostatistical frameworks to predict the response variable
where only predictor variables exist. The ACE prediction function is determined by applying an additive linear
regression of the optimal ACE transformed predictor functions on the untransformed response variable (Breiman
and Friedman 1985). The resultant function may then be applied where predictor variables are present in order to
estimate the response. This methodology is implemented in the ACE prediction program, ace_post.

1
Breiman , L, Friedman J. 1985. Estimating optimal transformations for muiltiple regression and
correlation. Journal of the American Statistical Association. V.80,No.391,pp 580-598.

76
Intuition may call into why the untransformed response variable is used by ace_post, since greater linearity is
observed in an additive regression of the optimal predictor functions and the optimal response transformation.
Estimated values in the case, however, would be of the optimal response transformation. Back-transformation of
the optimal response transformation estimate would then be required to obtain the response. What has just been
described is the non-linear transformation of a linear estimate. As discussed in section 1.1, this is very likely to
introduce a bias.

8.3 Example

Applying ACE to the Nickel Laterite dataset, the bivariate relationship of Ni and Fe will first be examined. As seen in
Figure 8.5, the correlation of this system is greatly improved by the ACE transform, though Ni is not fully explained
by the Fe regression. This is evident because a cross-plot of the transform functions leaves a great deal of scatter,
or unresolved variance.

Figure 8.5: Original bivariate relationship of Ni (response) and Fe (predictor) (top left), optimal response vs optimal
predictor (Fe) (top right), optimal response vs. response (bottom left) and optimal predictor vs. predictor (bottom
right).

77
Adding SiO2 and MgO as predictor variables, one would expect the ACE regression of Ni to improve, should these
additional variables be correlated with respected to Ni and not entirely redundant with respect to Fe. Figure 8.6
displays the original bivariate cross-plots between Ni and the three predictor variables, while Figure 8.7 displays
cross-plots between the optimal transformations and their original forms. The cross-plot between the transformed
response and the sum of the transformed predictors (multiple regression plot) is also displayed in Figure 8.7, and
the improved correlation (relative to the bivariate regression in Figure 8.5) confirms that predictive value was
extracted from these additional variables.

Figure 8.6: Original bivariate cross-plots between response (Ni) and the three predictor variables (Fe, SiO2 and
MgO).

78
Figure 8.7: Original variables vs their optimal transforms (left) and the sum of the optimal predictor transforms vs
the optimal response transformation (right).

ACE Prediction of Nickel

ACE will now be used for regression prediction of Ni, given the three predictor variables that were used in the most
recent example (Fe,SiO2 & MgO). To determine the degree of value which the non-linear nature of ACE lends to the
prediction, multiple LSR will also be run for comparison. So that the the predicted values of each method may be
validated, Ni values will first randomly removed from the dataset. This will allow for cross-validation between true
and predicted Ni following the regression prediction. The workflow is as follows:

1. 3000 sample numbers randomly generated for the removal of Ni. Due to the generation of duplicate
sample numbers, 2487 Ni samples were actually removed.
2. ace program is executed on the remaining 5311 samples that contain the Ni (response) and predictor
variables (Fe,SiO2 & MgO). Likewise, LSR is applied to the same sample pool to determine the additive
regression coefficients. Both techniques were performed blind of the removed Ni values.
3. ace_post program run using the ACE transform functions, and the predictor variables for the 2487 sample
intervals of removed Ni. Likewise, the LSR coefficients were applied for prediction on the same sample
pool.
4. Predicted values using ACE and LSR cross-validated against the True values of Ni that were removed in
Step 1.

The cross-validation results are shown in Figure 8.8, and a dramatic improvement is seen in the ACE predicted
values, relative to the LSR predicted values. Non-linear features clearly lend a great deal of value to the prediction
of Ni, as the ACE prediction represents a dramatic improvement in accuracy over the linear LSR, as measured by
the mean squared error, covariance and correlation coefficient.

79
Figure 8.8: Figure 8.11: Regression predicted Ni using LSR (above) and ACE (below), cross-validated against the
removed True values of Ni. Note that LSQ Regression and LSR are used interchangeably here for describing linear
least squares reqression.

80
ACE Prediction in the Absence of Non-linear Features

While ACE has proven its value in the prediction of a non-linear system, a concern that may be raised is whether
the technique will attempt to overfit multivariate systems where non-linear features do not exist. The original
authors (Brieman and Freidman 1985) discuss this issue, noting that ACE should be handled carefully in the
presence of outliers. To investigate this issue, the prediction through regression workflow that was used above on
the Ni Laterite dataset will be applied to a multivariate normal distribution.

A trivariate multivariate normal distribution is composed of a response variable Y and two predictor variables X1
and X2 for 10,000 samples. 3320 Y samples are randomly removed from this distribution, leaving 6680 observations
of the random variables on which to apply ACE (Figure 8.9). As seen in Figure 8.10, the ACE regression produces
virtually no correlation (0.062), confirming that it has not significantly overfit any outliers in this normal
distriubtion. Comparing the cross-validation of the ACE and linear LSR predicted Y values in Figuree 8.11, the two
techniques are seen to produce very similar results.

Figure 8.9: Response variable plotted against the two predictor variables for a multivariate normal distribution of
6680 observations.

81
Figure 8.10: Cross-plot between the response transformation and the sum of the predictor transformations (top
right), as well as cross-plots between the variables and their optimal transformations.

Figure 8.11: Regression predicted Y response variable using ACE_POST (left) and LSR (right), cross-validated against
the removed True values of Y. Note that LSQ Regression and LSR are used interchangeably here for describing
linear least squares reqression.

82
As a smaller sampling of the Gaussian distribution will have a larger influence from outlying observations, a
trivariate system of 1000 samples was created using identical properties to the previous example. 327 samples of
the Y response variable were randomly removed for prediction and cross-validation, leaving 677 samples available
for ACE and linear LSR (Figure 8.12). Figure 8.13 shows that the the outliers do have a larger influence on ACE
with this smaller dataset, as a higher correlation is seen (0.247) following transformation and regression. This
correlation remains extremely low, however, relative to the other ACE examples that have been shown in this
chapter where true correlation exists between the predictors and response. Cross-validation of the predicted Y
response in Figure 8.14 show that while linear LSR does have a marginally better mean squared error compared to
ACE, the two methods remain very compareable in the absence of non-linear features.

It is encouraging to see that while ACE represents a vast improvement over linear LSR for non-linear multivariate
systems, it also remains highly competitive with linear LSR in the absence of such features. Caution and inspection
of results are still recommended when working with highly skewed data or any other data that has significant
outliers.

Figure 8.12: Response variable plotted against the two predictor variables for a multivariate normal distribution of
673 observations.

83
Figure 8.13: Cross-plot between the response transformation and the sum of the predictor transformations (top
right), as well as cross-plots between the variables and their optimal transformations.

Figure 8.14: Regression predicted Y response variable using ACE_POST (left) and LSR (right), cross-validated against
the removed True values of Y. Note that LSQ Regression and LSR are used interchangeably here for describing
linear least squares reqression.

84
8.4 Software (ace & ace_post)

The first Alternating Conditional Expectations program, ace, is used to construct the optimal transformation
functions, using an adapted subroutine from the original ACE authors (Breiman and Friedman, 1985). The
corresponding required parameters are shown in Figure 8.15 and are explained below:

datafl: file with the input data to be transformed.


rcol, wcol: columns for the response variable and weight variable. If no weighting is required, insert a
zero value.
npred: number of predictor variables.
pcol(i), i=1,,npred: columns for predictor variables.
tmin, tmax: trimming limits to filter out data
l(i), i=1,npred+1: functional form of the predictor variables (ordered as they are in the pcol line,
followed by the response variable. See
outfl: file for output. This file contains npred + 2 columns appended to the original data file with the
optimal transformations and sum of the optimal predictor transformations.

Figure 8.15: Par file for ace

The second Alternating Conditional Expectations program, ace_post, is used to predict a response variable where
the predictor variables are present, based on ace output from collocated locations of response and predictor
variables. The corresponding required parameters are shown in Figure 8.16 and are explained below:

datafl: file with collocated response and predictor optimal transformations, output from the ace
program.
tmin, tmax: trimming limits for transformed and original variables.
rcol, sumcol: column numbers for the response variable and sum of the optimal predictor
transformations.
npred: number of predictor variables.

85
pcol(i), i=1,,npred: columns for the predictor variables.
tpcol(i), i=1,,npred: columns for the optimal predictor transformations.
outmeanfl: this file contains regularly spaced occurrences of the response and corresponding sum of
the optimal predictor transformation which fall along the linear regression line that is used as the
prediction function . It is made available for plotting and validation purposes.

Figure 8.16: Par file for ace_post

86
Chapter 9

Histogram Reproduction (TRANS)


Under ideal circumstances, simulated values will match the declustered histograms of the original variables
following back-transformation. Unfortunately, every step of a modeling workflow has the potential to introduce
issues with precise reproduction, including stationarity decisions, transformations, variogram modeling, etc. While
practitioners should pursue as close of reproduction as may reasonably be obtained through exploration and
sensitivity analysis of the modeling workflow, a final transformation may ultimately be required to insure
reproduction of the original distribution. This is likely to be of particular importance in the case of the primary
resource variable.

Given the cumulative distribution function of a simulated realization, or source function Fs ( z ) , transform the
simulated z values to match the declustered distribution, or target distribution Ft ( z ) through a matching of
probabilities as seen in Equation 9.1 (Deutsch and Journel 1998). This transformation is graphically represented in
Figure 9.1.

zt = Ft 1 ( Fs ( z )) (9.1)

Figure 9.1: Schematic illustration of the quantile-to-quantile matching of a simulated source distribution with a
declustered target distribution.

87
The naive application of Equation 9.1 would allow for a perfect reproduction of the original declustered
distribution, but would give no consideration to the spatial location of the simulated values. Values simulated very
2
close to sample locations should honour conditioning data, with a relatively low kriging variance k observed in
the conditional distributions from which the realizations are drawn. Values simulated at great distances from the
data will have a comparatively high uncertainty, with the associated kriging variance of the conditional
2
distributions approaching the global variance G . The trans program (Deutsch and Journel 1998) is implemented
to take advantage of this observation through a weighting of the final corrected zc transformed values, according
to the magnitude of the kriging variance for the simulated value (Equation 9.2).

zc = wc zt + wd z

w
2
where wc = 2k (9.2)
G
wd = 1 wc

As can be seen, the final corrected value will be a weighted combination of the source z value and target zt value,
with the later receiving increasing influence as the kriging variance increases. In doing so, conditioning data will be
honoured at the sample locations, with the histogram correction being more heavily enforced in a smoothly
increasing nature away from the data. Consequently, however, the target distribution will never be perfectly
reproduced.

As priorities will vary among of individual projects and practioners, users specify the w power in Equation 9.2 to
decrease (large w value) or increase (small w value) the magnitude of wc accordingly. Refer to the trans
parameter file description at the end of the chapter for further details.

The cumulative distribution functions for four arbitrarily selected Nickel realizations (source Fs ( z ) ) and the
original declustered data distribution (target Ft ( z ) ) are displayed in Figure 9.2. The trans program was applied to
the simulated values, with the corrected realizations ( zc ) along with the target Ft ( z ) shown Figure 9.3. Nearly
perfect histogram reproduction is achieved by the transformation, though the improvement may be more easily
observed in the quantile-to-quantile plots of Figure 9.4, where the four realizations are considered as a whole
against the declustered distribution.

88
Figure 9.2: CDFs of four uncorrected Ni realizations and the target declustered data distribution.

Figure 9.3: CDFs of four corrected Ni realizations and the target declustered data distribution.

89
Figure 9.4: Quantile-to-quantile plots of i) simulated Nickel realizations vs the declustered data distribution (left)
and ii) TRANS corrected Nickel realizations vs. the declustered data distribution (right).

The parameters for the trans program are displayed in Figure 9.5 and explained below. The program has not been
modified from its original GSLIB form, and the following information simply repeats the second edition of GSLIB
(Deutsch and Journel 1998). Please refer to section VI.2.7 of that book for additional information.

vartype: the variable type (1=continuous, 0=categorical).


refdist: the input data file with the target distribution and weights.
ivr, iwt: the input column for the values and the column for the (declustering) weight. If there are no
declustering weights, then set iwt=0.
datafl: the input file with the distribution(s) to be transformed.
ivrd, iwtd: the column for the values and the declustering weights (0 if none).
tmin, tmax: all values strictly less than tmin and strictly greater than tmax are ignored.
outfl: output file for the transformed values
nx, ny, nz: size of the 3-D model (for categorical variables). When transforming categorical variables it is
essential to consider some type of tie-breaking scheme. A moving window (of the following size) is
considered for tie-breaking when considering a categorical variable.
wx, wy, wz: size of 3D window for categorical variable tie-breaking.
nxyz: the number to transform at a time (when dealing with a continuous variable). Recall that nxyz will
be considered nsets times.
zmin, zmax: are the minimum and maximum values that will be used for extrapolation in the tails.
ltail, ltpar: specify the back-transformation implementation in the lower tail of the distribution: ltail = 1
implements linear interpoloation to the lower limit zmin and ltail = 2 implements power model
interpolation , with w = ltpar, to the lower limit zmin.
utail, utpar: specifiy the back-transformation implementation in the upper tail of the distribution: utail =
1 implements linear interpolation to the upper limit zmax, utail=2 implements power model
interpolation, with 2=utpar, to the upper limit zmax and utail=4 implements hyperbolic model
extrapolation with w=utpar..
transcon: constrain transformation to honor local data? (1=yes, 0=no).
estvfl: an input file with the estimation variance (must be of size nxyz).

90
icolev: column number in estvfl for the estimation variance.
omega: the control parameter for how much weight is given to the original data (0.33 < w < 3.0).
seed: random number seed used when constraining a categorical variable transformation to local data.

Figure 9.5: Par file for the trans program

91
Chapter 10

Chained Transformation Workflows


This summary chapter will provide a general set of ordering guidelines with which to apply the transformations
that have been presented in the preceding chapters. This chapter is not intended as a rigid set of protocols that
must be followed, as there are many valid permutations of chained transformation workflows that may be applied.
uv
Given a complex multivariate observation vector Z , a practitioner may attempt to transform it in order to obtain a
uv
well behaved distribution Y . The transformation tools which are required to achieve this end will depend on the
dimensionality of the data being considered, the complexities within the multivariate distribution, and the
simulation or estimation method to be applied. It may be as simple as applying a normal score transformation to
insure marginal Gaussianity, or as complex as the complete workflow that is presented in Figure 10.1 to achieve
multivariate Gaussianity and decorrelated variables. Regardless of whether one or all of the available
transformations are being applied, Figure 10.1 provides the general order in which the multivariate features should
be addressed. Major points of emphasis include:

Should logratios be required to remove compositional constraints, it must be applied in the first step of
the workflow. If any other transformation is applied before logratios in the forward transformation, it
follows that they will have to be applied after logratios in the back-transformation. It is very likely that the
compositional constraint would not be reproduced under those circumstances. Note that an exception to
this rule exists if the trans program is required for histogram reproduction.
Removal of non-linear and heteroscedastic features precedes decorrelation for two major reasons: i) any
transformations following decorrelation have the potential to reintroduce a measure of correlation and ii)
decorrelation methods such as PCA and MAF more accurately decompose linear and homoscedastic
distributions, and should therefore be preceded by transformations such as conditional standardization.
A final normal score transformation should be applied to all variables which require simulation to insure
that at a minimum, marginal Gaussianity is enforced. While this final transformation has the potential to
reintroduce a measure of correlation as mentioned in the point above, this cannot be avoided if Gaussian
simulation is to be applied.
Back-transformations are undone in the reverse order of which they were applied going forward.

92
Figure 10.1: Generalized forward transformation workflow where a complex multivariate distribution is
transformed to an approximate multivariate Guassian distribution for geostatistical modeling.

Selecting specific transformations from Figure 10.1, two demonstrative and more specific workflows are displayed
in Figures 10.2 and 10.3, which revolve around the stepwise conditional transformation and PCA/MAF respectively.
Every case study within this Guidebook (apart from ACE) is derived from one of these two workflows, although
uuv
specific steps and associated intermediate X i observation vectors are generally all that is displayed. Points of
emphasis regarding Figure 10.2 include:

The normal score transformation preceding the SC transformation is optional, but may be recommended
in the case of a nested SC applications where multiple first conditioning variables are used. Refer to
Figure 4.9 and the associated description for more information on this item.
Comparing the SC workflow with a potential PCA/MAF workflow in Figure 10.3, it is apparent that later
has the potential to involve more intermediate steps. An attractive feature of SC is indeed that complex
features are removed, variables are decorrelated and marginal distributions made Gaussian in a single
step. In the case of many variables or very few data, however, the SC transformation may present
implementation challenges in removing correlation between all variables. Furthermore that it does not
provide decorrelation beyond the zero lag distance. PCA/MAF may be appropriate if such issues are a
cause for concern.

93
Figure 10.2: Generalized forward and backward modeling workflow revolving around the logratio and stepwise
condtional transformations.

Points of emphasis regarding Figure 10.3 include:

The normal score transformation preceding the Conditional Standardization is optional, though highly
skewed distributions may present issues in the calculation of stable conditional statistics along the
margins of the distributions. While the associated CCG program is implemented to expand the search for
conditioning data beyond bin limits, a normal score transformation is another option to condense a
distribution and avoid the potential issues.
The normal score transformation preceding the PCA/MAF application suits very different purposes. First, a
zero mean is required of the variables in order to apply PCA/MAF. While this may be achieved simply by
subtracting the mean to produce residuals, a normal score transformation is recommended at this step if
any outlieing values are also present. The subsequent PCA/MAF transformation will be based around a
decomposition of the covariance matrix, and this statistical measure is very sensitive to outliers.
PCA and MAF are linear in nature and therefore may faciiliate the unbiased transformation of estimations.
To allow for this, however, non-linear transformations must not be used in conjunction.

94
Figure 10.3: Generalized forward and backward modeling workflow revolving around the logratio, conditional
standardization and PCA/MAF transformations.

95
Bibliography

Almeida, AS, Journel, AG. 1994. Joint simulation of multiple variables with a Markov-type corregionalization model.
Math Geology. No.26: 369-386.

Boisvert, JB, Rossi, ME, Deutsch, CV. 2009. Multivariate Geostatistical Simulation of Proportions and Nonadditive
Geometallurgical Variables. Centre for Computational Geostatistics Annual Report: Vol. 9: 303-1 303-8.

Breiman, L, Friedman, JH. 1985. Estimating Optimal Transformations for Multiple Regression and Correlation.
American Statistical Association. Vol.80, No.391: 580 598.

Desbarats, AJ, Dimitrakopoulos, R.2000. Geostatistical simulation of regionalized poresize distributions using
min/max autocorrelation factors. Mathematical Geology. Vol.32, No.8: 919 942.

Deutsch, CV, Journel, AG. 1998. GSLIB: Geostatistical Software Library and Users Guide: 2nd edition. New York:
Oxford University Press.

Deutsch, CV. 2006. Stepwise Conditional Transformation in Estimation Mode. Centre for Computational
Geostatistics Annual Report. Vol. 8: 122-1 -122-9.

Elogne, S, Leuangthong, O. 2008. Implementation of Min/Max Autocorrelation Factors and Application to a Real
Data Example. Centre for Computational Geostatistics Annual Report: Vol. 10: 406-1 -122-6.

Isaaks, E. 1990. The application of Monte Carlo methods to the Anlysis of Spatially Correlated Data. PhD thesis,
Stanford University.

Johnson, RA, Wichern, DW. 1988. Applied Multivariate Statistical Analysis. New Jersey: Prentice Hall. p. 340 370.

Journel, AG, Huijbregts, CJ. 1978. Mining Geostatistics. New York: Academic Press.

Klovan, JE. 1966. The use of factor analysis in determining depositional environments from grain-size distributions.
Journal of Sedimentary Petrology. Vol.36: p 115-125.

Leuangthong, O, Deutsch, CV. 2003. Stepwise Conditional Transformation for Simulation of Multiple Variables.
Mathematical Geology. Vol.35, No.2: p155-172.

Leuangthong, O. 2003. Stepwise Conditional Transformation for Multivariate Geostatistical Simulation, PhD Thesis,
University of Alberta.

Luster, GR. 1985. Raw Materials for Portland Cement: Applications of Conditional Simulation of Coregionalization.
PhD thesis, Stanford University.

96
Lyster, S, Deutsch, CV. 2004. PostMG: A Postprocessing Program for Multigaussian Kriging Output. Centre for
Computational Geostatistics Annual Report. Vol. 6: 405-1 405-5.

Manchuk, J. 2008. CCG Guidebook Series, Vol.7, Guide to Geostatistics with Compositional Data. University of
Alberta: Centre for Computational Geostatistics.

Pawlowsky-Glawh V, Egozcue JJ. 2006. Compositional data and their analysis: an introduction. Buccianti, A, Mateu-
Figueras, G, and Pawlowsky-Glahn, V (eds.), Compositional data analysis in the geosciences:from theory to
practice. London: Geological Society Special Publication. Vol.264: 1-10.

Ortiz, JM, Deutsch, CB. 2003. Uncertainty Upscaling. Centre for Computational Geostatistics Annual Report. Vol. 5,
Section 22: 1 14.

Rosenblatt, M. 1952. Remarks on a multivariate transformation. Annals of Mathematical Statistics. Vol.23, No.3:
p.470-472.

Schlens, J. 2003. A Tutorial on Principal Component Analysis: Derivation, Discussion and Singular Value
Decomposition. [Internet]. [cited 2011 April 22]. Available from:
http://www.cfm.brown.edu/people/gk/APMA2821F/PCA-Tutorial-Intuition_jp.pdf

Switzer, P, Green, A. 1984. Min/Max autocorrelation factors for multivariate spatial imaging. Stanford University:
Department of Statistics, Technical Report No.6. 14p.

Wang, D, Murphy, M. 2004. Estimating Optimal Transformations for Multiple Regression Using the ACE Algorithm.
Journal of Data Science, 2(2004): 329 346.

Xu, G, Datta-Gupta, A. 1996. A new approach to seismic data integration during reservoir characterization using
optimal non-parametric transformations. Society of Petroleum Engineers:1996 SPE Annual Technical Conference
and Exhibition, Denver, Colorado.

Zwahlen, E, Patzek, T. 1997. A comparison of mapping schemes for reservoir characterization. Society of Petroleum
Engineers: 1997 SPE Western Regional Meeting. Long Beach, CA.

97

También podría gustarte