Details

Compositional Data Analysis


Compositional Data Analysis

Theory and Applications
1. Aufl.

von: Vera Pawlowsky-Glahn, Antonella Buccianti

80,99 €

Verlag: Wiley
Format: EPUB
Veröffentl.: 24.08.2011
ISBN/EAN: 9781119977612
Sprache: englisch
Anzahl Seiten: 400

DRM-geschütztes eBook, Sie benötigen z.B. Adobe Digital Editions und eine Adobe ID zum Lesen.

Beschreibungen

It is difficult to imagine that the statistical analysis of compositional data has been a major issue of concern for more than 100 years. It is even more difficult to realize that so many statisticians and users of statistics are unaware of the particular problems affecting compositional data, as well as their solutions. The issue of ``spurious correlation'', as the situation was phrased by Karl Pearson back in 1897, affects all data that measures parts of some whole, such as percentages, proportions, ppm and ppb. Such measurements are present in all fields of science, ranging from geology, biology, environmental sciences, forensic sciences, medicine and hydrology. <p>This book presents the history and development of compositional data analysis along with Aitchison's log-ratio approach. <i>Compositional Data Analysis</i> describes the state of the art both in theoretical fields as well as applications in the different fields of science.</p> <p><br /> Key Features:</p> <ul> <li>Reflects the state-of-the-art in compositional data analysis.</li> <li>Gives an overview of the historical development of compositional data analysis, as well as basic concepts and procedures.</li> <li>Looks at advances in algebra and calculus on the simplex.</li> <li>Presents applications in different fields of science, including, genomics, ecology, biology, geochemistry, planetology, chemistry and economics.</li> <li>Explores connections to correspondence analysis and the Dirichlet distribution.</li> <li>Presents a summary of three available software packages for compositional data analysis.</li> <li>Supported by an accompanying website featuring R code.</li> </ul> <p>Applied scientists working on compositional data analysis in any field of science, both in academia and professionals will benefit from this book, along with graduate students in any field of science working with compositional data.</p>
<b>Preface xvii</b> <p><b>List of Contributors xix</b></p> <p><b>Part I Introduction</b> 1</p> <p>1 A Short History of Compositional Data Analysis 3<br /> <i>John Bacon-Shone</i></p> <p>1.1 Introduction 3</p> <p>1.2 Spurious Correlation 3</p> <p>1.3 Log and Log-Ratio Transforms 4</p> <p>1.4 Subcompositional Dependence 5</p> <p>1.5 alr, clr, ilr: Which Transformation to Choose? 5</p> <p>1.6 Principles, Perturbations and Back to the Simplex 6</p> <p>1.7 Biplots and Singular Value Decompositions 7</p> <p>1.8 Mixtures 7</p> <p>1.9 Discrete Compositions 8</p> <p>1.10 Compositional Processes 8</p> <p>1.11 Structural, Counting and Rounded Zeros 8</p> <p>1.12 Conclusion 9</p> <p>Acknowledgement 9</p> <p>References 9</p> <p>2 Basic Concepts and Procedures 12<br /> <i>Juan Jos´e Egozcue and Vera Pawlowsky-Glahn</i></p> <p>2.1 Introduction 12</p> <p>2.2 Election Data and Raw Analysis 13</p> <p>2.3 The Compositional Alternative 15</p> <p>2.3.1 Scale Invariance: Vectors with Proportional Positive Components Represent the Same Composition 15</p> <p>2.3.2 Subcompositional Coherence: Analyses Concerning a Subset of Parts Must Not Depend on Other Non-Involved Parts 16</p> <p>2.3.3 Permutation Invariance: The Conclusions of a Compositional Analysis Should Not Depend on the Order of the Parts 17</p> <p>2.4 Geometric Settings 17</p> <p>2.5 Centre and Variability 22</p> <p>2.6 Conclusion 27</p> <p>Acknowledgements 27</p> <p>References 27</p> <p><b>Part II Theory – Statistical Modelling</b> 29</p> <p>3 The Principle of Working on Coordinates 31<br /> <i>Glòria Mateu-Figueras, Vera Pawlowsky-Glahn and Juan José Egozcue</i></p> <p>3.1 Introduction 31</p> <p>3.2 The Role of Coordinates in Statistics 32</p> <p>3.3 The Simplex 33</p> <p>3.3.1 Basis of the Simplex 34</p> <p>3.3.2 Working on Orthonormal Coordinates 35</p> <p>3.4 <i>Move</i> or <i>Stay</i> in the Simplex 38</p> <p>3.5 Conclusions 40</p> <p>Acknowledgements 41</p> <p>References 41</p> <p>4 Dealing with Zeros 43<br /> <i>Josep Antoni Martún-Fernández, Javier Palarea-Albaladejo and Ricardo Antonio Olea</i></p> <p>4.1 Introduction 43</p> <p>4.2 Rounded Zeros 44</p> <p>4.2.1 Non-Parametric Replacement of Rounded Zeros 45</p> <p>4.2.2 Parametric Modified EM Algorithm for Rounded Zeros 47</p> <p>4.3 Count Zeros 50</p> <p>4.4 Essential Zeros 53</p> <p>4.5 Difficulties, Troubles and Challenges 55</p> <p>Acknowledgements 57</p> <p>References 57</p> <p>5 Robust Statistical Analysis 59<br /> <i>Peter Filzmoser and Karel Hron</i></p> <p>5.1 Introduction 59</p> <p>5.2 Elements of Robust Statistics from a Compositional Point of View 60</p> <p>5.3 Robust Methods for Compositional Data 63</p> <p>5.3.1 Multivariate Outlier Detection 64</p> <p>5.3.2 Principal Component Analysis 64</p> <p>5.3.3 Discriminant Analysis 65</p> <p>5.4 Case Studies 66</p> <p>5.4.1 Multivariate Outlier Detection 66</p> <p>5.4.2 Principal Component Analysis 68</p> <p>5.4.3 Discriminant Analysis 68</p> <p>5.5 Summary 70</p> <p>Acknowledgement 71</p> <p>References 71</p> <p>6 Geostatistics for Compositions 73<br /> <i>Raimon Tolosana-Delgado, Karl Gerald van den Boogaart and Vera Pawlowsky-Glahn</i></p> <p>6.1 Introduction 73</p> <p>6.2 A Brief Summary of Geostatistics 74</p> <p>6.3 Cokriging of Regionalised Compositions 76</p> <p>6.4 Structural Analysis of Regionalised Composition 76</p> <p>6.5 Dealing with Zeros: Replacement Strategies and Simplicial Indicator Cokriging 78</p> <p>6.6 Application 79</p> <p>6.6.1 Delimiting the Body: Simplicial Indicator Kriging 81</p> <p>6.6.2 Interpolating the Oil–Brine–Solid Content 82</p> <p>6.7 Conclusions 84</p> <p>Acknowledgements 84</p> <p>References 84</p> <p>7 Compositional VARIMA Time Series 87<br /> <i>Carles Barceló-Vidal, Lucúa Aguilar and Josep Antoni Martún-Fernández</i></p> <p>7.1 Introduction 87</p> <p>7.2 The Simplex <i>SD</i> as a Compositional Space 89</p> <p>7.2.1 Basic Concepts and Notation 89</p> <p>7.2.2 The Covariance Structure on the Simplex 90</p> <p>7.3 Compositional Time Series Models 91</p> <p>7.3.1 <i>C</i>-Stationary Processes 92</p> <p>7.3.2 <i>C</i>-VARIMA Processes 93</p> <p>7.4 CTS Modelling: An Example 94</p> <p>7.4.1 Expenditure Shares in the UK 94</p> <p>7.4.2 Model Selection 95</p> <p>7.4.3 Estimation of Parameters 96</p> <p>7.4.4 Interpretation and Comparison 96</p> <p>7.5 Discussion 99</p> <p>Acknowledgements 99</p> <p>References 100</p> <p>Appendix 102</p> <p>8 Compositional Data and Correspondence Analysis 104<br /> <i>Michael Greenacre</i></p> <p>8.1 Introduction 104</p> <p>8.2 Comparative Technical Definitions 105</p> <p>8.3 Properties and Interpretation of LRA and CA 107</p> <p>8.4 Application to Fatty Acid Compositional Data 107</p> <p>8.5 Discussion and Conclusions 111</p> <p>Acknowledgements 112</p> <p>References 112</p> <p>9 Use of Survey Weights for the Analysis of Compositional Data 114<br /> <i>Monique Graf</i></p> <p>9.1 Introduction 114</p> <p>9.2 Elements of Survey Design 115</p> <p>9.2.1 Randomization 115</p> <p>9.2.2 Design-Based Estimation 118</p> <p>9.3 Application to Compositional Data 122</p> <p>9.3.1 Weighted Arithmetic and Geometric Means 123</p> <p>9.3.2 Closed Arithmetic Mean of Amounts 123</p> <p>9.3.3 Centred Log-Ratio of the Geometric Mean Composition 124</p> <p>9.3.4 Closed Geometric Mean Composition 124</p> <p>9.3.5 Example: Swiss Earnings Structure Survey (SESS) 125</p> <p>9.4 Discussion 126</p> <p>References 126</p> <p>10 Notes on the Scaled Dirichlet Distribution 128<br /> <i>Gianna Serafina Monti, Glòria Mateu-Figueras and Vera Pawlowsky-Glahn</i></p> <p>10.1 Introduction 128</p> <p>10.2 Genesis of the Scaled Dirichlet Distribution 129</p> <p>10.3 Properties of the Scaled Dirichlet Distribution 131</p> <p>10.3.1 Graphical Comparison 131</p> <p>10.3.2 Membership in the Exponential Family 133</p> <p>10.3.3 Measures of Location and Variability 134</p> <p>10.4 Conclusions 136</p> <p>Acknowledgements 137</p> <p>References 137</p> <p><b>Part III Theory – Algebra and Calculus</b> 139</p> <p>11 Elements of Simplicial Linear Algebra and Geometry 141<br /> <i>Juan José Egozcue, Carles Barceló-Vidal, Josep Antoni Martún-Fernández, Eusebi Jarauta-Bragulat, José Luis Dúaz-Barrero and Glòria Mateu-Figueras</i></p> <p>11.1 Introduction 141</p> <p>11.2 Elements of Simplicial Geometry 142</p> <p>11.2.1 <i>n</i>-Part Simplex 142</p> <p>11.2.2 Vector Space 143</p> <p>11.2.3 Centred Log-Ratio Representation 146</p> <p>11.2.4 Metrics 147</p> <p>11.2.5 Orthonormal Basis and Coordinates 149</p> <p>11.3 Linear Functions 151</p> <p>11.3.1 Linear Functions Defined on the Simplex 152</p> <p>11.3.2 Simplicial Linear Function Defined on a Real Space 153</p> <p>11.3.3 Simplicial Linear Function Defined on the Simplex 154</p> <p>11.4 Conclusions 156</p> <p>Acknowledgements 156</p> <p>References 156</p> <p>12 Calculus of Simplex-Valued Functions 158<br /> <i>Juan José Egozcue, Eusebi Jarauta-Bragulat and José Luis Díaz-Barrero</i></p> <p>12.1 Introduction 158</p> <p>12.2 Limits, Continuity and Differentiability 161</p> <p>12.2.1 Limits and Continuity 161</p> <p>12.2.2 Differentiability 163</p> <p>12.2.3 Higher Order Derivatives 169</p> <p>12.3 Integration 171</p> <p>12.3.1 Antiderivatives. Indefinite Integral 171</p> <p>12.3.2 Integration of Continuous SV Functions 172</p> <p>12.4 Conclusions 174</p> <p>Acknowledgements 175</p> <p>References 175</p> <p>13 Compositional Differential Calculus on the Simplex 176<br /> <i>Carles Barceló-Vidal, Josep Antoni Martún-Fernández and Glòria Mateu-Figueras</i></p> <p>13.1 Introduction 176</p> <p>13.2 Vector-Valued Functions on the Simplex 177</p> <p>13.2.1 Scale-Invariant Vector-Valued Functions on R<i>n</i> + 177</p> <p>13.2.2 Vector-Valued Functions on <i>Sn</i> 178</p> <p>13.3 <i>C</i>-Derivatives on the Simplex 178</p> <p>13.3.1 Derivative of a Scale-Invariant Vector-Valued Function on R<i>n</i> + 178</p> <p>13.3.2 Directional <i>C</i>-Derivatives 180</p> <p>13.3.3 <i>C</i>-Derivative 182</p> <p>13.3.4 <i>C</i>-Gradient 184</p> <p>13.3.5 Critical Points of a <i>C</i>-Differentiable Real-Valued Function on <i>Sn</i> 184</p> <p>13.4 Example: Experiments with Mixtures 185</p> <p>13.4.1 Polynomial of Degree One 185</p> <p>13.4.2 Polynomial of Degree Two 186</p> <p>13.4.3 Polynomial of Degree One in Logarithms 187</p> <p>13.4.4 A numerical Example 188</p> <p>13.5 Discussion 189</p> <p>Acknowledgements 190</p> <p>References 190</p> <p><b>Part IV Applications</b> 191</p> <p>14 Proportions, Percentages, PPM: Do the Molecular Biosciences Treat Compositional Data Right? 193<br /> <i>David Lovell, Warren Müller, Jen Taylor, Alec Zwart and Chris Helliwell</i></p> <p>14.1 Introduction 193</p> <p>14.2 The Omics Imp and Two Bioscience Experiment Paradigms 194</p> <p>14.3 The Impact of Compositional Constraints in the Omics 197</p> <p>14.3.1 Univariate Impact of Compositional Constraints 197</p> <p>14.3.2 Impact of Compositional Constraints on Multivariate</p> <p>Distance Metrics 199</p> <p>14.4 Impact of Compositional Constraints on Correlation and Covariance 201</p> <p>14.4.1 Compositional Constraints, Covariance, Correlation and Log-Transformed Data 202</p> <p>14.4.2 A Simulation Approach to Understanding the Impact of Closure 202</p> <p>14.5 Implications 204</p> <p>14.5.1 Gathering Information to Infer Absolute Abundance 204</p> <p>14.5.2 Analysing Compositional Omics Data Appropriately 205</p> <p>Acknowledgements 206</p> <p>References 206</p> <p>15 Hardy–Weinberg Equilibrium: A Nonparametric Compositional Approach 208<br /> <i>Jan Graffelman and Juan José Egozcue</i></p> <p>15.1 Introduction 208</p> <p>15.2 Genetic Data Sets 209</p> <p>15.3 Classical Tests for HWE 210</p> <p>15.4 A Compositional Approach 210</p> <p>15.5 Example 214</p> <p>15.6 Conclusion and Discussion 215</p> <p>Acknowledgements 215</p> <p>References 215</p> <p>16 Compositional Analysis in Behavioural and Evolutionary Ecology 218<br /> <i>Michele Edoardo Raffaele Pierotti and Josep Antoni Martún-Fernández</i></p> <p>16.1 Introduction 218</p> <p>16.2 CODA in Population Genetics 219</p> <p>16.3 CODA in Habitat Choice 222</p> <p>16.4 Multiple Choice and Individual Variation in Preferences 224</p> <p>16.5 Ecological Specialization 228</p> <p>16.6 Time Budgets: More on Specialization 229</p> <p>16.7 Conclusions 231</p> <p>Acknowledgements 231</p> <p>References 231</p> <p>17 Flying in Compositional Morphospaces: Evolution of Limb Proportions in Flying Vertebrates 235<br /> <i>Luis Azevedo Rodrigues, Josep Daunis-i-Estadella, Glòria Mateu-Figueras and Santiago Thi´o-Henestrosa</i></p> <p>17.1 Introduction 235</p> <p>17.2 Flying Vertebrates – General Anatomical and Functional Characteristics 236</p> <p>17.3 Materials 236</p> <p>17.4 Methods 238</p> <p>17.5 Aitchison Distance Disparity Metrics 239</p> <p>17.5.1 Intragroup Aitchison Distance 239</p> <p>17.5.2 Intergroup Aitchison Distance 240</p> <p>17.6 Statistical Tests 243</p> <p>17.7 Biplots 244</p> <p>17.7.1 Chiroptera 244</p> <p>17.7.2 Pterosauria 245</p> <p>17.8 Balances 246</p> <p>17.9 Size Effect 249</p> <p>17.10 Final Remarks 249</p> <p>17.10.1 All Groups 250</p> <p>17.10.2 Aves 250</p> <p>17.10.3 Pterosauria 250</p> <p>17.10.4 Chiroptera 251</p> <p>Acknowledgements 252</p> <p>References 252</p> <p>18 Natural Laws Governing the Distribution of the Elements in Geochemistry: The Role of the Log-Ratio Approach 255<br /> <i>Antonella Buccianti</i></p> <p>18.1 Introduction 255</p> <p>18.2 Geochemical Processes and Log-Ratio Approach 256</p> <p>18.3 Log-Ratio Approach and Water Chemistry 258</p> <p>18.4 Log-Ratio Approach and Volcanic Gas Chemistry 261</p> <p>18.5 Log-Ratio Approach and Subducting Sediment Composition 263</p> <p>18.6 Conclusions 265</p> <p>Acknowledgements 265</p> <p>References 265</p> <p>19 Compositional Data Analysis in Planetology: The Surfaces of Mars and Mercury 267<br /> <i>Helmut Lammer, Peter Wurz, Josep Antoni Martún-Fernández and Herbert Iwo Maria Lichtenegger</i></p> <p>19.1 Introduction 267</p> <p>19.1.1 Mars 267</p> <p>19.1.2 Mercury 269</p> <p>19.1.3 Analysis of Surface Composition 270</p> <p>19.2 Compositional Analysis of Mars’ Surface 270</p> <p>19.3 Compositional Analysis of Mercury’s Surface 274</p> <p>19.4 Conclusion 278</p> <p>Acknowledgement 278</p> <p>References 278</p> <p>20 Spectral Analysis of Compositional Data in Cyclostratigraphy 282<br /> <i>Eulogio Pardo-Igúzquiza and Javier Heredia</i></p> <p>20.1 Introduction 282</p> <p>20.2 The Method 283</p> <p>20.3 Case Study 285</p> <p>20.4 Discussion 287</p> <p>20.5 Conclusions 288</p> <p>Acknowledgement 288</p> <p>References 288</p> <p>21 Multivariate Geochemical Data Analysis in Physical Geography 290<br /> <i>Jennifer McKinley and Christopher David Lloyd</i></p> <p>21.1 Introduction 290</p> <p>21.2 Context 291</p> <p>21.3 Data 293</p> <p>21.4 Analysis 295</p> <p>21.5 Discussion 299</p> <p>21.6 Conclusion 300</p> <p>Acknowledgement 300</p> <p>References 300</p> <p>22 Combining Isotopic and Compositional Data: A Discrimination of Regions Prone to Nitrate Pollution 302<br /> <i>Roger Puig, Raimon Tolosana-Delgado, Neus Otero and Albert Folch</i></p> <p>22.1 Introduction 302</p> <p>22.2 Study Area 303</p> <p>22.2.1 Maresme 304</p> <p>22.2.2 Osona 305</p> <p>22.2.3 Lluc¸an`es 305</p> <p>22.2.4 Empord`a 306</p> <p>22.2.5 Selva 306</p> <p>22.3 Analytical Methods 306</p> <p>22.4 Statistical Treatment 307</p> <p>22.4.1 Data Scaling 307</p> <p>22.4.2 Linear Discriminant Analysis 309</p> <p>22.4.3 Discriminant Biplots 310</p> <p>22.5 Results and Discussion 311</p> <p>22.6 Conclusions 314</p> <p>Acknowledgements 315</p> <p>References 315</p> <p>23 Applications in Economics 318<br /> <i>Tim Fry</i></p> <p>23.1 Introduction 318</p> <p>23.2 Consumer Demand Systems 319</p> <p>23.3 Miscellaneous Applications 322</p> <p>23.4 Compositional Time Series 323</p> <p>23.5 New Directions 323</p> <p>23.6 Conclusion 325</p> <p>References 325</p> <p><b>Part V Software</b> 327</p> <p>24 Exploratory Analysis Using CoDaPack 3D 329<br /> <i>Santiago Thió-Henestrosa and Josep Daunis-i-Estadella</i></p> <p>24.1 CoDaPack 3D Description 329</p> <p>24.2 Data Set Description 331</p> <p>24.3 Exploratory Analysis 333</p> <p>24.3.1 Numerical Analysis 333</p> <p>24.3.2 Biplot 334</p> <p>24.3.3 The Ternary Diagram 335</p> <p>24.3.4 Principal Component Analysis 336</p> <p>24.3.5 Balance-Dendrogram 336</p> <p>24.3.6 By Groups Description 338</p> <p>24.4 Summary and Conclusions 339</p> <p>Acknowledgements 340</p> <p>References 340</p> <p>25 robCompositions: An R-package for Robust Statistical Analysis of Compositional Data 341<br /> <i>Matthias Templ, Karel Hron and Peter Filzmoser</i></p> <p>25.1 General Information on the R-package robCompositions 341</p> <p>25.1.1 Data Sets Included in the Package 342</p> <p>25.1.2 Design Principles 343</p> <p>25.2 Expressing Compositional Data in Coordinates 343</p> <p>25.3 Multivariate Statistical Methods for Compositional Data Containing Outliers 345</p> <p>25.3.1 Multivariate Outlier Detection 345</p> <p>25.3.2 Principal Component Analysis and the Robust Compositional Biplot 347</p> <p>25.3.3 Discriminant Analysis 350</p> <p>25.4 Robust Imputation of Missing Values 351</p> <p>25.5 Summary 354</p> <p>References 354</p> <p>26 Linear Models with Compositions in R 356<br /> <i>Raimon Tolosana-Delgado and Karl Gerald van den Boogaart</i></p> <p>26.1 Introduction 356</p> <p>26.2 The Illustration Data Set 357</p> <p>26.2.1 The Data 357</p> <p>26.2.2 Descriptive Analysis of Compositional Characteristics 358</p> <p>26.3 Explanatory Binary Variable 360</p> <p>26.4 Explanatory Categorical Variable 363</p> <p>26.5 Explanatory Continuous Variable 365</p> <p>26.6 Explanatory Composition 367</p> <p>26.7 Conclusions 370</p> <p>Acknowledgement 371</p> <p>References 371</p> <p><b>Index 373</b></p>
<p><strong>Vera Pawlowsky-Glahn</strong>, Department of Computer Science and Applied Mathematics, University of Girona, Spain. <p><strong>Antonella Buccianti</strong>, Department of Earth Sciences, University of Florence, Italy.
<b>Compositional Data Analysis: Theory and Applications</b> <p><b>Edited by</b></p> <p><b>Vera Pawlowsky-Glahn,</b> Department of Computer Science and Applied Mathematics, University of Girona, Spain.</p> <p><b>Antonella Buccianti</b>, Department of Earth Sciences, University of Florence, Italy</p> <p>It is difficult to imagine that the statistical analysis of compositional data has been a major issue of concern for more than 100 years. It is even more difficult to realize that so many statisticians and users of statistics are unaware of the particular problems affecting compositional data, as well as their solutions. The issue of ``spurious correlation'', as the situation was phrased by Karl Pearson back in 1897, affects all data that measures parts of some whole, such as percentages, proportions, ppm and ppb. Such measurements are present in all fields of science, ranging from geology, biology, environmental sciences, forensic sciences, medicine and hydrology.</p> <p>This book presents the history and development of compositional data analysis along with Aitchison's log-ratio approach. <i>Compositional Data Analysis</i> describes the state of the art both in theoretical fields as well as applications in the different fields of science.</p> <p><br /> Key Features:</p> <p>• Reflects the state-of-the-art in compositional data analysis.<br /> • Gives an overview of the historical development of compositional data analysis, as well as basic concepts and procedures.<br /> • Looks at advances in algebra and calculus on the simplex.<br /> • Presents applications in different fields of science, including, genomics, ecology, biology, geochemistry, planetology, chemistry and economics.<br /> • Explores connections to correspondence analysis and the Dirichlet distribution.<br /> • Presents a summary of three available software packages for compositional data analysis.<br /> • Supported by an accompanying website featuring R code.</p> <p>Applied scientists working on compositional data analysis in any field of science, both in academia and professionals will benefit from this book, along with graduate students in any field of science working with compositional data.</p>

Diese Produkte könnten Sie auch interessieren:

Modeling Uncertainty
Modeling Uncertainty
von: Moshe Dror, Pierre L'Ecuyer, Ferenc Szidarovszky
PDF ebook
236,81 €
Continuous Bivariate Distributions
Continuous Bivariate Distributions
von: N. Balakrishnan, Chin Diew Lai
PDF ebook
106,99 €
Nonlinear Regression with R
Nonlinear Regression with R
von: Christian Ritz, Jens Carl Streibig
PDF ebook
74,89 €