The Book of R is a comprehensive, beginner-friendly guide to R, the world’s most popular programming language for statistical analysis. Even if you have no programming experience and little more than a grounding in the basics of mathematics, you’ll find everything you need to begin using R effectively for statistical analysis. You’ll start with the basics, like how to handle data and write simple programs, before moving on to more advanced topics, like producing statistical summaries of your data and performing statistical tests and modeling. You’ll even learn how to create impressive data visualizations with R’s basic graphics tools and contributed packages, like ggplot2 and ggvis, as well as interactive 3D visualizations using the rgl package. Dozens of hands-on exercises (with downloadable solutions) take you from theory to practice, as you learn: *The fundamentals of programming in R, including how to write data frames, create functions, and use variables, statements, and loops *Statistical concepts like exploratory data analysis, probabilities, hypothesis tests, and regression modeling, and how to execute them in R *How to access R’s thousands of functions, libraries, and data sets *How to draw valid and useful conclusions from your data *How to create publication-quality graphics of your results Combining detailed explanations with real-world examples and exercises, this book will provide you with a solid understanding of both statistics and the depth of R’s functionality. Make The Book of R your doorway into the growing world of data analysis.
Statistics and computing share many close relationships. Computing now permeates every aspect of statistics, from pure description to the development of statistical theory. At the same time, the computational methods used in statistical work span much of computer science. Elements of Statistical Computing covers the broad usage of computing in statistics. It provides a comprehensive account of the most important computational statistics. Included are discussions of numerical analysis, numerical integration, and smoothing. The author give special attention to floating point standards and numerical analysis; iterative methods for both linear and nonlinear equation, such as Gauss-Seidel method and successive over-relaxation; and computational methods for missing data, such as the EM algorithm. Also covered are new areas of interest, such as the Kalman filter, projection-pursuit methods, density estimation, and other computer-intensive techniques.
Release on 2014-05-13 | by Pierre Lafaye de Micheaux,Rémy Drouilhet,Benoit Liquet
Fundamentals of Programming and Statistical Analysis
Author: Pierre Lafaye de Micheaux,Rémy Drouilhet,Benoit Liquet
Pubpsher: Springer Science & Business
The contents of The R Software are presented so as to be both comprehensive and easy for the reader to use. Besides its application as a self-learning text, this book can support lectures on R at any level from beginner to advanced. This book can serve as a textbook on R for beginners as well as more advanced users, working on Windows, MacOs or Linux OSes. The first part of the book deals with the heart of the R language and its fundamental concepts, including data organization, import and export, various manipulations, documentation, plots, programming and maintenance. The last chapter in this part deals with oriented object programming as well as interfacing R with C/C++ or Fortran, and contains a section on debugging techniques. This is followed by the second part of the book, which provides detailed explanations on how to perform many standard statistical analyses, mainly in the Biostatistics field. Topics from mathematical and statistical settings that are included are matrix operations, integration, optimization, descriptive statistics, simulations, confidence intervals and hypothesis testing, simple and multiple linear regression, and analysis of variance. Each statistical chapter in the second part relies on one or more real biomedical data sets, kindly made available by the Bordeaux School of Public Health (Institut de Santé Publique, d'Épidémiologie et de Développement - ISPED) and described at the beginning of the book. Each chapter ends with an assessment section: memorandum of most important terms, followed by a section of theoretical exercises (to be done on paper), which can be used as questions for a test. Moreover, worksheets enable the reader to check his new abilities in R. Solutions to all exercises and worksheets are included in this book.
With interest growing in areas of forestry, conservation and other natural sciences, the need to organize and tabulate large amounts of forestry and natural science information has become a necessary skill. Previous attempts of applying statistical methods to these areas tend to be over-specialized and of limited use; an elementary text using methods, examples and exercises that are relevant to forestry and the natural sciences is long overdue. This book utilizes basic descriptive statistics and probability, as well as commonly used statistical inferential tools to introduce topics that are commonplace in a forestry context such as hypothesis texting, design of experiments, sampling methods, nonparametric tests and statistical quality control. It also contains examples and exercises drawn from the fields of forestry, wood science, and conservation.
As in its first edition, the new edition of Quantitative Corpus Linguistics with R demonstrates how to process corpus-linguistic data with the open-source programming language and environment R. Geared in general towards linguists working with observational data, and particularly corpus linguists, it introduces R programming with emphasis on: data processing and manipulation in general; text processing with and without regular expressions of large bodies of textual and/or literary data, and; basic aspects of statistical analysis and visualization. This book is extremely hands-on and leads the reader through dozens of small applications as well as larger case studies. Along with an array of exercise boxes and separate answer keys, the text features a didactic sequential approach in case studies by way of subsections that zoom in to every programming problem. The companion website to the book contains all relevant R code (amounting to approximately 7,000 lines of heavily commented code), most of the data sets as well as pointers to others, and a dedicated Google newsgroup. This new edition is ideal for both researchers in corpus linguistics and instructors who want to promote hands-on approaches to data in corpus linguistics courses.