Statistical application development with r and python second edition

612 332 0
Statistical application development with r and python   second edition

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

Statistical Application Development with R and Python - Second Edition Table of Contents Statistical Application Development with R and Python - Second Edition Credits About the Author Acknowledgment About the Reviewers www.PacktPub.com eBooks, discount offers, and more Why subscribe? Customer Feedback Preface What this book covers What you need for this book Who this book is for Conventions Reader feedback Customer support Downloading the example code Errata Piracy Questions Data Characteristics Questionnaire and its components Understanding the data characteristics in an R environment Experiments with uncertainty in computer science Installing and setting up R Using R packages RSADBE – the books R package Python installation and setup Using pip for packages IDEs for R and Python The companion code bundle Discrete distributions Discrete uniform distribution Binomial distribution Hypergeometric distribution Negative binomial distribution Poisson distribution Continuous distributions Uniform distribution Exponential distribution Normal distribution Summary Import/Export Data Packages and settings – R and Python Understanding data.frame and other formats Constants, vectors, and matrices Time for action – understanding constants, vectors, and basic arithmetic What just happened? Doing it in Python Time for action – matrix computations What just happened? Doing it in Python The list object Time for action – creating a list object What just happened? The data.frame object Time for action – creating a data.frame object What just happened? Have a go hero The table object Time for action – creating the Titanic dataset as a table object What just happened? Have a go hero Using utils and the foreign packages Time for action – importing data from external files What just happened? Doing it in Python Importing data from MySQL Doing it in Python Exporting data/graphs Exporting R objects Exporting graphs Time for action – exporting a graph What just happened? Managing R sessions Time for action – session management What just happened? Doing it in Python Pop quiz Summary Data Visualization Packages and settings – R and Python Visualization techniques for categorical data Bar chart Going through the built-in examples of R Time for action – bar charts in R What just happened? Doing it in Python Have a go hero Dot chart Time for action – dot charts in R What just happened? Doing it in Python Spine and mosaic plots Time for action – spine plot for the shift and operator data What just happened? Time for action – mosaic plot for the Titanic dataset What just happened? Pie chart and the fourfold plot Visualization techniques for continuous variable data Boxplot Time for action – using the boxplot What just happened? Doing it in Python Histogram Time for action – understanding the effectiveness of histograms What just happened? Doing it in Python Have a go hero Scatter plot Time for action – plot and pairs R functions What just happened? Doing it in Python Have a go hero Pareto chart A brief peek at ggplot2 Time for action – qplot What just happened? Time for action – ggplot What just happened? Pop quiz Summary Exploratory Analysis Packages and settings – R and Python Essential summary statistics Percentiles, quantiles, and median Hinges Interquartile range Time for action – the essential summary statistics for The Wall dataset What just happened? Techniques for exploratory analysis The stem-and-leaf plot Time for action – the stem function in play What just happened? Letter values Data re-expression Have a go hero Bagplot – a bivariate boxplot Time for action – the bagplot display for multivariate datasets What just happened? Resistant line Time for action – resistant line as a first regression model What just happened? Smoothing data Time for action – smoothening the cow temperature data What just happened? Median polish Time for action – the median polish algorithm What just happened? Have a go hero Summary Statistical Inference Packages and settings – R and Python Maximum likelihood estimator Visualizing the likelihood function Time for action – visualizing the likelihood function What just happened? Doing it in Python Finding the maximum likelihood estimator Using the fitdistr function Time for action – finding the MLE using mle and fitdistr functions What just happened? Confidence intervals Time for action – confidence intervals What just happened? Doing it in Python Hypothesis testing Binomial test Time for action – testing probability of success What just happened? Tests of proportions and the chi-square test Time for action – testing proportions What just happened? Tests based on normal distribution – one sample Time for action – testing one-sample hypotheses What just happened? Have a go hero Tests based on normal distribution – two sample Time for action – testing two-sample hypotheses What just happened? Have a go hero Doing it in Python Summary Linear Regression Analysis Packages and settings - R and Python The essence of regression The simple linear regression model What happens to the arbitrary choice of parameters? Time for action - the arbitrary choice of parameters What just happened? Building a simple linear regression model Time for action - building a simple linear regression model What just happened? Have a go hero ANOVA and the confidence intervals Time for action - ANOVA and the confidence intervals What just happened? Model validation Time for action - residual plots for model validation What just happened? Doing it in Python Have a go hero Multiple linear regression model Averaging k simple linear regression models or a multiple linear regression model Time for action - averaging k simple linear regression models What just happened? Building a multiple linear regression model Time for action - building a multiple linear regression model What just happened? The ANOVA and confidence intervals for the multiple linear regression model Time for action - the ANOVA and confidence intervals for the multiple linear regression model What just happened? Have a go hero Useful residual plots Time for action - residual plots for the multiple linear regression model What just happened? Regression diagnostics Leverage points Influential points DFFITS and DFBETAS The multicollinearity problem Time for action - addressing the multicollinearity problem for the gasoline data What just happened? Doing it in Python Model selection Stepwise procedures The backward elimination The forward selection The stepwise regression Criterion-based procedures Time for action - model selection using the backward, forward, and AIC criteria What just happened? Have a go hero Summary Logistic Regression Model Packages and settings – R and Python The binary regression problem Time for action – limitation of linear regression model What just happened? Probit regression model Time for action – understanding the constants What just happened? Doing it in Python Logistic regression model Time for action – fitting the logistic regression model What just happened? Doing it in Python Hosmer-Lemeshow goodness-of-fit test statistic Time for action – Hosmer-Lemeshow goodness-of-fit statistic What just happened? Model validation and diagnostics Residual plots for the GLM Time for action – residual plots for logistic regression model What just happened? Doing it in Python Have a go hero Influence and leverage for the GLM Time for action – diagnostics for the logistic regression What just happened? Have a go hero Receiving operator curves Time for action – ROC construction What just happened? Doing it in Python Logistic regression for the German credit screening dataset Time for action – logistic regression for the German credit dataset What just happened? Doing it in Python Have a go hero Summary Regression Models with Regularization Packages and settings – R and Python The overfitting problem Time for action – understanding overfitting What just happened? Doing it in Python P packages, Python session loading / Packages and settings – R and Python pairs function obtaining / What just happened? Pareto chart about / Pareto chart for incomplete applications / Pareto chart partial residual / Residual plots for the GLM partitioning about / Time for action – partitioning the display plot, What just happened? Pearson residual / Residual plots for the GLM percentiles about / Percentiles, quantiles, and median piecewise linear regression model about / Piecewise linear regression model fitting / Time for action – fitting piecewise linear regression models, What just happened? pie charts about / Pie chart and the fourfold plot for Bugs Severity problem / Pie chart and the fourfold plot drawback / Pie chart and the fourfold plot pip used, for packages / Using pip for packages plot displaying / Time for action – partitioning the display plot plot function modifying, to obtain dot chart / Doing it in Python poisson distribution / Poisson distribution pooled variance estimator about / Tests based on normal distribution – two sample PRESS residuals about / Useful residual plots probability density function (pdf) / Continuous distributions probability distribution function about / Maximum likelihood estimator probability mass function (pmf) / Discrete distributions about / Maximum likelihood estimator probability of success testing / Time for action – testing probability of success, What just happened? probit regression model about / Probit regression model constants / Time for action – understanding the constants in Python / Doing it in Python proportions testing / Time for action – testing proportions, What just happened? pruning / Pruning and other finer aspects of a tree pscl package / Packages and settings – R and Python Python installing / Python installation and setup setting up / Python installation and setup URL / Python installation and setup pip, used for packages / Using pip for packages essential operations / Doing it in Python matrix computations / Doing it in Python pandas package, used for importing csv files / Doing it in Python bar chart, obtaining for severity counts / Doing it in Python, Have a go hero boxplots / Doing it in Python notched boxplots / Doing it in Python multiple boxplots / Doing it in Python scatter plot, visualizing / Doing it in Python likelihood function / Doing it in Python confidence intervals / Doing it in Python statistical tests, performing / Doing it in Python model validation / Doing it in Python multiple linear regression analysis / Doing it in Python probit regression model / Doing it in Python logistic regression model / Doing it in Python residual analysis / Doing it in Python ROC curve / Doing it in Python logistic regression, for German credit screening dataset / Doing it in Python, Have a go hero data, overfitting / Doing it in Python ridge regression, for linear models / Doing it in Python classification tree, replicating for CART_Dummy dataset / Doing it in Python classification tree, for German credit data / Doing it in Python, Have a go hero German credit data, importing / Doing it in Python Python implementation of histograms / Doing it in Python Python packages importing / Packages and settings – R and Python Python session managing / Doing it in Python Q qplot function / Time for action – qplot quantile function / What just happened? quantiles about / Percentiles, quantiles, and median R 80-20 rule about / Pareto chart R installing / Installing and setting up R setting up / Installing and setting up R matrices, creating in / What just happened? data files, importing into / Using utils and the foreign packages bar chart, visualizing / Time for action – bar charts in R dot charts, obtaining / Time for action – dot charts in R R-student residuals about / Useful residual plots random forests about / Random forests for German credit data / Time for action – random forests for the German credit data, What just happened? for low birth weight data / Time for action – random forests for the low birth weight data , What just happened? random forests analyses with Python / Doing it in Python random variable (RV) / Understanding the data characteristics in an R environment receiver operator characteristic (ROC) curve about / Receiving operator curves receiving operator curves about / Receiving operator curves recursive partitions about / Understanding recursive partitions regression about / The essence of regression regression diagnostics about / Regression diagnostics leverage point / Leverage points influential point / Influential points DFFITS metric / DFFITS and DFBETAS DFBETAS metric / DFFITS and DFBETAS multicollinearity problem / The multicollinearity problem regression spline about / Regression spline basis functions / Basis functions piecewise linear regression model / Piecewise linear regression model regression tree constructing / Constructing a regression tree, Time for action – the construction of a regression tree, What just happened? representative probabilities / Understanding the data characteristics in an R environment reroughing / Smoothing data residual analysis in Python / Doing it in Python residual plots for model validations / Time for action - residual plots for model validation about / Useful residual plots for multiple linear regression model / Time for action - residual plots for the multiple linear regression model, What just happened? for logistic regression model / Time for action – residual plots for logistic regression model residual plots, for GLM about / Residual plots for the GLM response residual / Residual plots for the GLM deviance residual / Residual plots for the GLM Pearson residual / Residual plots for the GLM partial residual / Residual plots for the GLM working residual / Residual plots for the GLM residuals standardized residuals / Useful residual plots semi-studentized residuals / Useful residual plots PRESS residuals / Useful residual plots R-student residuals / Useful residual plots resistant line about / Resistant line for IO-CPU time / Resistant line as first regression model / Time for action – resistant line as a first regression model response residual / Residual plots for the GLM ridge regression for linear models / Ridge regression for linear models, Protecting against overfitting, Time for action – ridge regression for the linear regression model, What just happened? for logistic regression models / Ridge regression for logistic regression models, Time for action – ridge regression for the logistic regression model rline function / Packages and settings – R and Python, What just happened? R objects exporting / Exporting R objects ROC construction about / Time for action – ROC construction ROC curves reference / Time for action – ROC construction ROCR package / Packages and settings – R and Python rootstock dataset reference / Using utils and the foreign packages rough / Smoothing data R packages using / Using R packages RSADBE / RSADBE – the books R package loading / Packages and settings – R and Python, Packages and settings – R and Python rpart package about / Understanding recursive partitions reference / Improving the CART RSADBE / RSADBE – the books R package R sessions managing / Managing R sessions RStudio URL / IDEs for R and Python S scatter plot about / Scatter plot drain current, versus ground-to-source voltage / Scatter plot Gasoline mileage performance data / Scatter plot obtaining / Time for action – plot and pairs R functions in Python / Doing it in Python scatter_matrix function / Doing it in Python semi-studentized residuals about / Useful residual plots sequential testing / Negative binomial distribution session management about / Time for action – session management, What just happened? short message service (SMS) / Questionnaire and its components simple linear regression model about / The simple linear regression model building / Building a simple linear regression model, Time for action - building a simple linear regression model spine plot about / Spine and mosaic plots drawing, spineplot function used / Time for action – spine plot for the shift and operator data spineplot function used, for drawing spines / Time for action – spine plot for the shift and operator data spline plot about / What just happened? spline regression models fitting / Time for action – fitting the spline regression models, What just happened? standardized residuals / Useful residual plots Statistical Process Control (SPC) about / The stem-and-leaf plot statistical tests performing, in Python / Doing it in Python stem-and-leaf plot about / The stem-and-leaf plot simple illustration / The stem-and-leaf plot Octane Rating of Gasoline Blends / The stem-and-leaf plot stem function working / Time for action – the stem function in play stems about / The stem-and-leaf plot stepwise procedures, model selection about / Stepwise procedures backward elimination / The backward elimination forward selection / The forward selection stepwise regression / The stepwise regression sturges method / Doing it in Python summary statistics about / Essential summary statistics for The Wall dataset / Time for action – the essential summary statistics for The Wall dataset T table object about / The table object Titanic dataset, creating as / Time for action – creating the Titanic dataset as a table object, What just happened? techniques, exploratory analysis stem-and-leaf plot / The stem-and-leaf plot letter values / Letter values data re-expression / Data re-expression bagplot / Bagplot – a bivariate boxplot resistant line / Resistant line data smoothing / Smoothing data median polish / Median polish tests of proportions / Tests of proportions and the chi-square test test statistic about / Hypothesis testing The Wall dataset / Percentiles, quantiles, and median Titanic dataset creating, as data object / Time for action – creating the Titanic dataset as a table object, What just happened? about / What just happened? trailing digits about / The stem-and-leaf plot tree about / The first tree building / Time for action – building our first tree, What just happened? true positive rate (tpr) about / Receiving operator curves two-sample hypotheses testing / Time for action – testing two-sample hypotheses, Have a go hero two-sample problem about / Tests based on normal distribution – two sample U UCBAdmissions dataset / Have a go hero reference / Time for action – testing proportions uniform distribution / Uniform distribution upper hinge / Hinges utils using / Using utils and the foreign packages V variance / Discrete distributions variance inflation factor (VIF) / Time for action - addressing the multicollinearity problem for the gasoline data vector objects creating / Time for action – understanding constants, vectors, and basic arithmetic vectors / Constants, vectors, and matrices structure / Time for action – understanding constants, vectors, and basic arithmetic two unequal length vectors, adding / Time for action – understanding constants, vectors, and basic arithmetic visualization techniques for categorical data / Visualization techniques for categorical data for continuous variable data / Visualization techniques for continuous variable data W Windows version, Python reference link / Python installation and setup working residual / Residual plots for the GLM X XLS / Using utils and the foreign packages XLSX / Using utils and the foreign packages ... award our regular reviewers with free eBooks and videos in exchange for their valuable feedback Help us be relentless in improving our products! Preface R and Python are interchangeably required... Index Statistical Application Development with R and Python - Second Edition Statistical Application Development with R and Python - Second Edition Copyright © 2017 Packt Publishing All rights reserved.. .Statistical Application Development with R and Python - Second Edition Table of Contents Statistical Application Development with R and Python - Second Edition Credits About the Author Acknowledgment

Ngày đăng: 02/03/2019, 10:56

Từ khóa liên quan

Mục lục

  • Statistical Application Development with R and Python - Second Edition

  • Credits

  • About the Author

  • Acknowledgment

  • About the Reviewers

  • www.PacktPub.com

  • eBooks, discount offers, and more

  • Why subscribe?

  • Customer Feedback

  • Preface

  • What this book covers

  • What you need for this book

  • Who this book is for

  • Conventions

  • Reader feedback

  • Customer support

  • Downloading the example code

  • Errata

  • Piracy

  • Questions

Tài liệu cùng người dùng

  • Đang cập nhật ...

Tài liệu liên quan