注重体验与质量的电子书资源下载网站
分类于: 互联网 计算机基础
简介
统计学习基础: (英文版) 豆 9.5分
资源最后更新于 2020-09-05 22:02:56
作者:哈斯蒂 (Hastie.T.)
出版社:世界图书出版公司
出版日期:2009-01
ISBN:9787506292313
文件格式: pdf
标签: 机器学习 统计学习 统计学 数据挖掘 数学 统计 概率论与统计学 Statistical
简介· · · · · ·
计算和信息技术的飞速发展带来了医学、生物学、财经和营销等诸多领域的海量数据。理解这些数据是一种挑战,这导致了统计学领域新工具的发展,并延伸到诸如数据挖掘、机器学习和生物信息学等新领域。许多工具都具有共同的基础,但常常用不同的术语来表达。《统计学习基础(第2版)(英文)》介绍了这些领域的一些重要概念。尽管应用的是统计学方法,但强调的是概念,而不是数学。许多例子附以彩图。《统计学习基础(第2版)(英文)》内容广泛,从有指导的学习(预测)到无指导的学习,应有尽有。包括神经网络、支持向量机、分类树和提升等主题,是同类书籍中介绍得最全面的。
《统计学习基础(第2版)(英文)》可作为高等院校相关专业本科生和研究生的教材,对于统计学相关人员、科学界和业界关注数据挖掘的人,《统计学习基础(第2版)(英文)》值得一读。
目录
Preface
1 Introduction Overview of Supervised Learning
2.1 Introduction
2.2 Variable Types and Terminology
2.3 Two Simple Approaches to Prediction: Least Squares and Nearest Neighbors
2.3.1 Linear Models and Least Squares
2.3.2 Nearest-Neighbor Methods
2.3.3 From Least Squares to Nearest Neighbors
2.4 Statistical Decision Theory
2.5 Local Methods in High Dimensions
2.6 Statistical Models, Supervised Learning and Function Approximation
2.6.1 A Statistical Model for the Joint Distribution Pr(X,Y)
2.6.2 Supervised Learning
2.6.3 Function Approximation
2.7 Structured Regression Models
2.7.1 Difficulty of the Problem
2.8 Classes of Restricted Estimators
2.8.1 Roughness Penalty and Bayesian Methods
2.8.2 Kernel Methods and Local Regression
2.8.3 Basis Functions and Dictionary Methods
2.9 Model Selection and the Bias-Variance Tradeoff
Bibliographic Notes
Exercises
3 Linear Methods for Regression
3.1 Introduction
3.2 Linear Regression Models and Least Squares
3.2.1 Example:Prostate Cancer
3.2.2 The Ganss-Markov Theorem
3.3 Multiple Regression from Simple Univariate Regression
3.3.1 Multiple Outputs
3.4 Subset Selection and Coefficient Shrinkage
3.4.1 Subset Selection
3.4.2 Prostate Cancer Data Example fContinued)
3.4.3 Shrinkage Methods
3.4.4 Methods Using Derived Input Directions
3.4.5 Discussion:A Comparison of the Selection and Shrinkage Methods
3.4.6 Multiple Outcome Shrinkage and Selection
3.5 Compntational Considerations
Bibliographic Notes
Exercises
4 Linear Methods for Classification
4.1 Introduction
4.2 Linear Regression of an Indicator Matrix
4.3 Linear Discriminant Analysis
4.3.1 Regularized Discriminant Analysis
4.3.2 Computations for LDA
4.3.3 Reduced-Rank Linear Discriminant Analysis
4.4 Logistic Regression
4.4.1 Fitting Logistic Regression Models
4.4.2 Example:South African Heart Disease
4.4.3 Quadratic Approximations and Inference
4.4.4 Logistic Regression or LDA7
4.5 Separating Hyper planes
4.5.1 Rosenblatts Perceptron Learning Algorithm
4.5.2 Optimal Separating Hyper planes
Bibliographic Notes
Exercises
5 Basis Expansions and Regularizatlon
5.1 Introduction
5.2 Piecewise Polynomials and Splines
5.2.1 Natural Cubic Splines
5.2.2 Example: South African Heart Disease (Continued)
5.2.3 Example: Phoneme Recognition
5.3 Filtering and Feature Extraction
5.4 Smoothing Splines
5.4.1 Degrees of Freedom and Smoother Matrices
5.5 Automatic Selection of the Smoothing Parameters
5.5.1 Fixing the Degrees of Freedom
5.5.2 The Bias-Variance Tradeoff
5.6 Nonparametric Logistic Regression
5.7 Multidimensional Splines
5.8 Regularization and Reproducing Kernel Hilbert Spaces . .
5.8.1 Spaces of Phnctions Generated by Kernels
5.8.2 Examples of RKHS
5.9 Wavelet Smoothing
5.9.1 Wavelet Bases and the Wavelet Transform
5.9.2 Adaptive Wavelet Filtering
Bibliographic Notes
Exercises
Appendix: Computational Considerations for Splines
Appendix: B-splines
Appendix: Computations for Smoothing Splines
6 Kernel Methods
6.1 One-Dimensional Kernel Smoothers
6.1.1 Local Linear Regression
6.1.2 Local Polynomial Regression
6.2 Selecting the Width of the Kernel
6.3 Local Regression in Jap
6.4 Structured Local Regression Models in ]ap
6.4.1 Structured Kernels
6.4.2 Structured Regression Functions
6.5 Local Likelihood and Other Models
6.6 Kernel Density Estimation and Classification
6.6.1 Kernel Density Estimation
6.6.2 Kernel Density Classification
6.6.3 The Naive Bayes Classifier
6.7 Radial Basis Functions and Kernels
6.8 Mixture Models for Density Estimation and Classification
6.9 Computational Considerations
Bibliographic Notes
Exercises
7 Model Assessment and Selection
7.1 Introduction
7.2 Bias, Variance and Model Complexity
7.3 The Bias-Variance Decomposition
7.3.1 Example: Bias-Variance Tradeoff
7.4 Optimism of the Training Error Rate
7.5 Estimates of In-Sample Prediction Error
7.6 The Effective Number of Parameters
7.7 The Bayesian Approach and BIC
7.8 Minimum Description Length
7.9 Vapnik Chernovenkis Dimension
7.9.1 Example (Continued)
7.10 Cross-Validation
7.11 Bootstrap Methods
7.11.1 Example (Continued)
Bibliographic Notes
Exercises
8 Model Inference and Averaging
8.1 Introduction
8.2 The Bootstrap and Maximum Likelihood Methods
8.2.1 A Smoothing Example
8.2.2 Maximum Likelihood Inference
8.2.3 Bootstrap versus Maximum Likelihood
8.3 Bayesian Methods
8.4 Relationship Between the Bootstrap and Bayesian Inference
8.5 The EM Algorithm
8.5.1 Two-Component Mixture Model
8.5.2 The EM Algorithm in General
8.5.3 EM as a Maximization-Maximization Procedure
8.6 MCMC for Sampling from the Posterior
8.7 Bagging
8.7.1 Example: Trees with Simulated Data
8.8 Model Averaging and Stacking
8.9 Stochastic Search: Bumping
Bibliographic Notes
Exercises
9 Additive Models, Trees, and Related Methods
9.1 Generalized Additive Models
9.1.1 Fitting Additive Models
9.1.2 Example: Additive Logistic Regression
9.1.3 Summary
9.2 Tree Based Methods
10 Boosting and Additive Trees
11 Neural Networks
12 Support Vector Machines and Flexible Discriminants
13 Prototype Methods and Nearest-Neighbors
14 Unsupervised Learning
References
Author Index
Index
1 Introduction Overview of Supervised Learning
2.1 Introduction
2.2 Variable Types and Terminology
2.3 Two Simple Approaches to Prediction: Least Squares and Nearest Neighbors
2.3.1 Linear Models and Least Squares
2.3.2 Nearest-Neighbor Methods
2.3.3 From Least Squares to Nearest Neighbors
2.4 Statistical Decision Theory
2.5 Local Methods in High Dimensions
2.6 Statistical Models, Supervised Learning and Function Approximation
2.6.1 A Statistical Model for the Joint Distribution Pr(X,Y)
2.6.2 Supervised Learning
2.6.3 Function Approximation
2.7 Structured Regression Models
2.7.1 Difficulty of the Problem
2.8 Classes of Restricted Estimators
2.8.1 Roughness Penalty and Bayesian Methods
2.8.2 Kernel Methods and Local Regression
2.8.3 Basis Functions and Dictionary Methods
2.9 Model Selection and the Bias-Variance Tradeoff
Bibliographic Notes
Exercises
3 Linear Methods for Regression
3.1 Introduction
3.2 Linear Regression Models and Least Squares
3.2.1 Example:Prostate Cancer
3.2.2 The Ganss-Markov Theorem
3.3 Multiple Regression from Simple Univariate Regression
3.3.1 Multiple Outputs
3.4 Subset Selection and Coefficient Shrinkage
3.4.1 Subset Selection
3.4.2 Prostate Cancer Data Example fContinued)
3.4.3 Shrinkage Methods
3.4.4 Methods Using Derived Input Directions
3.4.5 Discussion:A Comparison of the Selection and Shrinkage Methods
3.4.6 Multiple Outcome Shrinkage and Selection
3.5 Compntational Considerations
Bibliographic Notes
Exercises
4 Linear Methods for Classification
4.1 Introduction
4.2 Linear Regression of an Indicator Matrix
4.3 Linear Discriminant Analysis
4.3.1 Regularized Discriminant Analysis
4.3.2 Computations for LDA
4.3.3 Reduced-Rank Linear Discriminant Analysis
4.4 Logistic Regression
4.4.1 Fitting Logistic Regression Models
4.4.2 Example:South African Heart Disease
4.4.3 Quadratic Approximations and Inference
4.4.4 Logistic Regression or LDA7
4.5 Separating Hyper planes
4.5.1 Rosenblatts Perceptron Learning Algorithm
4.5.2 Optimal Separating Hyper planes
Bibliographic Notes
Exercises
5 Basis Expansions and Regularizatlon
5.1 Introduction
5.2 Piecewise Polynomials and Splines
5.2.1 Natural Cubic Splines
5.2.2 Example: South African Heart Disease (Continued)
5.2.3 Example: Phoneme Recognition
5.3 Filtering and Feature Extraction
5.4 Smoothing Splines
5.4.1 Degrees of Freedom and Smoother Matrices
5.5 Automatic Selection of the Smoothing Parameters
5.5.1 Fixing the Degrees of Freedom
5.5.2 The Bias-Variance Tradeoff
5.6 Nonparametric Logistic Regression
5.7 Multidimensional Splines
5.8 Regularization and Reproducing Kernel Hilbert Spaces . .
5.8.1 Spaces of Phnctions Generated by Kernels
5.8.2 Examples of RKHS
5.9 Wavelet Smoothing
5.9.1 Wavelet Bases and the Wavelet Transform
5.9.2 Adaptive Wavelet Filtering
Bibliographic Notes
Exercises
Appendix: Computational Considerations for Splines
Appendix: B-splines
Appendix: Computations for Smoothing Splines
6 Kernel Methods
6.1 One-Dimensional Kernel Smoothers
6.1.1 Local Linear Regression
6.1.2 Local Polynomial Regression
6.2 Selecting the Width of the Kernel
6.3 Local Regression in Jap
6.4 Structured Local Regression Models in ]ap
6.4.1 Structured Kernels
6.4.2 Structured Regression Functions
6.5 Local Likelihood and Other Models
6.6 Kernel Density Estimation and Classification
6.6.1 Kernel Density Estimation
6.6.2 Kernel Density Classification
6.6.3 The Naive Bayes Classifier
6.7 Radial Basis Functions and Kernels
6.8 Mixture Models for Density Estimation and Classification
6.9 Computational Considerations
Bibliographic Notes
Exercises
7 Model Assessment and Selection
7.1 Introduction
7.2 Bias, Variance and Model Complexity
7.3 The Bias-Variance Decomposition
7.3.1 Example: Bias-Variance Tradeoff
7.4 Optimism of the Training Error Rate
7.5 Estimates of In-Sample Prediction Error
7.6 The Effective Number of Parameters
7.7 The Bayesian Approach and BIC
7.8 Minimum Description Length
7.9 Vapnik Chernovenkis Dimension
7.9.1 Example (Continued)
7.10 Cross-Validation
7.11 Bootstrap Methods
7.11.1 Example (Continued)
Bibliographic Notes
Exercises
8 Model Inference and Averaging
8.1 Introduction
8.2 The Bootstrap and Maximum Likelihood Methods
8.2.1 A Smoothing Example
8.2.2 Maximum Likelihood Inference
8.2.3 Bootstrap versus Maximum Likelihood
8.3 Bayesian Methods
8.4 Relationship Between the Bootstrap and Bayesian Inference
8.5 The EM Algorithm
8.5.1 Two-Component Mixture Model
8.5.2 The EM Algorithm in General
8.5.3 EM as a Maximization-Maximization Procedure
8.6 MCMC for Sampling from the Posterior
8.7 Bagging
8.7.1 Example: Trees with Simulated Data
8.8 Model Averaging and Stacking
8.9 Stochastic Search: Bumping
Bibliographic Notes
Exercises
9 Additive Models, Trees, and Related Methods
9.1 Generalized Additive Models
9.1.1 Fitting Additive Models
9.1.2 Example: Additive Logistic Regression
9.1.3 Summary
9.2 Tree Based Methods
10 Boosting and Additive Trees
11 Neural Networks
12 Support Vector Machines and Flexible Discriminants
13 Prototype Methods and Nearest-Neighbors
14 Unsupervised Learning
References
Author Index
Index