资料下载网
首页 计算机 考试教辅
Introduction To Machine Learning机器学习导论英文版_Ethem Alpaydin pdf电子书免费下载
首页 > 计算机 > AI人工智能 > Introduction To Machine Learning机器学习导论英文版_Ethem Alpaydin pdf电子书免费下载

《Introduction To Machine Learning机器学习导论英文版_Ethem Alpaydin》pdf电子书免费下载


下载方式一:

百度网盘下载地址:https://pan.baidu.com/s/1f074N_maKKtpCf18qVHc_w
百度网盘密码:1111

下载方式二:

http://ziliaoshare.cn/Download/ab_123507_pd_IntroductionToMachineLearningJQXXDLYWB_EthemAlpaydin.zip

 


Introduction To Machine Learning机器学习导论英文版_Ethem Alpaydin

作者:Thomas Dietterich, Editor Christopher Bishop, David Heckerman, Michael Jordan, and Michael Kearns, Associate Editors

页数:579

出版社:The MIT Press

《Introduction To Machine Learning机器学习导论英文版_Ethem Alpaydin》介绍

建立能够适应环境并从经验中学习的系统的目标吸引了来自许多领域的研究人员,包括计算机科学、工程、数学、物理学、神经科学和认知科学。从这项研究中产生了各种各样的学习技术,正在改变许多工业和科学领域。最近,一些研究团体已经聚集在一组常见的问题周围监督,半监督,无监督,强化学习问题。麻省理工学院出版的关于自适应计算和机器学习的系列文章试图统一机器学习研究的许多不同的分支,并培养高质量的研究和创新的应用。麻省理工学院出版社非常高兴地出版了第二版的《艾瑟姆·阿尔帕伊德》ı约翰的入门教材。这本书提出了一个可读性和简明的介绍机器学习,反映了这些不同的研究线索,同时提供了一个统一的处理领域。这本书涵盖了所有的主要问题公式,并介绍了最重要的算法和技术,包括从计算机科学,神经计算,信息理论和统计方法。第二版扩展并更新了几个领域的覆盖范围,特别是在过去五年中快速发展的内核机器和图形模型。这项更新的工作继续是一个令人信服的教科书,介绍课程在机器学习在本科和研究生开始的水平。


《Introduction To Machine Learning机器学习导论英文版_Ethem Alpaydin》目录

目录

系列前言十七

图十九

表二十二

前言三十一

确认书xxxiii

第二版注释xxxv

公证书xxxix

1引言1

1.1机器学习是什么?1

1.2机器学习应用示例4

1.2.1学习协会4

1.2.2分类5

1.2.3回归9

1.2.4无监督学习11

1.2.5强化学习13

1.3注释14

1.4相关资源16

1.5练习18

1.6参考文件19

2监督学习21

2.1从示例21学习一门课

八、目录

2.2 Vapnik Chervonenkis(VC)尺寸27

2.3可能大致正确(PAC)学习29

2.4噪声30

2.5学习多个班32

2.6回归34

2.7模型选择和泛化37

2.8监督机器学习算法的尺寸41

2.9注释42

2.10练习43

2.11参考文献44

3贝叶斯决策理论47

3.1简介47

3.2分类49

3.3损失和风险51

3.4判别函数53

3.5效用理论54

3.6协会规则55

3.7注释58

3.8练习58

3.9参考文件59

4参数化方法61

4.1简介61

4.2最大似然估计62

4.2.1伯努利密度63

4.2.2多项式密度64

4.2.3高斯(正常)密度64

4.3评估估计器:偏差和方差65

4.4贝叶斯估计66

4.5参数分类69

4.6回归73

4.7调谐模型复杂性:偏差/方差困境76

4.8选型程序80

4.9注释84

4.10练习84

4.11参考文件85

5多元方法87

5.1多元数据87

目录九

5.2参数估计88

5.3缺失值估计89

5.4多元正态分布90

5.5多元分类94

5.6调谐复杂性99

5.7离散特征102

5.8多元回归103

5.9注释105

5.10练习106

5.11参考107

6尺寸减少109

6.1简介109

6.2子集选择110

6.3主要成分分析113

6.4因素分析120

6.5多维缩放125

6.6线性判别分析128

6.7 Isomap 133

6.8局部线性嵌入135

6.9注释138

6.10练习139

6.11参考文献140

7群集143

7.1简介143

7.2混合料密度144

7.3 k-均值聚类145

7.4期望最大化算法149

7.5潜在变量模型的混合物154

7.6聚类后的监督学习155

7.7分层聚类157

7.8选择集群数量158

7.9注释160

7.10练习160

7.11参考文件161

8非参数方法163

8.1简介163

8.2非参数密度估计165

x目录

8.2.1直方图估计器165

8.2.2核估计器167

8.2.3 k-最近邻估计器168

8.3多元数据的泛化170

8.4非参数分类171

8.5凝聚最近邻172

8.6非参数回归:平滑模型174

8.6.1运行平均平滑度175

8.6.2内核平滑器176

8.6.3运行线平滑器177

8.7如何选择平滑参数178

8.8注释180

8.9练习181

8.10参考182

9决策树185

9.1简介185

9.2单变量树187

9.2.1分类树188

9.2.2回归树192

9.3修剪194

9.4树木中的规则提取197

9.5数据学习规则198

9.6多元树202

9.7注释204

9.8练习207

9.9参考207

10线性判别209

10.1简介209

10.2线性模型211的推广

10.3线性判别器212的几何

10.3.1两类212

10.3.2多个类别214

10.4成对分离216

10.5参数歧视重新审视217

10.6坡度下降218

10.7后勤歧视220

10.7.1两个等级220

内容席

10.7.2多个类别224

10.8回归判别228

10.9注释230

10.10练习230

10.11参考文件231

11个多层感知器233

11.1简介233

11.1.1了解大脑234

11.1.2神经网络作为并行的范例

处理235

11.2感知器237

11.3培训感知器240

11.4学习布尔函数243

11.5多层感知器245

11.6 MLP作为通用近似器248

11.7反向传播算法249

11.7.1非线性回归250

11.7.2两类歧视252

11.7.3多类别鉴别254

11.7.4多个隐藏层256

11.8培训程序256

11.8.1改进收敛性256

11.8.2过度降雨257

11.8.3网络结构258

11.8.4提示261

11.9调整网络大小263

11.10贝叶斯学习观266

11.11尺寸减少267

11.12学习时间270

11.12.1延时神经网络270

11.12.2经常性网络271

11.13注释272

11.14练习274

11.15参考文件275

12款本地型号279

12.1简介279

12.2竞争性学习280

十二、目录

12.2.1在线k-Means 280

12.2.2自适应共振理论285

12.2.3自组织地图286

12.3径向基函数288

12.4纳入基于规则的知识294

12.5标准化基函数295

12.6竞争性Bas

12.7学习矢量量化300

12.8专家混合物300

12.8.1合作专家303

12.8.2竞争专家304

12.9专家等级混合304

12.10注释305

12.11练习306

12.12参考文献307

13核心机309

13.1简介309

13.2最优分离超平面311

13.3不可分离情况:软边界超平面315

13.4ν-支持向量机318

13.5内核技巧319

13.6矢量核321

13.7定义内核324

13.8多核学习325

13.9多类内核机327

13.10回归的核心机器328

13.11一类内核机器333

13.12核降维335

13.13注释337

13.14练习338

13.15参考文献339

14贝叶斯估计341

14.1简介341

14.2估计分布参数343

14.2.1离散变量343

14.2.2连续变量345

14.3函数参数的贝叶斯估计348

目录十三

14.3.1回归348

14.3.2基函数/核函数的使用352

14.3.3贝叶斯分类353

14.4高斯过程356

14.5注释359

14.6练习360

14.7参考文献361

15隐马尔可夫模型363

15.1简介363

15.2离散马尔可夫过程364

15.3隐马尔可夫模型367

15.4 HMMs的三个基本问题369

15.5评估问题369

15.6寻找状态序列373

15.7学习模型参数375

15.8连续观察378

15.9输入379的HMM

15.10 HMM中的模型选择380

15.11注释382

15.12练习383

15.13参考文献384

16图形模型387

16.1简介387

16.2条件独立的典型案例389

16.3图形模型示例396

16.3.1朴素贝叶斯分类器396

16.3.2隐马尔可夫模型398

16.3.3线性回归401

16.4 d-分离402

16.5信念传播402

16.5.1链条403

16.5.2树木405

16.5.3多年生树木407

16.5.4连接树409

16.6无向图:马尔可夫随机场410

16.7学习图形模型的结构413

16.8影响图414

十四、目录

16.9注释414

16.10练习417

16.11参考文献417

17组合多个学习者419

17.1理由419

17.2培养多样化的学习者420

17.3模型组合方案423

17.4投票424

17.5纠错输出代码427

17.6装袋430

17.7增压431

17.8专家重访434

17.9概述435

17.10微调合奏437

17.11级联438

17.12注释440

17.13练习442

17.14参考文献443

18强化学习447

18.1简介447

18.2单一国家案件:K武装土匪449

18.3强化学习的要素450

18.4基于模型的学习453

18.4.1数值迭代453

18.4.2政策迭代454

18.5时差学习454

18.5.1勘探策略455

18.5.2确定性奖励和行动456

18.5.3不确定性奖励和行动457

18.5.4资格跟踪459

18.6概述461

18.7部分可观测状态464

18.7.1设置464

18.7.2示例:老虎问题465

18.8注释470

18.9练习472

18.10参考文献473

目录十五

19机器学习实验的设计与分析475

19.1简介475

19.2实验的因素、反应和策略478

19.3响应面设计481

19.4随机化、复制和阻断482

19.5机器学习实验指南483

19.6交叉验证和重采样方法486

19.6.1 K-折叠交叉验证487

19.6.2 5×2交叉验证488

19.6.3引导489

19.7测量分类器性能489

19.8区间估计493

19.9假设检验496

19.10评估分类算法的性能498

19.10.1二项检验499

19.10.2近似正常试验500

19.10.3 t试验500

19.11比较两种分类算法501

19.11.1麦克内玛试验501

19.11.2 K-折叠交叉验证配对t检验501

19.11.3 5 × 2配对t检验502

19.11.4 5 × 2配对F试验503

19.12比较多种算法:方差分析504

19.13多个数据集的比较508

19.13.1比较两种算法509

19.13.2多种算法511

19.14注释512

19.15练习513

19.16参考文献514

概率517

A.1概率要素517

A.1.1概率公理518

A.1.2条件概率518

A.2随机变量519

A.2.1概率分布和密度函数519

A.2.2联合分布和密度函数520

A.2.3条件分配520

A.2.4贝叶斯规则521

十六、目录

A.2.5期望521

A.2.6差异522

A.2.7弱大数定律523

A.3特殊随机变量523

A.3.1伯努利分布523

A.3.2二项分布524

A.3.3多项式分布524

A.3.4均匀分布524

A.3.5正态(高斯)分布525

A.3.6卡方分布526

A.3.7 t分配527

A.3.8 F分配527

A.4参考文献527

索引529

Contents

Series Foreword xvii

Figures xix

Tables xxix

Preface xxxi

Acknowledgments xxxiii

Notes for the Second Edition xxxv

Notations xxxix

1 Introduction 1

1.1 What Is Machine Learning? 1

1.2 Examples of Machine Learning Applications 4

1.2.1 Learning Associations 4

1.2.2 Classification 5

1.2.3 Regression 9

1.2.4 Unsupervised Learning 11

1.2.5 Reinforcement Learning 13

1.3 Notes 14

1.4 Relevant Resources 16

1.5 Exercises 18

1.6 References 19

2 Supervised Learning 21

2.1 Learning a Class from Examples 21

viii Contents

2.2 Vapnik-Chervonenkis (VC) Dimension 27

2.3 Probably Approximately Correct (PAC) Learning 29

2.4 Noise 30

2.5 Learning Multiple Classes 32

2.6 Regression 34

2.7 Model Selection and Generalization 37

2.8 Dimensions of a Supervised Machine Learning Algorithm 41

2.9 Notes 42

2.10 Exercises 43

2.11 References 44

3 Bayesian Decision Theory 47

3.1 Introduction 47

3.2 Classification 49

3.3 Losses and Risks 51

3.4 Discriminant Functions 53

3.5 Utility Theory 54

3.6 Association Rules 55

3.7 Notes 58

3.8 Exercises 58

3.9 References 59

4 Parametric Methods 61

4.1 Introduction 61

4.2 Maximum Likelihood Estimation 62

4.2.1 Bernoulli Density 63

4.2.2 Multinomial Density 64

4.2.3 Gaussian (Normal) Density 64

4.3 Evaluating an Estimator: Bias and Variance 65

4.4 The Bayes’ Estimator 66

4.5 Parametric Classification 69

4.6 Regression 73

4.7 Tuning Model Complexity: Bias/Variance Dilemma 76

4.8 Model Selection Procedures 80

4.9 Notes 84

4.10 Exercises 84

4.11 References 85

5 Multivariate Methods 87

5.1 Multivariate Data 87

Contents ix

5.2 Parameter Estimation 88

5.3 Estimation of Missing Values 89

5.4 Multivariate Normal Distribution 90

5.5 Multivariate Classification 94

5.6 Tuning Complexity 99

5.7 Discrete Features 102

5.8 Multivariate Regression 103

5.9 Notes 105

5.10 Exercises 106

5.11 References 107

6 Dimensionality Reduction 109

6.1 Introduction 109

6.2 Subset Selection 110

6.3 Principal Components Analysis 113

6.4 Factor Analysis 120

6.5 Multidimensional Scaling 125

6.6 Linear Discriminant Analysis 128

6.7 Isomap 133

6.8 Locally Linear Embedding 135

6.9 Notes 138

6.10 Exercises 139

6.11 References 140

7 Clustering 143

7.1 Introduction 143

7.2 Mixture Densities 144

7.3 k-Means Clustering 145

7.4 Expectation-Maximization Algorithm 149

7.5 Mixtures of Latent Variable Models 154

7.6 Supervised Learning after Clustering 155

7.7 Hierarchical Clustering 157

7.8 Choosing the Number of Clusters 158

7.9 Notes 160

7.10 Exercises 160

7.11 References 161

8 Nonparametric Methods 163

8.1 Introduction 163

8.2 Nonparametric Density Estimation 165

x Contents

8.2.1 Histogram Estimator 165

8.2.2 Kernel Estimator 167

8.2.3 k-Nearest Neighbor Estimator 168

8.3 Generalization to Multivariate Data 170

8.4 Nonparametric Classification 171

8.5 Condensed Nearest Neighbor 172

8.6 Nonparametric Regression: Smoothing Models 174

8.6.1 Running Mean Smoother 175

8.6.2 Kernel Smoother 176

8.6.3 Running Line Smoother 177

8.7 How to Choose the Smoothing Parameter 178

8.8 Notes 180

8.9 Exercises 181

8.10 References 182

9 Decision Trees 185

9.1 Introduction 185

9.2 Univariate Trees 187

9.2.1 Classification Trees 188

9.2.2 Regression Trees 192

9.3 Pruning 194

9.4 Rule Extraction from Trees 197

9.5 Learning Rules from Data 198

9.6 Multivariate Trees 202

9.7 Notes 204

9.8 Exercises 207

9.9 References 207

10 Linear Discrimination 209

10.1 Introduction 209

10.2 Generalizing the Linear Model 211

10.3 Geometry of the Linear Discriminant 212

10.3.1 Two Classes 212

10.3.2 Multiple Classes 214

10.4 Pairwise Separation 216

10.5 Parametric Discrimination Revisited 217

10.6 Gradient Descent 218

10.7 Logistic Discrimination 220

10.7.1 Two Classes 220

Contents xi

10.7.2 Multiple Classes 224

10.8 Discrimination by Regression 228

10.9 Notes 230

10.10 Exercises 230

10.11 References 231

11 Multilayer Perceptrons 233

11.1 Introduction 233

11.1.1 Understanding the Brain 234

11.1.2 Neural Networks as a Paradigm for Parallel

Processing 235

11.2 The Perceptron 237

11.3 Training a Perceptron 240

11.4 Learning Boolean Functions 243

11.5 Multilayer Perceptrons 245

11.6 MLP as a Universal Approximator 248

11.7 Backpropagation Algorithm 249

11.7.1 Nonlinear Regression 250

11.7.2 Two-Class Discrimination 252

11.7.3 Multiclass Discrimination 254

11.7.4 Multiple Hidden Layers 256

11.8 Training Procedures 256

11.8.1 Improving Convergence 256

11.8.2 Overtraining 257

11.8.3 Structuring the Network 258

11.8.4 Hints 261

11.9 Tuning the Network Size 263

11.10 Bayesian View of Learning 266

11.11 Dimensionality Reduction 267

11.12 Learning Time 270

11.12.1 Time Delay Neural Networks 270

11.12.2 Recurrent Networks 271

11.13 Notes 272

11.14 Exercises 274

11.15 References 275

12 Local Models 279

12.1 Introduction 279

12.2 Competitive Learning 280

xii Contents

12.2.1 Online k-Means 280

12.2.2 Adaptive Resonance Theory 285

12.2.3 Self-Organizing Maps 286

12.3 Radial Basis Functions 288

12.4 Incorporating Rule-Based Knowledge 294

12.5 Normalized Basis Functions 295

12.6 Competitive Basis Functions 297

12.7 Learning Vector Quantization 300

12.8 Mixture of Experts 300

12.8.1 Cooperative Experts 303

12.8.2 Competitive Experts 304

12.9 Hierarchical Mixture of Experts 304

12.10 Notes 305

12.11 Exercises 306

12.12 References 307

13 Kernel Machines 309

13.1 Introduction 309

13.2 Optimal Separating Hyperplane 311

13.3 The Nonseparable Case: Soft Margin Hyperplane 315

13.4 ν-SVM 318

13.5 Kernel Trick 319

13.6 Vectorial Kernels 321

13.7 Defining Kernels 324

13.8 Multiple Kernel Learning 325

13.9 Multiclass Kernel Machines 327

13.10 Kernel Machines for Regression 328

13.11 One-Class Kernel Machines 333

13.12 Kernel Dimensionality Reduction 335

13.13 Notes 337

13.14 Exercises 338

13.15 References 339

14 Bayesian Estimation 341

14.1 Introduction 341

14.2 Estimating the Parameter of a Distribution 343

14.2.1 Discrete Variables 343

14.2.2 Continuous Variables 345

14.3 Bayesian Estimation of the Parameters of a Function 348

Contents xiii

14.3.1 Regression 348

14.3.2 The Use of Basis/Kernel Functions 352

14.3.3 Bayesian Classification 353

14.4 Gaussian Processes 356

14.5 Notes 359

14.6 Exercises 360

14.7 References 361

15 Hidden Markov Models 363

15.1 Introduction 363

15.2 Discrete Markov Processes 364

15.3 Hidden Markov Models 367

15.4 Three Basic Problems of HMMs 369

15.5 Evaluation Problem 369

15.6 Finding the State Sequence 373

15.7 Learning Model Parameters 375

15.8 Continuous Observations 378

15.9 The HMM with Input 379

15.10 Model Selection in HMM 380

15.11 Notes 382

15.12 Exercises 383

15.13 References 384

16 Graphical Models 387

16.1 Introduction 387

16.2 Canonical Cases for Conditional Independence 389

16.3 Example Graphical Models 396

16.3.1 Naive Bayes’ Classifier 396

16.3.2 Hidden Markov Model 398

16.3.3 Linear Regression 401

16.4 d-Separation 402

16.5 Belief Propagation 402

16.5.1 Chains 403

16.5.2 Trees 405

16.5.3 Polytrees 407

16.5.4 Junction Trees 409

16.6 Undirected Graphs: Markov Random Fields 410

16.7 Learning the Structure of a Graphical Model 413

16.8 Influence Diagrams 414

xiv Contents

16.9 Notes 414

16.10 Exercises 417

16.11 References 417

17 Combining Multiple Learners 419

17.1 Rationale 419

17.2 Generating Diverse Learners 420

17.3 Model Combination Schemes 423

17.4 Voting 424

17.5 Error-Correcting Output Codes 427

17.6 Bagging 430

17.7 Boosting 431

17.8 Mixture of Experts Revisited 434

17.9 Stacked Generalization 435

17.10 Fine-Tuning an Ensemble 437

17.11 Cascading 438

17.12 Notes 440

17.13 Exercises 442

17.14 References 443

18 Reinforcement Learning 447

18.1 Introduction 447

18.2 Single State Case: K-Armed Bandit 449

18.3 Elements of Reinforcement Learning 450

18.4 Model-Based Learning 453

18.4.1 Value Iteration 453

18.4.2 Policy Iteration 454

18.5 Temporal Difference Learning 454

18.5.1 Exploration Strategies 455

18.5.2 Deterministic Rewards and Actions 456

18.5.3 Nondeterministic Rewards and Actions 457

18.5.4 Eligibility Traces 459

18.6 Generalization 461

18.7 Partially Observable States 464

18.7.1 The Setting 464

18.7.2 Example: The Tiger Problem 465

18.8 Notes 470

18.9 Exercises 472

18.10 References 473

Contents xv

19 Design and Analysis of Machine Learning Experiments 475

19.1 Introduction 475

19.2 Factors, Response, and Strategy of Experimentation 478

19.3 Response Surface Design 481

19.4 Randomization, Replication, and Blocking 482

19.5 Guidelines for Machine Learning Experiments 483

19.6 Cross-Validation and Resampling Methods 486

19.6.1 K-Fold Cross-Validation 487

19.6.2 5×2 Cross-Validation 488

19.6.3 Bootstrapping 489

19.7 Measuring Classifier Performance 489

19.8 Interval Estimation 493

19.9 Hypothesis Testing 496

19.10 Assessing a Classification Algorithm’s Performance 498

19.10.1 Binomial Test 499

19.10.2 Approximate Normal Test 500

19.10.3 t Test 500

19.11 Comparing Two Classification Algorithms 501

19.11.1 McNemar’s Test 501

19.11.2 K-Fold Cross-Validated Paired t Test 501

19.11.3 5 × 2 cv Paired t Test 502

19.11.4 5 × 2 cv Paired F Test 503

19.12 Comparing Multiple Algorithms: Analysis of Variance 504

19.13 Comparison over Multiple Datasets 508

19.13.1 Comparing Two Algorithms 509

19.13.2 Multiple Algorithms 511

19.14 Notes 512

19.15 Exercises 513

19.16 References 514

A Probability 517

A.1 Elements of Probability 517

A.1.1 Axioms of Probability 518

A.1.2 Conditional Probability 518

A.2 Random Variables 519

A.2.1 Probability Distribution and Density Functions 519

A.2.2 Joint Distribution and Density Functions 520

A.2.3 Conditional Distributions 520

A.2.4 Bayes’ Rule 521

xvi Contents

A.2.5 Expectation 521

A.2.6 Variance 522

A.2.7 Weak Law of Large Numbers 523

A.3 Special Random Variables 523

A.3.1 Bernoulli Distribution 523

A.3.2 Binomial Distribution 524

A.3.3 Multinomial Distribution 524

A.3.4 Uniform Distribution 524

A.3.5 Normal (Gaussian) Distribution 525

A.3.6 Chi-Square Distribution 526

A.3.7 t Distribution 527

A.3.8 F Distribution 527

A.4 References 527

Index 529

计算机


python
AI人工智能
javascript
计算机网络/服务器
数据库技术
计算机F

考试教辅


考研考博
英语四六级

沪ICP备18046276号-5