Mecha Learn

Tuesday, 29 December 2020

Speech and Language Processing An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition

Speech and Language Processing An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition Third Edition draft Daniel Jurafsky Stanford University James H. Martin University of Colorado at Boulder Copyright ©2020. All rights reserved. Draft of December 30, 2020. Comments and typos welcome!

Summary of Contents 1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2 Regular Expressions, Text Normalization, Edit Distance . . . . . . . . . 2 3 N-gram Language Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4 Naive Bayes and Sentiment Classification . . . . . . . . . . . . . . . . . . . . . . . 55 5 Logistic Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 6 Vector Semantics and Embeddings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 7 Neural Networks and Neural Language Models . . . . . . . . . . . . . . . . . 127 8 Sequence Labeling for Parts of Speech and Named Entities . . . . . . 148 9 Deep Learning Architectures for Sequence Processing . . . . . . . . . . . 173 10 Contextual Embeddings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 11 Machine Translation and Encoder-Decoder Models . . . . . . . . . . . . . 203 12 Constituency Grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231 13 Constituency Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 14 Dependency Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280 15 Logical Representations of Sentence Meaning . . . . . . . . . . . . . . . . . . . 305 16 Computational Semantics and Semantic Parsing . . . . . . . . . . . . . . . . 331 17 Information Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332 18 Word Senses and WordNet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355 19 Semantic Role Labeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373 20 Lexicons for Sentiment, Affect, and Connotation . . . . . . . . . . . . . . . . 393 21 Coreference Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415 22 Discourse Coherence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442 23 Question Answering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464 24 Chatbots & Dialogue Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 492 25 Phonetics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526 26 Automatic Speech Recognition and Text-to-Speech . . . . . . . . . . . . . . 548 Bibliography. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 575 Subject Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 607

https://web.stanford.edu/~jurafsky/slp3/ed3book.pdf

Monday, 22 April 2019

Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow

Part I. The Fundamentals of Machine Learning 1. The Machine Learning Landscape. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 What Is Machine Learning? 4 Why Use Machine Learning? 4 Types of Machine Learning Systems 8 Supervised/Unsupervised Learning 8 Batch and Online Learning 15 Instance-Based Versus Model-Based Learning 18 Main Challenges of Machine Learning 24 Insufficient Quantity of Training Data 24 Nonrepresentative Training Data 26 Poor-Quality Data 27 Irrelevant Features 27 Overfitting the Training Data 28 Underfitting the Training Data 30 Stepping Back 30 Testing and Validating 31 Hyperparameter Tuning and Model Selection 32 Data Mismatch 33 Exercises 34 2. End-to-End Machine Learning Project. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Working with Real Data 38 Look at the Big Picture 39 iii Frame the Problem 39 Select a Performance Measure 42 Check the Assumptions 45 Get the Data 45 Create the Workspace 45 Download the Data 49 Take a Quick Look at the Data Structure 50 Create a Test Set 54 Discover and Visualize the Data to Gain Insights 58 Visualizing Geographical Data 59 Looking for Correlations 62 Experimenting with Attribute Combinations 65 Prepare the Data for Machine Learning Algorithms 66 Data Cleaning 67 Handling Text and Categorical Attributes 69 Custom Transformers 71 Feature Scaling 72 Transformation Pipelines 73 Select and Train a Model 75 Training and Evaluating on the Training Set 75 Better Evaluation Using Cross-Validation 76 Fine-Tune Your Model 79 Grid Search 79 Randomized Search 81 Ensemble Methods 82 Analyze the Best Models and Their Errors 82 Evaluate Your System on the Test Set 83 Launch, Monitor, and Maintain Your System 84 Try It Out! 85 Exercises 85 3. Classi€cation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 MNIST 87 Training a Binary Classifier 90 Performance Measures 90 Measuring Accuracy Using Cross-Validation 91 Confusion Matrix 92 Precision and Recall 94 Precision/Recall Tradeoff 95 The ROC Curve 99 Multiclass Classification 102 Error Analysis 104 iv | Table of Contents Multilabel Classification 108 Multioutput Classification 109 Exercises 110 4. Training Models. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 Linear Regression 114 The Normal Equation 116 Computational Complexity 119 Gradient Descent 119 Batch Gradient Descent 123 Stochastic Gradient Descent 126 Mini-batch Gradient Descent 129 Polynomial Regression 130 Learning Curves 132 Regularized Linear Models 136 Ridge Regression 137 Lasso Regression 139 Elastic Net 142 Early Stopping 142 Logistic Regression 144 Estimating Probabilities 144 Training and Cost Function 145 Decision Boundaries 146 Softmax Regression 149 Exercises 153 5. Support Vector Machines. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 Linear SVM Classification 155 Soft Margin Classification 156 Nonlinear SVM Classification 159 Polynomial Kernel 160 Adding Similarity Features 161 Gaussian RBF Kernel 162 Computational Complexity 163 SVM Regression 164 Under the Hood 166 Decision Function and Predictions 166 Training Objective 167 Quadratic Programming 169 The Dual Problem 170 Kernelized SVM 171 Online SVMs 174 Table of Contents | v Exercises 175 6. Decision Trees. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 Training and Visualizing a Decision Tree 177 Making Predictions 179 Estimating Class Probabilities 181 The CART Training Algorithm 182 Computational Complexity 183 Gini Impurity or Entropy? 183 Regularization Hyperparameters 184 Regression 185 Instability 188 Exercises 189 7. Ensemble Learning and Random Forests. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 Voting Classifiers 192 Bagging and Pasting 195 Bagging and Pasting in Scikit-Learn 196 Out-of-Bag Evaluation 197 Random Patches and Random Subspaces 198 Random Forests 199 Extra-Trees 200 Feature Importance 200 Boosting 201 AdaBoost 202 Gradient Boosting 205 Stacking 210 Exercises 213 8. Dimensionality Reduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 The Curse of Dimensionality 216 Main Approaches for Dimensionality Reduction 218 Projection 218 Manifold Learning 220 PCA 222 Preserving the Variance 222 Principal Components 223 Projecting Down to d Dimensions 224 Using Scikit-Learn 224 Explained Variance Ratio 225 Choosing the Right Number of Dimensions 225 PCA for Compression 226 vi | Table of Contents Randomized PCA 227 Incremental PCA 227 Kernel PCA 228 Selecting a Kernel and Tuning Hyperparameters 229 LLE 232 Other Dimensionality Reduction Techniques 234 Exercises 235 9. Unsupervised Learning Techniques. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 Clustering 238 K-Means 240 Limits of K-Means 250 Using clustering for image segmentation 251 Using Clustering for Preprocessing 252 Using Clustering for Semi-Supervised Learning 254 DBSCAN 256 Other Clustering Algorithms 259 Gaussian Mixtures 260 Anomaly Detection using Gaussian Mixtures 266 Selecting the Number of Clusters 267 Bayesian Gaussian Mixture Models 270 Other Anomaly Detection and Novelty Detection Algorithms 274 Part II. Neural Networks and Deep Learning 10. Introduction to Arti€cial Neural Networks with Keras. . . . . . . . . . . . . . . . . . . . . . . . . . 277 From Biological to Artificial Neurons 278 Biological Neurons 279 Logical Computations with Neurons 281 The Perceptron 281 Multi-Layer Perceptron and Backpropagation 286 Regression MLPs 289 Classification MLPs 290 Implementing MLPs with Keras 292 Installing TensorFlow 2 293 Building an Image Classifier Using the Sequential API 294 Building a Regression MLP Using the Sequential API 303 Building Complex Models Using the Functional API 304 Building Dynamic Models Using the Subclassing API 309 Saving and Restoring a Model 311 Using Callbacks 311 Table of Contents | vii Visualization Using TensorBoard 313 Fine-Tuning Neural Network Hyperparameters 315 Number of Hidden Layers 319 Number of Neurons per Hidden Layer 320 Learning Rate, Batch Size and Other Hyperparameters 320 Exercises 322 11. Training Deep Neural Networks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325 Vanishing/Exploding Gradients Problems 326 Glorot and He Initialization 327 Nonsaturating Activation Functions 329 Batch Normalization 333 Gradient Clipping 338 Reusing Pretrained Layers 339 Transfer Learning With Keras 341 Unsupervised Pretraining 343 Pretraining on an Auxiliary Task 344 Faster Optimizers 344 Momentum Optimization 345 Nesterov Accelerated Gradient 346 AdaGrad 347 RMSProp 349 Adam and Nadam Optimization 349 Learning Rate Scheduling 352 Avoiding Overfitting Through Regularization 356 ℓ1 and ℓ2 Regularization 356 Dropout 357 Monte-Carlo (MC) Dropout 360 Max-Norm Regularization 362 Summary and Practical Guidelines 363 Exercises 364 12. Custom Models and Training with TensorFlow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367 A Quick Tour of TensorFlow 368 Using TensorFlow like NumPy 371 Tensors and Operations 371 Tensors and NumPy 373 Type Conversions 374 Variables 374 Other Data Structures 375 Customizing Models and Training Algorithms 376 Custom Loss Functions 376 viii | Table of Contents Saving and Loading Models That Contain Custom Components 377 Custom Activation Functions, Initializers, Regularizers, and Constraints 379 Custom Metrics 380 Custom Layers 383 Custom Models 386 Losses and Metrics Based on Model Internals 388 Computing Gradients Using Autodiff 389 Custom Training Loops 393 TensorFlow Functions and Graphs 396 Autograph and Tracing 398 TF Function Rules 400 13. Loading and Preprocessing Data with TensorFlow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403 The Data API 404 Chaining Transformations 405 Shuffling the Data 406 Preprocessing the Data 409 Putting Everything Together 410 Prefetching 411 Using the Dataset With tf.keras 413 The TFRecord Format 414 Compressed TFRecord Files 415 A Brief Introduction to Protocol Buffers 415 TensorFlow Protobufs 416 Loading and Parsing Examples 418 Handling Lists of Lists Using the SequenceExample Protobuf 419 The Features API 420 Categorical Features 421 Crossed Categorical Features 421 Encoding Categorical Features Using One-Hot Vectors 422 Encoding Categorical Features Using Embeddings 423 Using Feature Columns for Parsing 426 Using Feature Columns in Your Models 426 TF Transform 428 The TensorFlow Datasets (TFDS) Project 429 14. Deep Computer Vision Using Convolutional Neural Networks. . . . . . . . . . . . . . . . . . . 431 The Architecture of the Visual Cortex 432 Convolutional Layer 434 Filters 436 Stacking Multiple Feature Maps 437 TensorFlow Implementation 439 Table of Contents | ix Memory Requirements 441 Pooling Layer 442 TensorFlow Implementation 444 CNN Architectures 446 LeNet-5 449 AlexNet 450 GoogLeNet 452 VGGNet 456 ResNet 457 Xception 459 SENet 461 Implementing a ResNet-34 CNN Using Keras 464 Using Pretrained Models From Keras 465 Pretrained Models for Transfer Learning 467 Classification and Localization 469 Object Detection 471 Fully Convolutional Networks (FCNs) 473 You Only Look Once (YOLO) 475 Semantic Segmentation 478 Exercises 482

https://www.knowledgeisle.com/wp-content/uploads/2019/12/2-Aur%C3%A9lien-G%C3%A9ron-Hands-On-Machine-Learning-with-Scikit-Learn-Keras-and-Tensorflow_-Concepts-Tools-and-Techniques-to-Build-Intelligent-Systems-O%E2%80%99Reilly-Media-2019.pdf

Monday, 31 December 2018

Foundations of Machine Learning

Contents

Preface xiii

1 Introduction 1

1.1 What is machine learning? 1

1.2 What kind of problems can be tackled using machine learning? 2

1.3 Some standard learning tasks 3

1.4 Learning stages 4

1.5 Learning scenarios 6

1.6 Generalization 7

2 The PAC Learning Framework 9

2.1 The PAC learning model 9

2.2 Guarantees for finite hypothesis sets — consistent case 15

2.3 Guarantees for finite hypothesis sets — inconsistent case 19

2.4 Generalities 21

2.4.1 Deterministic versus stochastic scenarios 21

2.4.2 Bayes error and noise 22

2.5 Chapter notes 23

2.6 Exercises 23

3 Rademacher Complexity and VC-Dimension 29

3.1 Rademacher complexity 30

3.2 Growth function 34

3.3 VC-dimension 36

3.4 Lower bounds 43

3.5 Chapter notes 48

3.6 Exercises 50

4 Model Selection 61

4.1 Estimation and approximation errors 61

4.2 Empirical risk minimization (ERM) 62

4.3 Structural risk minimization (SRM) 64

vi Contents

4.4 Cross-validation 68

4.5 n-Fold cross-validation 71

4.6 Regularization-based algorithms 72

4.7 Convex surrogate losses 73

4.8 Chapter notes 77

4.9 Exercises 78

5 Support Vector Machines 79

5.1 Linear classification 79

5.2 Separable case 80

5.2.1 Primal optimization problem 81

5.2.2 Support vectors 83

5.2.3 Dual optimization problem 83

5.2.4 Leave-one-out analysis 85

5.3 Non-separable case 87

5.3.1 Primal optimization problem 88

5.3.2 Support vectors 89

5.3.3 Dual optimization problem 90

5.4 Margin theory 91

5.5 Chapter notes 100

5.6 Exercises 100

6 Kernel Methods 105

6.1 Introduction 105

6.2 Positive definite symmetric kernels 108

6.2.1 Definitions 108

6.2.2 Reproducing kernel Hilbert space 110

6.2.3 Properties 112

6.3 Kernel-based algorithms 116

6.3.1 SVMs with PDS kernels 116

6.3.2 Representer theorem 117

6.3.3 Learning guarantees 117

6.4 Negative definite symmetric kernels 119

6.5 Sequence kernels 121

6.5.1 Weighted transducers 122

6.5.2 Rational kernels 126

6.6 Approximate kernel feature maps 130

6.7 Chapter notes 135

6.8 Exercises 137

7 Boosting 145

7.1 Introduction 145

7.2 AdaBoost 146

7.2.1 Bound on the empirical error 149

7.2.2 Relationship with coordinate descent 150

7.2.3 Practical use 154

Contents vii

7.3 Theoretical results 154

7.3.1 VC-dimension-based analysis 154

7.3.2 L1-geometric margin 155

7.3.3 Margin-based analysis 157

7.3.4 Margin maximization 161

7.3.5 Game-theoretic interpretation 162

7.4 L1-regularization 165

7.5 Discussion 167

7.6 Chapter notes 168

7.7 Exercises 170

8 On-Line Learning 177

8.1 Introduction 178

8.2 Prediction with expert advice 178

8.2.1 Mistake bounds and Halving algorithm 179

8.2.2 Weighted majority algorithm 181

8.2.3 Randomized weighted majority algorithm 183

8.2.4 Exponential weighted average algorithm 186

8.3 Linear classification 190

8.3.1 Perceptron algorithm 190

8.3.2 Winnow algorithm 198

8.4 On-line to batch conversion 201

8.5 Game-theoretic connection 204

8.6 Chapter notes 205

8.7 Exercises 206

9 Multi-Class Classification 213

9.1 Multi-class classification problem 213

9.2 Generalization bounds 215

9.3 Uncombined multi-class algorithms 221

9.3.1 Multi-class SVMs 221

9.3.2 Multi-class boosting algorithms 222

9.3.3 Decision trees 224

9.4 Aggregated multi-class algorithms 228

9.4.1 One-versus-all 229

9.4.2 One-versus-one 229

9.4.3 Error-correcting output codes 231

9.5 Structured prediction algorithms 233

9.6 Chapter notes 235

9.7 Exercises 237

10 Ranking 239

10.1 The problem of ranking 240

10.2 Generalization bound 241

10.3 Ranking with SVMs 243

viii Contents

10.4 RankBoost 244

10.4.1 Bound on the empirical error 246

10.4.2 Relationship with coordinate descent 248

10.4.3 Margin bound for ensemble methods in ranking 250

10.5 Bipartite ranking 251

10.5.1 Boosting in bipartite ranking 252

10.5.2 Area under the ROC curve 255

10.6 Preference-based setting 257

10.6.1 Second-stage ranking problem 257

10.6.2 Deterministic algorithm 259

10.6.3 Randomized algorithm 260

10.6.4 Extension to other loss functions 262

10.7 Other ranking criteria 262

10.8 Chapter notes 263

10.9 Exercises 264

11 Regression 267

11.1 The problem of regression 267

11.2 Generalization bounds 268

11.2.1 Finite hypothesis sets 268

11.2.2 Rademacher complexity bounds 269

11.2.3 Pseudo-dimension bounds 271

11.3 Regression algorithms 275

11.3.1 Linear regression 275

11.3.2 Kernel ridge regression 276

11.3.3 Support vector regression 281

11.3.4 Lasso 285

11.3.5 Group norm regression algorithms 289

11.3.6 On-line regression algorithms 289

11.4 Chapter notes 290

11.5 Exercises 292

12 Maximum Entropy Models 295

12.1 Density estimation problem 295

12.1.1 Maximum Likelihood (ML) solution 296

12.1.2 Maximum a Posteriori (MAP) solution 297

12.2 Density estimation problem augmented with features 297

12.3 Maxent principle 298

12.4 Maxent models 299

12.5 Dual problem 299

12.6 Generalization bound 303

12.7 Coordinate descent algorithm 304

12.8 Extensions 306

12.9 L2-regularization 308

Contents ix

12.10 Chapter notes 312

12.11 Exercises 313

13 Conditional Maximum Entropy Models 315

13.1 Learning problem 315

13.2 Conditional Maxent principle 316

13.3 Conditional Maxent models 316

13.4 Dual problem 317

13.5 Properties 319

13.5.1 Optimization problem 320

13.5.2 Feature vectors 320

13.5.3 Prediction 321

13.6 Generalization bounds 321

13.7 Logistic regression 325

13.7.1 Optimization problem 325

13.7.2 Logistic model 325

13.8 L2-regularization 326

13.9 Proof of the duality theorem 328

13.10 Chapter notes 330

13.11 Exercises 331

14 Algorithmic Stability 333

14.1 Definitions 333

14.2 Stability-based generalization guarantee 334

14.3 Stability of kernel-based regularization algorithms 336

14.3.1 Application to regression algorithms: SVR and KRR 339

14.3.2 Application to classification algorithms: SVMs 341

14.3.3 Discussion 342

14.4 Chapter notes 342

14.5 Exercises 343

15 Dimensionality Reduction 347

15.1 Principal component analysis 348

15.2 Kernel principal component analysis (KPCA) 349

15.3 KPCA and manifold learning 351

15.3.1 Isomap 351

15.3.2 Laplacian eigenmaps 352

15.3.3 Locally linear embedding (LLE) 353

15.4 Johnson-Lindenstrauss lemma 354

15.5 Chapter notes 356

15.6 Exercises 356

16 Learning Automata and Languages 359

16.1 Introduction 359

x Contents

16.2 Finite automata 360

16.3 Efficient exact learning 361

16.3.1 Passive learning 362

16.3.2 Learning with queries 363

16.3.3 Learning automata with queries 364

16.4 Identification in the limit 369

16.4.1 Learning reversible automata 370

16.5 Chapter notes 375

16.6 Exercises 376

17 Reinforcement Learning 379

17.1 Learning scenario 379

17.2 Markov decision process model 380

17.3 Policy 381

17.3.1 Definition 381

17.3.2 Policy value 382

17.3.3 Optimal policies 382

17.3.4 Policy evaluation 385

17.4 Planning algorithms 387

17.4.1 Value iteration 387

17.4.2 Policy iteration 390

17.4.3 Linear programming 392

17.5 Learning algorithms 393

17.5.1 Stochastic approximation 394

17.5.2 TD(0) algorithm 397

17.5.3 Q-learning algorithm 398

17.5.4 SARSA 402

17.5.5 TD(λ) algorithm 402

17.5.6 Large state space 403

17.6 Chapter notes 405

Conclusion 407

A Linear Algebra Review 409

A.1 Vectors and norms 409

A.1.1 Norms 409

A.1.2 Dual norms 410

A.1.3 Relationship between norms 411

A.2 Matrices 411

A.2.1 Matrix norms 411

A.2.2 Singular value decomposition 412

A.2.3 Symmetric positive semidefinite (SPSD) matrices 412

Contents xi

B Convex Optimization 415

B.1 Differentiation and unconstrained optimization 415

B.2 Convexity 415

B.3 Constrained optimization 419

B.4 Fenchel duality 422

B.4.1 Subgradients 422

B.4.2 Core 423

B.4.3 Conjugate functions 423

B.5 Chapter notes 426

B.6 Exercises 427

C Probability Review 429

C.1 Probability 429

C.2 Random variables 429

C.3 Conditional probability and independence 431

C.4 Expectation and Markov’s inequality 431

C.5 Variance and Chebyshev’s inequality 432

C.6 Moment-generating functions 434

C.7 Exercises 435

D Concentration Inequalities 437

D.1 Hoeffding’s inequality 437

D.2 Sanov’s theorem 438

D.3 Multiplicative Chernoff bounds 439

D.4 Binomial distribution tails: Upper bounds 440

D.5 Binomial distribution tails: Lower bound 440

D.6 Azuma’s inequality 441

D.7 McDiarmid’s inequality 442

D.8 Normal distribution tails: Lower bound 443

D.9 Khintchine-Kahane inequality 443

D.10 Maximal inequality 444

D.11 Chapter notes 445

D.12 Exercises 445

E Notions of Information Theory 449

E.1 Entropy 449

E.2 Relative entropy 450

E.3 Mutual information 453

E.4 Bregman divergences 453

E.5 Chapter notes 456

E.6 Exercises 457

xii Contents

F Notation 459

Bibliography 461

Index 475

https://www.dropbox.com/s/38p0j6ds5q9c8oe/10290.pdf

Natural Language Processing Succinctly

OVERVIEW AI assistants represent a significant frontier for development. But the complexities of such systems pose a significant barrier for developers. In Natural Language Processing Succinctly, author Joseph Booth will guide readers through designing a simple system that can interpret and provide reasonable responses to written English text. With this foundation, readers will be prepared to tackle the greater challenges of natural language development. TABLE OF CONTENTS Natural Language Processing What We're Building Extracting Sentences Extracting Words Tagging Entity Recognition Knowledge Base Answering Questions Cloudmersive Google Cloud NLP API Microsoft Cognitive Services Other NLP Uses Summary Penn Treebank Tags Universal POS Tags About the Code

https://www.syncfusion.com/succinctly-free-ebooks/natural-language-processing-succinctly

Friday, 30 November 2018

Machine Learning For Dummies

Table of Contents INTRODUCTION............................................................................................... 1 About This Book ................................................................................... 1 Foolish Assumptions............................................................................ 2 Icons Used in This Book....................................................................... 2 CHAPTER 1: Understanding Machine Learning................................. 3 What Is Machine Learning? ................................................................. 4 Iterative learning from data........................................................... 5 What’s old is new again.................................................................. 5 Defining Big Data.................................................................................. 6 Big Data in Context with Machine Learning...................................... 7 The Need to Understand and Trust your Data................................. 8 The Importance of the Hybrid Cloud................................................. 9 Leveraging the Power of Machine Learning ..................................... 9 Descriptive analytics.....................................................................10 Predictive analytics .......................................................................10 The Roles of Statistics and Data Mining with Machine Learning...............................................................................11 Putting Machine Learning in Context ..............................................12 Approaches to Machine Learning ....................................................14 Supervised learning......................................................................15 Unsupervised learning .................................................................15 Reinforcement learning ...............................................................16 Neural networks and deep learning...........................................17 CHAPTER 2: Applying Machine Learning ..............................................19 Getting Started with a Strategy.........................................................19 Using machine learning to remove biases from strategy........20 More data makes planning more accurate ...............................22 Understanding Machine Learning Techniques...............................22 Tying Machine Learning Methods to Outcomes ............................23 Applying Machine Learning to Business Needs..............................23 Understanding why customers are leaving...............................24 Recognizing who has committed a crime ..................................25 Preventing accidents from happening.......................................26 iv Machine Learning For Dummies, IBM Limited Edition These materials are © 2018 John Wiley & Sons, Inc. Any dissemination, distribution, or unauthorized use is strictly prohibited. CHAPTER 3: Looking Inside Machine Learning................................27 The Impact of Machine Learning on Applications..........................28 The role of algorithms..................................................................28 Types of machine learning algorithms.......................................29 Training machine learning systems............................................33 Data Preparation................................................................................34 Identify relevant data ...................................................................34 Governing data..............................................................................36 The Machine Learning Cycle .............................................................37 CHAPTER 4: Getting Started with Machine Learning.................39 Understanding How Machine Learning Can Help..........................39 Focus on the Business Problem .......................................................40 Bringing data silos together ........................................................41 Avoiding trouble before it happens............................................42 Getting customer focused ...........................................................43 Machine Learning Requires Collaboration......................................43 Executing a Pilot Project....................................................................44 Step 1: Define an opportunity for growth..................................44 Step 2: Conducting a pilot project...............................................44 Step 3: Evaluation .........................................................................45 Step 4: Next actions......................................................................45 Determining the Best Learning Model ............................................46 Tools to determine algorithm selection.....................................46 Approaching tool selection..........................................................47 CHAPTER 5: Learning Machine Skills .......................................................49 Defining the Skills That You Need ....................................................49 Getting Educated................................................................................53 IBM-Recommended Resources ........................................................56 CHAPTER 6: Using Machine Learning to Provide Solutions to Business Problems ....................................57 Applying Machine Learning to Patient Health ................................57 Leveraging IoT to Create More Predictable Outcomes..................58 Proactively Responding to IT Issues.................................................59 Protecting Against Fraud...................................................................60 CHAPTER 7: Ten Predictions on the Future of Machine Learning...............................................................63

Saturday, 30 December 2017

Deep Learning for Natural Language Processing

Contents III Data Preparation 34 IV BagofWords 61 V Word Embeddings 114 VI Text Classification 144 VII Language Modeling 189 VIII Image Captioning 244 IX Machine Translation 331 X Appendix 372 XI Conclusions 395 Copyright

Common terms and phrases approach architecture array bag-of-words better BLEU score calculate called caption chapter characters classification clean close Complete example convert create dataset deep learning define descriptions develop discover document encode Encoder-Decoder Epoch evaluate example Example output Explore extract file.close filename filter function given import input input sequence integer encode Keras labels language model layer length Listing load load doc load_doc(filename look loss mapping max_length means methods movie review natural language processing negative Neural Machine Translation neural network open(filename output pairs performance pre-trained predict prepare prints probability problem provides punctuation Python reference remove representation Running the example sentence sentiment sequence skill specific split started statistical step summarize task text data tokens turn tutorial vector vocab vocab_size vocabulary word embedding Word2Vec

Natural Language Processing with Python [html edition]

Natural Language Processing with Python – Analyzing Text with the Natural Language Toolkit Steven Bird, Ewan Klein, and Edward Loper This version of the NLTK book is updated for Python 3 and NLTK 3. The first edition of the book, published by O'Reilly, is available at http://nltk.org/book_1ed/. (There are currently no plans for a second edition of the book.) 0. Preface 1. Language Processing and Python 2. Accessing Text Corpora and Lexical Resources 3. Processing Raw Text 4. Writing Structured Programs 5. Categorizing and Tagging Words (minor fixes still required) 6. Learning to Classify Text 7. Extracting Information from Text 8. Analyzing Sentence Structure 9. Building Feature Based Grammars 10. Analyzing the Meaning of Sentences (minor fixes still required) 11. Managing Linguistic Data (minor fixes still required) 12. Afterword: Facing the Language Challenge Bibliography Term Index

Mecha Learn

Tuesday, 29 December 2020

Speech and Language Processing An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition

Monday, 22 April 2019

Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow

Monday, 31 December 2018

Foundations of Machine Learning

Natural Language Processing Succinctly

Friday, 30 November 2018

Machine Learning For Dummies

Saturday, 30 December 2017

Deep Learning for Natural Language Processing

Thursday, 31 July 2014

Natural Language Processing with Python [html edition]

Speech and Language Processing An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition

Report Abuse