Becoming a Data Head

How to Think, Speak, and Understand Data Science, Statistics, and Machine Learning
Langbeschreibung
"Turn yourself into a Data Head. You'll become a more valuable employee and make your organization more successful."Thomas H. Davenport, Research Fellow, Author of Competing on Analytics, Big Data @ Work, and The AI AdvantageYou've heard the hype around data--now get the facts.In Becoming a Data Head: How to Think, Speak, and Understand Data Science, Statistics, and Machine Learning, award-winning data scientists Alex Gutman and Jordan Goldmeier pull back the curtain on data science and give you the language and tools necessary to talk and think critically about it.You'll learn how to:* Think statistically and understand the role variation plays in your life and decision making* Speak intelligently and ask the right questions about the statistics and results you encounter in the workplace* Understand what's really going on with machine learning, text analytics, deep learning, and artificial intelligence* Avoid common pitfalls when working with and interpreting dataBecoming a Data Head is a complete guide for data science in the workplace: covering everything from the personalities you'll work with to the math behind the algorithms. The authors have spent years in data trenches and sought to create a fun, approachable, and eminently readable book. Anyone can become a Data Head--an active participant in data science, statistics, and machine learning. Whether you're a business professional, engineer, executive, or aspiring data scientist, this book is for you.
Inhaltsverzeichnis
Acknowledgments xiiiForeword xxiiiIntroduction xxviiPart One Thinking Like a Data HeadChapter 1 What Is the Problem? 3Questions a Data Head Should Ask 4Why Is This Problem Important? 4Who Does This Problem Affect? 6What If We Don't Have the Right Data? 6When Is the Project Over? 7What If We Don't Like the Results? 7Understanding Why Data Projects Fail 8Customer Perception 8Discussion 10Working on Problems That Matter 11Chapter Summary 11Chapter 2 What Is Data? 13Data vs. Information 13An Example Dataset 14Data Types 15How Data Is Collected and Structured 16Observational vs. Experimental Data 16Structured vs. Unstructured Data 17Basic Summary Statistics 18Chapter Summary 19Chapter 3 Prepare to Think Statistically 21Ask Questions 22There Is Variation in All Things 23Scenario: Customer Perception (The Sequel) 24Case Study: Kidney-Cancer Rates 26Probabilities and Statistics 28Probability vs. Intuition 29Discovery with Statistics 31Chapter Summary 33Part Two Speaking Like a Data HeadChapter 4 Argue with the Data 37What Would You Do? 38Missing Data Disaster 39Tell Me the Data Origin Story 43Who Collected the Data? 44How Was the Data Collected? 44Is the Data Representative? 45Is There Sampling Bias? 46What Did You Do with Outliers? 46What Data Am I Not Seeing? 47How Did You Deal with Missing Values? 47Can the Data Measure What You Want It to Measure? 48Argue with Data of All Sizes 48Chapter Summary 49Chapter 5 Explore the Data 51Exploratory Data Analysis and You 52Embracing the Exploratory Mindset 52Questions to Guide You 53The Setup 53Can the Data Answer the Question? 54Set Expectations and Use Common Sense 54Do the Values Make Intuitive Sense? 54Watch Out: Outliers and Missing Values 58Did You Discover Any Relationships? 59Understanding Correlation 59Watch Out: Misinterpreting Correlation 60Watch Out: Correlation Does Not Imply Causation 62Did You Find New Opportunities in the Data? 63Chapter Summary 63Chapter 6 Examine the Probabilities 65Take a Guess 66The Rules of the Game 66Notation 67Conditional Probability and Independent Events 69The Probability of Multiple Events 69Two Things That Happen Together 69One Thing or the Other 70Probability Thought Exercise 72Next Steps 73Be Careful Assuming Independence 74Don't Fall for the Gambler's Fallacy 74All Probabilities Are Conditional 75Don't Swap Dependencies 76Bayes' Theorem 76Ensure the Probabilities Have Meaning 79Calibration 80Rare Events Can, and Do, Happen 80Chapter Summary 81Chapter 7 Challenge the Statistics 83Quick Lessons on Inference 83Give Yourself Some Wiggle Room 84More Data, More Evidence 84Challenge the Status Quo 85Evidence to the Contrary 86Balance Decision Errors 88The Process of Statistical Inference 89The Questions You Should Ask to Challenge the Statistics 90What Is the Context for These Statistics? 90What Is the Sample Size? 91What Are You Testing? 92What Is the Null Hypothesis? 92Assuming Equivalence 93What Is the Significance Level? 93How Many Tests Are You Doing? 94Can I See the Confidence Intervals? 95Is This Practically Significant? 96Are You Assuming Causality? 96Chapter Summary 97Part Three Understanding the Data Scientist's ToolboxChapter 8 Search for Hidden Groups 101Unsupervised Learning 102Dimensionality Reduction 102Creating Composite Features 103Principal Component Analysis 105Principal Components in Athletic Ability 105PCA Summary 108Potential Traps 109Clustering 110k-Means Clustering 111Clustering Retail Locations 111Potential Traps 113Chapter Summary 114Chapter 9 Understand the Regression Model 117Supervised Learning 117Linear Regression: What It Does 119Least Squares Regression: Not Just a Clever Name 120Linear Regression: What It Gives You 123Extending to Many Features 124Linear Regression: What Confusion It Causes 125Omitted Variables 125Multicollinearity 126Data Leakage 127Extrapolation Failures 128Many Relationships Aren't Linear 128Are You Explaining or Predicting? 128Regression Performance 130Other Regression Models 131Chapter Summary 131Chapter 10 Understand the Classification Model 133Introduction to Classification 133What You'll Learn 134Classification Problem Setup 135Logistic Regression 135Logistic Regression: So What? 138Decision Trees 139Ensemble Methods 142Random Forests 143Gradient Boosted Trees 143Interpretability of Ensemble Models 145Watch Out for Pitfalls 145Misapplication of the Problem 146Data Leakage 146Not Splitting Your Data 146Choosing the Right Decision Threshold 147Misunderstanding Accuracy 147Confusion Matrices 148Chapter Summary 150Chapter 11 Understand Text Analytics 151Expectations of Text Analytics 151How Text Becomes Numbers 153A Big Bag of Words 153N-Grams 157Word Embeddings 158Topic Modeling 160Text Classification 163Naïve Bayes 164Sentiment Analysis 166Practical Considerations When Working with Text 167Big Tech Has the Upper Hand 168Chapter Summary 169Chapter 12 Conceptualize Deep Learning 171Neural Networks 172How Are Neural Networks Like the Brain? 172A Simple Neural Network 173How a Neural Network Learns 174A Slightly More Complex Neural Network 175Applications of Deep Learning 178The Benefits of Deep Learning 179How Computers "See" Images 180Convolutional Neural Networks 182Deep Learning on Language and Sequences 183Deep Learning in Practice 185Do You Have Data? 185Is Your Data Structured? 186What Will the Network Look Like? 186Artificial Intelligence and You 187Big Tech Has the Upper Hand 188Ethics in Deep Learning 189Chapter Summary 190Part Four Ensuring SuccessChapter 13 Watch Out for Pitfalls 193Biases and Weird Phenomena in Data 194Survivorship Bias 194Regression to the Mean 195Simpson's Paradox 195Confirmation Bias 197Effort Bias (aka the "Sunk Cost Fallacy") 197Algorithmic Bias 198Uncategorized Bias 198The Big List of Pitfalls 199Statistical and Machine Learning Pitfalls 199Project Pitfalls 200Chapter Summary 202Chapter 14 Know the People and Personalities 203Seven Scenes of Communication Breakdowns 204The Postmortem 204Storytime 205The Telephone Game 206Into the Weeds 206The Reality Check 207The Takeover 207The Blowhard 208Data Personalities 208Data Enthusiasts 209Data Cynics 209Data Heads 209Chapter Summary 210Chapter 15 What's Next? 211Index 215
ALEX J. GUTMAN, PhD, is a Data Scientist, Corporate Trainer, and Accredited Professional Statistician. His professional focus is on statistical and machine learning and he has extensive experience working as a Data Scientist for the Department of Defense and two Fortune 50 companies.
ISBN-13:
9781119741749
Veröffentl:
2021
Erscheinungsdatum:
24.06.2021
Seiten:
272
Autor:
Alex J. Gutman
Gewicht:
368 g
Format:
224x151x14 mm
Sprache:
Deutsch

38,50 €*

Lieferzeit: Besorgungstitel - Lieferbar innerhalb von 10 Werktageni
Alle Preise inkl. MwSt. | zzgl. Versand