Data science for dummies /

Monetize your company's data and data science expertise without spending a fortune on hiring independent strategy consultants to help What if there was one simple, clear process for ensuring that all your company's data science projects achieve a high a return on investment? What if you co...

Full description

Saved in:
Bibliographic Details
Main Author: Pierson, Lillian (Author)
Format: Electronic eBook
Language:English
Published: Hoboken, NJ : For Dummies, 2021.
Edition:Third edition.
Subjects:
Online Access:CONNECT
Table of Contents:
  • <P><b>Introduction</b><b> 1</b></p> <p>About This Book 3</p> <p>Foolish Assumptions 3</p> <p>Icons Used in This Book 4</p> <p>Beyond the Book 4</p> <p>Where to Go from Here 4</p> <p><b>Part 1: Getting Started with Data Science</b><b> 5</b></p> <p><b>Chapter 1: Wrapping Your Head Around Data Science</b><b> 7</b></p> <p>Seeing Who Can Make Use of Data Science 8</p> <p>Inspecting the Pieces of the Data Science Puzzle 10</p> <p>Collecting, querying, and consuming data 11</p> <p>Applying mathematical modeling to data science tasks 12</p> <p>Deriving insights from statistical methods 12</p> <p>Coding, coding, coding
  • it's just part of the game 13</p> <p>Applying data science to a subject area 13</p> <p>Communicating data insights 14</p> <p>Exploring Career Alternatives That Involve Data Science 15</p> <p>The data implementer 16</p> <p>The data leader 16</p> <p>The data entrepreneur 17</p> <p><b>Chapter 2: Tapping into Critical Aspects of Data Engineering</b><b> 19</b></p> <p>Defining Big Data and the Three Vs 19</p> <p>Grappling with data volume 21</p> <p>Handling data velocity 21</p> <p>Dealing with data variety 22</p> <p>Identifying Important Data Sources 23</p> <p>Grasping the Differences among Data Approaches 24</p> <p>Defining data science 25</p> <p>Defining machine learning engineering 26</p> <p>Defining data engineering 26</p> <p>Comparing machine learning engineers, data scientists, and data engineers 27</p> <p>Storing and Processing Data for Data Science 28</p> <p>Storing data and doing data science directly in the cloud 28</p> <p>Storing big data on-premise 32</p> <p>Processing big data in real-time 35</p> <p><b>Part 2: Using Data Science to Extract Meaning from Your Data </b><b>37</b></p> <p><b>Chapter 3: Machine Learning Means Using a Machine to Learn from Data</b><b> 39</b></p> <p>Defining Machine Learning and Its Processes 40</p> <p>Walking through the steps of the machine learning process 40</p> <p>Becoming familiar with machine learning terms 41</p> <p>Considering Learning Styles 42</p> <p>Learning with supervised algorithms 42</p> <p>Learning with unsupervised algorithms 43</p> <p>Learning with reinforcement 43</p> <p>Seeing What You Can Do 43</p> <p>Selecting algorithms based on function 44</p> <p>Using Spark to generate real-time big data analytics 48</p> <p><b>Chapter 4: Math, Probability, and Statistical Modeling</b><b> 51</b></p> <p>Exploring Probability and Inferential Statistics 52</p> <p>Probability distributions 53</p> <p>Conditional probability with Naïve Bayes 55</p> <p>Quantifying Correlation 56</p> <p>Calculating correlation with Pearson's r 56</p> <p>Ranking variable-pairs using Spearman's rank correlation 58</p> <p>Reducing Data Dimensionality with Linear Algebra 59</p> <p>Decomposing data to reduce dimensionality 59</p> <p>Reducing dimensionality with factor analysis 63</p> <p>Decreasing dimensionality and removing outliers with PCA 64</p> <p>Modeling Decisions with Multiple Criteria Decision-Making 65</p> <p>Turning to traditional MCDM 65</p> <p>Focusing on fuzzy MCDM 67</p> <p>Introducing Regression Methods 67</p> <p>Linear regression 67</p> <p>Logistic regression 69</p> <p>Ordinary least squares (OLS) regression methods 70</p> <p>Detecting Outliers 70</p> <p>Analyzing extreme values 70</p> <p>Detecting outliers with univariate analysis 71</p> <p>Detecting outliers with multivariate analysis 73</p> <p>Introducing Time Series Analysis 73</p> <p>Identifying patterns in time series 74</p> <p>Modeling univariate time series data 75</p> <p><b>Chapter 5: Grouping Your Way into Accurate Predictions</b><b> 77</b></p> <p>Starting with Clustering Basics 78</p> <p>Getting to know clustering algorithms 79</p> <p>Examining clustering similarity metrics 81</p> <p>Identifying Clusters in Your Data 82</p> <p>Clustering with the k-means algorithm 82</p> <p>Estimating clusters with kernel density estimation (KDE) 84</p> <p>Clustering with hierarchical algorithms 84</p> <p>Dabbling in the DBScan neighborhood 87</p> <p>Categorizing Data with Decision Tree and Random Forest Algorithms 88</p> <p>Drawing a Line between Clustering and Classification 89</p> <p>Introducing instance-based learning classifiers 90</p> <p>Getting to know classification algorithms 90</p> <p>Making Sense of Data with Nearest Neighbor Analysis 93</p> <p>Classifying Data with Average Nearest Neighbor Algorithms 94</p> <p>Classifying with K-Nearest Neighbor Algorithms 97</p> <p>Understanding how the k-nearest neighbor algorithm works 98</p> <p>Knowing when to use the k-nearest neighbor algorithm 99</p> <p>Exploring common applications of k-nearest neighbour algorithms 100</p> <p>Solving Real-World Problems with Nearest Neighbor Algorithms 100</p> <p>Seeing k-nearest neighbor algorithms in action 101</p> <p>Seeing average nearest neighbor algorithms in action 101</p> <p><b>Chapter 6: Coding Up Data Insights and Decision Engines</b><b> 103</b></p> <p>Seeing Where Python and R Fit into Your Data Science Strategy 104</p> <p>Using Python for Data Science 104</p> <p>Sorting out the various Python data types 106</p> <p>Putting loops to good use in Python 109</p> <p>Having fun with functions 110</p> <p>Keeping cool with classes 112</p> <p>Checking out some useful Python libraries 114</p> <p>Using Open Source R for Data Science 120</p> <p>Comprehending R's basic vocabulary 121</p> <p>Delving into functions and operators 124</p> <p>Iterating in R 127</p> <p>Observing how objects work 129</p> <p>Sorting out R's popular statistical analysis packages 131</p> <p>Examining packages for visualizing, mapping, and graphing in R 133</p> <p><b>Chapter 7: Generating Insights with Software Applications</b><b> 137</b></p> <p>Choosing the Best Tools for Your Data Science Strategy 138</p> <p>Getting a Handle on SQL and Relational Databases 139</p> <p>Investing Some Effort into Database Design 144</p> <p>Defining data types 144</p> <p>Designing constraints properly 145</p> <p>Normalizing your database 145</p> <p>Narrowing the Focus with SQL Functions 147</p> <p>Making Life Easier with Excel 151</p> <p>Using Excel to quickly get to know your data 152</p> <p>Reformatting and summarizing with PivotTables 157</p> <p>Automating Excel tasks with macros 158</p> <p><b>Chapter 8: Telling Powerful Stories with Data</b><b> 161</b></p> <p>Data Visualizations: The Big Three 162</p> <p>Data storytelling for decision makers 162</p> <p>Data showcasing for analysts 163</p> <p>Designing data art for activists 164</p> <p>Designing to Meet the Needs of Your Target Audience 164</p> <p>Step 1: Brainstorm (All about Eve) 165</p> <p>Step 2: Define the purpose 166</p> <p>Step 3: Choose the most functional visualization type for your purpose 166</p> <p>Picking the Most Appropriate Design Style 167</p> <p>Inducing a calculating, exacting response 167</p> <p>Eliciting a strong emotional response 168</p> <p>Selecting the Appropriate Data Graphic Type 170</p> <p>Standard chart graphics 171</p> <p>Comparative graphics 173</p> <p>Statistical plots 176</p> <p>Topology structures 179</p> <p>Spatial plots and maps 180</p> <p>Testing Data Graphics 183</p> <p>Adding Context 184</p> <p>Creating context with data 184</p> <p>Creating context with annotations 185</p> <p>Creating context with graphical elements 186</p> <p><b>Part 3: Taking Stock of Your Data Science Capabilities </b><b>187</b></p> <p><b>Chapter 9: Developing Your Business Acumen</b><b> 189</b></p> <p>Bridging the Business Gap 189</p> <p>Contrasting business acumen with subject matter expertise 190</p> <p>Defining business acumen 191</p> <p>Traversing the Business Landscape 192</p> <p>Seeing how data roles support the business in making money 192</p> <p>Leveling up your business acumen 195</p> <p>Fortifying your leadership skills 196</p> <p>Surveying Use Cases and Case Studies 197</p> <p>Documen