ADVANCED ANALYTICS WITH PYSPARK : patterns for learning from data at scale using Python and Spark /

The amount of data being generated today is staggering and growing. Apache Spark has emerged as the de facto tool to analyze big data and is now a critical part of the data science toolbox. Updated for Spark 3.0, this practical guide brings together Spark, statistical methods, and real-world dataset...

Full description

Saved in:
Bibliographic Details
Main Authors: Tandon, Akash (Author), Owen, Sean (Author), Wills, Josh (Author), Ryza, Sandy (Author), Laserson, Uri, 1983- (Author)
Format: eBook
Language:English
Published: [S.l.] : O'REILLY MEDIA, 2022.
Subjects:
Online Access:CONNECT
CONNECT
LEADER 03353cam a22005177a 4500
001 in00006151893
006 m o d
007 cr |n|||||||||
008 220618s2022 xx o 000 0 eng d
005 20220706142806.8
035 |a 1WRLDSHRon1330690750 
040 |a YDX  |b eng  |c YDX  |d ORMDA 
020 |a 9781098103620  |q (electronic bk.) 
020 |a 1098103629  |q (electronic bk.) 
020 |z 1098103653 
020 |z 9781098103651 
035 |a (OCoLC)1330690750 
037 |a 9781098103644  |b O'Reilly Media 
050 4 |a QA76.9.D343 
082 0 4 |a 006.3/12  |2 23/eng/20220621 
049 |a TXMM 
100 1 |a Tandon, Akash,  |e author. 
245 0 0 |a ADVANCED ANALYTICS WITH PYSPARK :  |b patterns for learning from data at scale using Python and Spark /  |c Akash Tandon, Sandy Ryza, Uri Laserson, Sean Owen & Josh Wills. 
260 |a [S.l.] :  |b O'REILLY MEDIA,  |c 2022. 
300 |a 1 online resource 
336 |a text  |b txt  |2 rdacontent 
337 |a computer  |b c  |2 rdamedia 
338 |a online resource  |b cr  |2 rdacarrier 
520 |a The amount of data being generated today is staggering and growing. Apache Spark has emerged as the de facto tool to analyze big data and is now a critical part of the data science toolbox. Updated for Spark 3.0, this practical guide brings together Spark, statistical methods, and real-world datasets to teach you how to approach analytics problems using PySpark, Spark's Python API, and other best practices in Spark programming. Data scientists Akash Tandon, Sandy Ryza, Uri Laserson, Sean Owen, and Josh Wills offer an introduction to the Spark ecosystem, then dive into patterns that apply common techniques-including classification, clustering, collaborative filtering, and anomaly detection, to fields such as genomics, security, and finance. This updated edition also covers NLP and image processing. If you have a basic understanding of machine learning and statistics and you program in Python, this book will get you started with large-scale data analysis. Familiarize yourself with Spark's programming model and ecosystem Learn general approaches in data science Examine complete implementations that analyze large public datasets Discover which machine learning tools make sense for particular problems Explore code that can be adapted to many uses. 
590 |a O'Reilly Online Learning Platform: Academic Edition (SAML SSO Access) 
630 0 0 |a SPARK (Electronic resource) 
650 0 |a Python (Computer program language) 
650 0 |a Data mining. 
700 1 |a Owen, Sean,  |e author. 
700 1 |a Wills, Josh,  |e author. 
700 1 |a Ryza, Sandy,  |e author. 
700 1 |a Laserson, Uri,  |d 1983-  |e author. 
730 0 |a WORLDSHARE SUB RECORDS 
776 0 8 |i Print version:  |z 1098103653  |z 9781098103651  |w (OCoLC)1272856308 
856 4 0 |u https://go.oreilly.com/middle-tennessee-state-university/library/view/-/9781098103644/?ar  |z CONNECT  |3 O'Reilly  |t 0 
949 |a ho0 
994 |a 92  |b TXM 
998 |a wi  |d z 
999 f f |s ab355ad0-b44f-4efd-a5af-b0dc1be3c58f  |i ab355ad0-b44f-4efd-a5af-b0dc1be3c58f  |t 0 
952 f f |a Middle Tennessee State University  |b Main  |c James E. Walker Library  |d Electronic Resources  |t 1  |e QA76.9.D343   |h Library of Congress classification 
856 4 0 |3 O'Reilly  |t 0  |u https://go.oreilly.com/middle-tennessee-state-university/library/view/-/9781098103644/?ar  |z CONNECT