Scalable Data Analysis in Python with Dask [Video]

Scalable Data Analysis in Python with Dask [Video]

Mohammed Kashif
New Release!

Build high-performance, distributed, and parallel applications in Dask
Packt Subscription
FREE
$9.99/m after trial
Video
$106.25
RRP $124.99
Save 14%
What do I get with a Packt subscription?
  • Exclusive monthly discount - no contract
  • Unlimited access to entire Packt library of 6500+ eBooks and Videos
  • 120 new titles added every month, on new and emerging tech
What do I get with an eBook?
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the subscription reader
What do I get with Print & eBook?
  • Get a paperback copy of the book delivered to you
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the subscription reader
What do I get with a Video?
  • Download this Video course in MP4 format
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the subscription reader
$0.00
$106.25
$9.99 p/m after trial
RRP $124.99
Subscription
Video
Start a FREE 10-day trial

Frequently bought together


Scalable Data Analysis in Python with Dask [Video] Book Cover
Scalable Data Analysis in Python with Dask [Video]
$ 124.99
$ 106.25
Exploratory Data Analysis with Pandas and Python 3.x [Video] Book Cover
Exploratory Data Analysis with Pandas and Python 3.x [Video]
$ 124.99
$ 106.25
Buy 2 for $212.50
Save $37.48
Add to Cart

Video Description

Data analysts, Machine Learning professionals, and data scientists often use tools such as Pandas, Scikit-Learn, and NumPy for data analysis on their personal computer. However, when they want to apply their analyses to larger datasets, these tools fail to scale beyond a single machine, and so the analyst is forced to rewrite their computation.

If you work on big data and you’re using Pandas, you know you can end up waiting up to a whole minute for a simple average of a series. And that’s just for a couple of million rows!

In this course, you’ll learn to scale your data analysis. Firstly, you will execute distributed data science projects right from data ingestion to data manipulation and visualization using Dask. Then, you will explore the Dask framework. After, see how Dask can be used with other common Python tools such as NumPy, Pandas, matplotlib, Scikit-learn, and more.

You’ll be working on large datasets and performing exploratory data analysis to investigate the dataset, then come up with the findings from the dataset. You’ll learn by implementing data analysis principles using different statistical techniques in one go across different systems on the same massive datasets.

Throughout the course, we’ll go over the various techniques, modules, and features that Dask has to offer. Finally, you’ll learn to use its unique offering for machine learning, using the Dask-ML package. You’ll also start using parallel processing in your data tasks on your own system without moving to the distributed environment.

All the code files and related files are uploaded on GitHub at this link: https://github.com/PacktPublishing/-Scalable-Data-Analysis-in-Python-with-Dask

Style and Approach

This hands-on course covers all the important components of Dask (arrays, bags, data frames, schedulers, and the Futures API) to parallelize your existing Python code and perform computations in a distributed setting. This course is designed with minimum theory and maximum practical implementation, followed by step-by-step instructions to get you up and running.

Video Preview

What You Will Learn

  • Understand the concept of Block algorithms and how Dask leverages it to load large data.
  • Implement various example using Dask Arrays, Bags, and Dask Data frames for efficient parallel computing
  • Combine Dask with existing Python packages such as NumPy and Pandas
  • See how Dask works under the hood and the various in-built algorithms it has to offer
  • Leverage the power of Dask in a distributed setting and explore its various schedulers
  • Implement an end-to-end Machine Learning pipeline in a distributed setting using Dask and scikit-learn
  • Use Dask Arrays, Bags, and Dask Data frames for parallel and out-of-memory computations

Authors

Video Details

ISBN 139781789808926
Course Length3 hours 31 minutes
Read More

Read More Reviews

Recommended for You

Exploratory Data Analysis with Pandas and Python 3.x [Video] Book Cover
Exploratory Data Analysis with Pandas and Python 3.x [Video]
$ 124.99
$ 106.25
Machine Learning and Data Science with Python: A Complete Beginners Guide [Video] Book Cover
Machine Learning and Data Science with Python: A Complete Beginners Guide [Video]
$ 18.99
$ 16.15
Hands-On SQL Server 2019 Big Data Clusters with Spark [Video] Book Cover
Hands-On SQL Server 2019 Big Data Clusters with Spark [Video]
$ 124.99
$ 106.25
AWS Certified Big Data Specialty 2019 - In Depth and Hands On! [Video] Book Cover
AWS Certified Big Data Specialty 2019 - In Depth and Hands On! [Video]
$ 91.99
$ 78.20
Hands-On Exploratory Data Analysis with R Book Cover
Hands-On Exploratory Data Analysis with R
$ 23.99
$ 16.80
Autonomous Cars: Deep Learning and Computer Vision in Python [Video] Book Cover
Autonomous Cars: Deep Learning and Computer Vision in Python [Video]
$ 54.99
$ 46.75