Preferred method of contact:

Perform Data Engineering on Microsoft HD Insight (20775)

COURSE TYPE

Intermediate

Course Number

8491

Duration

5 Days

PDF Add to WishList

The main purpose of the course is to give students the ability plan and implement big data workflows on HDInsight.

This is a Microsoft Official Course (MOC) delivered by a Learning Tree expert instructor.

Enhance your Microsoft skills and take your career to the next level with our Microsoft Training Success Pack.

You Will Learn How To

  • Explain Microsoft R
  • Transform and clean big data sets

Important Course Information

Requirements:

  • Programming experience using R, and familiarity with common R packages
  • Knowledge of common statistical methods and data analysis best practices
  • Basic knowledge of the Microsoft Windows operating system and its core functionality
  • Working knowledge of relational databases

Redeem Your Microsoft Training Vouchers (SATV):

Course Outline

  • Module 1: Getting Started with HDInsight

This module introduces Hadoop, the MapReduce paradigm, and HDInsight.

Lessons

  • Big Data
  • Hadoop
  • MapReduce
  • HDInsight

Lab : Querying Big Data

  • Query data with Hive
  • Visualize data with Excel

After completing this module, students will be able to:

  • Describe Big data
  • Describe Hadoop
  • Describe MapReduce
  • Describe HDInsight
  • Module 2: Deploying HDInsight Clusters

At the end of this module the student will be able to deploy HDInsight clusters.

Lessons

  • HDInsight cluster types
  • Managing HDInsight Clusters
  • Managing HDInsight Clusters with PowerShell

Lab : Managing HDInsight clusters with the Azure Portal

  • Create an HDInsight Hadoop Cluster
  • Customize HDInsight using a script action
  • Customize HDInsight using Bootstrap
  • Delete an HDInsight cluster

After completing this module, students will be able to:

  • Describe HDInsight cluster types.
  • Describe the creation, management, and deletion of HDInsight clusters with the Azure portal
  • Describe the creation, management, and deletion of HDInsight clusters with PowerShell
  • Module 3: Authorizing Users to Access Resources

This module covers permissions and the assignment of permissions.

Lessons

  • Non-domain Joined clusters
  • Configuring domain-joined HDInsight clusters
  • Manage domain-joined HDInsight clusters

Lab : Authorizing Users to Access Resources

  • Configure a domain-joined HDInsight cluster
  • Configure Hive policies

After completing this module, students will be able to:

  • Describe how to authorize user access to objects
  • Describe how to authorize users to execute code
  • Describe how to manage domain-joined HDInsight clusters
  • Module 4: Loading data into HDInsight

This module covers loading data into HDInsight.

Lessons

  • HDInsight Storage
  • Data loading tools
  • Performance and reliability

Lab : Loading Data into HDInsight

  • Loading data using Sqoop
  • Loading data using AZcopy
  • Loading data using ADLcopy
  • Use HDInsight to compress data

After completing this module, students will be able to:

  • Describe HDInsight storage configurations and architectures
  • Describe options for loading data into HDInsight
  • Describe benefits of compression and pre-processing in HDInsight
  • Module 5: Troubleshooting HDInsight

This module describes how to troubleshoot HDInsight.

Lessons

  • Analyze HDInsight logs
  • YARN logs
  • Heap dumps
  • Operations management suite

Lab : Troubleshooting HDInsight

  • Analyze HDInsight logs
  • Analyze YARN logs
  • Monitor resources with Operations Management Suite

After completing this module, students will be able to:

  • Analyze HDInsight logs
  • Analyze YARN logs
  • Analyze Heap dumps
  • Use the operations management suite to monitor resources
  • Module 6: Implementing Batch Solutions

This module describes how to implement batch solutions.

Lessons

  • Apache Hive storage
  • Querying with Hive and Pig
  • Operationalize HDInsight

Lab : Backing Up SQL Server Databases

  • Load data into a hive table
  • Query data with Hive and Pig

After completing this module, students will be able to:

  • Describe Apache Hive storage
  • Query data using Hive and Pig
  • Operationalize HDInsight
  • Module 7: Design Batch ETL solutions for big data with Spark

This module describes how to design batch ETL solutions for big data with Spark.

Lessons

  • What is Spark?
  • ETL with Spark
  • Spark performance

Lab : Design Batch ETL solutions for big data with Spark.

  • Create a HDInsight Cluster with access to Data Lake Store
  • Use HDInsight Spark cluster to analyze data in Data Lake Store
  • Analyzing website logs using a custom library with Apache Spark cluster on HDInsight
  • Managing resources for Apache Spark cluster on Azure HDInsight

After completing this module, students will be able to:

  • Describe Spark and when to use it
  • Describe the use of ETL with Spark
  • Analyze Spark performance
  • Module 8: Analyze Data with Spark SQL

This module describes how to analyze data with Spark SQL.

Lessons

  • Implement interactive queries
  • Perform exploratory data analysis

Lab : Analyze data with Spark SQL

  • Implement interactive queries
  • Perform exploratory data analysis

After completing this module, students will be able to:

  • Implement interactive queries.
  • Perform exploratory data analysis.
  • Module 9: Analyze Data with Hive and Phoenix

This module describes how to analyze data with Hive and Phoenix.

Lessons

  • Implement interactive queries for big data with interactive hive.
  • Perform exploratory data analysis by using Hive
  • Perform interactive processing by using Apache Phoenix

Lab : Analyze data with Hive and Phoenix

  • Implement interactive queries for big data with interactive Hive
  • Perform exploratory data analysis by using Hive
  • Perform interactive processing by using Apache Phoenix

After completing this module, students will be able to:

  • Implement interactive queries with interactive Hive
  • Perform exploratory data analysis using Hive
  • Perform interactive processing by using Apache Phoenix
  • Module 10: Stream Analytics

This module introduces Azure Stream Analytics.

Lessons

  • Stream analytics
  • Process streaming data from stream analytics
  • Managing stream analytics jobs

Lab : Implement Stream Analytics

  • Process streaming data with stream analytics
  • Managing stream analytics jobs

After completing this module, students will be able to:

  • Describe stream analytics and it’s capabilities
  • Process streaming data with stream analytics
  • Manage stream analytics jobs
  • Module 11: Spark Streaming using the DStream API

This module introduces the Dstream API and describes how to create Spark structured streaming applications.

Lessons

  • Dstream
  • Create Spark structured streaming applications
  • Persistence and visualization

Lab : Spark streaming applications using DStream API

  • Creating Spark streaming applications using the DStream API
  • Creating Spark structured streaming applications

After completing this module, students will be able to:

  • Explain DStream
  • Create Spark structured streaming applications
  • Describe persistence and visualization
  • Module 12: Develop big data real-time processing solutions with Apache Storm

This module explains how to develop big data real-time processing solutions with Apache Storm.

Lessons

  • Persist long term data
  • Stream data with Storm
  • Create Storm topologies
  • Configure Apache Storm

Lab : Developing big data real-time processing solutions with Apache Storm

  • Stream data with Storm
  • Create Storm topologies

After completing this module, students will be able to:

  • Persist long term data
  • Stream data with Storm
  • Create Storm topologies
  • Configure Apache Storm
  • Module 13: Analyze Data with Spark SQL

This module describes how to analyze data with Spark SQL.

Lessons

  • Implement interactive queries
  • Perform exploratory data analysis

Lab : Analyze data with Spark SQL

  • Implement interactive queries
  • Perform exploratory data analysis

After completing this module, students will be able to:

  • Implement interactive queries
  • Perform exploratory data analysis
Show complete outline
Show Less

Convenient Ways to Attend This Instructor-Led Course

Hassle-Free Enrolment: No advance payment required to reserve your seat.
Tuition due 30 days after you attend your course.

In the Classroom

Live, Online

Private Team Training

In the Classroom — OR — Live, Online

Tuition — Standard: $3710   Government: $3260

Dec 18 - 22 (5 Days)
9:00 AM - 4:30 PM EST
New York / Online (AnyWare) New York / Online (AnyWare) Reserve Your Seat

How would you like to attend?

Live, Online
In-Class

Jan 8 - 12 (5 Days)
9:00 AM - 4:30 PM EST
Herndon, VA / Online (AnyWare) Herndon, VA / Online (AnyWare) Reserve Your Seat

How would you like to attend?

Live, Online
In-Class

Feb 26 - Mar 2 (5 Days)
9:00 AM - 4:30 PM EST
Rockville, MD / Online (AnyWare) Rockville, MD / Online (AnyWare) Reserve Your Seat

How would you like to attend?

Live, Online
In-Class

Mar 26 - 30 (5 Days)
9:00 AM - 4:30 PM EDT
New York / Online (AnyWare) New York / Online (AnyWare) Reserve Your Seat

How would you like to attend?

Live, Online
In-Class

Apr 9 - 13 (5 Days)
9:00 AM - 4:30 PM EDT
Herndon, VA / Online (AnyWare) Herndon, VA / Online (AnyWare) Reserve Your Seat

How would you like to attend?

Live, Online
In-Class

May 7 - 11 (5 Days)
9:00 AM - 4:30 PM EDT
Rockville, MD / Online (AnyWare) Rockville, MD / Online (AnyWare) Reserve Your Seat

How would you like to attend?

Live, Online
In-Class

Jun 18 - 22 (5 Days)
9:00 AM - 4:30 PM EDT
New York / Online (AnyWare) New York / Online (AnyWare) Reserve Your Seat

How would you like to attend?

Live, Online
In-Class

Guaranteed to Run

Show all dates
Show fewer dates

Private Team Training

Enroling at least 3 people in this course? Consider bringing this (or any course that can be custom designed) to your preferred location as a private team training.

For details, call 1-888-843-8733 or Click here »

Tuition

Standard

Government

In Classroom or
Online

Standard

$3710

Government

$3260

Private Team Training

Contact Us »

Course Tuition Includes:

After-Course Instructor Coaching
When you return to work, you are entitled to schedule a free coaching session with your instructor for help and guidance as you apply your new skills.

Prev
Next

Training Hours

Standard Course Hours: 9:00 am – 4:30 pm
*Informal discussion with instructor about your projects or areas of special interest: 4:30 pm – 5:30 pm

}

- ,

Prev
Next
Chat Now

Please Choose a Language

Canada - English

Canada - Français