Preferred method of contact:

Hadoop Architecture & Administration for Big Data Solutions

COURSE TYPE

Intermediate

Course Number

1252

Duration

4 Days

PDF Add to WishList

The emergence of large data sets presents new opportunities and challenges to organizations of all sizes. In this Hadoop architecture and administration training course, you gain the skills to instal, configure, and manage the Apache Hadoop platform and its associated ecosystem, and build a Hadoop solution that satisfies your business requirements.

You Will Learn How To

  • Architect a Hadoop solution to satisfy your business requirements
  • Instal and build a Hadoop cluster capable of processing large data
  • Configure and tune the Hadoop environment to ensure high throughput and availability
  • Allocate, distribute, and manage resources
  • Monitor the file system, job progress, and overall cluster performance

Important Course Information

Recommended Experience:

  • Knowledge of Linux at the level of:
  • Knowledge of Java at the level of:

Course Outline

  • Introduction to Data Storage and Processing

Installing the Hadoop Distributed File System (HDFS)

  • Defining key design assumptions and architecture
  • Configuring and setting up the file system
  • Issuing commands from the console
  • Reading and writing files

Setting the stage for MapReduce

  • Reviewing the MapReduce approach
  • Introducing the computing daemons
  • Dissecting a MapReduce job
  • Defining Hadoop Cluster Requirements

Planning the architecture

  • Selecting appropriate hardware
  • Designing a scalable cluster

Building the cluster

  • Installing Hadoop daemons
  • Optimizing the network architecture
  • Configuring a Cluster

Preparing HDFS

  • Setting basic configuration parameters
  • Configuring block allocation, redundancy and replication

Deploying MapReduce

  • Installing and setting up the MapReduce environment
  • Delivering redundant load balancing via Rack Awareness
  • Maximizing HDFS Robustness

Creating a fault–tolerant file system

  • Isolating single points of failure
  • Maintaining High Availability
  • Triggering manual failover
  • Automating failover with Zookeeper

Leveraging NameNode Federation

  • Extending HDFS resources
  • Managing the namespace volumes

Introducing YARN

  • Critiquing the YARN architecture
  • Identifying the new daemons
  • Managing Resources and Cluster Health

Allocating resources

  • Setting quotas to constrain HDFS utilization
  • Prioritizing access to MapReduce using schedulers

Maintaining HDFS

  • Starting and stopping Hadoop daemons
  • Monitoring HDFS status
  • Adding and removing data nodes

Administering MapReduce

  • Managing MapReduce jobs
  • Tracking progress with monitoring tools
  • Commissioning and decommissioning compute nodes
  • Maintaining a Cluster

Employing the standard built–in tools

  • Managing and debugging processes using JVM metrics
  • Performing Hadoop status checks

Tuning with supplementary tools

  • Assessing performance with Ganglia
  • Benchmarking to ensure continued performance
  • Extending Hadoop

Simplifying information access

  • Enabling SQL–like querying with Hive
  • Installing Pig to create MapReduce jobs

Integrating additional elements of the ecosystem

  • Imposing a tabular view on HDFS with HBase
  • Configuring Oozie to schedule workflows
  • Implementing Data Ingress and Egress

Facilitating generic input/output

  • Moving bulk data into and out of Hadoop
  • Transmitting HDFS data over HTTP with WebHDFS

Acquiring application–specific data

  • Collecting multi–sourced log files with Flume
  • Importing and exporting relational information with Sqoop
  • Planning for Backup, Recovery and Security
  • Coping with inevitable hardware failures
  • Securing your Hadoop cluster
Show complete outline
Show Less

Convenient Ways to Attend This Instructor-Led Course

Hassle-Free Enrolment: No advance payment required to reserve your seat.
Tuition due 30 days after you attend your course.

In the Classroom

Live, Online

Private Team Training

In the Classroom — OR — Live, Online

Tuition — Standard: $3285   Government: $2890

Dec 12 - 15 (4 Days)
9:00 AM - 4:30 PM EST
Toronto / Online (AnyWare) Toronto / Online (AnyWare) Reserve Your Seat

How would you like to attend?

Live, Online
In-Class

Feb 20 - 23 (4 Days)
9:00 AM - 4:30 PM EST
Herndon, VA / Online (AnyWare) Herndon, VA / Online (AnyWare) Reserve Your Seat

How would you like to attend?

Live, Online
In-Class

May 29 - Jun 1 (4 Days)
9:00 AM - 4:30 PM EDT
Toronto / Online (AnyWare) Toronto / Online (AnyWare) Reserve Your Seat

How would you like to attend?

Live, Online
In-Class

Jun 26 - 29 (4 Days)
9:00 AM - 4:30 PM EDT
Herndon, VA / Online (AnyWare) Herndon, VA / Online (AnyWare) Reserve Your Seat

How would you like to attend?

Live, Online
In-Class

Sep 25 - 28 (4 Days)
9:00 AM - 4:30 PM EDT
Herndon, VA / Online (AnyWare) Herndon, VA / Online (AnyWare) Reserve Your Seat

How would you like to attend?

Live, Online
In-Class

Guaranteed to Run

Private Team Training

Enroling at least 3 people in this course? Consider bringing this (or any course that can be custom designed) to your preferred location as a private team training.

For details, call 1-888-843-8733 or Click here »

Tuition

Standard

Government

In Classroom or
Online

Standard

$3285

Government

$2890

Private Team Training

Contact Us »

Course Tuition Includes:

After-Course Instructor Coaching
When you return to work, you are entitled to schedule a free coaching session with your instructor for help and guidance as you apply your new skills.

After-Course Computing Sandbox
You'll be given remote access to a preconfigured virtual machine for you to redo your hands-on exercises, develop/test new code, and experiment with the same software used in your course.

Free Course Exam
You can take your Learning Tree course exam on the last day of your course or online at any time after class and receive a Certificate of Achievement with the designation "Awarded with Distinction."

Prev
Next

Training Hours

Standard Course Hours: 9:00 am – 4:30 pm
*Informal discussion with instructor about your projects or areas of special interest: 4:30 pm – 5:30 pm

FREE Online Course Exam (if applicable) – Last Day: 3:30 pm – 4:30 pm
By successfully completing your FREE online course exam, you will:

  • Have a record of your growth and learning results
  • Bring proof of your progress back to your organization
  • Earn credits toward industry certifications (if applicable)

- ,

Prev
Next
Chat Now

Please Choose a Language

Canada - English

Canada - Français