The Apache Hadoop software library is a framework that allows for the distributed storage and analysis of large data sets across clusters of computers supporting simple-to-complex programming models.
Designed to scale up from single servers to thousands of machines, Advanced Hadoop offers a robust local computation and storage model. Rather than rely on hardware to deliver high-availability, the modern library manages failures at the application layer. This enables highly-available services to be delivered seamlessly on top of computer clusters.
Over the past 10 years, Hadoop has evolved significantly; however, most application developers and architects utilize only the capabilities of its past versions (Hadoop 1 and, minimally, Hadoop 2). This represents both a problem and a competitive opportunity for software developers and architects who have an acumen for problem solving.
Common issues facing yesterday’s Hadoop champions:
Limited functionality and poor performance
Imperative-style software and unnecessary bolt-on frameworks are built
Attempts to reproduce the capabilities inherent to Hadoop yield an increase in maintenance costs in the short-term and long-term
Enterprise code is redundant, tangled, one-off and non-reusable
The purpose of this course is to re-introduce Hadoop from a modern, thorough and comprehensive focus. This course will emphasize the advanced characteristics of Hadoop 3.
The two primary frameworks introduced in this course include YARN and HDFS, which offer memory-resident, resource-preserving capabilities. They offer volume-based balancing, node labeling, centralized declaratively managed cache, storage policies and rich utility libraries.
Hadoop 3 can be integrated transparently and declaratively into any legacy or new data infrastructure. The course will also offer guidance to effectively use Hadoop’s native performance augmenting techniques.
Development experience with Java, a previous version of Hadoop and a user’s proficiency with a Unix-based operating system and command line scripts.
Individuals who wish to understand the features of Hadoop 3, become proficient with the implementation of highly performant Hadoop implementations
This is a 3 day class when taught on-site with ILT or via web-ex with VILT. It is also offered on a per-module basis for on-line self-enablement via our LMS, Brane.
Day 1: A Comprehensive overview of the Hadoop 3 libraries
Day 2: Advanced HDFS Practices
Day 3: Advanced YARN Practices