Formationfff HDP DEVELOPER ENTERPRISE APACHE SPARK

Overview

This course is  an entry point for developers who need to create big data applications to analyse big data stored in Apache Hadoop using Spark.

Topics include: An overview of the Hortonworks Data Platform (HDP), including HDFS and YARN; using Spark Core APIs for interactive data exploration; Spark SQL and DataFrame operations; Spark Streaming and DStream operations; data visualisation, reporting, and collaboration; performance monitoring and tuning; building and deploying Spark applications; and an introduction to the Spark Machine Learning Library.

Duration

4 days

Who is the course for

Software engineers that are looking to develop data stream and in memory applications for time sensitive and highly iterative applications in an Enterprise HDP environment.

Prerequisites

Students should be familiar with programming skills and have previous experience in software development using either Python or Scala. Previous experience with data streaming, SQL, and HDP is also helpful, but not required.

Hands-On Lab Activities

Lab 0: Pre-lab Setup

About This Lab

Objective:

Set up the lab environment and confirm functionality of HDP2.5

Lab 1: Using HDFS Commands

About This Lab

Objective:

View, add, manipulate, and remove files and directories to and from HDFS using hdfs dfs

commands.

Lab 2: Introduction to Spark REPLs and Zeppelin

About This Lab

Objective:

Access and browse Spark REPLs and Zeppelin

File Locations:

N/A

Successful Outcome:

Use Spark REPLs and browse Zeppelin

Lab 3: Creating and Manipulating RDDs (Scala/Python)

About This Lab

Objective:

Create and Manipulate RDDs using Scala and Zeppelin

Lab 4: Create and Manipulate Pair RDDs (Scala/Python)

About This Lab

Objective:

Create pair RDD’s and use various functions to transform these RDD’s using Scala in Zeppelin.

File Locations:

/home/zeppelin/spark/data/

Successful Outcome:

REQUIRED: Create pair RDDs and perform various operations.

OPTIONAL: Complete challenge labs performing more complex operations.Lab 5: Basic Spark Streaming (Scala)

Lab 5: Basic Spark Streaming (Scala/Python)

About This Lab

Objective:

Set up basic Spark Streaming operations using the REPL

File Locations:

/root/spark/data/

Successful Outcome:

Stream data from HDFS directories and TCP sockets using Spark Streaming

 

Lab 6: Basic Spark Streaming Transformations (Scala/Python)

About This Lab

Objective:

Learn to use basic Spark Streaming transformations on data streams

File Locations:

/root/spark/data/

Successful Outcome:

Perform several basic transformations on streaming dataLab 8: Create and Save DataFrames & Tables (Scala)

Lab 7: Spark Streaming Window Transformations (Scala/Python)

About This Lab

Objective:

Use Spark Streaming Window Transformations

File Locations:

NA

Successful Outcome:

Perform several Spark Streaming Window Transformations

Lab 8: Create and Save DataFrames & Tables (Scala/Python)

About This Lab

Objective:

Create and save DataFrames and tables

Files Locations:

NA

Successful Outcome:

Use various methods to create and save DataFrames and tables

 

Lab 9: Working with DataFrames (Scala/Python)

About This Lab

Objective:

Learn to use the DataFrames API.

File Locations:

NA

Successful Outcome:

Manipulate DataFrames using the DataFrames API

Lab 10: Data Visualization, Reporting andCollaboration using Zeppelin (Scala/Python)

About This Lab

Objective:

Learn to use Zeppelin to perform data visualizations, collaborate, and integrate visualizations into

reports.

Files Locations:

NA

Successful Outcome:

Use Zeppelin to perform data visualization, collaboration, and reporting tasks.

 

Lab 11: Job Monitoring (Scala/Python)

About This Lab

Objective:

Monitor Spark jobs using the Spark Application UI

File Locations:

NA

Successful Outcome:

Monitor Spark jobs.

 

Lab 12: Performance Tuning (Scala/Python)

About This Lab

Objective:

Practice performance tuning techniques

File Locations:

/home/zeppelin/spark/data/

Successful Outcome:

Code performance tuning techniques from the lesson

 

Lab13: Build and Submit Applications to YARN (Scala/Python)

About This Lab

Objective:

Apply programming knowledge into stand-alone applications submitted to a YARN cluster

File Locations:

NA

Successful Outcome:

Build and submit a cluster-mode application to YARN

Lab 14: Machine Learning Walkthrough

About This Lab

Objective:

Observe and run code examples that demonstrate machine learning processes.

File Locations:

NA

Successful Outcome:

Import a preconfigured note that contains machine learning code samples, read through the note, and

run those examples.


 Mokhtar Sellami

I am Mokhtar Sallami, certified in IBM Big Data Specialist with IBM BigInsights V2.1. I m a Technologist Lecturer in computer science at the High Institute of Technological Studies of KEF, Tunisia. I’m a CEO founder of Enodis Web start-up ( 2008-2012). I’ve worked as web Freelancer for Enode (https://www.linkedin.com/company/499839?trk=prof-exp-company-name). I am Big Data Trainer (IBM BigInsights, Cloudera) at the UIT, EPI, ISET Kef. I’ve participated as Cloudera Trainer at First International Spring school of Big Data and Cloud Computing (http://www.i2sbd2c.tn/index.html )


Nous contacter sur le 27 862 155 , 54 828 018, 71 866 142


Durée : 4 jours

Cours du jours : 

De 9h à 14h

Cours du soir & weekend :

De 18:30 à 21h , Samedi matin,dimanche matin ,de 9h à 13h

Parrainez une seul personne et recevez une réduction de 30 % 
Parrainez une deuxième personne et recevez une réduction immédiate de 100 %

vous êtes Entreprise : 
Nous sommes à votre entière disposition pour vous fournir les documents nécessaires au vu de la déduction des frais de formation sur la TFP(Taxe à la Formation Professionnelle).


HDP DEVELOPER ENTERPRISE APACHE

 1456  vue Vues


Prochaine sessions:

Nous consulter


Durée:

4 jours


Confirmation de la session:

Une semaine avant la date prévue

Agile Scrum par la pratique

Angular mode projet

Bootcamp Java Spring Angular

Bootcamp Java Spring Angular

DevOps

DevOps Tools Engineer par la pratique

ISTQB version 2018

Java 8 1Z0-808

Java 8 mode projet

Reconversion : Devenir développeur J2ee Angular de Zéro

Selenium par la pratique

Confirmation de la session:

Une semaine avant la date prévue


Recommandé pour:

Développeur , Data scientist, Big data developer, chef de projet ,entrepreneur,


A propos du lieu de la formation

Nous pouvons aussi nous déplacer dans vos locaux (formation intra-entreprise).

N'hésitez pas également à proposer d'autres dates ou lieux, si ceux proposés ne vous conviennent pas.


A la fin de la formation:

Vous aurez un certificat de formation agréé par l'état