> > > CDBBDA Detailed outline

Cloudera Designing & Building Big Data Applications (CDBBDA)

Course Description Kurs tarihleri Course Outline

Detailed Course Outline

Module 1: Application Architecture
  • Scenario Explanation
  • Understanding the Development Environment
  • Identifying and Collecting Input Data
  • Selecting Tools for Data Processing and Analysis
  • Presenting Results to the User
Module 2: Defining and Using Data Sets
  • Metadata Management
  • What is Apache Avro?
  • Avro Schemas
  • Avro Schema Evolution
  • Selecting a File Format
  • Performance Considerations
Module 3: Using the Kite SDK Data Module
  • What is the Kite SDK?
  • Fundamental Data Module Concepts
  • Creating New Data Sets Using the Kite SDK
  • Loading, Accessing, and Deleting a Data Set
Module 4: Importing Relational Data with Apache Sqoop
  • What is Apache Sqoop?
  • Basic Imports
  • Limiting Results
  • Improving Sqoop’s Performance
  • Sqoop 2
Module 5: Capturing Data with Apache Flume
  • What is Apache Flume?
  • Basic Flume Architecture
  • Flume Sources
  • Flume Sinks
  • Flume Configuration
  • Logging Application Events to Hadoop
Module 6: Developing Custom Flume Components
  • Flume Data Flow and Common Extension Points
  • Custom Flume Sources
  • Developing a Flume Pollable Source
  • Developing a Flume Event-Driven Source
  • Custom Flume Interceptors
  • Developing a Header-Modifying Flume Interceptor
  • Developing a Filtering Flume Interceptor
  • Writing Avro Objects with a Custom Flume Interceptor
Module 7: Managing Workflows with Apache Oozie
  • The Need for Workflow Management
  • What is Apache Oozie?
  • Defining an Oozie Workflow
  • Validation, Packaging, and Deployment
  • Running and Tracking Workflows Using the CLI
  • Hue UI for Oozie
Module 8: Processing Data Pipelines with Apache Crunch
  • What is Apache Crunch?
  • Understanding the Crunch Pipeline
  • Comparing Crunch to Java MapReduce
  • Working with Crunch Projects
  • Reading and Writing Data in Crunch
  • Data Collection API
  • Functions
  • Utility Classes in the Crunch API
Module 9: Working with Tables in Apache Hive
  • What is Apache Hive?
  • Accessing Hive
  • Basic Query Syntax
  • Creating and Populating Hive Tables
  • How Hive Reads Data
  • Using the RegexSerDe in Hive
Module 10: Developing User-Defined Functions
  • What are User-Defined Functions?
  • Implementing a User-Defined Function
  • Deploying Custom Libraries in Hive
  • Registering a User-Defined Function in Hive
Module 11: Executing Interactive Queries with Impala
  • What is Impala?
  • Comparing Hive to Impala
  • Running Queries in Impala
  • Support for User-Defined Functions
  • Data and Metadata Management
Module 12: Understanding Cloudera Search
  • What is Cloudera Search?
  • Search Architecture
  • Supported Document Formats
Module 13: Indexing Data with Cloudera Search
  • Collection and Schema Management
  • Morphlines
  • Indexing Data in Batch Mode
  • Indexing Data in Near Real Time
Module 14: Presenting Results to Users
  • Solr Query Syntax
  • Building a Search UI with Hue
  • Accessing Impala through JDBC
  • Powering a Custom Web Application with Impala and Search
 

Cookies help us deliver our services. By using our services, you agree to our use of cookies.   Got it!