Cloudera
CA-CDAT
Cloudera Data Analyst Training: Using Pig, Hive, and Impala with Hadoop
Who Should Attend
- This course is designed for data analysts, business intelligence specialists, developers, system architects, and database administrators
Knowledge of SQL is assumed, as is basic Linux command-line familiarity. Knowledge of at least one scripting language (e.g., Bash scripting, Perl, Python, Ruby) would be helpful but is not essential. Prior knowledge of Apache Hadoop is not required
Cloudera University’s four-day data analyst training course will teach you to apply traditional data analytics and business intelligence skills to big data tools like Apache Impala (Incubating). Apache Hive, and Apache Pig. Cloudera presents the tools data professionals need to access, manipulate, transform and analyze complex data sets using SQL and familiar scripting languages
Through instructor-led discussion and interactive, hands-on exercises, participants will navigate the Hadoop ecosystem, learning how to:
- Acquire, store, and analyze data using features in Pig, Hive, and Impala
- Perform fundamental ETL (extract, transform, and load) tasks with Hadoop tools
- Use Pig, Hive, and Impala to improve productivity for typical analysis tasks
- Join diverse datasheets to gain valuable business insight
- Perform interactive, complex queries on datasets
- Introduction
- Hadoop Fundamentals
- Introduction to Pig
- Basic Data Analysis with Pig
- Processing Complex Data with Pig
- Multi-Dataset Operations with Pig
- Pig Troubleshooting and Optimization
- Introduction to Hive and Impala
- Querying with Hive and Impala
- Hive and Impala Data Management
- Data Storage and Performance
- Relational Data Analysis with Hive and Impala
- Complex Data with Hive and Impala
- Analyzing Text with Hive and Impala
- Hive Optimization
- Impala Optimization
- Extending Hive and Impala
- Choosing the Best Tool for the Job
If you would like to know more about this course please contact us