Apache Flume - Ingesting log data into Hadoop and Kafka, Detailed workshop about using Flume to ingest web server logs into live Hadoop and Kafka Cluster.
Created by Durga Viswanatha Raju Gadiraju
What Will I Learn?
Understand basics of Flume
Implement simple flume agent
Understand multiple flume agent flows
Setup multiple agents using Avro as connector
Setup agent to get log data into HDFS
Understand sources like netcat, avro and exec in detail
Understand channels like memory and file in detail
Understand sinks like logger, avro and HDFS in detail
Description
As part of this session we will understand how we can use Apache Flume to ingest streaming real time data in detail.
Overview of Flume
Setting up gen_logs
Develop first Flume Agent
Understand Source, Sink and Channel
Flume Multi Agent Flows
Get data into HDFS using Flume
Limitations and Conclusion
For this demo we will be using our Big Data developer labs. You need to have access to existing big data cluster or sign up to our labs.
Hands On demos:
Developing simple Flume agent to get data from netcat to agent logs
Develop multi agent flow where data from web server logs go to avro sink and then from avro source to logger
Develop multiplex flow where data from web server logs is written to HDFS and Kafka
Who is the target audience?
- Any one who is aspiring to build career in Big Data
- Big Data professionals who want to ingest log data to HDFS
- Any one who want to explore Flume as streaming ingestion tool
Preview This Course : GET COUPON CODE
Post a Comment for "Apache Flume - Ingesting log data into Hadoop and Kafka"