Apache Flume - Ingesting log data into Hadoop and Kafka

Apache Flume - Ingesting log data into Hadoop and Kafka, Detailed workshop about using Flume to ingest web server logs into live Hadoop and Kafka Cluster.
Created by Durga Viswanatha Raju Gadiraju
What Will I Learn?
Understand basics of Flume
Implement simple flume agent
Understand multiple flume agent flows
Setup multiple agents using Avro as connector
Setup agent to get log data into HDFS
Understand sources like netcat, avro and exec in detail
Understand channels like memory and file in detail
Understand sinks like logger, avro and HDFS in detail
Description
As part of this session we will understand how we can use Apache Flume to ingest streaming real time data in detail.

Overview of Flume

Setting up gen_logs

Develop first Flume Agent

Understand Source, Sink and Channel

Flume Multi Agent Flows

Get data into HDFS using Flume

Limitations and Conclusion
For this demo we will be using our Big Data developer labs. You need to have access to existing big data cluster or sign up to our labs.

Hands On demos:

Developing simple Flume agent to get data from netcat to agent logs

Develop multi agent flow where data from web server logs go to avro sink and then from avro source to logger

Develop multiplex flow where data from web server logs is written to HDFS and Kafka

Who is the target audience?

  • Any one who is aspiring to build career in Big Data
  • Big Data professionals who want to ingest log data to HDFS
  • Any one who want to explore Flume as streaming ingestion tool

Preview This Course : GET COUPON CODE

Post a Comment for "Apache Flume - Ingesting log data into Hadoop and Kafka"