I am explaining you how to configure flume configuration file .
I basically came across many times when people get stuck multiple times while configuring flume agent with respective source ,sink and channel. Here ,I shall ease you by providing an example to design flume configuration file though which you can extract data from source to sink via channel.
Let me demonstrate with an example .
Here I used only the parameters which are mandatory to configure source ,sink and channel for type spool, hdfs and memory respectively. you can add more parameters under source ,sink and channel if needed
Agent1.sources = spooldirsource
Agent1.sinks = hdfssink
Agent1.channels = Mchannel
#Defining source
Agent1.sources. spooldirsource .type = spooldir
Agent1.sources. spooldirsource .spoolDir =
<the directory from which to read files
from>
Agent1.sources. spooldirsource .fileHeader
= <Whether to add header storing
absolute file path name>
#defining sink
Agent1.sinks.hdfssink.type = hdfs
Agent1.sinks.hdfssink.hdfs.path =
<path name in hdfs to store file>
#defining channel
Agent1.channels.Mchannel.type = memory
Agent1.channels.Mchannel.capacity = 10000
#Binding channel between source and sink
Agent1.sources.spooldirsource.channels = Mchannel
Agent1.sinks.hdfssink.channel = Mchannel
Please find the below example for flume spool directory source:
Agent1.sources = spooldirsource
Agent1.sinks = hdfssink
Agent1.channels = Mchannel
#Defining source
Agent1.sources. spooldirsource .type = spooldir
Agent1.sources. spooldirsource .spoolDir =
/user/itechshree/out98
Agent1.sources. spooldirsource .fileHeader=true
#defining sink
Agent1.sinks.hdfssink.type = hdfs
Agent1.sinks.hdfssink.hdfs.path =
/user/itechshree/kafka
#defining channel
Agent1.channels.Mchannel.type = memory
Agent1.channels.Mchannel.capacity = 10000
#Binding channel between source and sink
Agent1.sources.spooldirsource.channels = Mchannel
Agent1.sinks.hdfssink.channel = Mchannel
Create flume1.conf file keeping this code and run this below command in CLI:
flume-ng agent -n Agent1 -c conf -f flume1.conf
Now after receiving the successful message from console ,files under out98 folder will go to hdfs directory.
Please try more using other source avro,netcat etc.
Next I shall discuss about Apache Pig.
See you in my next blog!!
1 Comments
Content is very nice , explaination is also awesome.....
ReplyDeletePlease do not enter any spam link in the comment box