ITechShree-Data-Analytics-Technologies

Flume Spooling directory example


I am explaining you  how to configure flume configuration file .
I basically came across many times when people get stuck multiple times while configuring flume agent with respective source ,sink and channel. Here ,I shall ease you by providing an example to design flume configuration file though which you can extract data from source to sink via channel.


Let me demonstrate with an example .
Here I used only the parameters which are  mandatory to configure source ,sink and channel for type spool, hdfs and memory respectively.you can add more parameters under source ,sink and channel if needed


Agent1.sources = spooldirsource               
Agent1.sinks = hdfssink                                                
Agent1.channels = Mchannel              

#Defining source
Agent1.sources. spooldirsource .type = spooldir
Agent1.sources. spooldirsource .spoolDir = 
<the directory from which to read files 
from>

Agent1.sources. spooldirsource .fileHeader
<Whether to add header storing 
absolute file path name>

#defining sink
Agent1.sinks.hdfssink.type = hdfs
Agent1.sinks.hdfssink.hdfs.path =
 <path name in hdfs to store file>

#defining channel
Agent1.channels.Mchannel.type = memory
Agent1.channels.Mchannel.capacity = 10000

#Binding channel between source and sink
Agent1.sources.spooldirsource.channels = Mchannel
Agent1.sinks.hdfssink.channel = Mchannel 





Please find the below example for flume spool directory source:







Agent1.sources = spooldirsource               
Agent1.sinks = hdfssink                                                
Agent1.channels = Mchannel              

#Defining source
Agent1.sources. spooldirsource .type = spooldir
Agent1.sources. spooldirsource .spoolDir =
/user/itechshree/out98
Agent1.sources. spooldirsource .fileHeader=true
#defining sink
Agent1.sinks.hdfssink.type = hdfs
Agent1.sinks.hdfssink.hdfs.path = 
/user/itechshree/kafka


#defining channel
Agent1.channels.Mchannel.type = memory
Agent1.channels.Mchannel.capacity = 10000

#Binding channel between source and sink
Agent1.sources.spooldirsource.channels = Mchannel
Agent1.sinks.hdfssink.channel = Mchannel 

Create flume1.conf file keeping this code and run this below command in CLI:

flume-ng agent -n Agent1 -c conf -f flume1.conf

Now after receiving the successful message from console ,files under out98 folder will  go to hdfs directory.




Please try more using other source avro,netcat etc.

Next I shall discuss about Apache Pig.
See you in my next blog!!



Post a Comment

1 Comments

Please do not enter any spam link in the comment box