hadoop - map/reduce functionality -


i started looking hadoop , made wordcount example work on cluster(two datanodes) after going through struggles.

but have question map/reduce functionality. read during map, input files/data transformed form of data can efficiently processed during reduce step.

let's have 4 input files(input1.txt, input2.txt, input3.txt, input4.txt) , want read input files , transform form of data reduce.

so here question. if run application (wordcount) on cluster environment (two datanodes), these 4 input files read on each datanode or 2 input files read on each datanode? , how can check file read on datanode?

or map(on each datanode) read files kind of block instead of reading individual file?

see hadoop works on basis of blocks rather files. if 4 files less 128mb(or 64mb depending on block size) read 1 mapper. chunk read mapper known inputsplit. hope answers question.


Comments

Popular posts from this blog

powershell Start-Process exit code -1073741502 when used with Credential from a windows service environment -

twig - Using Twigbridge in a Laravel 5.1 Package -

c# - LINQ join Entities from HashSet's, Join vs Dictionary vs HashSet performance -