hadoop - map/reduce functionality -

April 15, 2011

i started looking hadoop , made wordcount example work on cluster(two datanodes) after going through struggles.

but have question map/reduce functionality. read during map, input files/data transformed form of data can efficiently processed during reduce step.

let's have 4 input files(input1.txt, input2.txt, input3.txt, input4.txt) , want read input files , transform form of data reduce.

so here question. if run application (wordcount) on cluster environment (two datanodes), these 4 input files read on each datanode or 2 input files read on each datanode? , how can check file read on datanode?

or map(on each datanode) read files kind of block instead of reading individual file?

see hadoop works on basis of blocks rather files. if 4 files less 128mb(or 64mb depending on block size) read 1 mapper. chunk read mapper known inputsplit. hope answers question.

Search This Blog

Macro

hadoop - map/reduce functionality -

Comments

Post a Comment

Popular posts from this blog

symfony - TEST environment only: The database schema is not in sync with the current mapping file -

twig - Using Twigbridge in a Laravel 5.1 Package -

jdbc - Not able to establish database connection in eclipse -