hadoop - when will the number/nodes for the reducers be allocated in the mapreduce job execution? -

when reading mapreduce, read below interesting lines:

"but how reducer’s know nodes query partitions? happens through application master. each mapper instance completes, notifies application master partitions produced during run. each reducer periodically queries application master mapper hosts until has received final list of nodes hosting partitions."

i have doubt here. when each reducer mean exactly? reducers allocated before starting of map phase , how reducer nodes chosen?

reducers can start before maps done processing of data. once start can pull data mapper machines, start processing after mappers done processing of data.

mapred.reduce.slowstart.completed.maps property configure behaviour. more information on property here.


Popular posts from this blog

symfony - TEST environment only: The database schema is not in sync with the current mapping file -

twig - Using Twigbridge in a Laravel 5.1 Package -

jdbc - Not able to establish database connection in eclipse -