hadoop - when will the number/nodes for the reducers be allocated in the mapreduce job execution? -


when reading mapreduce, read below interesting lines:

"but how reducer’s know nodes query partitions? happens through application master. each mapper instance completes, notifies application master partitions produced during run. each reducer periodically queries application master mapper hosts until has received final list of nodes hosting partitions."

i have doubt here. when each reducer mean exactly? reducers allocated before starting of map phase , how reducer nodes chosen?

reducers can start before maps done processing of data. once start can pull data mapper machines, start processing after mappers done processing of data.

mapred.reduce.slowstart.completed.maps property configure behaviour. more information on property here.


Comments

Popular posts from this blog

powershell Start-Process exit code -1073741502 when used with Credential from a windows service environment -

twig - Using Twigbridge in a Laravel 5.1 Package -

c# - LINQ join Entities from HashSet's, Join vs Dictionary vs HashSet performance -