MapReduce

Created
Created
2020 Mar 27 9:16
Editor
Creator
Creator
Seonglae ChoSeonglae Cho
Edited
Edited
2025 Nov 3 13:40

Big Data distributed Parallel Computing Software Frameworkv

  • The core of MapReduce is two functions
  • Big data processing should be kept as simple as possible
  • Commutative and associative properties must hold
  • MapReduce is a method to distribute tasks across multiple nodes
  • For large tasks, repeatedly fork to divide, and join to merge when small

Limitations of MapReduce

  • Memory cannot be wasted to hold the metadata of a large number of smaller data sets.
  • The reduce phase cannot start until the map task is complete
  • starting a new map task before the completion of the reduce task in the previous application is not possible in standard MapReduce
MapReduce Notion
 
 
 
 
 

jaff deen paper

 
 
 

Recommendations