Recently I found that I’m kind of over talking when I speak with others, so I want to train my conclusion of solving problem. Here are some problems I meet when I read books of Big Data.

  • No 1. Why Spark does not use map-reduce?

Disk reading is too slow to complete big data analyze.Map-reduce is a shuffle which meanings parallizing.Spark just use memory to complete parallizing a huge problem.

  • No 2. How Hadoop to append data?

Just use