Commetns:
- distributed architecture for compunting & storage
- could run in 3 modes : standalone , pseudo districuted, fully distributed
- client-server architect :
Server : run Name Node + Job Tracker
Client : run Data Node + Task Tracker
- the execution liverage counts on : MapReduce which is executed in slave side
- HDFS is the filesystem used in hardoop, it will be kept in 3 different place. 2 is near-end side and the other is in far-end side
- HBASE is like a database could use for saving data after Reduce
Reference book:
- Windoop應用實作指南-掌握Hadoop翱翔雲端 by 許清榮,林奇暻, 買大誠
Thinking:
- if our traget data is distributed everywhere, maybe it's a way to handle locally and migrate.
- the point is map & key, the application must be divided & indexed via a key.