distributed_computing:data_processing:hadoop:hdfs:small_files

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
distributed_computing:data_processing:hadoop:hdfs:small_files [2019/10/25 21:33] phreazerdistributed_computing:data_processing:hadoop:hdfs:small_files [2019/10/25 21:55] (current) – [Solutions] phreazer
Line 11: Line 11:
   * Consolidator   * Consolidator
   * HBase: Stores data in indexed SequenceFiles (HBase)   * HBase: Stores data in indexed SequenceFiles (HBase)
 +  * Spark compaction: https://github.com/KeithSSmith/spark-compaction
 +  * Filecrush: https://github.com/asdaraujo/filecrush
  • distributed_computing/data_processing/hadoop/hdfs/small_files.1572032001.txt.gz
  • Last modified: 2019/10/25 21:33
  • by phreazer