Data Compaction Job

Over the time iceberg tables could slow down and require to run data compaction to clean up tables.
iomete provides built-in job to run data compactions for each table. This job triggers the next iceberg processes:

  1. ExpireSnapshots Maintenance - Expire Snapshots
  2. Delete Orphan Files - See Maintenance - Delete Orphan Files
  3. Rewrite Data Files - See Maintenance - Rewrite Data Files
  4. Rewrite Manifests - See Maintenance

To enable data compaction spark job follow the next steps:

  1. In the left sidebar menu choose Spark Jobs
  2. Create new job
  3. Fill the form with below values:

Field Name


Schedule (example will run job every Sunday at 12:00, feel free to change the value)

0 12 * * SUN

Docker Image


Main application file


Main class

Leave empty

Instance: Size (ICU) (feel free to increase)


See example screenshot below



We've created initial job for data-compaction which will be enough in most cases. Feel free to fork and create new data compaction image based on your company requirements.
View in Github

Did this page help you?