Binary Activity Aggregation: Storage Comparison

Efficient strategies for managing high-frequency activity data at scale (Max 1,000 Employees).

How It Works

Activities Table (Real-time, Temporary)

  • Stores individual activity pings throughout the day
  • One row created every time employee activity is detected
  • Data persists during business hours only
  • Cleaned up daily after batch job runs
-- Schema
id (BIGINT PRIMARY KEY)
employee_id (BIGINT FOREIGN KEY)
ping_minute (TIMESTAMP)
created_at (TIMESTAMP)
updated_at (TIMESTAMP)

Attendance Table (Daily, Permanent)

  • Stores aggregated daily summary for each employee
  • One row per employee per day
  • Contains compressed binary representation
  • Stores entry time (first) and exit time (last)
  • Permanent historical record
-- Schema
id (BIGINT PRIMARY KEY)
employee_id (BIGINT)
attendance_date (DATE)
activity_binary (BLOB) -- 30-50 bytes
entry_time (TIME)
exit_time (TIME)
absent (BOOLEAN)

The Daily Process

  1. During business hours: Activities logged to temporary table (540+ rows per employee).
  2. End of day: Batch job generates binary string from activities.
  3. Compression: Compresses binary using Gzip (down to 40 bytes).
  4. Persistence: Stores compressed blob, first/last activity as Entry/Exit times.
  5. Cleanup: Deletes all temporary activity logs for that day.

Storage Efficiency: Before vs After

BEFORE: Individual Activity Rows (81 KB/Employee/Day)

Employees Daily Monthly Yearly 5 Years 10 Years
100 8.1 MB 243 MB 2.96 GB 14.8 GB 29.6 GB
250 20.25 MB 607.5 MB 7.39 GB 36.95 GB 73.9 GB
1,000 81 MB 2.43 GB 29.6 GB 148 GB 296 GB

AFTER: Compressed Binary (40 Bytes/Employee/Day)

Employees Daily Monthly Yearly 5 Years 10 Years
100 4 KB 120 KB 1.46 MB 7.3 MB 14.6 MB
1,000 40 KB 1.2 MB 14.6 MB 73 MB 146 MB

Storage Reduction Impact: 99.95%

Storage Factor
2,025x
Smaller requirement
10 Year Savings
295.8 GB
Free space retained
Max Row Count
1 Row
Per day / employee

Key Insights

Efficiency: From 540 rows (temporary) to 1 permanent row daily.

Density: Data reduced from 81 KB to 40 bytes (99.95% reduction).

Integrity: Gzip compression maintains activity detail while optimizing space.

Scalability: Minimal storage growth even at 1,000 employee scale.

Want to optimize your data storage?

I specialize in building high-performance, scalable software solutions with an emphasis on extreme efficiency.

Let's Connect