Binary Activity Aggregation: Storage Comparison
Efficient strategies for managing high-frequency activity data at scale (Max 1,000 Employees).
How It Works
Activities Table (Real-time, Temporary)
- Stores individual activity pings throughout the day
- One row created every time employee activity is detected
- Data persists during business hours only
- Cleaned up daily after batch job runs
-- Schema
id (BIGINT PRIMARY KEY)
employee_id (BIGINT FOREIGN KEY)
ping_minute (TIMESTAMP)
created_at (TIMESTAMP)
updated_at (TIMESTAMP)
id (BIGINT PRIMARY KEY)
employee_id (BIGINT FOREIGN KEY)
ping_minute (TIMESTAMP)
created_at (TIMESTAMP)
updated_at (TIMESTAMP)
Attendance Table (Daily, Permanent)
- Stores aggregated daily summary for each employee
- One row per employee per day
- Contains compressed binary representation
- Stores entry time (first) and exit time (last)
- Permanent historical record
-- Schema
id (BIGINT PRIMARY KEY)
employee_id (BIGINT)
attendance_date (DATE)
activity_binary (BLOB) -- 30-50 bytes
entry_time (TIME)
exit_time (TIME)
absent (BOOLEAN)
id (BIGINT PRIMARY KEY)
employee_id (BIGINT)
attendance_date (DATE)
activity_binary (BLOB) -- 30-50 bytes
entry_time (TIME)
exit_time (TIME)
absent (BOOLEAN)
The Daily Process
- During business hours: Activities logged to temporary table (540+ rows per employee).
- End of day: Batch job generates binary string from activities.
- Compression: Compresses binary using Gzip (down to 40 bytes).
- Persistence: Stores compressed blob, first/last activity as Entry/Exit times.
- Cleanup: Deletes all temporary activity logs for that day.
Storage Efficiency: Before vs After
BEFORE: Individual Activity Rows (81 KB/Employee/Day)
| Employees | Daily | Monthly | Yearly | 5 Years | 10 Years |
|---|---|---|---|---|---|
| 100 | 8.1 MB | 243 MB | 2.96 GB | 14.8 GB | 29.6 GB |
| 250 | 20.25 MB | 607.5 MB | 7.39 GB | 36.95 GB | 73.9 GB |
| 1,000 | 81 MB | 2.43 GB | 29.6 GB | 148 GB | 296 GB |
AFTER: Compressed Binary (40 Bytes/Employee/Day)
| Employees | Daily | Monthly | Yearly | 5 Years | 10 Years |
|---|---|---|---|---|---|
| 100 | 4 KB | 120 KB | 1.46 MB | 7.3 MB | 14.6 MB |
| 1,000 | 40 KB | 1.2 MB | 14.6 MB | 73 MB | 146 MB |
Storage Reduction Impact: 99.95%
Storage Factor
2,025x
Smaller requirement
10 Year Savings
295.8 GB
Free space retained
Max Row Count
1 Row
Per day / employee
Key Insights
✓
Efficiency: From 540 rows (temporary) to 1 permanent row daily.
✓
Density: Data reduced from 81 KB to 40 bytes (99.95% reduction).
✓
Integrity: Gzip compression maintains activity detail while optimizing space.
✓
Scalability: Minimal storage growth even at 1,000 employee scale.