elementary.event_freshness_anomalies
Monitors the freshness of event data over time, as the expected time it takes each event to load -
that is, the time between when the event actually occurs (the event timestamp), and when it is loaded to the
database (the update timestamp).
This test compliments the freshness_anomalies test and is primarily intended for data that is updated in a continuous / streaming fashion.
The test can work in a couple of modes:
- If only an
event_timestamp_column is supplied, the test measures over time the difference between the current
timestamp (“now”) and the most recent event timestamp.
- If both an
event_timestamp_column and an update_timestamp_column are provided, the test will measure over time
the difference between these two columns.
Test configuration
Required configuration: event_timestamp_column
Default configuration: anomaly_direction: spike to alert only on delays.
tests:
— elementary.event_freshness_anomalies:
event_timestamp_column: column name
update_timestamp_column: column name
where_expression: sql expression
anomaly_sensitivity: int
detection_period:
period: [hour | day | week | month]
count: int
training_period:
period: [hour | day | week | month]
count: int
time_bucket:
period: [hour | day | week | month]
count: int
seasonality: day_of_week
detection_delay:
period: [hour | day | week | month]
count: int
ignore_small_changes:
spike_failure_percent_threshold: int
drop_failure_percent_threshold: int
anomaly_exclude_metrics: [SQL expression]
models:
- name: < model name >
tests:
- elementary.event_freshness_anomalies:
event_timestamp_column: < timestamp column > # Mandatory
update_timestamp_column: < timestamp column > # Optional
where_expression: < sql expression >
time_bucket: # Daily by default
period: < time period >
count: < number of periods >