A flow ingest code update was applied in this cluster around 23:00 UTC that caused our hot restart logic to put all agent-based ingest over https into a 10 minute hold down period. We immediately rolled back when we saw the drops in flow, which caused an additional 10 minute hold down period, after which our ingest capabilities returned to nominal state.
The offending bug in our hot restart logic has been resolved and will not recur in future ingest updates.
Posted Oct 25, 2022 - 16:55 UTC
This incident has been resolved.
Posted Oct 25, 2022 - 00:00 UTC
The rollback is complete and we are continuing to monitor.
Posted Oct 24, 2022 - 23:30 UTC
Beginning at approximately 23:00 UTC, Kentik engineering rolled out a new ingest layer update and all agent-based ingest was adversely affected. We are rolling back.