Incident on 2023-07-21 - VPC CNI not allocating IP addresses
- Key events - First detected: 2023-07-21 08:15
- Incident declared: 2023-07-21 09:31
- Repaired: 2023-07-21 12:42
- Resolved 2023-07-21 12:42
 
- Time to repair: 4h 27m 
- Time to resolve: 4h 27m 
- Identified: User reported of seeing issues with new deployments in #ask-cloud-platform 
- Impact: The service availability for CP applications may be degraded/at increased risk of failure. 
- Context: - 2023-07-21 08:15 - User reported of seeing issues with new deployments (stuck with ContainerCreating)
- 2023-07-21 09:00 - Team started to put together the list of all effected namespaces
- 2023-07-21 09:31 - Incident declared
- 2023-07-21 09:45 - Team identified that the issue was affected 6 nodes and added new nodes and and began to cordon/drain affected nodes
- 2023-07-21 12:35 - Compared cni settings on a 1.23 test cluster with live and found a setting was different
- 2023-07-21 12:42 - Set the command to enable Prefix Delegation on the live cluster
- 2023-07-21 12:42 - Incident repaired
- 2023-07-21 12:42 - Incident resolved
 
- Resolution: - The issue was caused by a missing setting on the live cluster. The team added the setting to the live cluster and the issue was resolved
 
- Review actions: - Add a test/check to ensure the IP address allocation is working as expected #4669