Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: delete cni statefile when unable to be parsed #3551

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

QxBytes
Copy link
Contributor

@QxBytes QxBytes commented Apr 1, 2025

Reason for Change:

Sometimes in certain scenarios (usually windows), if there is a crash of the OS, null bytes may be written to the state and log file. When the CNI tries to restore the state, it is unable to read the statefile and fails. All subsequent retries will fail as the state is irrecoverable. This PR changes this behavior to delete the entire cni statefile if there is a syntax error (ex: if there are a bunch of null bytes in the file), as manual intervention would be needed to recover anyway. The null statefile issue only seems to appear on the pipelines on windows nodes.

Issue Fixed:

See above

Requirements:

Notes:
This issue appears sporadically

@QxBytes QxBytes self-assigned this Apr 1, 2025
@QxBytes QxBytes added fix Fixes something. ci Infra or tooling. labels Apr 1, 2025
@@ -192,13 +193,19 @@ func (nm *networkManager) restore(isRehydrationRequired bool) error {
// Read any persisted state.
err := nm.store.Read(storeKey, nm)
if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to log the contents invalid and all to help assist with troubleshooting?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci Infra or tooling. fix Fixes something.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants