Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Support for BoostrapSelfManagedAddons flag for EKS cluster creation #5222

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jas-nik
Copy link
Contributor

@jas-nik jas-nik commented Nov 20, 2024

What type of PR is this?
/kind feature

What this PR does / why we need it:

Add flag to support BootstrapSelfManagedAddons to provision Bare EKS cluster without default addons (coreDNS, kube-proxy, aws-vpc-cni)

https://docs.aws.amazon.com/eks/latest/userguide/create-cluster.html

By default, EKS installs multiple networking add-ons during cluster creation. This includes the Amazon VPC CNI, CoreDNS, and kube-proxy.

If you’d like to disable the installation of these default networking add-ons, use the parameter below. This may be used for alternate CNIs, such as Cilium. Review the EKS API reference for more information.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

Special notes for your reviewer:

Checklist:

  • squashed commits
  • includes documentation
  • includes emojis
  • adds unit tests
  • adds or updates e2e tests

Release note:

Add flag to support BootstrapSelfManagedAddons to provision Bare EKS cluster without default addons (coreDNS, kube-proxy, aws-vpc-cni)

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Nov 20, 2024
@k8s-ci-robot k8s-ci-robot added needs-priority needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Nov 20, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @jas-nik. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@adriananeci
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 20, 2024
@jas-nik
Copy link
Contributor Author

jas-nik commented Nov 20, 2024

"failed to create new managed VPC: failed to create vpc: The maximum number of VPCs has been reached"

😞

@jas-nik
Copy link
Contributor Author

jas-nik commented Nov 20, 2024

/retest

@JonnieDoe
Copy link

/lgtm

@k8s-ci-robot
Copy link
Contributor

@JonnieDoe: changing LGTM is restricted to collaborators

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Nov 20, 2024
@JonnieDoe
Copy link

/retest

1 similar comment
@jas-nik
Copy link
Contributor Author

jas-nik commented Nov 25, 2024

/retest

@nrb
Copy link
Contributor

nrb commented Dec 5, 2024

@jas-nik Sorry about the delay here, and thank you for this contribution.

I've been troubleshooting EKS CI issues, and this behavior is very welcome :)

I do ask that you add some tests for this case. We'll need at least a cluster template and to point to that template in the e2e test config.

@jas-nik
Copy link
Contributor Author

jas-nik commented Jan 27, 2025

@nrb Apologies, this fell off my radar. I'll get them added

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign neolit123 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@damdo
Copy link
Member

damdo commented Feb 4, 2025

Hey @jas-nik CAPA doesn't do merge commits, would you be able to rebase instead?
Thanks!

@k8s-ci-robot k8s-ci-robot removed the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Feb 4, 2025
@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Feb 4, 2025
@nrb
Copy link
Contributor

nrb commented Feb 4, 2025

/test pull-cluster-api-provider-aws-test

Hit VPC limit.

@jas-nik
Copy link
Contributor Author

jas-nik commented Feb 4, 2025

Still the same VPC limit issue.

@jas-nik
Copy link
Contributor Author

jas-nik commented Feb 4, 2025

/retest

@damdo
Copy link
Member

damdo commented Feb 5, 2025

/test pull-cluster-api-provider-aws-test

@jas-nik
Copy link
Contributor Author

jas-nik commented Feb 6, 2025

@damdo @nrb @richardcase would you be able to help with the VPC limit issue?

@damdo
Copy link
Member

damdo commented Feb 6, 2025

/test pull-cluster-api-provider-aws-test

@nrb
Copy link
Contributor

nrb commented Feb 6, 2025

@jas-nik Unfortunately we can't allocate more VPCs right now. The best we can do is retry the tests at off-peak times.

@nrb
Copy link
Contributor

nrb commented Feb 6, 2025

/retest

@richardcase
Copy link
Member

@jas-nik @nrb @damdo - are getting the "VPC limit" error when running just the unit tests or the e2e tests? If it's the unit tests, (i.e. via /test pull-cluster-api-provider-aws-test), which i guess is the case from the discussion, then that should be within our control to change as it shouldn't be fitting AWS and this may be coming from out "resource counting" code.

Also, if it is the e2e then potentially we can increase the service limits.

@k8s-ci-robot
Copy link
Contributor

@jas-nik: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-cluster-api-provider-aws-test 49f825a link true /test pull-cluster-api-provider-aws-test

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@richardcase
Copy link
Member

Are we sure thats the real error? We have a test the checks for maximum number of VPCs:

g.Expect(err.Error()).To(ContainSubstring("The maximum number of VPCs has been reached"))

I will have a look at the logs in the morning to check.

@richardcase
Copy link
Member

richardcase commented Feb 6, 2025

Looking at the logs it seems to be multiple issues with tests:

  • FAIL: TestDefaultingWebhook (0.59s) (and sub tests)
  • FAIL: TestCreateCluster/cluster_create_with_2_subnets (0.00s)
  • FAIL: TestCreateIPv6Cluster (0.01s)

Its worth running just these tests locally to see whats going on.

@jas-nik
Copy link
Contributor Author

jas-nik commented Feb 6, 2025

https://prow.k8s.io/view/gs/kubernetes-ci-logs/pr-logs/pull/kubernetes-sigs_cluster-api-provider-aws/5336/pull-cluster-api-provider-aws-test/1887504193065848832 - One of the latest PRs build passed even after VPC limit failure, so it might be a red herring after all. Thank you for chiming in. Still need to investigate what is the actual failure.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 28, 2025
@k8s-ci-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. needs-priority needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants