-
Notifications
You must be signed in to change notification settings - Fork 195
On-boarding Kata Containers into OpenShift CI #2802
Conversation
Enable openshift-ci on runtime repository. Fixes: kata-containers#2527 Depends-on: github.com/kata-containers/tests#2802 Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
Enable openshift-ci on runtime repository. Fixes: github.com/kata-containers/tests#2527 Depends-on: github.com/kata-containers/tests#2802 Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
0414d9f
to
ba2695f
Compare
Thanks @wainersm for this PR. To solve the |
Thanks @chavafg ! I've added an entry on There is another issue however. On my readme I am referencing urls that should be available only when openshift/release#9009 gets merged. It is a chicken-and-eggs problem. How usually you guys overcome this? |
Enable openshift-ci on runtime repository. Fixes: github.com/kata-containers/tests#2527 Depends-on: github.com/kata-containers/tests#2802 Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
Hi @wainersm - thanks for raising! Travis is failing as the new file isn't referenced by any others:
Please add a link to the new file in the Note that looking at the doc, there are some words our spell checker will complain about. Once you've re-pushed Travis will run that for you, but you might want to run the spell checker locally until you get a clean pass. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for raising @wainersm. A few comments and suggestions...
.ci/openshift-ci/README.md
Outdated
@@ -0,0 +1,63 @@ | |||
# OpenShift CI files | |||
|
|||
The OpenShift CI (openshift-ci) allows for components to be built and tested in clusters provided by the OpenShift community. An entitled component should follow the [onboarding process](https://github.com/openshift/ci-tools/blob/master/ONBOARD.md) which involve its configuration in the [ci-operator](https://github.com/openshift/ci-operator/blob/master/README.md). So this directory contains scripts and files used by the OpenShift ci-operator to test kata-containers in that CI environment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: very long lines, which makes diff harder to read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm re-writing the README to keep the lines under 80-chars limit.
.ci/openshift-ci/README.md
Outdated
|
||
The ci-operator behavior is defined in yaml files which are placed at the [config directory](https://github.com/openshift/release/tree/master/ci-operator/config) in a per-component/repository/branch basis. This directory is organized such as the operator can infer the components and their repositories that should be tested. And the yaml configuration files should be named in a way that it knows the branch as well. | ||
|
||
Therefore, there is the *kata-containers* directory (find it [here](https://github.com/openshift/release/tree/master/ci-operator/config/kata-containers)) to host the configuration files for our project. The configuration files are named as `kata-containers/<repository>/kata-containers-<repository>-<branch>.yaml`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The spell checker won't like this I think so I'd suggest changing to kata-containers/
.
.ci/openshift-ci/README.md
Outdated
@@ -0,0 +1,63 @@ | |||
# OpenShift CI files | |||
|
|||
The OpenShift CI (openshift-ci) allows for components to be built and tested in clusters provided by the OpenShift community. An entitled component should follow the [onboarding process](https://github.com/openshift/ci-tools/blob/master/ONBOARD.md) which involve its configuration in the [ci-operator](https://github.com/openshift/ci-operator/blob/master/README.md). So this directory contains scripts and files used by the OpenShift ci-operator to test kata-containers in that CI environment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"ci-operator" isn't a dictionary word to you either need to add it to our spell checker dictionary (see https://github.com/kata-containers/tests/tree/master/cmd/check-spelling), or -- simpler -- just refer to it as "CI operator" instead maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I changed the README to use "CI operator" instead.
.ci/openshift-ci/README.md
Outdated
|
||
By the means of [templates](https://github.com/openshift/release/tree/master/ci-operator/templates) the ci-operator allows for preparing the test environment. Commonly those templates are made to instruct the operator on how to deploy and configure OpenShift in a given cloud provider, then run end-to-end tests. | ||
|
||
In special for kata-containers the used template needs to enable nested virtualization in the test cluster. See the template currently used in the kata-containers's ci-operator configuration file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The project should always be referred to as "Kata Containers" in documentation. But here I think you are referring to the directory? In which case, put it in back-ticks: kata-containers/
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I re-wrote the README to use the project's name correctly as "Kata Containers".
ds_pods=($(oc get pods | awk '{if ($1 ~ "kata-deploy") { print $1 } }')) | ||
cnt=5 | ||
while [[ ${#ds_pods[@]} -gt 0 && $cnt -ne 0 ]]; do | ||
sleep 120 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a long time; does it need to be this long?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran this script in OpenShift CI many times, the kata-deploy daemonset never finished before 2min. Usually it takes 3 iterations to finish (~6 min). So I don't think 2 min on this sleep is a long time.
sleep 120 | ||
for i in ${!ds_pods[@]}; do | ||
info "Check daemonset ${ds_pods[i]} is running" | ||
rc=$(oc exec ${ds_pods[i]} -- cat /tmp/kata_install_status 2> /dev/null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: odd indentation (tabs vs. spaces?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've rewrite the script to fix those tabs vs spaces and long-than-80-chars lines. Thanks!
oc logs ${ds_pods[i]} | ||
unset ds_pods[i] | ||
else | ||
info "Running" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing break
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't need a break
here. I check the number of running ds_pods
in the begin of the loop.
oc describe pods $p | ||
oc logs $p | ||
done | ||
# TODO: exit here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please resolve and remove the comment, or add a ref to a GitHub issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
# Apply the CRI-O configuration | ||
info "Configuring kata runtime for CRI-O" | ||
if [ -z "$KATA_CRIO_CONF_BASE64" ]; then | ||
export KATA_CRIO_CONF_BASE64=$(echo $(cat $configs_dir/crio_kata.conf|base64) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is starting to look a bit like lisp :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, I found it a little bit uncommon (so to speak). But I wanted the config files versioned in plain-text so that any change would be properly visualized with diff. So the only solution I've found was to convert from text to base64 on run-time. Perhaps the if [ -z "$KATA_CRIO_CONF_BASE64" ]; then
is not needed....
Enable openshift-ci on runtime repository. Fixes: github.com/kata-containers/tests#2527 Depends-on: github.com/kata-containers/tests#2802 Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
ba2695f
to
83c1aa5
Compare
Hi @jodh-intel , thanks for your careful initial review! I tried to address as much as possible the comments you made on this new version. One important thing I added was the logic to wait the worker nodes to reboot after a change in the system configuration. I see that Travis warns me about missing license on some files, I will address it in a following version. v1 -> v2:
|
83c1aa5
to
5ec7e70
Compare
v2 -> v3:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @wainersm. A few more comments.
.ci/openshift-ci/README.md
Outdated
These files can be automatically generated from the CI operator configuration | ||
files as follows: | ||
|
||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: You could specify ```sh
or ```bash
to get syntax highlighting here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
.ci/openshift-ci/README.md
Outdated
|
||
1. `pipeline:src` - wraps the source repository | ||
* The source repository is copied to `$KATA_SRC/<repo>` in the image. Assume | ||
that `$KATA_SRC` is `/go/github.com/src/kata-containers` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: missing periods at the end of the full sentences in these bullets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
.ci/openshift-ci/README.md
Outdated
|
||
## Install Kata Containers | ||
|
||
In this stage the test cluster should get Kata Containers installed in. In other |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/In other to achieve/In order to achieve/ ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yet another good catch. thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
.ci/openshift-ci/README.md
Outdated
to the test cluster. Then the `$KATA_SRC/tests/.ci/openshift-ci/test.sh` script | ||
is called, but it delegates the installation process to the `$KATA_SRC/tests/.ci/openshift-ci/cluster/install_kata.sh` script. | ||
|
||
In a nutshell, it is used a Kubernetes *DaemonSet* to install Kata Containers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a nutshell
🌰 This might be confusing for non-native English speakers 🌰 <-- That's a nut folks, not what you might be thinking! 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hehehehehe Good to know. I always assumed 'in a nutshell' were an expression widely understood in technical communities. Maybe because the first computer books I got access were the ' in a nutshell' series of O'REILLY? Oh, I am getting old... :D
# Wait all worker nodes reboot. | ||
# | ||
# Params: | ||
# $1 - timeout in seconds. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit (defensive programming): We try to check params at the start of a function:
wait_for_reboot() {
local timeout="${1:-}"
[ -z "$timeout" ] && die "need timeout"
# ...
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going to change to local delta="${1:-1800}"
. So I don't need to pass a repeated value on the two calls made in this script, yet let the function configurable for future usage.
$(cat $configs_dir/crio_kata.conf|base64) | sed -e 's/\s//g') | ||
fi | ||
envsubst < ${deployments_dir}/machineconfig_kata_runtime.yaml.in | oc apply -f - | ||
if [ $(oc get machineconfigs|grep 50-kata|wc -l) != "0" ]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line looks potentially fragile - where is 50-kata
defined and is it guaranteed to ever change? Maybe "surface" it by creating a readonly
variable at the top of the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The 50-kata
is defined inside ${deployments_dir}/machineconfig_kata_runtime.yaml.in. I just learned that I can get the return code of oc get -f ${deployments_dir}/machineconfig_kata_runtime.yaml.in
instead.
envsubst < ${deployments_dir}/machineconfig_selinux.yaml.in | oc apply -f - | ||
oc get machineconfig/51-kata-selinux || die "SELinux machineconfig not found" | ||
# The new SELinux configuration will trigger another reboot. | ||
wait_for_reboot 1800 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: duplicated timeout value - maybe define a variable at the top of the file to avoid duplication?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Now the wait_for_reboot
has 1800 as default timeout.
The install_runtime.sh can configure Docker with the kata runtime. In some circumstances Docker is not installed and so the script fail. Let's avoid it by checking if Docker is installed. Fixes kata-containers#2600 Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
Enable openshift-ci on runtime repository. Fixes: github.com/kata-containers/tests#2527 Depends-on: github.com/kata-containers/tests#2802 Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
5ec7e70
to
df028f0
Compare
@jodh-intel Hi James, version 4 just came out of the oven ;) v3 -> v4:
|
Enable openshift-ci on runtime repository. Fixes: github.com/kata-containers/tests#2527 Depends-on: github.com/kata-containers/tests#2802 Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
/cc @fidencio |
export DESTDIR="$1" | ||
info "Build and install Kata Containers at ${DESTDIR}" | ||
|
||
[ "$ID" != "centos" ] && die "Expect the build root to be CentOS" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably want to add a comment here explaining why you expect CentOS. I had the same reaction as @jodh-intel when I saw that test.
# The build root container has already golang installed, let's remove it | ||
# so that it will use the version required by kata. | ||
yum remove -y golang | ||
"${cidir}/install_go.sh" -p -f |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two questions about this
- What happens if you try to use the CentOS golang? (just curiosity on my part)
- The install script installs go using
curl
. Should you addyum install curl
for safety where you also installsudo
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two questions about this
1. What happens if you try to use the CentOS golang? (just curiosity on my part)
Good question. I don't remember If I ever tried the CentOS golang. What I remember is that something used to fail if the environment had the CentOS golang package installed.
The fact is that I tried to adhere this build script as much as possible with the process used in all the Kata Jenkins jobs. And they don't use the distro's golang, instead it is installed the version suggested (runtime/versions.yaml
) via tests/.ci/install_go.sh
.
2. The install script installs go using `curl`. Should you add `yum install curl` for safety where you also install `sudo`?
Yes, maybe. In the CentOS image used as build root container has curl
installed so I didn't bothered.
# The default /usr prefix makes the deployment on Red Hat CoreOS (rhcos) more | ||
# complex because that directory is read-only by design. Another prefix than | ||
# /opt/kata is problematic too because QEMU experimental eventually got from | ||
# Kata Containers Jenkins is uncompressed in that directory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this sentence. Even grammatically.
Do you mean that the CI also installs qemu, and that we need to add our stuff in the same spot (if so, why?). Is it because the prefix will be used for the configuration files, and passing /opt/kata
"magically" generates configuration files that point to the right qemu? If so, it really needs to be explained. That's what I understood reading the sentence, but clearly. the sed
below adjusts the configuration to point back to /usr/bin
. so I'm a bit confused.
From a CI point of view, does that mean that what we test with the CI is not installed in the path where it would be installed on an RHCOS system? What are the implications, if any?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this sentence. Even grammatically.
Reading it again, I even didn't understand what I wrote. :)
I will rephrase it.
Do you mean that the CI also installs qemu, and that we need to add our stuff in the same spot (if so, why?). Is it because the prefix will be used for the configuration files, and passing
/opt/kata
"magically" generates configuration files that point to the right qemu? If so, it really needs to be explained. That's what I understood reading the sentence, but clearly. thesed
below adjusts the configuration to point back to/usr/bin
. so I'm a bit confused.
The outcome of buildall_install.sh
is a full installation of Kata Containers under runtime/_out/build_install
. This is later packaged in a container image along with a install script. By the use of Kubernetes DaemonSets, the content of runtime/_out/build_install
is rsync'ed with the node's filesystem.
That said, some facts:
- By default Kata Containers build to
/usr/
andPREFIX=/usr
- On the cluster node (rhcos):
/usr
is read-only and should not be changed. (see rhcos architecture)/opt
is a symlink to/var/opt
/var
is writable but, for the sake of support, it should be changed only by machineconfig objects.
- Kata CI jobs use statically built QEMU rather than of the distros. The CI script will simply fetch and uncompress the tarball in the test machine.
- Kata provides two flavours of QEMU for CI:
- "experimental" QEMU and virtiofs (see qemu-experimental-nightly-x86_64 job) built for
/opt/kata
prefix. - "standard" QEMU (see the qemu-nightly-x86_64 job) which unlike the experimental QEMU is built for
/usr/
- "experimental" QEMU and virtiofs (see qemu-experimental-nightly-x86_64 job) built for
- Kata CI jobs also use built kernel (guest OS) and rootfs (initrd or ramdisk)
- Alike QEMU those files are fetched from Jenkins and uncompressed in the system. Moreover, they will be uncompressed following the
PREFIX
value.
- Based on the value of
PREFIX
the CI scripts will configure Kata'sconfiguration.toml
so that it contain the right paths to QEMU (virtiofsd, if the case), Kernel and rootfs.
In buildall_install.sh
script PREFIX
is set to /opt/kata
so that Kata is built to /opt/kata
and then I avoid the fact 2
(/usr
being readonly). The kernel and rootfs are properly copied to under /opt/kata
(fact 4
), the "standard" QEMU goes to /usr
(fact 3
) -- here is the problem -- and then the configuration.toml
file gets mis-configured (fact 5
). So I change the configuration.toml
to fix that as you can see below.
You might ask:
Q: How do you install QEMU in /usr
given it is readonly?
A: Dirty hack. Notice that on runtime/.ci/openshift-ci/images/entrypoint.sh
script the /usr
is re-mounted as read-write
Q: QEMU is static, so can't you simply copy the binary to /opt/kata/bin
?
A: It is possible but there is another issue. It is built the firmwares as well, and the QEMU binary still expect them under /usr
. Perhaps that can be fixed by changing to firmware location on configuration.toml
too.
Q: Can't it be solved differently then?
A: Yes, if we have a "standard" QEMU built with /opt/kata
prefix and cached in Jenkins the problem will likely to be over. Honestly, I plan to work on it after this PR gets merged.
Q: Why don't you use QEMU from rhcos?
A: It is trick and brings in a completely different string of issues. We can discuss that in details later but, briefly speaking, rhcos doesn't package QEMU. We could use, say, the CentOS package but on openshift 4.5 it doesn't install smoothly (conflicts with glusterfs....).
All in all, there are plenty of improvements and important decisions to be taken but I am deferring them to after I get this merged. Because I think what I propose here works pretty much well as the v0 to get Kata on-boarded on Openshift CI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kata provides two flavours of QEMU for CI:
- "experimental" QEMU and virtiofs (see qemu-experimental-nightly-x86_64 job) built for
/opt/kata
prefix.- "standard" QEMU (see the qemu-nightly-x86_64 job) which unlike the experimental QEMU is built for
/usr/
That's good news. I thought it was only testing the experimental one.
In
buildall_install.sh
scriptPREFIX
is set to/opt/kata
so that Kata is built to/opt/kata
and then I avoid thefact 2
(/usr
being readonly). The kernel and rootfs are properly copied to under/opt/kata
(fact 4
), the "standard" QEMU goes to/usr
(fact 3
) -- here is the problem -- and then theconfiguration.toml
file gets mis-configured (fact 5
). So I change theconfiguration.toml
to fix that as you can see below.
A meta Q about that. If RHCOS is supposed to be a read-only system, why does it let you install stuff in /opt at all? 😃
Q: How do you install QEMU in
/usr
given it is readonly?
A: Dirty hack. Notice that onruntime/.ci/openshift-ci/images/entrypoint.sh
script the/usr
is re-mounted as read-write
That part I was aware of, but it's definitely useful to mention it in the comments (it's not obvious at all IMO to someone not familiar with RHCOS)
Q: QEMU is static, so can't you simply copy the binary to
/opt/kata/bin
?
A: It is possible but there is another issue. It is built the firmwares as well, and the QEMU binary still expect them under/usr
. Perhaps that can be fixed by changing to firmware location onconfiguration.toml
too.
A longer term solution might be to mount some subset of the container root fs into the target location (to avoid copies while preserving the right paths)
Q: Can't it be solved differently then?
A: Yes, if we have a "standard" QEMU built with/opt/kata
prefix and cached in Jenkins the problem will likely to be over. Honestly, I plan to work on it after this PR gets merged.
Sounds good.
Q: Why don't you use QEMU from rhcos?
A: It is trick and brings in a completely different string of issues. We can discuss that in details later but, briefly speaking, rhcos doesn't package QEMU. We could use, say, the CentOS package but on openshift 4.5 it doesn't install smoothly (conflicts with glusterfs....).
That's another Q I did not have personally, but that would be a valuable addition to the commit message.
All in all, there are plenty of improvements and important decisions to be taken but I am deferring them to after I get this merged. Because I think what I propose here works pretty much well as the v0 to get Kata on-boarded on Openshift CI.
I think that your Q / A is an excellent addition to the commit msg. Thanks a bunch.
fi | ||
done | ||
|
||
# The standard QEMU cached in Jenkins is built with /usr/bin prefix, thus it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does that qemu come from? Is that the one built with kata, which we are trying to get rid of?
Shouldn't we try to go with a yum install qemu
and get the dependencies that way instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Answered those questions on previous reply.
The OpenShift CI allows for components to be built and tested in clusters provided by the OpenShift community. This adds the minimum required on kata's side to get it onboarded in that CI system. More precisely, it introduces the buildall_install.sh script which is used to build and install kata-containers into a container image. Fixes kata-containers#2527 Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
df028f0
to
26f9262
Compare
v4 -> v5:
|
@jodh-intel , @c3d , what else should I do to get those changes merged? |
@wainersm - thanks for updating. There are a number of unanswered questions on this PR though. I think they should all be relatively quick to resolve but please could you respond to them? |
This adds the implementation of a mini-operator which is used to install kata into the OpenShift test cluster. Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
This adds a "Hello World" suite which provides a smoke test for kata and its installation in the test cluster. Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
Add the README.md file. Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
26f9262
to
970a589
Compare
v5 -> v6:
|
Hi @jodh-intel , I scan through the question on this PR, and either I answered or change the code properly. Should I "resolve conversation" for every and each conversation? Sorry, I am not used to the review process for Kata... so I might be missing some steps. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @wainersm.
lgtm
/test |
Thanks @jodh-intel ! I appreciated your careful review! |
@devimc Hi Julio, since you reviewed the required changes on runtime side (kata-containers/runtime#2891) ...mind to review this one as well? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm - thanks @wainersm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've spoken with Wainer Yesterday and this part is the first thing we need merged.
Once it's merged, we can work on the OpenShift side and then it'll be good to actually start running it.
@wainersm, thanks a ton for the immense work done on this!
@GabyCT, @chavafg, I'd really feel more comfortable if one you would be merging this PR!
Thanks again for this @wainersm! |
Enable openshift-ci on runtime repository. Fixes: github.com/kata-containers/tests#2527 Depends-on: github.com/kata-containers/tests#2802 Signed-off-by: Wainer dos Santos Moschetta <[email protected]>
This is an initial implementation on kata-containers's side of the requirements to get it on-boarded into the OpenShift CI. On the README file added on this PR there are more detailed technical information about the openshift-ci and how the scripts/files proposed on this PR are used. On openshift-ci's side there will be needed to have openshift/release#9009 merged too. Please, refer to that README (and the openshift-ci documentation pointers it provides) for further information.
In the issue #2527 I describe the ultimate goal with these changes which is to run e2e openshift 4 tests with latest kata-containers code. However this initial implementation provides only a smoke test suite that I used for the purpose of testing the CI code itself. So once this initial backbone is merged I will be able to work on the execution of e2e tests properly saying.
On my development cluster the proposed here pipeline installs kata correctly and the smoke tests pass. However, on the openshift-ci environment the pipeline seems to install kata correctly but the smoke test fails. I've debugged it but I don't have a solution so far and the problem seems alike cri-o/cri-o#2984 .
Also there is a known issue (and a colleague is helping me to solve):
build_install.sh
basically exports the variables to produce the desired kata installation (underruntime/_out/build_install
) then calls thetests/.ci/Install_kata.sh
script. It tries to fetch pre-built artifacts from Jenkins and when the cache fails it attempts to build them locally. The problem is that in openshift-ci build environment it is not allowed to evoke docker because it is already in a containerized context (i.e. no nested containers allowed).Regardless the limitations and known issues I explained before, I would like to have a general review on this PR because I've spent too much time on the code and likely there are stuffs that I missed (and I don't want to spend much more time fixing it :)