-
Notifications
You must be signed in to change notification settings - Fork 557
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create image.md #293
Create image.md #293
Conversation
|
||
## Notational Conventions | ||
|
||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this and other documents of the specification are to be interpreted as described in [RFC2119](http://tools.ietf.org/html/rfc2119). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the phrasing from #260 already covers this, so we can remove this section here.
Feedback on tactics: I'm not sure why we have all the explicit ordering requirements Without a reproducible way to go from a filesystem to a hash, I don't Feedback on strategy: This sounds like a fancy way to construct and distribute a root
|
The suggested changes imply that the runtime would have to construct a rootfs that could be used in the container (what we pivot the root to) either by using a union filesystem or just copying the files to a new root. Should we clarify that further? |
On Tue, Jan 05, 2016 at 03:22:55PM -0800, Mrunal Patel wrote:
In the current commit (e021a7f), unpacking and image to a filesystem |
I think large design things like this should be discussed on the mailing lists and with Google Docs. GitHub is a very difficult forum for this. Also, here is a proposal I worked up that tries to break an image format into a number of different layers with hashes at each of those layers: https://docs.google.com/document/d/1JPQqH9m9fwZIVsvpB5FSyPJjH0RZLnFkI2J2gohfPu0/edit My concern here is that the image format is too monolithic and includes the host specific config.json file by default. |
I agree that we should discuss this on the mailing list first. |
|
||
An image's `ID` MUST be a string of the form `sha512-<hash>`. | ||
|
||
QUESTION: Do we really want to mandate sha512? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a solid default. For a majority of x86{,_64} chips, it is faster than sha256. Though for architectures like ARM, this could be a hit.
Here is my laptop. Both are speedy, but sha512 is faster:
$ dd if=/dev/urandom of=rand.img bs=1M count=64
64+0 records in
64+0 records out
67108864 bytes (67 MB) copied, 9.47004 s, 7.1 MB/s
$ time sha256sum rand.img
2d4e6875974352af656630e8ed76ee036847888d69c507f73809d672c58f439d rand.img
real 0m1.068s
user 0m1.026s
sys 0m0.041s
$ time sha512sum rand.img
ba914581b051a017d9f7fc3169bd6036bfa24666eb692d891107ce237d60a091919f0cbce0408221c452bd0217ee1388a72ca33a2b65e6bb791714217f85a4d3 rand.img
real 0m0.692s
user 0m0.654s
sys 0m0.036s
Here is a raspberry pi. The sha512 is an order of magnitude slower:
$ dd if=/dev/urandom of=rand.img bs=1M count=64
64+0 records in
64+0 records out
67108864 bytes (67 MB) copied, 54.8794 s, 1.2 MB/s
$ time sha256sum rand.img
d286218629fccb13c9c1568b6ddb9f2fc5452299a4ee307e3cff353836d8901a rand.img
real 0m5.552s
user 0m4.850s
sys 0m0.440s
$ time sha512sum rand.img
8dcaddd1337cc9dcef273b4a53243563a619747214256a7c6285e6e7a94c671d06121ba08c87ae8c0a33bde4eb142d80413782913f79d9fd2686799174be1061 rand.img
real 1m38.730s
user 1m33.010s
sys 0m1.120s
But like the pattern of <algorithm>-<digest>
could allow folks to choose alternates that suite their needs better, or once there is the next great algorithm to be switched to.
923813e
to
c501cdb
Compare
a842fc9
to
16e4366
Compare
|
||
1. Layers. | ||
Each embedded layer MUST have a corresponding image (tar) file, at the root of the image. | ||
Each layer's tar file MUST be named with its ID followed by the appropriate file-extension based on how the tar file is written (e.g. `.tar` for uncompressed). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
File extensions don't matter, we should just use magic numbers like ACI does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't focusing on the extension for any purpose, other than to say it needs to have one after the ID part of the name.
Not sure what your "why?" was referring to, but if its the ID part of the filename then I did that because we needed a way to go from the "layers" to the embedded tar file in a consistent fashion. We can't use the "name" of the image because it might have characters that are not permitted on all filesystems - like a "/" - so having a tar file with that name might not be possible. Using, and requiring an ID, also forces us to ensure there's a crypto key for each embedded image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you guys are reading too much into the extension - all files (well most) have extensions, tar/tgz/etc...
In that sentence I was defining the format of the base filename nothing more - I can remove the comment about extensions - I just added it to make it clear that its still a normal file and whatever extension would normally be there should still be there.. I still expect there to be the appopriate magic number in the file - but that's out of scope of this doc - that's defined by 'tar.
|
||
1. A `manifest.json` file describing the contents of the image. | ||
This file is REQUIRED to be present in the image. | ||
This file extends the [`config.json`](config.md) file with some additional properties: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm struggling on this.
As the config.json
has evolved to essentially be a instance configuration for a container, I don't see how the manifest.json
is particularly an extension. This comes back to the F2F conversations about having templates, or "suggests" or "requests" for resources. But the config.json is really not portable and would come with a list of caveats and exceptions, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tend to think of this as a separate discussion. Whether the json in an image becomes the exact values we see in config.json or a template, or ranges, or something else is something that we can work on in parallel. To me the key question for us in this PR is whether we're ok with a single json file (manifest.json) that contains: 1) what will be used to construct the config.json, and 2) extra fields to help process the image. Or do we want to split these into multiple files. I don't yet see value in splitting them. Then in a different PR we can twiddle the json to be what we want based on whatever usecases we're looking to support.
Perhaps people could enumerate the usecases for something more verbose than the existing config.json.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On Mon, Feb 15, 2016 at 06:40:49PM -0800, Doug Davis wrote:
Or do we want to split these into multiple files. I don't yet see
value in splitting them.
The value I see in splitting files is clear separation of concerns. A
manifest file contains only the bits needed for fetching and
validation. A config.json contains only the bits needed by the
‘start’ command. A process.json contains only the bits needed by the
‘exec’ command 1. The state JSON contains only the bits needed to
manipulate (join, signal, tear down) a container. I'm ok allowing
folks to wedge all this into one file 2, but it seems like it would
be easier to write the individual tools and to understand the results
of a particular action if each action had a separate config file.
Unless there's some overlap between the actions, I see no benefit to
configuring multiple actions in the same file. And “start a container
from this OCI-bundle directory” and “create and OCI-bundle directory
from this tarball” don't seem like they have any overlap.
Subject: Re: Labels and extension meta data in containers #108
Date: Wed, 30 Dec 2015 20:54:37 -0800
Message-ID: <[email protected]>
An `ID` MUST be a string of the form `<algorithm>-<hash value>`. | ||
The `algorithm` part of the string MUST be in lower-case. | ||
For example, `sha512` not `SHA512`. | ||
It is RECOMMENDED that implementations use the sha512 algorithm. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should perhaps back this with some reasoning and references and also suggest alternatives.
Signed-off-by: Doug Davis <[email protected]>
Signed-off-by: Doug Davis <[email protected]>
Signed-off-by: Doug Davis <[email protected]>
rebased |
This work is moving into the newly formed OCI Image Format. https://github.com/opencontainers/image-spec |
This is far from complete but I wanted to get it out there in time for people to look it over prior to the f2f next week. Its kind of a merging of docker's image format and appc's image format.
Signed-off-by: Doug Davis [email protected]