-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Breaking] "scope" member now defaults to parent of "start_url" #647
Conversation
The default URL is the start URL with its filename, query and fragment removed. This is consistent with Chrome's implementation and several sites rely on this. It seems to be the most logical behaviour. This means it is not possible to get an "unbounded" scope. Any reference to unbounded scope in the document is now obsolete.
Anybody know why pr-preview hasn't kicked in yet? (There is no 647 page on pr-preview.s3.amazonaws.com.) I was hoping to be able to share this around with a link to the nicely formatted Diff page. |
That's odd... no idea why it failed. Maybe send again? |
index.html
Outdated
<div class="note" data-link-for="WebAppManifest"> | ||
<p> | ||
If the <a>scope</a> member is not present in the manifest, it | ||
defaults to the directory containing the <a>start_url</a> member. For |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that there is not such thing as a "directory" in URLs (I know you mean path segments here)... it's still somewhat problematic, because the path segments are kinda meaningless or used to mean different things (unlike file system directories).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know, I was trying to avoid using the word "directory" but I just ran with it. Only in non-normative text. I can change it but do you have any suggested wording?
it's still somewhat problematic
What do you mean? The wording is problematic, or the concept? I think the concept of cutting off the last part of the path is very well defined on the web (it's the standard URL resolution algorithm). The reason resolution works this way is because it allows HTML pages to reference a file in the same directory just by name, which is why I described it this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can change it but do you have any suggested wording?
We should follow terminology from the URL spec (i.e., path segment).
What do you mean? The wording is problematic, or the concept?
Apologies, I accidentally missed this line:
https://github.com/w3c/manifest/pull/647/files#diff-eacf331f0ffc35d4b482f1d15a887d3bR1742
Which addresses the issue. There too, we might just point to the URL spec "single-dot path segment".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I didn't want to s/directory/path segment, because a path segment is just one piece, not the whole prefix. (For example, if the path is /x/y/z.html
, the scope is /x/y
, not just y
.)
I've reworded it to refer to path terminology, not directory.
There too, we might just point to the URL spec "single-dot path segment".
Not sure what you mean. Do you think the actual algorithm should say "single-dot path segment" instead of "."? I don't think so, because "." is a specific string, while "single-dot path segment" is a pattern match.
Hrr, GitHub doesn't let me create a new PR when one already exists. Also you've started commenting already :) A shame, because the whole reason I waited until today to send the PR is so that the pr-preview Diff would look right. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, with a few nits on terminology, this seems reasonable to me. I'd be happy to change have us change Gecko behaviour to match.
As this is a significant change, we'd naturally want to get an ok from a few other implementers before we merge. @tobie, sorry to bother you, but for some reason we didn't get the preview generated? Any chance you could check what happened? |
@mgiuca, it MAY regenerate the pr-preview once you push another commit (I confirmed it's working, as you probably saw my "test" pull request). You could push an empty commit even, and then we can squash it later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've uploaded a patch set; let's see if we get a diff.
Edit: We did, yay.
index.html
Outdated
<div class="note" data-link-for="WebAppManifest"> | ||
<p> | ||
If the <a>scope</a> member is not present in the manifest, it | ||
defaults to the directory containing the <a>start_url</a> member. For |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I didn't want to s/directory/path segment, because a path segment is just one piece, not the whole prefix. (For example, if the path is /x/y/z.html
, the scope is /x/y
, not just y
.)
I've reworded it to refer to path terminology, not directory.
There too, we might just point to the URL spec "single-dot path segment".
Not sure what you mean. Do you think the actual algorithm should say "single-dot path segment" instead of "."? I don't think so, because "." is a specific string, while "single-dot path segment" is a pattern match.
Excellent, thanks for also checking the URL stuff. Let’s get this in front of the various implementers and hopefully get consensus quickly. |
Is the Gecko behaviour in-line with the spec as it stands now? (i.e., you never navigate away from the app if the scope is unspecified). It would be good to make sure Mozilla engineering is happy to make this change before we commit to the spec change. I have also contacted Apple to discuss, since Safari has shipped this in a public beta. |
<a>scope</a> is missing, the navigation scope will be | ||
<code>/pages/</code> on the same origin. If <a>start_url</a> is | ||
<code>/pages/</code> (the trailing slash is important!), the | ||
navigation scope will be <code>/pages/</code>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So basically something like a base path or so. Isn't this defined somewhere?
If not, why not define it and create an algorithm, basically:
base = (url) => url.substr(0, url.lastIndexOf('/') + 1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also what happens with /hi/there/../cool/youcouldmakeit/
- I guess for these cases it would be nice with an algorithm.
function base(path) {
const resolvedPath = new URL(path, document.location.href).pathname;
return resolvedPath.substr(0, resolvedPath.lastIndexOf('/') + 1);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So basically something like a base path or so. Isn't this defined somewhere?
Yes, in the URL standard. This section is a non-normative explanation with examples, for readers who want to understand practically what the default scope will be, without going into the URL parsing algorithm (which is practically unreadable).
The real (normative) text is in the "scope" member section, where the default is defined as "a new URL using "." as input and start URL as the base URL". That invokes the URL parser which then uses the standard resolution algorithm, the same as if you had <a href=".">
to link to the containing directory of the current page. I did that because I don't think duplicating the resolution algorithm is a good idea, but it's a little arcane, hence the non-normative note.
If you want to explicitly define the logic here, instead of using the existing URL resolution algorithm, you need to:
- Drop the query and fragment.
- Drop everything after the last slash in the path.
The algorithm needs to operate on a URL object (not a string), so the path is represented as a list of segments, implicitly slash-separated. You need to be careful that the path ends with a blank segment, to represent a trailing slash. The algorithm would be this:
- Let default be a new URL.
- Set default’s username to start URL's username, default’s password to start URL's password, default’s host to start URL's host, default’s port to start URL's port, default’s path to a copy of start URL's path.
- Remove default’s path’s last item, if any.
- Append the empty string to default's path.
(This is basically a specialization of the URL Parser with input hard-coded as ".").
But I would rather not duplicate the existing logic.
(Aside: One of my main gripes about the URL standard is that almost the entire thing is a monolithic token-consume parser for the entire URL, rather than being broken into distinct concepts that you can easily point to, like a "resolve" algorithm which is what we need here. Formally, there is no such thing as a "relative URL" and there is no such thing as "resolution"; it's all just built-in functionality of the URL parser, which makes it hard to reference from outside, as we need to do here.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if @annevk has some comments for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The language about "new URL" isn't quite correct. You actually need to invoke the URL parser. Otherwise I tend to agree with @mgiuca on the approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @annevk but I was trying to be consistent with the writing in the rest of this document (in fact on the very next line: "Let scope URL be a new URL using value as input and manifest URL as the base URL."). There are six existing instances of this "new URL using X as input" in the Manifest spec.
I'll do a pass over it afterwards and fix up the invocations of URL parsing, but I think it's more urgent to land this to keep implementation divergence from getting out of control.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am fine with this, do you want me to land this now?
</p> | ||
<p> | ||
Developers should take care, if they rely on the default behaviour, | ||
that all of the application's page URLs begin with the parent path of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even better if you could like to the "parent path" algorithm here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this still needed, given the above? Technically, I do, because if you click on "scope" in that sentence, it links to the parent path algorithm defined in scope member.
@mgiuca talk to Andreas Bovens from Mozilla - he is the PR. |
@marcoscaceres Can you ping Andreas on this (or whoever is most appropriate on the product or eng side; i.e., who would be an approver for changing the implementation). |
OK, already approved by Marcos, so let's land this! 💯 |
This removes the concept of "unbounded scope", which was special behavior if "scope" was omitted. Now, if "scope" is omitted, it defaults to the parent directory of "start_url", which is the most sensible default.
This is a breaking normative change. Sites relying on the old behavior may find that the scope is now more restrictive. In particular, if the "start_url" is deep within the path hierarchy, the rest of the site may no longer be considered part of the app. Having said that, a major implementation (Chrome for Android) already had this policy, so this is expected to bring the spec in-line with current expectations, rather than break existing sites.
Closes #550.
Preview | Diff