[Breaking] "scope" member now defaults to parent of "start_url" #647

mgiuca · 2018-01-31T00:02:47Z

This removes the concept of "unbounded scope", which was special behavior if "scope" was omitted. Now, if "scope" is omitted, it defaults to the parent directory of "start_url", which is the most sensible default.

This is a breaking normative change. Sites relying on the old behavior may find that the scope is now more restrictive. In particular, if the "start_url" is deep within the path hierarchy, the rest of the site may no longer be considered part of the app. Having said that, a major implementation (Chrome for Android) already had this policy, so this is expected to bring the spec in-line with current expectations, rather than break existing sites.

Closes #550.

Preview | Diff

The default URL is the start URL with its filename, query and fragment removed. This is consistent with Chrome's implementation and several sites rely on this. It seems to be the most logical behaviour. This means it is not possible to get an "unbounded" scope. Any reference to unbounded scope in the document is now obsolete.

mgiuca · 2018-01-31T02:42:06Z

Anybody know why pr-preview hasn't kicked in yet? (There is no 647 page on pr-preview.s3.amazonaws.com.) I was hoping to be able to share this around with a link to the nicely formatted Diff page.

marcoscaceres · 2018-01-31T03:48:04Z

That's odd... no idea why it failed. Maybe send again?

marcoscaceres · 2018-01-31T03:51:51Z

index.html

+      <div class="note" data-link-for="WebAppManifest">
+        <p>
+          If the <a>scope</a> member is not present in the manifest, it
+          defaults to the directory containing the <a>start_url</a> member. For


The problem is that there is not such thing as a "directory" in URLs (I know you mean path segments here)... it's still somewhat problematic, because the path segments are kinda meaningless or used to mean different things (unlike file system directories).

I know, I was trying to avoid using the word "directory" but I just ran with it. Only in non-normative text. I can change it but do you have any suggested wording?

it's still somewhat problematic

What do you mean? The wording is problematic, or the concept? I think the concept of cutting off the last part of the path is very well defined on the web (it's the standard URL resolution algorithm). The reason resolution works this way is because it allows HTML pages to reference a file in the same directory just by name, which is why I described it this way.

I can change it but do you have any suggested wording?

We should follow terminology from the URL spec (i.e., path segment).

What do you mean? The wording is problematic, or the concept?

Apologies, I accidentally missed this line:
https://github.com/w3c/manifest/pull/647/files#diff-eacf331f0ffc35d4b482f1d15a887d3bR1742

Which addresses the issue. There too, we might just point to the URL spec "single-dot path segment".

OK. I didn't want to s/directory/path segment, because a path segment is just one piece, not the whole prefix. (For example, if the path is /x/y/z.html, the scope is /x/y, not just y.)

I've reworded it to refer to path terminology, not directory.

There too, we might just point to the URL spec "single-dot path segment".

Not sure what you mean. Do you think the actual algorithm should say "single-dot path segment" instead of "."? I don't think so, because "." is a specific string, while "single-dot path segment" is a pattern match.

mgiuca · 2018-01-31T03:55:25Z

That's odd... no idea why it failed. Maybe send again?

Hrr, GitHub doesn't let me create a new PR when one already exists. Also you've started commenting already :)

A shame, because the whole reason I waited until today to send the PR is so that the pr-preview Diff would look right.

marcoscaceres

Overall, with a few nits on terminology, this seems reasonable to me. I'd be happy to change have us change Gecko behaviour to match.

marcoscaceres · 2018-01-31T04:11:09Z

As this is a significant change, we'd naturally want to get an ok from a few other implementers before we merge.

@tobie, sorry to bother you, but for some reason we didn't get the preview generated? Any chance you could check what happened?

marcoscaceres · 2018-01-31T04:21:05Z

@mgiuca, it MAY regenerate the pr-preview once you push another commit (I confirmed it's working, as you probably saw my "test" pull request). You could push an empty commit even, and then we can squash it later.

mgiuca

I've uploaded a patch set; let's see if we get a diff.

Edit: We did, yay.

mgiuca · 2018-01-31T05:05:09Z

index.html

+      <div class="note" data-link-for="WebAppManifest">
+        <p>
+          If the <a>scope</a> member is not present in the manifest, it
+          defaults to the directory containing the <a>start_url</a> member. For


OK. I didn't want to s/directory/path segment, because a path segment is just one piece, not the whole prefix. (For example, if the path is /x/y/z.html, the scope is /x/y, not just y.)

I've reworded it to refer to path terminology, not directory.

There too, we might just point to the URL spec "single-dot path segment".

Not sure what you mean. Do you think the actual algorithm should say "single-dot path segment" instead of "."? I don't think so, because "." is a specific string, while "single-dot path segment" is a pattern match.

marcoscaceres · 2018-01-31T06:12:40Z

Excellent, thanks for also checking the URL stuff. Let’s get this in front of the various implementers and hopefully get consensus quickly.

mgiuca · 2018-01-31T08:07:07Z

Overall, with a few nits on terminology, this seems reasonable to me. I'd be happy to change have us change Gecko behaviour to match.

Is the Gecko behaviour in-line with the spec as it stands now? (i.e., you never navigate away from the app if the scope is unspecified). It would be good to make sure Mozilla engineering is happy to make this change before we commit to the spec change.

I have also contacted Apple to discuss, since Safari has shipped this in a public beta.

kenchris · 2018-01-31T10:30:04Z

index.html

+          <a>scope</a> is missing, the navigation scope will be
+          <code>/pages/</code> on the same origin. If <a>start_url</a> is
+          <code>/pages/</code> (the trailing slash is important!), the
+          navigation scope will be <code>/pages/</code>.


So basically something like a base path or so. Isn't this defined somewhere?

If not, why not define it and create an algorithm, basically:

base = (url) => url.substr(0, url.lastIndexOf('/') + 1)

Also what happens with /hi/there/../cool/youcouldmakeit/ - I guess for these cases it would be nice with an algorithm.

function base(path) { const resolvedPath = new URL(path, document.location.href).pathname; return resolvedPath.substr(0, resolvedPath.lastIndexOf('/') + 1); }

So basically something like a base path or so. Isn't this defined somewhere?

Yes, in the URL standard. This section is a non-normative explanation with examples, for readers who want to understand practically what the default scope will be, without going into the URL parsing algorithm (which is practically unreadable).

The real (normative) text is in the "scope" member section, where the default is defined as "a new URL using "." as input and start URL as the base URL". That invokes the URL parser which then uses the standard resolution algorithm, the same as if you had <a href="."> to link to the containing directory of the current page. I did that because I don't think duplicating the resolution algorithm is a good idea, but it's a little arcane, hence the non-normative note.

If you want to explicitly define the logic here, instead of using the existing URL resolution algorithm, you need to:

Drop the query and fragment.

Drop everything after the last slash in the path.

The algorithm needs to operate on a URL object (not a string), so the path is represented as a list of segments, implicitly slash-separated. You need to be careful that the path ends with a blank segment, to represent a trailing slash. The algorithm would be this:

Let default be a new URL.

Set default’s username to start URL's username, default’s password to start URL's password, default’s host to start URL's host, default’s port to start URL's port, default’s path to a copy of start URL's path.

Remove default’s path’s last item, if any.

Append the empty string to default's path.

(This is basically a specialization of the URL Parser with input hard-coded as ".").

But I would rather not duplicate the existing logic.

(Aside: One of my main gripes about the URL standard is that almost the entire thing is a monolithic token-consume parser for the entire URL, rather than being broken into distinct concepts that you can easily point to, like a "resolve" algorithm which is what we need here. Formally, there is no such thing as a "relative URL" and there is no such thing as "resolution"; it's all just built-in functionality of the URL parser, which makes it hard to reference from outside, as we need to do here.)

I wonder if @annevk has some comments for this?

The language about "new URL" isn't quite correct. You actually need to invoke the URL parser. Otherwise I tend to agree with @mgiuca on the approach.

I agree with @annevk but I was trying to be consistent with the writing in the rest of this document (in fact on the very next line: "Let scope URL be a new URL using value as input and manifest URL as the base URL."). There are six existing instances of this "new URL using X as input" in the Manifest spec.

I'll do a pass over it afterwards and fix up the invocations of URL parsing, but I think it's more urgent to land this to keep implementation divergence from getting out of control.

I am fine with this, do you want me to land this now?

kenchris · 2018-01-31T10:43:13Z

index.html

+        </p>
+        <p>
+          Developers should take care, if they rely on the default behaviour,
+          that all of the application's page URLs begin with the parent path of


Even better if you could like to the "parent path" algorithm here

Is this still needed, given the above? Technically, I do, because if you click on "scope" in that sentence, it links to the parent path algorithm defined in scope member.

kenchris · 2018-01-31T10:45:10Z

@mgiuca talk to Andreas Bovens from Mozilla - he is the PR.

mgiuca · 2018-01-31T23:36:41Z

@marcoscaceres Can you ping Andreas on this (or whoever is most appropriate on the product or eng side; i.e., who would be an approver for changing the implementation).

kenchris · 2018-02-02T10:20:33Z

OK, already approved by Marcos, so let's land this! 💯

mgiuca added 3 commits January 31, 2018 10:51

Remove all references to unbounded scope.

aa6db2c

Rewrite note explaining the default behaviour of scope.

e15babe

mgiuca requested review from marcoscaceres and kenchris January 31, 2018 00:03

mgiuca mentioned this pull request Jan 31, 2018

pr-preview bot doesn't seem to be working tobie/pr-preview#15

Closed

marcoscaceres reviewed Jan 31, 2018

View reviewed changes

marcoscaceres approved these changes Jan 31, 2018

View reviewed changes

Avoid the term 'directory' when describing a path.

a665904

mgiuca commented Jan 31, 2018

View reviewed changes

kenchris reviewed Jan 31, 2018

View reviewed changes

kenchris merged commit 352396e into w3c:gh-pages Feb 2, 2018

mgiuca mentioned this pull request Mar 9, 2018

Clarify rule of navigating away from the navigation scope #646

Closed

mgiuca deleted the default-scope branch June 13, 2018 05:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Breaking] "scope" member now defaults to parent of "start_url" #647

[Breaking] "scope" member now defaults to parent of "start_url" #647

mgiuca commented Jan 31, 2018 •

edited by pr-preview bot

Loading

mgiuca commented Jan 31, 2018

marcoscaceres commented Jan 31, 2018

marcoscaceres Jan 31, 2018

mgiuca Jan 31, 2018

marcoscaceres Jan 31, 2018

mgiuca Jan 31, 2018

mgiuca commented Jan 31, 2018

marcoscaceres left a comment

marcoscaceres commented Jan 31, 2018

marcoscaceres commented Jan 31, 2018

mgiuca left a comment •

edited

Loading

mgiuca Jan 31, 2018

marcoscaceres commented Jan 31, 2018

mgiuca commented Jan 31, 2018

kenchris Jan 31, 2018

kenchris Jan 31, 2018 •

edited

Loading

mgiuca Jan 31, 2018

kenchris Feb 1, 2018

annevk Feb 1, 2018

mgiuca Feb 2, 2018

kenchris Feb 2, 2018

kenchris Jan 31, 2018

mgiuca Jan 31, 2018

kenchris commented Jan 31, 2018

mgiuca commented Jan 31, 2018

kenchris commented Feb 2, 2018

[Breaking] "scope" member now defaults to parent of "start_url" #647

[Breaking] "scope" member now defaults to parent of "start_url" #647

Conversation

mgiuca commented Jan 31, 2018 • edited by pr-preview bot Loading

mgiuca commented Jan 31, 2018

marcoscaceres commented Jan 31, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mgiuca commented Jan 31, 2018

marcoscaceres left a comment

Choose a reason for hiding this comment

marcoscaceres commented Jan 31, 2018

marcoscaceres commented Jan 31, 2018

mgiuca left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

marcoscaceres commented Jan 31, 2018

mgiuca commented Jan 31, 2018

Choose a reason for hiding this comment

kenchris Jan 31, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kenchris commented Jan 31, 2018

mgiuca commented Jan 31, 2018

kenchris commented Feb 2, 2018

mgiuca commented Jan 31, 2018 •

edited by pr-preview bot

Loading

mgiuca left a comment •

edited

Loading

kenchris Jan 31, 2018 •

edited

Loading