com-1nl

str-2nl

str-1nl

str-0nl

string-0nl

url-2nl

url-1nl

url-0nl

url-full-0nl

rgb-2nl

rgb-1nl

rgb-0nl

rgb-full-0nl

cnt-2nl

cnt-1nl

cnt-0nl

cnt-full-0nl

Expected results if the below proposals were to be adopted: All of the above text is green. ‘Str1 Str2 ’ appears at the start of the fourth & fifth lines (the lines that have ‘0nl’), but neither Str1 nor Str2 appears on the preceding two lines (the lines that have ‘1nl’ or ‘2nl’). Two green dots appear at the start of each of the url lines. Each of the rgb lines is surrounded by a dashed border of a brighter green than the text of the rgb lines. Each of the cnt lines is preceded by ‘00’.


BAD_COMMENT vs. unexpected end of style sheet

Result: All browsers tested give precedence to unexpected-end-of-style-sheet rules over BAD_COMMENT rules. If we accept this as correct behaviour, then I believe this means that BAD_COMMENT token can never occur, and that it should thus be removed from the tokenizer description so as not to conflict with the “unexpected end of style sheet” provisions.

BAD_STRING vs. unexpected end of style sheet

All browsers tested (on Unix) agree that a trailing newline makes the difference between whether the BAD_STRING or unexpected-end-of-style-sheet rules apply, but differ in the handling of the unexpected-end-of-stylesheet case: Gecko implicitly terminates the unterminated string, whereas the Konqueror family discard the unterminated string token (partial token).

The “unexpected end of style sheet” part of §4.2 seems fairly clear that “close all open constructs” includes closing partial tokens as well as closing grammatical constructs, and I haven't seen any other way that the Konqueror family's behaviour can be considered conforming.

Proposed text change: Require Gecko's behaviour: Change the BAD_STRING token definition such that it's required to be followed by a character (as distinct from end-of-style-sheet); and change the example in the "unexpected end of stylesheet" text to clarify that the correct behaviour depends on whether any \r, \n or \f character comes between the ‘Hello’ and the end of the stylesheet, because such a character would make it a BAD_STRING token instead of merely an unfinished STRING.

BAD_URI vs. unexpected end of style sheet

Gecko treats ‘url(foo’ + EOF by implicitly closing the URI token rather than as a BAD_URI token, much as it does for strings. Whereas if a newline is inserted before the EOF, then Gecko considers it a BAD_URI token, even though URI token is allowed to have newlines before the closing paren. I can see the spec being read either in favour of BAD_URI or end-of-stylesheet rules, but behaving differently depending on whether a newline is present seems like a bug.

Konqueror,WebKit,Chromium behave as if the final (partial?) token weren't present at all rather than as if there were either a URI or BAD_URI token. Given that the tokenizer in the spec does now have a BAD_URI token, this doesn't seem consistent with the spec; though it may well be just that I'm not testing recent enough versions (as BAD_URI was added fairly recently).

Proposed text change: If end-of-stylesheet processing should take precedence over BAD_STRING (per the above proposal) then presumably it should similarly take precedence over BAD_URI. Thus, I would suggest changing the BAD_URI token definition such that it's required to be followed by a character (as distinct from end-of-style-sheet). However, note that this would mean that none of the tested UAs would pass all the tests. I don't consider interoperable handling of premature end-of-stylesheet to be particularly important, so I wouldn't object to having CSS2.1 giving some flexibility in how premature end-of-stylesheet is handled, perhaps with a comment that a future level of CSS may tighten this.

Unexpected end of style sheet and closing FUNCTION ... ')' constructs

(As far as the spec is concerned, I don't see any reason for counter() and rgb() to be treated any differently from each other. I tested both only because I know that some UAs treat them quite differently internally; and indeed they do produce different results in the Konqueror family.)

Gecko interprets "all open constructs" to include FUNCTION ... ')', whether FUNCTION is "rgb(" or "counter(". Number of newlines immediately prior to end-of-stylesheet makes no difference here, as one would expect from the spec.

For the rgb case, Konqueror discards the unterminated rgb() construct without considering this to yield an "invalid value" for purposes of §4.2. Given that an rgb() construct consists of several tokens, I don't see any basis for this behaviour in the spec. Whereas for the counter() case, it is being considered an invalid value. Konquerors behaviour for the counter() tests could be considered defensible if one accepts the interpretation that "open construct" might not include the case of a FUNCTION token with no following ')' token.

In WebKit,Chromium, the unterminated rgb() construct results in discarding the whole @media rule for some reason. Conceivably this is a result of interpreting the phrase "invalid statements" in the second sentence of §7.2.1 to mean "invalid @media statements" rather than "invalid statements within an @media statement". Is it worth inserting ‘within an @media statement’ or similar?

However, for some reason things behave differently in the case of an unterminated counter() construct when there's no newline immediately before the end-of-stylesheet. I don't see any basis in the text for a newline here making a difference to behaviour.

Proposed text change: Add the FUNCTION case to the list of example "open constructs", which I believe would match most people's interpretation of the intended behaviour. This would match Gecko's behaviour.
I tend to think that no textual change is needed to be sufficiently clear that the whole @media statement shouldn't be discarded, though it would be good to hear other people's opinion, particularly from WebKit people.