Some people have construed this as an attack on IE7. It is absolutely not. I’m trying to be helpful. Microsoft has always taken great care not to break things for their customers when releasing new browser versions. I’m just trying point out an issue I think they may have missed. The title summarises the issue.

The IE7 blog just announced Microsoft’s intention to change the way browser preferences for Accept-language are set up by default. Basically your preferences will no longer, by default, be set to fr if you’re French, but to fr-FR instead, ie. your locale as determined by Windows settings.

I think this is going to cause major problems with content negotiation on the Web.

To give a practical example:
Set your language settings to just es-MX and/or es-ES and point your browser to this article on the W3C site (an article explaining how to set language preferences).

You’ll get back the English version, even though there’s a Spanish version there. Someone with es set in IE6, Opera or Firefox will see the Spanish version automatically – even if their preferences are es-MX then es.

This is down to the way language negotiation is done on the Apache server.

In the article linked to above we explain that “Some of the server-side language selection mechanisms require an exact match to the Accept-Language header. If a document on the server is tagged as fr (French) then a request for a document matching fr-CH (Swiss French) will fail. To ensure success you should configure your browser to request both fr-CH and fr.”

This is from the Apache 2 documentation:

The server will also attempt to match language-subsets when no other match can be found. For example, if a client requests documents with the language en-GB for British English, the server is not normally allowed by the HTTP/1.1 standard to match that against a document that is marked as simply en. (Note that it is almost surely a configuration error to include en-GB and not en in the Accept-Language header, since it is very unlikely that a reader understands British English, but doesn’t understand English in general. Unfortunately, many current clients have default configurations that resemble this.)

Apache 2 introduces “some exceptions … to the negotiation algorithm to allow graceful fallback when language negotiation fails to find a match”, but those using Apache 1 don’t have that luxury.

Apart from the fact that most users wouldn’t even know that they can set their browser preferences differently, not to mention knowing how to do that, IE7 CR1 doesn’t even provide a preset selection for es rather than es-ES – you have to enter it manually. Not likely to happen much.

It seems to me that a simple fix to this would be for IE7 to set the user’s default preferences to *also* include es (ie. es-ES, es for Spain, fr-FR, fr for France, etc.). Then, when a file such as qa-lang-priorities.fr-fr.html is not found, the server will find qa-lang-priorities.fr.html and return a French file. Those people who want to know where the user’s browser is (likely to be) physically located can still use the fr-FR information to get the locale.

I think that the result of ignoring this is that many people will be confused about why they no longer see a page in Spanish, when they did before, and a lot of hard work by content developers will go unnoticed on the Web. In short, think Microsoft is about to introduce a serious bug into IE7.

Note, in passing, that the rules for specifying the lang attribute in HTML and xml:lang in XHTML are described by BCP47. The latest syntax and matching specifications are RFC4646 and RFC4647 – which obsolete RFC 3066 and RFC 1766, and which tells you to go to the IANA Language Subtag Registry at http://www.iana.org/assignments/language-subtag-registry to find out what language codes to use, rather than the ISO code lists. For more information, see http://www.w3.org/International/articles/language-tags/ )

Btw, I tried posting this as a comment on the IE7 blog page, but it didn’t work (site busy) so I did it here.