February 19, 2011
Loto Celebration, Lucie Panic, Nelu Trenu tuc-tuc-tuc (Giantess World)
Loto (9722 KB)
Watch on posterous
(portuguese) (36151 KB)
Watch on posterous
Watch on posterous
Loto Celebration, Lucie Panic, Nelu Trenu tuc-tuc-tuc (Giantess World) Loto (9722 KB) Watch on posterous Celebration (portuguese ) (36151 KB) Watch on posterous Lucie Panic ( czech ) Nelu (13803 KB) Watch on posterous trenu tuc-tuc-tuc ( romanian) See the full gallery on posterou ...... Read MORE » on Dogmeat
On the WWW, I see a lot of confusion about the appropriate use of
ALTtexts in HTML. Although the finer points could be argued, I believe the general principles are more or less as I have set them out in this note. At least, I commend them to you, and look forward to reasoned discussion if you disagree.
ALTtext is meant to be alternative text, primarily for use when the image is not being displayed. The most common mistake (if used at all!) is to provide a description of the image, without considering what job the image was doing on the page, leading to results that can range from the incongruous to the absurd. The
ALTtext is intended to be a suitable textual alternative to the purpose of the image: sometimes that might turn out to be a description of the image, but in practice that choice seems to be wrong far more often than it's right.
Some readers prefer to see my collection of howlers first.
The first principle of HTML authoring, it seems to me, is to convey information to the reader about some "topic of discourse". That's a high-falutin' way of saying that one is writing a story, advertising a product, offering scientific results, giving a tutorial on basket-weaving, recipes for cooking wild mushrooms, or whatever the "topic of discourse" happens to be. (I'm not considering the special case of writing about HTML or about the WWW - that just confuses the issue.)
Reference to mechanics of the World Wide Web or a particular browser is generally a distraction. There are differences between browsers and platforms, and trying to tell the user how to use the browser with which you, the author, are familiar, can just confuse users of a different browser/version/platform. I recommend authors, as a general rule, to presume that users already know how to operate their browsers, or can find out (Help files etc.), and are hungry for your information about the "topic of discourse". 
This note is about web pages that have substantial text content, whose images are an adjunct rather than being the whole purpose of the page. In other words, the content is inherently accessible to text-only readers, and it's the author's job to facilitate such readers to get the best that they can out of the document, in spite of the absence of images. I don't try to cover the kind of WWW document whose information is, for whatever reason, primarily dependent on images, such as say a photo gallery, although some of the points made here can certainly be usefully applied there too.
This note is emphatically not asking you to "dumb-down" your documents in order to make them work on text-mode browsers. What it is doing is asking you to give thought to how your documents will come across in a wide range of browsing situations, and to follow an authoring style that will make best use of whatever combination of resources each reader has at their disposal. That is the big difference between the concept of the WWW, and most other ways of publishing information.
Caveat: If you are mandated to produce fully-accessible documents, then obviously your accessibility guide takes precedence. Please also visit the Web Accessibility Initiative pages at W3C.
ALTis no longer the only attribute available for conveying text-mode information. With HTML4 we have three or four attributes available, reducing the need for compromise in the way that each is used: the title attribute on the image itself, the title attribute on any link which encloses the image, and the longdesc link to a page with a long description of the image. MSIE versions, as well as a number of specialised browsers, have offered support for some of these features, and Mozilla/Netscape 6+ is now on-board with them. Advice on how to exploit these features (aimed at disability situations, and broadly helpful in general text-mode browsing too), can be found at the W3C WAI: I encourage authors to phase-in the use of these features in ways that would be helpful to those whose browsers support them, but avoiding a complete reliance on them.
Look on the
ALTtext as an opportunity to reinforce your message, as discussed in this item in the W3C WAI email archives, which I thoroughly recommend pondering over: it's a win-win option, which does no harm to those who are already getting your point, but rates to help those readers who would be less-effectively reached without it. See also Ian Hickson's ALT text miniFAQ, his idea of a specification for client agent behaviour, and the Images tests in his Eviltests repertoire.
HTML mechanisms for offering images
In HTML, you can offer an inline
IMG, or a regular link (anchor) that points to an image. You may decide to use one or other, or both, of these. If you are offering a link to an image, then your options are to include, within the scope of that link anchor, either an
IMGor some normal text, or both. So there is a wide choice of combinations, each of which could be appropriate in particular circumstances. 
ALTattribute of the
IMGis intended to be used as alternative text for situations where the
IMGis not being displayed. The idea is expressed well in the HTML4 recommendation (section 13.8):"Several non-textual elements (IMG, AREA, APPLET, and INPUT) require authors to specify alternate text to serve as content when the element cannot be rendered normally."
Note that key phrase: "...to serve as content...". In section 13.2 there is a very sloppy remark quoted from the comments in the DTD, saying that "you need to provide a description with ALT": we will see in this discussion that a description of the image is often a poor choice as the alternative text, precisely because it does not "serve as content" in the particular context. An insightful tip that I learned from WAI discussions was, instead of thinking of the
ALTtext as a substitute for the image representation, think of the text and the image as alternative representations for content.
The corollary of that is that when someone asks "what's the right alt text for this image?", the appropriate first response is "what job is that image meant to be doing on the page: what content does it represent?". Once the answer to that has been given, then choosing the right alt text should follow naturally. Omitting to ask that question, and just composing an alt text by rote without attention to context, produces absurdities of the kind we see in the howlers.
ALTtext itself is not allowed to be marked up by including HTML tags within the text; on the other hand the
IMG(and thus its
ALTtext) is governed by any tags that enclose it, so, for example, if your
IMGis a heading, or part of a heading, be sure to enclose it between the
<Hn>...</Hn>tags. The use of
&#number;mechanisms is legal, and supported by any reasonable browser now. Text-mode browsers will typically flow the actual text string onto two or more displayed lines if needed, so this is not a problem. Graphical browsers, I am afraid, don't do a particularly good job with
ALTtexts, and for years it seemed to get worse with each edition of the 'popular' browsers, but there isn't much that you can do about that as a document author. Opera and, now, Mozilla-based browsers, commendably reverse that trend.
Hint: avoid putting line breaks in your source HTML within your ALT texts, as some browsers display badly. If necessary, start on a fresh source line before the
ALT=attribute, and end the source line after the closing quotation mark, but don't break the source line anywhere between.
Some people have tried the plausible trick of including
&#number;references for carriage-return and/or line-feed in their ALT text in hope of getting line-breaks in its display, but I'm afraid this isn't part of the specification and doesn't really work.
HTML4 introduces the
LONGDESCattribute, for linking to a document offering a long description, which may be supported by specialised browsers, at least: there was a well-established convention for unobtrusively offering such a link using existing mechanisms .
Why should authors bother with
Well, from the fact that you're reading this article, I hope you already think it's a good idea, but I have written some notes .
Some of the biggest "casualties" on the information dirt-track are documents whose authors didn't take the indexing robots seriously. Every step that you take towards text-mode accessibility is also a step towards being friendly to those indexing robots, so (whether or not you care about minority audiences such as the blind or users of text mode terminals) I'd say it's in your own interest to keep text-mode accessibility in mind.
Appropriate choice of
Callie at Writepage commented:Many authors haven't figured out exactly what they are trying to present; they don't know what it is about the image that's important to the page's intended audience. The reason you can't figure out why their alt [texts] aren't working is that they don't know why the images are there. Every graphic has a reason for being on that page: because it either enhances the theme/ mood/ atmosphere or it is critical to what the page is trying to explain. Knowing what the image is for ... makes the labels easier to write.
When you write predominantly-textual material in HTML, you address three different kinds of user:
- I. Those with image loading enabled.
- Example: any graphical browser
- II. Those browsing in text mode, but having image display available if they so choose.
- Examples: graphical browser with auto image loading off; Lynx with a graphical viewer available as helper application
- III. Those who have text mode only, and cannot display images at all.
- Examples: character mode terminal; readers who use a speaking machine. Indexing robots!
The speaking machine isn't only for blind readers, although why people would want to put additional difficulties in the way of blind people accessing their textual material I can't imagine; a sighted reader might use a speaking machine while resting their eyes or otherwise occupied. Motor-impairment can also mean that an otherwise well-adjusted person cannot use a conventional user-interface (such as the precision needed for a typical imagemap).
When you use an inline image, the
ALTtext is your tool - not very precise, but a serviceable tool nevertheless - to get your message over to readers of types II and III. There are several different reasons why you might be making images available to your reader, so, not surprisingly, there are several different approaches to choosing an alt text. I found it helpful to categorise four main types of image. These are not meant to be in any order of priority: each use has its proper field of applicability.
- a) "Page decorations"
If your reader doesn't display these, there is nothing that an alt text can usefully add to the topic of discourse, and a reader running in text mode is unlikely to want to view or download the decoration. So, code
ALT=""in most cases. If you are using one as a link anchor, be sure to include some text in the scope of the anchor too (it is good authoring style to make the significant text be the link, rather than some insignificant bullet or, so help me, "click here"). Spacer images (until this unpleasant crutch has been phased out) should have an
ALTattribute which also serves as a spacer (maybe
For cautionary, interrogatory etc. icons the natural choice for text-mode browsers would be something like "[!]" or "[?]", and for bullets
ALT="*"etc.; however, in general ASCII-art may produce intrusive results on a speaking browser. Something like this may be worth considering:<EM><IMG ... ALT="Warning: "></EM>
As far as company logos are concerned, well, if the name of the company is already on the page in clear text, as is often the case, then the logo can be treated as decoration, and
ALT=""is appropriate; if the logo were being used instead of the company name, "Foo Corporation", then I recommend
ALT="Foo Corporation"as the right choice; in fairness I should note that there are differing opinions held on this topic, and only the author can truly know whether they intended the logo to be a visual branding mark on the page, such that no textual replacement would be meaningful, or as a significant element that should be brought to the attention of non-image readers. (But if you sell logos, and are exhibiting specimens of your work, then your logos are your content!)
The often-seen variations on
ALT="Logo of Foo Corporation", or even
ALT="Medium size GIF of logo"(!) are incomprehensible: the author is supposed to be providing the reader with information, not with meta-information (descriptions of information). A text description of the logo is generally felt to be inappropriate as ALT text: for those readers who wanted it  there are accepted conventions of offering a separate description.
With the increased emphasis on separation of Content (marked up in HTML) from Presentation (proposed by CSS), it's been persuasively argued that "decorative images" should be treated as "presentational", and so should be taken out of the HTML, and proposed (for appropriate
mediatypes) from the stylesheet. The issue of selecting an
alttext for those images then disappears.
- b) "Navigation Icons"
Here, the text equivalent is usually simple enough, being a short statement of the target ("Adventure", "Science Fiction") or function ("Contents", "Next Chapter", "Previous", "Foo Corporation Home Page"). Please bear in mind that browsers already have their own meaning for terms such as "Back", "Home", "Forward", so it is best to avoid the potential confusion that results when the author makes these terms mean something else. Please avoid those "Return to (xyz)" links, that are so irritating to someone who went directly to your page, and has never visited (xyz) before.
By all means help your readers to understand your site layout by using terms such as "Previous", "Up", "Next", or texts which state the destination explicitly ("About Quarks"); but I still say avoid the confusion of the terms "Return" or "Back".
Some authors prefer to offer alternative text-mode links separately from their graphical links, in which case
ALT=""would seem plausible; but I prefer to avoid the confusion of two sets of navigation links pointing to the same targets. Also, accessibility checkers rightly complain when they find an image used alone as a link and given
ALT="", and some browsers will detect this situation and overrule the
ALT=""in the interests of accessibility.
Thumbnails, for "navigating" to a fullsize image of the same thing, are discussed under (c).
Imagemaps are a special case of this category. In some situations, imagemaps play a role that cannot directly be substituted by anything else (geographical maps, for example), but some alternative means of navigation such as an A-Z index or a search can be useful to all kinds of users, you should not devalue it as an extra chore that's only for text-mode readers. Often, though, on the WWW, one sees imagemaps used instead of simple links, presumably on the grounds that it's more complex and so demonstrates the author's prowess. This is a pity, if the author doesn't also have the prowess to make their page usable for all text-mode readers.
Client-side imagemaps, for which browser support is now widespread, are more adaptable for use by text-mode users than the older server-side maps were, so long as you provide them with
ALTtexts on their
AREAtags. Still, as a navigation tool, a row of simple
ALTtexts can do the job in an effective manner, that adapts better to changes of window size.
I offer a page about text-friendly imagemaps. Did I say you should provide separate graphical and text-only pages? - I did not: in general I don't believe you need to, and those people who keep yelling "I can't afford the time to make separate text mode versions of my pages" are just looking for some excuse for their inability to make a web page that is both visually attractive on a graphical browser, and accessible to text mode users.
- c) "Supplemental or Interesting"
These are graphics that the user may find helps their appreciation of the text, but are not mandatory to it. I suggest that there are two ways of going about this.
Provide only links to them; show the reader what's on offer with a brief description. Now, the principle says not to fuss about details of the WWW, but in this case, what with limited bandwidth and the possibility of not all image formats being accepted by all browsers, we can make an exception and warn the reader what it is that we are offering here. So an example could be
Frigate, circa 1800, 560kB PNG
Provide an inline
IMG, with an
ALTtext that summarises the major feature that you wanted to bring to the reader's attention, e.g
ALT="Warships at that time usually had two rows of cannon"
You should be able to word the body of your text so that it doesn't pre-suppose the reader is also viewing the image alongside. As long as readers of "type II" are aware that an image is available, they can make their own decision whether to load it. Giving readers of "type III" the impression that you are commanding them to load an image will only frustrate and annoy.
Of course, you could combine an inline
IMG, with its
ALTtext, together with a link to an out of line image (typically a larger, more detailed version of it). Then, take care to put the information in its proper place, whether as clear text to be seen by all readers, or in the
ALTtext aimed chiefly at those who are not loading images. But these are minor details, compared with the major abuse that we see out on the WWW.
If you feel that the
IMGneeds additional alt text, provide it; if not, then put
ALT="". This alt text could typically supply the chief piece of information for which you had provided the picture, e.g following the above example, you might describe the major relevant feature of the vessel illustrated by the picture. The test of appropriateness, as ever, is to imagine the HTML document viewed without the picture - or imagine reading the document to someone over the telephone - and ask yourself whether the
ALTtext supplies useful information about the field of discourse. (Technical information about the missing images is, as I say, tolerated if it's for good reason, but in an article on historic ships, the reader primarily wants information about ships, not technical woffle about the WWW.)
- d) Critical for understanding the page
In some fields, this situation rarely arises. In others (e.g science and engineering, and mathematics so long as we have to put equations in as inline images), it's quite a common occurrence. In this case, provided you have somehow made the reader aware that the document will unfortunately be meaningless without them loading the inline images, there may be nothing useful you can do with the
Further discussion and thoughts
I don't think cases (a) and (b) really need a great deal of discussion. (c) and (d) are trickier, and it's not always obvious which of the two we're dealing with. One reader's essential illustrations are another reader's optional extras, and in the more borderline cases it's possible to make an image seem to be either essential or supplemental depending on just how you word the text (as Toby Speight pointed out).
In a situation where there is some agreed scheme for a textual representation, then you could use that as the
ALTtext. If your audience is accustomed to reading mathematical equations in LATEX notation, you could use that as alt text for the image of the equation. If you are dealing with heraldry, then you might use the appropriate heraldic description or "blazon", Three Seaxes Argent in pale on a field Vert. Similarly if the image represents musical notation, needlework, or anything else for which there is also an accepted textual notation. In fairness, there are cases where a graphic is absolutely essential to the meaning, and no reasonable amount of text can possibly replace it. But this is no excuse for providing useless alt texts in those situations where a useful one could be provided.
We had an example a little while back in which someone had suggested (in the context of holiday offers) an alt text that said
ALT="Picture of Hotel". I say this is inappropriate because it tells us nothing about the topic of discourse - instead, it tells us chiefly about the mechanics of the WWW. What the reader, particularly a reader of "type III", wants to know is - what does the picture show that's relevant to the topic of discourse? The picture might show, for example:
ALT="The Pines Hotel, a fine old stone building in extensive grounds". This is suggested as an alt text, rather than as a caption, because those readers who can see the picture will already be able to see it for themselves. If you want to also offer them a link to the picture, then do so, in one of the ways mentioned above. (Even a blind reader might want to download the picture, to show it to a friend later.)
Readers of "type II" already have browser facilities that allow them to retrieve the image if they so choose (Lynx puts the inlines only a keystroke "*" away); there is nothing extra that the author really needs to do about it - and after several discussions of what the author could do, nobody seems to have come up with anything that really provides worthwhile additional help to the text-only user without needlessly distracting the graphics-based user. (As so often on the WWW, the important thing is to mark up the information honestly for what it is, rather than trying to out-guess the browser designer; if there are limitations in the facilities that one or other browser offers, then the place to remedy them is in the browser design, not in the author's HTML source.) When I mentioned in a usenet discussion my dissatisfaction with the above text,
"Picture of Hotel", someone helpfully suggested
"Download picture of Hotel". I hope that by now I've made it clear why, far from being an improvement, this seems to me to be even worse: it concentrates yet again on the mechanics of the WWW rather than on the "topic of discourse".
One suggestion was to capitalise on the typical text-mode browsers' [IMAGE] notation by putting the
ALTtext within square brackets, so that text-mode users would associate this with an image:
ALT="[The Pines Hotel, a fine old stone building...]". This seems a good idea for sighted users, but again the extra punctuation might be intrusive for voice browsers.
As said before: if the image is mandatory to your presentation, then say so plainly. If not, then don't pester the reader to load it: tell them what information it contains relevant to the "topic of discourse", and leave them to take action as they consider appropriate.
"This page has been visited [Counter Image] times"
"Accessed [a bitmapped number] times since 12/1/95"
(and several variations on this theme).
Well, this has nothing to do with the "topic of discourse". I don't think any of the variations could be claimed to be good style (not forgetting that 12/1/95 means something different to European readers than what it means to USAns). But I can't work up any enthusiasm for page counters, nor can Jeff Goldberg.
ALT text as "tooltips"?
Version by version, popular graphical browsers got worse and worse in their display of ALT texts when auto image loading was off. Then they seem to have hit upon the idea of displaying the ALT texts as "tooltips" when the mouse pointer was on the image location. Plenty of authors seem to have reacted by using the ALT text to specify their desired tooltip text, regardless of the text being entirely inappropriate for use as the "alternative text" described in the HTML specifications.
Well, HTML4 has an answer to this: the
TITLEattribute. The HTML4 spec says explicitly that it would be appropriate for the TITLE attribute to be displayed as a "tooltip", so it all falls into place. Use the
ALTtext for the purpose of providing alternative text, for example along the lines discussed in this article, and use the
TITLEattribute to title the image, in a way that would be appropriate for a tooltip. Support for the
TITLEattribute was introduced in MSIE4, and in Opera and other browsers: the major problem was Netscape 4.* releases, but Netscape 6 finally supports it.
More about this in ALT text as popups: a critical response.
The Decorative Horizontal Rule
Authors ask, reasonably enough, to use an
IMGof a decorative rule in the graphical display, that falls back to some kind of separator in a text mode display: several ways of doing that have been suggested, but none are without shortcomings.
Style sheets are the least harmful way: see an earlier suggestion of mine using styles with an HR.
In theory, inserting your decorative image with
OBJECT, and supplying
HRas the fallback, would be another solution entirely within the philosophy of HTML, but sadly not well implemented by even the latest crop of browsers.
Discussion of the various solutions on usenet brought a number of strongly-held but mutually incompatible views, so if none of these solutions appeal to you, it might be advisable to re-cast the design so that the problem doesn't need to be solved..
Spacing between alt texts
Consider some images crammed together, for example as navigation buttons:
ALT="The University"><IMG SRC=town.gif ALT="The Town">...
When viewed on a text-mode browser, this is going to read:
The UniversityThe Town...
Various solutions may be considered:
- The use of nothing more than white space to separate links, such as
"Left ", "Right ", "Index ", "Config "etc.
is considered unsatisfactory in general, as it could be confusing just where the link boundaries are; and note that HTML4 warns against using leading/trailing "white space" on attribute values.
Vertical bars are a popular alternative: to get e.g:
the ALT texts might be respectively:
"|Left|", "Right|", "Index|", "Config|"
Brackets balance well and they reflect a long-standing text browser idiom of representing
IMGwith [LINK], [INLINE] etc.: providing ALT texts such as
"[The University]", "[The Town]", "[Main Index]", etc., produces the obvious result:
[The University][The Town][Main Index]
The above ideas are serviceable in quite a range of browsing situations, though I have to admit that some speaking browsers are less than ideal, for example it's no fun listening to left square bracket, right square bracket while the links are being read out (example from an earlier version of IBM Home Page Reader).
Nowadays it might be a more productive approach to provide text-mode links as the primary navigation mechanism, and use CSS stylesheets to propose a decorative presentation for them (for suitable media, of course).
Navigation to be text-friendly and accessible
Some accessibility guidelines recommend having more than just space(s) separating adjacent link texts. But most users of graphical browsers don't need or want such separators. Here's a suggestion: as ALT texts for the images within the scope of the links, use the appropriate texts without additional separators. Then, between those images, outside of the scope of the links, place one-pixel transparent GIFs with their alt text set to "|" to act as text mode separators. See the navigation bar at the foot of this page where this is used in practice.
Note that the buttons do not have their height and width specified, whereas the separator pixels have their height and width (=1) specified. I've played around with this on a range of browsers, and it seems to me to be a nice compromise. When image loading is enabled, graphical browsers produce a result that is practically indistinguishable from the earlier constructs. On graphical browser/versions with image loading disabled, some gave good results while others were visually rather poor, but all were at least serviceable. The results on text browsers (Lynx and emacs-w3) were entirely acceptable. (I used to suggest
alt=" | ", with white space either side of the bar, but I now know that this produces undesirable effects in IBM Home Page Reader, for example.)
OK, this is just one suggested solution: maybe other authors will find a better compromise; and, as specialised browsers are developed further, the need for some of these workarounds may fade away. Again one might consider whether it was better to offer a simple text menu in the HTML, that would be serviceable from any browsing situation, optionally modified from the stylesheet to give visual styling for appropriate media types.
HEIGHT and WIDTH?
There had been a long-running debate on c.i.w.a.html about the wisdom of providing correct HEIGHT and WIDTH attributes on
IMGtags. The idea is that the browser can reserve space for the image before the image is retrieved, and in consequence can present the normal text properly formatted on the page as soon as the text is available, simply slotting the images into place later as they arrive.
Browser versions differ in how they support this when image loading is turned off. Some (e.g Netscape 4.* versions) will display the
ALTtext only partially, or not at all, if the specified rectangle is too small. MSIE can display the whole
ALTtext, but it is not the default (Tools/ Internet Options/ Advanced/ Accessibility/ "Always expand ALT text for images"). The behaviour while waiting for image loading to be completed (some browsers display the
ALTtext during this interval) may or may not be the same as the behaviour when image loading is turned off. Too many variations of behaviour have been seen, between browsers and between versions, to be able to give an account of them here. On balance I think the only advice I can offer is to normally include correct HEIGHT and WIDTH information; but if these are likely to be too small for the text, then you might decide to deliberately omit those attributes.
The colour of ALT text is discussed separately.
There's also a separate page about INPUT TYPE=IMAGE.
Reader comments and questions
A correspondent suggests that many of the misuses described here are being caused by the "authoring tools" which authors are using to create their WWW documents. But that isn't much of an excuse, is it? HTML markup is simple enough already: certainly it makes sense to use an appropriate software tool to save drudgery, and I'm all in favour of that as a principle. But when the tool prevents you from producing documents that conform to good authoring style, then you should be questioning your original decision to rely on that particular tool.
A reader asks: Someone is now trying to tell me that alt text can actually block search engine spiders from indexing a site. Is this true?
I find this hard to believe. Omitting
ALTattributes offends against web accessibility guidelines, which surely count for something, even amongst those who do not take HTML syntax rules seriously. There's various surveys of search engine properties, but the search engines do change their procedures rather quickly so it's not useful to try to summarise them here: what is certainly true is that at various times some of them have, and some of them have not, indexed "alt" texts. But I've seen nothing credible to say that
ALTtexts, in themselves, would be harmful to web indexing, and it would indeed seem perverse (and would draw adverse comment in relation to accessibility) for one of them to do so. I certainly wouldn't let it discourage me from proper use of
ALT, and I'd soon add my signature to the petition if such a search engine is found! Maybe your informant is thinking of the misuse of ALT attributes to "spam" the indexing robot with keywords, but then it would be the keyword spamming which was the abuse (and would apply to any other means of feeding the indexer with keywords which were otherwise not seen by the normal reader) rather than the use of ALT attributes as such.
By the way, I have never taken any active steps to get my own web pages registered with search engines. But when I search for topics that are of interest to me, my own pages keep coming up, relatively high on the list. So I must be doing something right.
- Reality check - some typical scenarios in a text mode browser:
The online edition of a prestigious US newspaper was seen promoting its partners thusP A R T N E R S : [LINK]-[USEMAP] spacer [LINK] spacer [LINK] spacer
This was seen on a unix system for which the recommended browser is not in fact available:Our site is best experienced with: [LINK]Click to Get It!
Then, there was this great piece of self-advertising (spelling as in original):Another fine Web sight from [Company Logo]
- An English Borough Council making its mark:
- Welcome to... Borough Crest The Borough of On-Line Having problems accessing our site? [Click Here]
- An anonymised extract from the front page of a German web site service provider:
- [LINK]-visual fubarGmbH direct rubriken [space.gif] [space.gif] [space.gif] [space.gif] [space.gif] [space.gif] [space.gif] [space.gif] [space.gif] Privat [space.gif] [space.gif] [space.gif] [space.gif] [space.gif] [space.gif] [space.gif] [space.gif] [space.gif] Aktuell [space.gif] [space.gif] [space.gif] [space.gif] [spitze-links.gif]
(My informant tells me they were immediately taken out of consideration for the contract.)
- Artistic licence(?):
- siterequirements.jpg notice.jpg inorderto.jpg getquicktime.jpg getflash.jpg plugintester.jpg agreement.jpg
This artist had used a whole job-lot of images (almost exclusively images of text), each solemnly furnished with an
ALTtext which repeated the image's file name, like so:img src="site_requirements_images/inorderto.jpg" alt="inorderto.jpg"
- "This site is [LINK]etscape [INLINE]"
Way back, I browsed the results of a web search for the string "etscape", and found some HTML sources varying from the slightly silly to the sheer demented! Interestingly, when I reviewed this in 2002, it looks as if most of the "etscape-enhanced" sites are from 1997 or earlier. How fashions fade!
ALT="Large Yellow Bullet"
So we get to read (or blind readers get to hear):Large Yellow Bullet Introduction Large Yellow Bullet The Problem Small Red Bullet Historical Analysis Small Red Bullet Current Situation Large Yellow Bullet The Solution
The web site of the US Embassy Belgrade had this prizewinning example:Small red bullet Response to Terrorism
ALT="This image is mapped, please download it"
For text-mode browsers, this does not help. I've already made some more-appropriate suggestions above. See my accompanying imagemap page for more detail and references.
ALT="Turn on image loading, damnit!"
One wonders just how many potential visitors have given that site a miss because it "welcomed" the indexing robot in such a brusque and uninformative fashion.
ALT="Imagemap of various flags"
What would a text-only user want with an image of some unspecified flags? This is supposed to be a navigation tool, not a guessing game!
Alt="(Sorry, Not Available With Your Web Client)"
Nonsense! I was using Netscape with image loading turned off. Even if I had been using Lynx, who are you to say that I can't fire up a helper application to see this image?
- [LINK]-- Self explanatory
An image of some "self-explanatory" text, with no
ALT="Put your alt text here"
- "Oldtown University arms Physics Department"
Gosh, could they be Weapons of Mass Distraction?... Ah: this is their Physics Department's web page, decorated at the left with the University's coat-of-arms (not ours: but name changed to protect the guilty...).
- "Photo of a bull in the water canoeing"
I beg your pardon? Ah, here's what went wrong:
<IMG SRC="bull.jpg" ALT="Photo of a bull in the water">
<IMG SRC="canoe.jpg" ALT="canoeing">
The original site, which featured a scenic Canadian river, showed two perfectly reasonable, but unrelated, pictures: later, a reader of this article, "Michael T.", sent me a fine illustration of this howler!.
- "Academic departments are indicated by pink bullets".
Unusable in text mode, or on a monochrome display. Choose a graphic icon that can carry the message by its shape - in this case it could be a little mortarboard (no harm in making it coloured too); and choose distinctive text markers to use for the
Academic departments are indicated
by: <IMG SRC="mortarboard.gif"
(making the corresponding adjustment to the list itself, of course).
This was "leakage", where the author had made the mistake of referring from the text to some aspect ("pink bullets") of the presentation that would only be perceived by a subset of readers. The WWW, by its very nature, will separate content from presentation in this way: a careful author can choose a style of authoring that does not fall into this kind of trap.
ALT="Loading... Please Wait"
Nice try! But who guarantees that the browser is, in fact, loading images at this time? As Jukka K says, 'Should I wait until I get some mental disorder that makes me click on the "show images" button?'
ALT="Plot of fish population against date"
This is a more subtle problem, but the poor text-mode reader is left wondering what the heck is that graph-plot supposed to be telling me?. A more helpful text might be "Graph: fish population fell dramatically through 1980's until fishing moratorium imposed", or whatever the graph actually illustrates.
Oh dear! Is nothing sacred? - this came from (an earlier version of) the welcome page of the W3C themselves! Several images had been displayed together, without any provision for spacing or punctuating their
- My favourite howler went something like this
- <CENTER> <FONT SIZE=6>Our Classrooms and Staff</FONT> <IMG SRC="rule.gif" ALT="fancy horizontal rule"> </CENTER>
Instead of using
<H1>for this first level header, they had simply marked it up with a font size: there was no linebreak implied between the text and the image. With image loading on, this was not a problem: the "fancy horizontal rule" was so big that it automatically went onto a new line. However, with text-mode browsers this whole thing was quaintly rendered as
Our Classrooms and Staff fancy horizontal rule
Certainly they should have used
H1markup for the header. One way to handle a decorative horizontal rule is mentioned above.
(Comment: you may be able to find some of the examples with search engines which include alt text; others have been adapted, or some examples taken and a composite "howler" made. Search engines which index ALT texts still have no problem finding plenty of pages containing "small red bullet" or similar, and they aren't all web authoring tutorials advising you not to do it, unfortunately.)
I don't believe that any particular familiarity with a text browser, nor indeed with a speaking machine, was required in order to select a useful
ALTtext for most examples. If the document had been marked up by giving appropriate thought to the content that is to be communicated to the reader, rather than getting side-tracked by the mechanics of the WWW, then it could have "worked" on every browser - and searcher and indexer. And without in any way degrading its visual appearance in the common graphical browsing situations.
To sum up:
- Think what information will be presented to each of the three types of user
- Consider how that information illuminates the "topic of discourse"
- And then all should be clear.
A final thought. When your paperback edition is published, does it include an "ALT text" that tells the reader that they should have bought the hardback edition with the eight extra illustrations, and the handsome dustcover? I think not. Please don't address your text-mode web readers as if they were second class citizens, either.
- Overview: Tabular summary of this page's conclusions.
- More: ALT background materials, various musings and rants
- Up: Text-friendly authoring topics
My thanks and best regards to all who have contributed to the discussions on earlier drafts of this note.
"Plagiarism is the sincerest form of flattery"? Apart from a version of this article which was featured at the WDG site with my agreement and full co-operation, and extracts which have been used with my permission in a couple of other places, web search engines have found several different sites that had "lifted" substantial parts of this article without asking permission or acknowledging their source, including one that had applied their own copyright notice onto my material. I suppose I should be pleased that the ideas are finding such an echo, but...
Use of ALT texts in IMG s [ italiano ] Introduction On the WWW, I see a lot of confusion about the appropriate use of ALT texts in HTML. Although the finer points could be argued, I believe the general principles are more or less as I have set them out in this note. At least, I commend them to you, ...... Read MORE » on Dogmeat
Breaking the Web with hash-bangsTuesday, February 08, 2011
Update 10 Feb 2011: Tim Bray has written a much shorter, clearer and less technical explanation of the broken use of hash-bangs URLs. I thoroughly recommend reading and referencing it.
Update 11 Feb 2011: Another very insightful (and balanced) response, this from Ben Ward (Hash, Bang, Wallop.) , great job in separating the wheat from the chaff.
Every URL on Lifehacker is now looks like this
http://lifehacker.com/#!5753509/hello-world-this-is-the-new-lifehacker. Before Monday the URL was almost the same, but without the
#!. So what?
#is a special character in a URL, it marks the rest of the URL as a fragment identifier, so everything after it refers to an HTML element id, or a named anchor in the current page. The current page here being the LifeHacker homepage.
So Sunday Lifehacker was a 1 million page site, today it's a one page site with 1 million fragment identifiers.
Why? I don't know. Twitter's response when faced with this question on launching "New Twitter" is that Google can index individual tweets. True, but they could do that in the previous proper URL structure before too, with much less overhead.
A solution to a problem
#!-baked URL (hash-bang) syntax first came into the general web developer spotlight when Google announced a method web developers could use to allow Google to crawl Ajax-dependent websites.
Although Google spent many laborious hours trying to crack this problem, they eventually admitted defeat and tackled the problem in a different manner. Instead of trying to find this mythical content, lets get website owners to tell us where the content actually is, and they produced a specification aimed at doing just that.
If you’re starting from scratch, one good approach is to build your site’s structure and navigation using only HTML. Then, once you have the site’s pages, links, and content in place, you can spice up the appearance and interface with Ajax. Googlebot will be happy looking at the HTML, while users with modern browsers can enjoy your Ajax bonuses.
#!URL syntax was especially geared for sites that got the fundamental web development best practices horribly wrong, and gave them a lifeline to getting their content seen by Googlebot.
And today, this emergency rescue package seems to be regarded as the One True Way of web development by engineers from Facebook, Twitter, and now Lifehacker.
In Google’s specification, they call the
#!-patterned URLs as pretty URLs, and they are transformed by Googlebot (and other crawlers supporting Google’s lifeline specification) into something more grotesque.
On Sunday, Lifehacker’s URL scheme looked like this:
Not bad. The 7-digit number in the middle is the only unclean thing about this URL, and Gawker’s content system needs that as a unique identifier to map to the actual article. So it’s a mostly clean URL.
Today, the same piece of content is now addressable via this URL:
This is less clean than before, the addition of the
#!fundamentally changes the structure of the URL:
- The path
- A new fragment identifier of
What does this achieve? Nothing. And the URL mangling doesn’t end there.
Google’s specification says that it will transform the hash-bang URL into a query string parameter, so the example URL above becomes:
That uglier URL actually returns the content of the article. So this is the canonical reference to this piece of content. This is the content that Google indexes. (This is also the same with Twitter’s hash-bang URLs.)
This URL scheme looks a lot like:
Lifehacker/Gawker have thrown away a decade’s worth of clean URL experience, and ended up with something that actually looks worse than the typical templated Classic ASP site. (How more Frontpage can you get?)
Clean? Not on your life!
What’s the problem?
Far more complicated than a simple URL, far more error prone, and far brittler.
So, requesting the URL assigned to a piece of content doesn’t result in the requestor receiving that content. It’s broken by design. LifeHacker is deliberately preventing crawlers from following links on the site towards interesting content. Unless you jump through a hoop invented by Google.
Why is this hoop there?
The why of hash-bang
So why use a hash-bang if it’s an artificial URL, and a URL that needs to be reformatted before it points to a proper URL that actually returns content?
Out of all the reasons, the strongest one is “Because it’s cool”. I said strongest not strong.
Engineers will mutter something about preserving state within an Ajax application. And frankly, that’s a ridiculous reason for breaking URLs like that. The URL of an
At the risk of invoking the wrath of Jamie Zawinski, LifeHacker can keep its mostly clean URL of last week (
http://lifehacker.com/5753509/hello-world-this-is-the-new-lifehacker) and obtain the mangled version by this regular expression:
var mangledUrl = this.href.replace(/(d+)/, "#!$1");
Disallow all bots (except Googlebot)
All non-browser user-agents (crawlers, aggregators, spiders, indexers) that completely support both HTTP/1.1 and the URL specification (RFC 2396, for example) cannot crawl any Lifehacker or Gawker content. Except Googlebot.
This has ramifications that need to be considered:
- Caching is now broken, since intermediary servers have no canonical representation of content, they are unable to cache content. This results in Lifehacker perceived as being slower. It means Gawker don’t save bandwidth costs by any edge caching of chunks of content, and they are on their own in dealing with spikes of traffic.
- HTTP/1.1 and RFC-2396 compliant crawlers now cannot see anything but an empty homepage shell. This has knock-on effects on the applications and services built on such crawlers and indexers.
- The potential use of Microformats (and upper-case Semantic Web tools) has now dropped substantially - only browser-based aggregators or Google-led aggregators will see any Microformatted data. This removes Lifehacker and other Gawker sites from being used as datasources in Hackdays (rather ironic, really).
- Facebook Like widgets that use page identifiers now need extra work to allow articles to be liked. (by default, since the homepage is the only page referenceable by a non-mangled URL, and all mangled URLs resolve down to being the homepage)
- A debugging console.log line accidentally left in the source will cause Gawker’s site to fail when the visitor’s browser doesn’t have the developer tools installed and enabled (Firefox, Safari, Internet Explorer)
Such brittleness for no real reason or a benefit that outweighs the downside. There are far better methods than what Gawker adopted, even HTML5’s History API (with appropriate polyfillers) would be a better solution.
An Architectural Nightmare
Gina Trapani tweets: Lay down your pitchforks and give @Lifehacker’s redesign a week before you swear it off and insist that the staff doesn’t care about you. A week won’t solve Gawker’s architectural nightmare.
Updates (9th February 2011)
Wow. I (and my VPS) am overwhelmed by the conversation this post has sparked. Thank you for contributing towards a constructive discussion. Some of the posts that caught my eye today:
The Next Web reports that Gawker blogs have disappeared from Google News searches. A Gawker media editor is quoted that they hope to have it resolved soon. They are listed again but using the
_escaped_fragment_form of the URL. So much for clean URLs. Though, the link seems intermittently broken claiming the URL requested is not available (with a redirect to
I did like this tl;dr summary of this post over on theawl.com by mrmcd.
Webmonkey have a summary story, but link off to some very handy resources for clean URL strategies. (I first learnt HTML from Webmonkey back in the previous century)
Danny Thorpe talks about Side effects of hash-bang URLs, including URL Cache equivalence. Oliver Nightingale has a nicely worked example using HTML5's pushState in a progressively enhanced way (great job!)
The very short geeky summary of this post (try curling a Lifehacker article canonical URL):$ curl http://lifehacker.com//hello-world- \ this-is-the-new-lifehacker | grep "Hello" $
or as Ben Ward put it: If site content doesn’t load through
Broken HTTP Referrers
Watching my logfiles I'm seeing a number of inbound links to this post from gawker.com and kokatu.com - from the homepage (i.e. the fragment identifier is stripped out). So somewhere on those sites there's a discussion going on about my post, and there's no way of finding it thanks to Gawker's use of hash-bang URLs.
Adrian Chen — 21-year-old Michigan resident Evan Emory currently faces 20 years in prison for "manufacturing child sexual abusive material". His crime: He posted a YouTube video that made it appear he was singing an explicit song to a classroom of elementary students.
Emory tricked administrators at Beechnau Elementary School into letting him perform a song for the kids on video, claiming he wanted to build his portfolio. He sung an innocent song in front of the kids, but when the room was empty recorded a sexually explicit song. ("I like the way you make your body move. C'mon, girl...See how long it takes to make your panties mine...I'll add some foreplay in just to make it fun. I want to stick my index finger in your anus.")
Through trick editing, Emory made it appear that he had been singing the song to the kids while they smiled and laughed along. He included a disclaimer—"No children were exposed to the 'graphic content' of this video"—and posted it on YouTube earlier this week.
On Wednesday, Emory was arrested on charges of manufacturing "child sexual abusive material". Said the county prosecutor:
"The bottom line in this case is that he walked into a classroom and took advantage and victimized every single child in that classroom," Tague said.
"This case is very disturbing to law enforcement officials. We have launched a full-fledged investigation with the sheriff."
At his arraignment, outraged parents of the kids in the video appeared at the courthouse to rally for jail time.
We can understand why the parents and school would be upset. But these are clearly laws designed to punish hardcore sex offenders—not some bro who came up with a misguided idea for a prank. In the end, the video appears to have been online for about a day or two and was probably seen by a few hundred people at most. This is a very broad definition of "victimization!" One law professor says the charges are likely unconstitutional.
As Radly Balko points out, the hysteria is fueled by the volatile combination of children + sex + The Internet. Add to that an overreaction by a humiliated school district. Here's hoping the judge realizes this, too.
Note: The embedded video is another one of Emory's pranks—not the video in question
Singer Faces 20 Years In Prison for YouTube Prank on Kids Adrian Chen — 21-year-old Michigan resident Evan Emory currently faces 20 years in prison for "manufacturing child sexual abusive material". His crime: He posted a YouTube video that made it appear he was singing an explicit song to a classro ...... Read MORE » on Dogmeat
Periodic Table of the Elements
Elements for html5advent.com
Document root element.1html
Columns in a table.col
Table of multi-dimensional data.table
First element of the HTML document. Contains document metadata.1head
Set of form controls grouped by theme.fieldset
Heading for the current section.25h1
A row of cells.tr
Text that is preformatted in the HTML code.pre
Control for entering a numeric value in a known range.meter
Control for selecting from multiple options.select
Title of a table.caption
Document metadata that can't be represented with other elements.6meta
rt dfn em
Text that has been inserted during document editing.ins
Container with no semantic meaning.86div
Group of option.optgroup
Heading for the current section.21h3
Set of commands.menu
Specifies URL which non-absolute URLs are relative to.base
Contains semantically meaningless markup for browsers that don't understand ruby annotations.rp
Abbreviation or acronym.abbr
Text that has been removed during document editing.del
Text that is outdated or no longer accurate.s
Caption for a form control.label
Single option within a select control.option
Define sets of options.datalist
Heading for the current section.3h4
Command the user can perform, such as publishing an article.command
Contains rows that hold the table's data.tbody
Contains elements that are part of the document only if scripting is disabled.noscript
Mathematical or programming variable.var
Example input (usually keyboard) for a program.kbd
Opportunity for a line break.
Term which will be described.dt
Generic form input.input
Contains the results of a calculation.output
Generates private-public key pairs.keygen
Contains rows with table headings.thead
Styling defined inline data.style
Inline or linked client side scripts.6script
Title of a referenced piece of work.cite
Defines directional formatting for content.bdo
Fragment of code.code
Description for the preceeding term.dd
Multiline free-form text input.textarea
Control for displaying progress of a task.progress
Heading for the current section.h6
Contains additional information, such as the contents of an accordian view.details
Contains rows with summary of data.tfoot
Hyperlink area in an image map.area
Image map for adding hyperlinks to parts of an image.map
Reference to non-HTML content.embed
Alternative sources for parent video or audio elements.source
Nested browser frame.iframe
Bitmap which is editable by client side scripts.canvas
Specifies external timing track for media elements.
This element is still being drafted.track*
Allows scripts to access devices such as a webcam.
This element is still being drafted.
Metadata and scripting