I recently stumbled across an old copy of the Demoroniser (which my American-trained sense of spelling keeps trying to spell as demoronizer), a script designed to correct some of the broken HTML generated by Microsoft Office. Aside from flat-out coding errors, Office would use non-standard characters for things such as curly quotes or em-dashes that would only show up on Windows computers. If you viewed these sites on a Mac, a Linux box, a Palm, etc., they would seem to be missing punctuation everywhere. His solution was to convert these to their plain-ASCII equivalents.

Over the last year or so, WordPress and A List Apart have converted me from “stick with the lowest common denominator” to “let’s show real typography.” Since the days of the Demoroniser, Unicode has become a standard part of HTML, so modern browsers* can either display a full range of characters or convert them to something they can display. You probably won’t be able to see Chinese text in Lynx, but a properly encoded curly quote—“ or ”—will show up as a plain old ".

For one thing, real typography looks much nicer. Continue reading

After updating some links, the following dialogue occurred to me:

Sallah: Indy, why does the web… move?
Indiana: Give me the URL.
(The location looks like a Python script)
Indiana: Snakes. Why did it have to be snakes?
Sallah. ASP. Very dangerous. You go first.

(Actually, I have to credit Katie for the Python reference. The first and last lines just popped into my head, though.)

Well, I signed up with Gravatar, mainly so I could test the plugin.

Basically the idea is that you can define an avatar that will follow you around the Internet, anywhere you post. All that’s necessary is for the site you’re commenting on to be Gravatar-enabled at the time someone visits.

The one thing I’m not entirely thrilled about is that it uses your email address as the basis for your ID. They really didn’t have many options to choose from, since most blog comment forms only have space for your name (not always unique), email address, and website (not everyone has one). To avoid publishing addresses accidentally, they one-way encrypt it using MD5. (MD5 is a hash function, so while you can have two systems generate an MD5 signature from the same data to see if it matches, you can’t restore the original from the signature.)

If you’re interested in Gravatars, head over to their site, see if you agree with their policies, and if you enter your email address when commenting (don’t worry, current and future WordPress versions never display it outside of the admin area), your avatar will show up next to your comments.

Anyway, once I had gravatars showing up, I had to find a layout that (a) looked good and (b) worked in IE. (Yes, that again.) Continue reading

All the Linux desktop action these days is in KDE and GNOME, but on older hardware, servers, or anything else where you need to squeeze every last ounce of performance from the box, something lighter is needed.

[Screenshot of a WindowMaker desktop] My Linux box at work — a 300 MHz Pentium II — runs WindowMaker. It’s familiar, it stays out of the way, and it doesn’t tie up the memory or CPU that a modern version of KDE or Gnome (or Windows, for that matter) would. But you need to add applets like a clock or a desktop pager. You can find them easily enough — I ended up using the aptly-named wmclock and wmpager – but there’s a significant problem with both. WindowMaker lets you change the size of the dock icons, but when I shrank the dock to get more space I discovered that both applets have a hard-coded size of 64×64 pixels.

[Pair of WM Applets, first at default 64x64 size (they look fine), then at 48x48 (they don't adjust and edges get cut off)] As you can see, a 64×64 applet just doesn’t work in a 48×48 space. It surprised me, though, since these dockapps are designed specifically for WindowMaker, and it’s WindowMaker itself that lets you change the size. You open up Preferences, change the size, and restart WM. Just menus and buttons. No config files, no registry, no third-party add-on. This isn’t an esoteric hack that takes serious effort to find, it’s a basic feature. You might as well design a Mac program that assumes the Dock is on the bottom of the screen. For most people it will be, but it’s not rocket science to move it.

In my ICS classes, they always discouraged us from using “magic numbers” — just throwing a number in the code without identifying or abstracting it. There are two very good reasons for this. The first is that you might forget what this 64 is doing. The second is that you might decide to change it later on, and it’s much easier to change one SIZE=64 definition than to track down every 64 and hope you’ve neither missed any you need to change nor changed any you need to leave alone.

Those dock applets are stuck at 64×64 pixels because the programmers were thinking in terms of the pixel grid, not in terms of actual display size. Continue reading

Some people browse collections. I collect browsers. Mostly I just want to see what they’ll do to my web site, but I have a positively ridiculous number of web browsers installed on my Linux and Windows computers at work and at home, and I’ve installed a half-dozen extra browsers on our PowerBook.

One project I’ve worked on since my days at UCI was a script to identify a web browser. In theory this should be simple, since every browser sends its name along when it requests a page. In practice, it’s not, because there’s no standard way to describe that identity.

Actually, that’s not quite true. There is a standard (described in the specs for HTTP 1.0 and 1.1: RFC 1945 and RFC 2068), but for reasons I’ll get into later, it’s not adequate for more than the basics, and even those have been subverted. That standard says a browser (or, in the broader sense, a “user agent,” since search robots, downloaders, news readers, proxies, and other programs might access a site) should identify itself in the following format:

  • Name/version more-details

Additional details often include the operating system or platform the browser is running on, and sometimes the language.

Now here are some examples of what browsers call themselves: Continue reading

A few weeks ago I was looking at the website error logs and noticed some attempts to access images with names like /flash/images/%20%20%20%20%20%20%20ans3.jpg. I got around to looking at it today, and all of them are the same name, all of them from browsers looking at my profile of the Teen Titans, which includes an image called teentitans3.jpg.

I finally realized what’s going on. Some moronic filter has broken up the name not as “teen titans” but as “teen tit ans,” decided it must be porn, and replaced the “offending” words with spaces (%20 is the code for a space in a URL).

It really makes me wonder how badly mangled the page looks to these people, especially if it turns out that every instance of the team’s name gets pointlessly erased.

Further reading: The Censorware Project, Peacefire, Electronic Frontier Foundation.

I just caught a reference to Arve Bersvendsen’s EvilML file. What is it? It’s an HTML document designed to make use of the fact that HTML is, technically, SGML, which has all kinds of strange shortcuts you can use. Of course, no one has ever bothered to make a web browser that actually handles all these shortcuts.

It’s hard to describe it. The code is barely readable. The first line of text looks like this: <body<h1<em>Emphasized</> in &lt;h1&gt;</>. No browser in existence is likely to display it correctly, and yet — amazingly enough — it validates…

I already thought that moving to the more rigidly-defined XHTML was a good idea, but suddenly it makes a lot more sense!

ยปAll pages site-wide with this tag