whitehouse.gov moves to Drupal, recovery.gov migrated from Drupal

Whoops! Chris Wilson had one good point, but some big misses in this article. I guess this is a danger for a writer — for example, the “PHP” input filter setting allows JavaScript by default, but Chris didn’t select that setting.

The biggest problem is that he claims that Drupal is impenetrable, which it is. For many beginners, it has a steep learning curve. But he never makes the connection; why do your site visitors care? If millions of them appear, and your site continues to work well in response because it was built with a solid operational foundation instead of being built with something that has a cute-but-heavy GUI on the backend, don’t they benefit? It looks to me like Chris has unfortunately conflated the needs of end-users with the needs of site developers.

Also, I’d like to take my hat off to the organization that landed the $18M contract to migrate recovery.gov into Sharepoint. That’s a lot of money for a site built using tables in HTML and containing leftover hidden cruft like “this Web Part Page has been personalized. As a result, one or more Web Part properties may contain confidential information. Make sure the properties contain information that is safe for others to read. After exporting this Web Part…”


Flush DNS cache in Ubuntu

Interested in flushing your Ubuntu DNS cache? Note: I’m running Jaunty Jackalope as of the date of this post.

Well, Ubuntu doesn’t cache DNS by default. Your cache rests within your router, or your assigned DNS servers. You could restart your router, if you have access to it. Or wait until the time-to-live has expired.

You can install a local resolver that will cache DNS addresses, if you like. It will speed up your Web access slightly, since your Web browser will check the local cache first. I imagine the time you save will be measured in milliseconds.

Do that with:

sudo apt-get update && sudo apt-get install nscd

And to clear your local cache, restart the service:

sudo /etc/init.d/nscd restart


Recursively find and list filesize and full path on the command line

Can’t beat the command line for flexibility and power in accomplishing system administration tasks. Here’s one way to recursively list the filesizes and full paths of files with a particular extension from the command line:

nice find . -name "*.swf" -type f -print0 | xargs -0r ls -skS | less

This is a succinct way to say:
“Show me all Flash files in the current directory hierarchy, descending to unlimited depth. Print the full filename on standard output followed by a null character. Send each filename in turn to the ‘ls’ command, which will look up each file’s size and print that in 1K blocks followed by the filename. (If there aren’t any results from the first command, don’t even run the ‘ls’ command, since that will just give us a list of all the files in the current directory.) Finally, send all that output to the ‘less’ command, which will allow me to page through and view it easily.”

EDIT: Added -r switch to xargs command to ensure we don’t see a list of all files, if the first ‘find’ command doesn’t find any. That sort of thing could be confusing.


Still filtering spam primarily using the “From:” header? Then read this.

I’m working with an organization that has been refusing “share this” e-mails from our Web site; specifically, e-mails that originate at our Web server that have that organization’s domain name in the “From:” header.

Here’s the problem with this. Let’s say that Joe Bloggs works at Bloggy Spot, and his e-mail address is “joe@bloggyspot.com.” His coworker Carl really wants to forward him a relevant article from the Time Magazine Web site, so he fills out the form, enters his e-mail address (which is required), and Joe’s, and hits “send.”

But since that message from Time Magazine does not originate from inside your network — as far as you can tell — and it claims to come from Joe’s coworker Carl (“carl@bloggyspot.com”), you refuse that message. “Sorry, can’t deliver to Joe,” you say. “There’s no way you could be Carl. Carl wouldn’t send e-mail from anywhere other than here.”

Don’t refuse those e-mails. Allow them. Rely on other, more reliable methods, and be happy.

Why shouldn’t you base your filtering on the From: header?

For two reasons.

First, you’re trying to fight against something that has been part of the nature of e-mail since its beginning, and second, you’re trying to fight against the nature of the Web today.

  1. This has been the nature of e-mail since its beginning.
    The e-mail protocol standard has always allowed e-mail clients, and hence people, to put whatever they want in the “from” box — so from the beginning, conscientious system administrators have had to rely
    on much more robust methods of content and spam filtering. Looking in the “From:” header for an e-mail supposedly sent from “bloggyspot.com,” and prohibiting e-mail that way, will only make it harder on users. Regarding the organization I’m negotiating with, their system administrator did point out that they already have multiple other layers of filtering and spam protection in place. I argued that since that was the case and since those methods are much more reliable, they should be relying on these instead.

    Perhaps you see the issue: a system that relied only on this level of filtering would be quite easy to defeat, and a system that relied on more filtering than this, wouldn’t need this type of quasi-effective filtering anyway.

  2. This is the nature of the Web today.
    When you visit a Web site and forward an article to someone you know, your message in the vast majority of cases comes “from” your e-mail address. Obviously, this is done so that the recipient will be more
    likely to accept the e-mail when it arrives. The Web’s most popular sites all follow this practice.

    The New York Times, Time Magazine, CNN, and Fox News sites, for example, allow — and in the case of the Times, require — a user to enter their own e-mail address as the “From:” address. Yahoo!, the Web’s third most visited site, does this as well. I’m sure there are many, many more examples.

Spam is a big problem for organizations, but when filtering spam, you’ve got to choose your battles carefully. If you hamstring your users too much, the costs probably won’t be worth the benefits.