dtm | Entries tagged with tech

So many, many, many moons ago (over seven years!) I posted about a greasemonkey script that provided a comment killfile. (See that page if you're unfamiliar with the concept)

It was popular and useful for a while, but mostly succumbed to bit rot while I was at Google and I hadn't had a chance to play with the code in a modern web environment until recently.

I now have an experimental version of my old killfile redone as a chrome extension, and am looking for a few early tester volunteers to find out the places it should work that it doesn't yet. I'm also interested to see if the chrome sync stuff works for people who aren't me. (in theory, all your chrome devices should know about who you've tagged as a troll)

Now I'm providing even less support for this version than I provided for that greasemonkey script, so don't even bother asking unless you feel comfortable following the steps on this page under the heading "Steps on adding extensions from other websites".

But if you'd like to test this thing send me an email at martin -at- snowplow -dot- org.

Yeah, yeah. First post in a long time, first post of the new year, first post with a new US President, etc. Assume I've groveled sufficiently for not posting in ages.

So the other day I visited a website belonging to a friend and my web-browser completely freaked out, blocking the site and saying that it linked to all sorts of disreputable places. That is to say, it warned that my friend's site was including material from sites that attempted to install all sorts of malware.

Now, this was odd, but occasionally a bad ad can get into an ad network, and then everyone showing those ads accidentally is displaying malware, and I figured that was what happened since when I went back and then visited the site again, all was fine.

Only, after reloading her page I discovered that my friend doesn't run ads on her site.

( A more detailed description of what was going on, probably of interest only to techies )

Yeah, yeah. Haven't posted in two months, blah blah blah.

And this isn't going to be a life update post either. Sorry. People who are not CS types (and this is much more academic CS than practical CS stuff) should just skip this post.

Now, I present some gloriously ridiculous, useless code, done primarily to show that I could do this in java, and do so in a typesafe manner.
( A demonstration of the Y-combinator in java )
Well that was a bit interesting. Get you head around how it works, if you've never worked all the way through the Y combinator before. The relevant wikipedia article might be useful. The translation from scheme to java was fairly straightforward; the trick was getting all the types to work out properly.

But wait, there's more! That code there isn't really reuseable at all. What's really needed is a version that works with arbitrary types:
( A generic version )
Oh, and for those of you worried that I'm giving away internal Google secrets by using the interface com.google.common.base.Function, that's already been open-sourced as part of the Google Collections Library.

Some people might deduce from my posts this past month that either I have abandoned livejournal or that nothing happened during June. The latter is not true, and I don't think that the former is either, but I still can't get into the regular blogging habit.

#include <std/apologies/long_journal_update_lag.h>

So now for something completely different; namely, a rant about a particular instance of idiocy common as dirt on the web:

Dear web page designers:

See all the hits this search generates? Almost without exception, every result there is either telling you how to do something that's a bad idea or asking about details of how to implement a bad idea. What I'm talking about is this: most of the time when you need to type in actual physical address online, there will be a text box for your name, one for your street address, one for your city, one for your zipcode, and a drop-down box for your state.

Now, every state has this convenient little two-letter abbreviation, and anyone typing in an address in a particular state already has the abbreviation quite literally at their fingertips. Why is it that I can't type "NJ" into the state box? Why do I need to type in "NNNN" or "NNNNN" or "NNNNNN" to get my state? (Depending on whether the creator of the box has included Canadian provinces in the drop-down or not, and on whether they sorted the options in the dropdown by abbreviation or by state/province name) I can't even get a standard bunch of keypresses memorized to type my state!

This is not a new issue; there was a use-it.com article about this in 2000. The big boys (aka amazon) do this correctly, with a text box. And yet still, you'll find advice like this (from a page supposedly made in 2005):

Sometimes you may want to replace text fields with drop-down menus. This might be because selecting from a menu is easier than typing. But it could also be because the script that handles the form can't interpret just any text entry.
For example, you will often be asked to choose your state from a drop-down menu. This might be because picking it from the menu is easier than typing the name of the state.
Along the same line, you may often asked to enter the 2 letter initials of your state from a drop-down menu as well.
This could prevent confusion for the script that handles the form input. If, say, the script was programmed to only accept capital letters, then a drop-down menu would secure that no invalid entries were made.

Aaaaaaaaaaaaaaaah!

Look, if the script on the back end blows up when given input like that, then fix the script. Or make a wrapper around it to validate data first, representing the form to the user if they type in a bad state abbreviation. (Since for security reasons, you need to do that already anyway, and you know it)

Don't make me guess how many "N"s to press, and don't make me jump back to the mouse, especially when you lay out your form like a standard US postal address so that I'm jumping back to the keyboard for the zipcode anyway.

Is this a small annoyance? Absolutely. It's trivial in the grand scheme of things, and even fairly small in the world of web-based annoyances. But it's totally unnecessary. In fact, implementing the drop-down is probably more work than putting another text box there. Please - go play an extra minute or so solitaire instead of making yet another state drop-down box. Your time will be better spent.

From the quiz solution summary on http://rubyquiz.com/quiz122.html: (ellipses in the original)

You have shown me the light and it tells me... Daniel Martin is crazy. I'll leave it to him to explain his own solution, as punishment for the time it took me to puzzle it out. I had to print that Array inside of the inject() call during each iteration to see how it built up the answer.

Update: I did in fact accept my punishment and post an explanation of my twisted implementation of the Luhn algorithm.

Update: Sorry; anonymous comments aren't shown on this entry any more. Three spam deletions is my limit.

Continuing my pattern of occasional technical posts just that my journal won't be completely dormant, here's another one:

If you do much web development at all, you probably work with a template language of some kind. You know, the kind of thing where you write HTML with various placeholders in spots that get filled in by the web application - examples include jsp pages, Django's template system, Smarty templates, PHP pages, or HTML::Mason.

Anyway, the problem with virtually every HTML templating language out there is that they make it easier for the person writing HTML templates to add an XSS hole than to avoid it. This isn't a matter of making it possible for page writers to shoot themselves in the foot - that's always going to be possible, given any reasonable system - it's a matter of making it easier to do than to avoid.

( More for people who've ever worked in such environments )

I started, and at some point may continue, a big long livejournal post about a rather technical topic - ways in which people make themselves vulnerable to XSS attacks - when I ran across this example that is just too horrid not to post about on its own.

( How to achieve triple vulnerability to XSS attacks )Update: I had a technical detail wrong, which must make writing browsers painful in trying to parse tag-soup HTML.

So I've thought about several different journal entries I could write lately, but I somehow just don't feel I have the requisite will-to-type to write any of them. So here are some scattered thoughts I'm not journaling about:

Katherine: ( 3 topics, 107 words )
Personal: ( 4 topics; 418 words )
Technical: ( 5 topics; 345 words )

I call this a parable because although I'm sure that there's a lesson to be learned from this, I'm not quite sure what it is. I do have certain points in this story that feel important, and I've labeled those as "parable events". As I said, though, I leave the conclusions to the reader.

My employer (call them "company E") has over the past year tightened up the corporate network, including restricting outgoing connections to nothing other than ftp and web browsing.

The TCP-literate will wonder if I actually mean is that company E is restricting outgoing connections to ports 20, 21, 80, and 443. In fact, that's exactly what I mean. Connections to those port numbers are allowed and others are not. In theory, one can submit tech. requests to network engineering if there is a business reason to allow some other type of access to a certain location.
( Read more... )

This is a story about an exploit that didn't happen. Mostly, because I chickened out.

The short version is this: for a while, there was a bug in the way google accepted user preferences that meant that it was possible to create an <img ...> tag such that anyone who looked at a page containing the image would have their google preferences changed. Think about this for a second. See that blank box above the smiley face? If Google were still vulnerable to this exploit, you'd see a second little smiley face in that box. Also, merely by loading your friends page with this entry on it, your google preferences would have been changed to whatever I picked. In this case, to have the google buttons and all explanatory text switch to Arabic, to search only for pages in Chinese or Japanese, and to display only one result per page. You'd see this when next you used Google, whether you used it directly by visitng http://www.google.com or through some browser plugin.

Now, freerepublic.com (no, I'm not linking to them) is frequently visited by people who would at the least be freaked out by something like this. Furthermore, it's full of people I wouldn't mind freaking out. Also, it encourages semi-anonymous users to post images in the comments. At least as of Halloween, Google hadn't fixed this exploit. Think about it: right before the election some of the more rabid online right-wing activists have their ability to use Google taken away from them in what looks like an islamofascist plot...

Anyway, as I said, I chickened out. I don't know if there's some sort of moral or lesson here - except that web application security is so difficult that even Google can get it wrong in potentially embarrassing ways - but it kind of seems like there ought to be. If anyone cares about the technical details behind the flaw you can read about it by googling "google setprefs xsrf" and see more details about my specific way to exploit it by looking at what http://xrl.us/rv5j/smile.gif gets you when you feed it through wget.

(And yes, I'd reported this to Google on September 25th, but I wasn't the original discoverer of the flaw itself. Encoding the evil into an image tag was my own creation, as was the exploration of how much evil could be encoded into one little picture.)

Or, poke yourself, or something. Whatever's appropriate.

I've got a big huge problem to solve at work and I'd like to look up what's in the literature on solving problems like it, but I'm sufficiently distant from academic computer science that I don't know what the term is for this type of problem or where to begin.

Or, as

mizkit said, “Help me LJ-wan Kenobi! You're my only hope!”

( The problem description )

This is another computer-geek post, because I was thinking about it today, and thought it deserved to get documented. Most of my friends will not care, so...
( How to allow users to change their own passwords when using subversion *without* apache )

So one thing I discovered as I went back over my old posts is that I used to do some serious computer geeking in my posts. I mean, seriously.

And I haven't done any computer geeking on livejournal in a while, so:
( The insanity that is the jvm's shutdown sequence )

This is a technical/work entry. I'll do a Katherine update at some point in the not-too-distant future.

Occasionally, even the most insulated-from-the-customer programmer must write some sort of documentation. (I'm actually not nearly as insulated from the customer as I might pretend to be) Recently, a piece of install documentation that I had been maintaining in reStructuredText was officially transferred to the publications group, and since then it has been a Word document, though I am still expected to submit updates. At one point I noted that I would be much happier maintaining that documentation in the plain text format I had been using before, and was asked by publications “Dan, what do you want to do with the doc in plain text that you can't do if it's in Word format?” — this is my reply to that question, with some slight rewording and elaboration.
( Why I find Word inadequate for maintaining technical documentation )

Well, I've almost let another month go by without writing anything here.

I do possibly have some interesting news (that people on TooMUSH are already aware of), but I'm not going to say anything about that here at this time until I have more to report. (How's that for cryptic?)

In the meantime, though, I'm going to make a prediction and record it here so that it's all properly timestamped:

Software aimed at the average programmer will succeed or fail based on whether or not someone in India can find the manual online and print it out.

This means that having the only useful English documentation to your product be in a book selling for $30 at the local Barnes and Noble is not sufficient. It also means that webbook-style manuals alone are not sufficient. Sybase, which has had its documentation in webbook-style manuals forever has figured this out - note how they have here both the online and pdf versions; contrast this to Microsoft, who seems to think that it's okay to make documentation freely available, but not easily printable. PDF files that are locked so that they can't be printed are likewise worthless for getting your documentation to that printer in India.

I reach this conclusion based on the number of hits I receive to my personal website from www.google.co.in from people searching for "ant manual pdf". It's been steadily growing, and I haven't been updating the pdf manual I have - it's still the manual for ant 1.4, some three or four years out of date at this point. Those programmers in India really want the documentation in printed form.

This isn't to say that I think it would be worth translating most documentation into Hindi - first off, most programmers in Bangalore are as likely to have Kannada or Gujarati as their home language, and secondly it's not really about India per se. India's just a convenient term for "wherever people are outsourcing programming to these days". I'm also getting a tolerable number of hits from Singapore, Romania, and Hungary. I even got a hit from Sri Lanka this past month, with a search for "Ant documentation Manual PDF".

So in the effort to get me posting about anything, and prevent my journal from idling for another month, I'm going to talk about a toy I bought myself on eBay a few weeks ago.

It's a slide rule.

It's a moderately nice one, though the sight on it is slightly askew, such that if I don't touch it, the line isn't quite vertical. It's only a 10 inch model; the really accurate 20-inch and longer ones aren't selling on eBay, or are too pricey.

It has scales L, C, D, A, B, K, Ci, CF, DF, CiF, S, T, ST, LL1, LL2, LL3, LLO, and LLOO.

Just think - there was a time when every engineering student would know what all of those scales meant. I've included a discussion of what those scales mean, and what it means I can compute with this, below the cut.
( The rest gets rather math-heavy )

In many ways, people often use blogs in ways similar to what usenet was used for back in the day. That is to say, some of the common behavior patterns appear to have shown up again in blog comment threads that used to show up in usenet.

Specifically, I'm talking about trolls - people who show up in a comment thread to derail the discussion, and do so again and again and again. Examples might include someone who shows up in the comment threads of a blog discussing recent work in microbiology to argue yet again for Intelligent Design theory to be taught in the schools, or someone who shows up repeatedly on a feminist-oreinted blog to snipe about how a woman who is raped after voluntarily consuming alcohol deserves what happened to her.

Trolls are not necessarily bad people, (though some probably are) but they are behaving obnoxiously in context. In usenet days, there was something known as a "killfile" which would cause your news reader to simply not display posts from users you didn't like for one reason or anthoer. Wouldn't it be nice if you, the blog reader, could simply mark certain commentors as annoying, and automatically have their blog comments hidden when you looked at your favorite blog's comments page?

Well now you can

That is, you can if you are using Firefox to browse the web and feel like installing a few things. First off, go install the latest version of greasemonkey. Then, go install this greasemonkey script. This provides the ability to ignore people on some weblogs where I wanted that ability. So far, it covers:

livejournal, though I haven't tested all the obscure different styles available there yet.
pharyngula, and other blogs hosted on scienceblogs.com
slacktivist, and other blogs on typepad.com
The Panda's Thumb
Alas, A Blog
Feministe

It's pretty easy to add a new blog at this point, assuming that the HTML around comments is well structured, so expect the list of supported blogs to grow. Among other things, I intend to add support for comments hosted at haloscan.com (which covers several big blogs, such as Eschaton and Digby's blog) by the time the weekend is over.

Update: Haloscan-based comments are now handled, as are livejournal community and syndicated feed pages. (Which were only excluded because of a bug before)

Update again: Pandagon is now covered and there is experimental support for killfiling people from your livejournal friendslist (e.g. if you like to read a community except for one particular poster). Note that the friendslist support is disabled by default; to enable it you'll have to edit the script to remove the comment markers inside the variable "scenariolist" from the beginning of the line that says "ljfriendsScenario". (You can edit installed scripts from the "Tools-Manage User Scripts..." menu)

As I may have mentioned here before, I work from home four days out of five. Recently, everyone in the office was forced to switch email programs to Outlook because we're now using the corporate Outlook server.

Now, this morning at around 9:15, the office network went down, which naturally meant I was kicked off VPN. Because I'm using Outlook and was composing a message when the network went down:

I can't finish and send my message, and have it just sitting in the outbox until the network comes back up.
I can't access any email that was stored on the server, though I can still see all the subject lines and who sent stuff to me.
Every now and then while I was typing the rest of my message Outlook would freeze for a couple of seconds as it desperately tried to communicate with home base.

Now, if I am disconnected from the network when I start Outlook, I can choose to work offline. However, when I do that (stopping and restarting Outlook), I no longer have any email sent after about 2pm yesterday. Specifically, I don't have the message I was replying to when the network went down nor do I have my partially typed response. (Oh, I still have the text, cut and pasted into Notepad, but it's pointless without what I was responding too) This despite the fact that Outlook supposedly did a synchronization this morning at 8, unless the "Send/Receive" button and messages in the status bar about "Synchronizing..." mean something different from what they should.

Life was much better when all email was on the local IMAP server and those of us who wanted to used Mozilla. Disconnected operation just plain worked. Why oh why are "enterprise" solutions so fragile?

So going back to my post about pow.png, I find myself in the odd position of hosting an image file that occasionally shows up in the comments section of freerepublic.com.

Now, wanting to tweak the freepers, and considering that I have at my disposal the full power of mod_rewrite (which would, I suppose, allow me to redirect to a cgi script, and could therefore do anything), I ask you: what should I do?

I'm thinking that really effective tweakage would require me to replace pow.png only selectively; say, once per day per client IP address, and of course only if the referrer setting was either missing or something on freerepublic.

For instance, what should pow be replaced with in this thread?

I was also thinking that before I did this, I'd place nice little circle-C mark on pow.png, just to make it absolutely clear when someone does rip it that they're committing copyright violation. Not that they're likely to care, but people should at least make their own stupid little word balloons. It's not hard.

Update: I was lazy, and just went with the antimagnet image. I also haven't put a little circle C on pow.png because, well, what I really want is to annoy the freepers out of my access logs, and I don't think it's worth it going to extraordinary lengths to beat them about the head. Besides, the circle-C hasn't been required to confer copyright for years.

So this is my first entry in a while and, sorry, but I'm not going to talk about Katherine. (she turned 1 last Monday, and there was cake and a multitude of relatives and cameras, and maybe I'll post a Katherine update later today)

Instead, I'm going to talk about stuff I found in the webserver log.

Recently (late last week) I cleaned out my old account at the Johns Hopkins Math department. Among the other things I did was to drop a little Redirect directive in the .htaccess file there to point to my my current web home. This means that requests for files that used to be in my math.jhu.edu account are now bounced automatically to my snowplow.org account.

As I never had access to the webserver logs on math.jhu.edu, I never knew if my pages were being looked at, or, if so, which pages were popular, in terms of being linked to from other places. Well, as I've now found out apparently the most interesting thing in my old web account (in terms of hits) was this stupid image. That's right, it's just a "POW!" with a border around it. I'm not even sure when or why I made it - playing around in gimp, I suspect.

So now I get to see from the logs what pages link to that image, and it's a little bizarre how it's spread. Below I give the referrer and the first time the snowplow.org logs see someone looking for pow.png from that page: (not necessarily when requests started, since I don't see what went to math.jhu.edu) (All times Eastern Standard Time)

a google image search for "pow"	07/Jan/2005:13:58:44
user profile at http://www.myspace.com/	07/Jan/2005:14:28:18
http://cactokaur.blog.com/ (in Spanish)	07/Jan/2005:17:19:34
http://boards.theforce.net/ Revenge of the Sith discussion	07/Jan/2005:17:43:52
The freepers use "pow" too	07/Jan/2005:21:28:24
Something on driven2modify.com	08/Jan/2005:00:24:55
Blog for "kasha_teishu" on xanga.com	08/Jan/2005:02:08:30
http://maroc-chat.de/forum-thread-539--.html	08/Jan/2005:09:40:40
a post on www.blackcatbone.34sp.com	08/Jan/2005:14:25:15
a post on forum.surfermag.com	08/Jan/2005:17:08:31
sealiontavern.nobledead.com	11/Jan/2005:12:07:06
Another xanga.com blog (PermitaSer)	11/Jan/2005:18:29:53
an old xanga.com blog post (warning: background music and probably NSFW)	11/Jan/2005:20:38:19
A post (currently unreachable) on gaiaonline.com	11/Jan/2005:23:02:53
Someone's hotmail email	12/Jan/2005:10:01:28
ekok.nl post	12/Jan/2005:10:49:51

"Pow" gets around. (I omitted other google image searches which found it, such as "POW", "Pow", and "pow!") It also probably points to the idea that "most links to" is not necessarily a good measure of "most interesting", "most valuable", or "most useful". Or maybe I'm just deluding myself about the usefulness of anything else I've ever put up on the web.

Profile

dtm

Daniel Martin

September 2024

S	M	T	W	T	F	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

Syndicate

Page Summary

Style Credit

Style: Pink Panther for Lefty by sarken

Expand Cut Tags

No cut tags

Page generated Jul. 12th, 2025 11:11 am

Daniel Martin

My personal blog, and not in any way the opinions of my employer

Entries tagged with tech

Navigation

Killfile, reloaded.

Something noticed in malware distribution - one-time malware redirects

A bit of truly ridiculous java geekery

So nothing for June, and a rant

It's official

Most HTML templating languages are written incorrectly

How to make your webapp amazingly vulnerable to XSS attacks

Scattered thoughts

A short parable on network security

ObNonHack

Poke that computer scientist sitting next to you

How we set up password changing in subversion

How Java shuts down

Why I don't like Word for writing documentation

Another month

Obsolete technology geekery

Got Trolls? Usenet-style killfile for lj comments

Reason I hate Outlook number 3241

What images could I use?

The story of pow