As a point of note, "cn" as a word suffix (regexp search: cn$) doesn't appear at all in /usr/share/dict/words on this here unix system.
So some sort of check for /cn["\/]*/ or something like that ought to weed out everything in the .cn TLD.
This isn't much use if you want to link to legitimate Chinese sites, but realistically it ought to be a hot-button for auto-moderation of blog comments and similar.
Re: gstats code getting sneakier
So some sort of check for /cn["\/]*/ or something like that ought to weed out everything in the .cn TLD.
This isn't much use if you want to link to legitimate Chinese sites, but realistically it ought to be a hot-button for auto-moderation of blog comments and similar.