Tuesday, September 25, 2007

Browsing the Afribone... To run yourself!

Using Google's translation engine, a web site from Guinea reads like a medieval thriller!

Burden



The strike of syndicaleux paralyses the country. Forces of the ogre and demonstrators clash. Tabassés citizens, prefects driven out, ransacked residences, died and interpellations. To run yourself!



Rehandling



On January 19 at the evening, Fory Coco got rid of Fodé Bangoura, its ex Petit President. Large
snub for trusty servants. To run yourself!




RTG



Significant ultra hearts, shocked by the way in which Radio operator Gbantama and the TV Coconut covers
burden, are not long in being moved. To run yourself!



Futurelec business - Guinean State Necessary is to seize or save?




Source URL:
http://www.afribone.net.gn/lynx/

Translation URL:
http://translate.google.com/translate?hl=en&sl=fr&u=http://www.afribone.net.gn/lynx/&sa=X&oi=translate&resnum=19&ct=result&prev=/search%3Fq%3Dsite:.gn%26num%3D20%26hl%3Den%26safe%3Doff

Monday, September 24, 2007

Art is...

Someone needs to send a memo over to the folks at ArtIsAnalCheese.com: Your domain name, strategically sliced and diced, makes for a pretty unsavory idea. Art is anal, but why bring cheese into the mix?

I'm reminded of another unfortunate URL I saw in Manhattan: The One Ill Building. I don't think I'd want to live in the only ill building.

Friday, September 21, 2007

.htaccess: Invalid command 'UUUUUUUUUUUUUUUUUUUUUU'

I've never seen this error before. It appears that this error was logged 6 times during the split second it took to save a change to my .htaccess file. During that split second, 6 attempted page views returned code 500 Internal Server Errors. Weird.

[Fri Sep 21 12:36:08 2007] [alert] [client xx.xx.xx.xx] /var/www/html/.htaccess: Invalid command 'UUUUUUUUUUUUUUUUUUUUUU', perhaps mis-spelled or defined by a module not included in the server configuration

Thursday, September 13, 2007

&^@_=

Strange how some arbitrary strings produce empty results pages, but others do not. I was honestly looking for pages containing strings of = characters as separators.

=====

+++++

&&&&&

@@@@@

_____

Monday, September 10, 2007

The Things Google Remembers



I did not know that Google saved search records for so long. Today is September 10, 2007, and I used Google to search the name of an author. I thought her name sounded failiar, and now I know why. I visited a page of hers almost 2 years ago, as Google helpfully pointed out. I guess I knew I was doing this -- allowing Google to save my searches and to remember which pages I actually clicked on -- but somehow I thought that this search data was being stored locally on my PC. That does not seem likely since this PC is new as of January, 2007.

Friday, September 7, 2007

phx.gbl. keymachine.de, et al.

Gazing at my access_logs this week, I learned a couple of interesting things.

Evidently, Microsoft is dabbling in the world of referer_spam and bogus hostnames to clog access_logs with confusing junk. This is apparently being done as some kind of quality test.

I have noticed this for months and even though I think I understand what might be going on behind the scenes at Microsoft I am still puzzled by this rather blunt implementation.

Long story short, if you see hostnames liks this in your access_log, it is evidently Microsoft running some kind of s00per-s3kr3t QA script comparing its live.com search results with the pages in its index.

bl2sch1081908.phx.gbl
bl2sch1082210.phx.gbl
bl2sch1081904.phx.gbl

(basically, anything appearing to come from the *absolutely meaningless* .gbl top-level domain)

I am happy to know that Microsoft is working to improve its Live.com search results, but the vague and clumsily stealthy way of doing it is, well, puzzling.

Another thing I learned this week is that there are a lot more spam drones masquerading as Googlebot than I realized. I usually filter out Googlebot traffic from my access_logs, as I know my sites are well indexed and I do not feel a need to monitor Googlebot traffic to the sites on a daily basis.

I turn that filter off once in a while, though, and I was a little surprised yesterday to see hits coming from a rogue drone at (where else?) keymachine.de claiming to be Googlebot:

ns.km30217.keymachine.de - - [06/Sep/2007:14:55:37 -0400] "GET / HTTP/1.1" 206 14484 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
ns.km30217.keymachine.de - - [06/Sep/2007:14:55:39 -0400] "GET / HTTP/1.1" 206 14484 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
ns.km30217.keymachine.de - - [06/Sep/2007:14:55:41 -0400] "GET / HTTP/1.1" 206 14484 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

The hits all return status 206 (incomplete download), which is not a common error for small HTML pages.

Keymachine.de (owned by Keyweb.de) is a German hosting company that I know only for the referer_spam and abusive drone traffic it sends my way. It is mostly banned from my sites but I had not noticed its reappearance as a phony Googlebot until recently.