Wednesday, November 11, 2009

Xgrid: finale

It is just too weird for words. I spent at least 10 hours between yesterday and today, trying everything I could think of to figure out why Xgrid was knackered.

The jobs I try to run in Terminal hang, while those I try to submit give an id which then yields nothing when I request results. They are visible in Xgrid Admin as "pending" but nothing can persuade them to change that status.

• I found out that xgrid logs to /var/log/system.log

Depending on what I've been doing, I get a variety of errors there. At one point I had (in part):

Nov 11 17:04:50 localhost xgridagentd[20]: Notice: agent connected to controller "localhost" address "" port "4111"
Nov 11 17:04:50 localhost xgridagentd[20]: Warning: agent error opening connection to controller "localhost" (error = Close requested)
Nov 11 17:04:57 localhost xgridcontrollerd[19]: Notice: controller role changed to MASTER

As I say, a huge variety of errors and no consistency. Sometimes complaints about passwords, sometimes about databases. Sometimes things closing on their own. The kitchen sink.

It is not did not seem that the agent needs to be idle:

sudo cat /etc/xgrid/agent/

(But note that this value does not track with what is set in System Preferences). And anyway, I activated the screensaver and let it run for 15 minutes. No jobs ran.

And it is not a password issue. I got rid of all password requirements using both Xgrid Sharing and XgridLite, and confirmed it by looking with my own eyes at the plist files.

But miracle of miracles, I sit down to write a final post telling Apple that it's over between us, and leave Xgrid Admin open on my Desktop. I look over from blogger and see this:

It fixed itself! I'm speechless.

No comments: