There are lots of resources available, see the list here. I described in that post how I started up the xgridcontroller daemon, and obtained Xgrid Admin which gives me a view of xgrid jobs. The daemon is visible in Activity Monitor (choose "Other User Processes", or from Terminal with
ps -A
. It's curious that it comes back after a re-boot. The Xgrid controller can be administered from the command line:or by using XGrid Admin. Note that the earlier examples used
sudo
for xgrid
commands, but this is not necessary. In browsing the manual for xgrid, I discovered that you can do this (for my bash shell):now I can skip the
-h 127.0.0.1 -p <password>
stuff in my commands. Before we get to the subject of this post, let's talk a little bit about where Xgrid runs and what you're allowed to do. According to the Apple docs, the jobs run on the agent as user "nobody." The docs discuss users and permissions here, but don't say much about user "nobody" beyond that it "provide(s) minimal permissions."
According to this FAQ from the mailing list:
The second ("nobody") provides only very minimal privileges, since it assumes that the agent doesn't trust the client. This is the most common reason why jobs that attempt to read or write outside of, e.g., /tmp, will get a permission error.
And if you look at this file:
/usr/share/sandbox/xgridagentd_task_nobody.sb
there are a couple of fairly complicated regular expressions
(allow file-read* (regex "^/(bin|dev|(private/)?(etc|tmp|var)|usr|System|Library)(/|$)"))
(allow file-read* file-write* (regex "^/(private/)?(tmp|var)(/|$)"))
which I can't decipher completely but I interpret as restricting reading privileges for "nobody." And according to the mailing list entry, "the best solution to this problem is to enable Kerberos."
But we're getting ahead of ourselves. The question for today is, how do we get data and code to the agent and data back again? According to the Apple docs
You have the option of supplying an input file or a directory of files. If you supply an input directory, it is copied to each agent and becomes the working directory for the executable file.
In an example I used earlier, there is a one-line Python script with:
print 'Hello Python world!'
in a directory
temp
on my Desktop. Working from the Desktop directory I did:As the docs say:
Important: You have the option of providing a relative path or an absolute path when specifying executable files, input files and directories, and output files and directories. When a relative path is used, the executable and the input files or directories are copied to the agents, and the output files or directories are created for every agent and collected by the controller. If you specify an absolute path to the executable, input, or output files or directories, those files are assumed to exist on the agent computers, or to be available to the agents as part of a shared file system, at the path location specified. They are not copied or created.
Executive summary:
• relative path: copied to the agent
• absolute path: assumed to exist on the agent
The version of the above example that I posted the other day and another here are subtly different. They provided a full path
/Users/te/Desktop/temp
to the temp directory, and which should be "assumed to exist on the agent computers, or..." What we want is a relative path.According to the man page for xgrid, there are more options:
The docs say:
Use the -in parameter to pass an input directory. This directory is copied to each agent and becomes the working directory on the agent’s host computer. You can include anything needed in the working directory, such as additonal input files, libraries, and executables. The executable file is run in this directory.
Thinking about this, I realized there is another issue with the xgrid command above. Even though it works, what it really should be is:
which also works. Since we're in temp on the agent, we don't need the directory before
script.py
.I placed a file with some DNA sequence named
seq.txt
in the temp
directory. The script xgrid.script.py
opens the file and prints the data.What about getting data back again? Well, I mean, other than the way we've been doing it :)
It works! We'll save stderror (se) for next time, with BLAST.