Main concepts

The job scheduler goal is to run interdependent jobs (i.e. one job might need one or several outputs from other jobs)

  • Resources: Any object is a resource, whether it is data, a job, or a server. Resources are located on different hosts that can be specified by a URI. For example, file:///a/b/c denotes a local folder or file, while xpm:token:user@hostname.org corresponds to a token used to limit the number of launched processes on a given computer.

  • Connectors: Connectors specify how a resource can be accessed and how processes can be launched. Single connectors (i.e., localhost and ssh) are built-in. Composite connectors (i.e., describing a cluster of computers) can be built from single connectors.

  • Resource state: A resource can be in the state

    • WAITING (waiting for dependencies to be met),
    • HOLD (waiting a user action) consequently to a dependency being in the ERROR or HOLD state.
    • DONE when completed,
    • ERROR.

    For tasks that can be run, two other states are possible:

    • READY (waiting to be run),
    • RUNNING,
  • Groups: One should allow to set a group for a set of experiments. For example: I run several series of experiment and call them "trec.test1", "trec.test2". I can then operate on all the resources of a specific group, like e.g. "trec" or "trec.test1".

General architecture

Resources

We have the following types of resources:

  • Data: the output of one job (one job can have several outputs). Some data can also be already generated by an external process (e.g. a data collection) and be declared to experimaestro as read. Data can be in three states: WAITING, HOLD or DONE.

  • Task:

    • Job: a task to be run, that produces a given set of resources.
    • Server: a task that need to be run; however, we don't wait for the server run to complete. *

Status

Every resource (see below) has a unique ID which is a path to a directory containing information about the resource on the host.

Based on the file ${FILE}, several paths are defined (note that not all files might be present)

  • ${FILE}.lock locks the write access to the status (can also be used as an exclusive lock to the resource) This file is used whenever an exclusive access is needed.
  • ${FILE}.status contains the PID of the running process (two columns PID MODE separated by space) where MODE is r, w. This file is used when the resource can be accessed by a single writer and multiple readers
  • ${FILE}.run corresponds to the script that are needed to execute the job
  • ${FILE}.code corresponds to the error code at the end of the execution of the job
  • ${FILE}.done created when the job was successfully executed or the data successfully generated
  • ${FILE}.err contains the error log output (jobs only)
  • ${FILE}.out contains the standard log output (jobs only)
  • ${FILE}.input contains the standard input (jobs only)

Using experimaestro

Configuration

The server and clients are configured by a simple property file settings.ini, located in the .experimaestro (by default) file in the user's home directory.

{% highlight ini %}

[server] ; Port for the Web server (and the XML-RPC server) port = 12345 ; Experimaestro will store its data in this folder database = /path/to/a/valid/folder

USERNAME = PASSWORD, user

[client] local.url = http://USERNAME:PASSWORD@localhost:12345/xmlrpc local.default = true

local.url = http://USERNAME:PASSWORD@localhost:12345 local.default = true

{% endhighlight %}

Starting experimaestro

The experimaestro script can be used to start or stop the server, add jobs and resources. Type:

experimaestro --help

to get some help on available commands.

XML-RPC

The Experimaestro server can be reached through XML-RPC calls.