From this page you will find..
Globus is a
royalty-free, open source toolkit for building grid applications.
It provides also command-line tools for example to login to a system, transfer files and submit jobs.
At LRZ Globus services are provided in the following machines:
- Linux cluster
- login node:
lxgt2.lrz-muenchen.de
- visualization servers:
gvs1.lrz-muenchen.de and gvs2.lrz-muenchen.de
- HLRB2 supercomputer
- login node:
a01.hlrb2.lrz-muenchen.de
- visualization server:
rvs1.lrz-muenchen.de
The visualization servers provide only interactive login service.
Login nodes provide also file transfer and job submission services.
The service contact ports for each service is written below in each section of the service.
Using Globus commands at LRZ
To use Globus commands at the machines you have to load globus module to set environmental variables right.
- at Linux cluster call
module load globus
- at HLRB2
- Normal user can call:
module load globus java/1.6.0
- DEISA users can call:
module load deisa globus
Certificate and UNIX account
You must have a personal certificate to use grid.
Instructions to apply and install certificates are here.
For some of the German research network sites it is possible to use Short Lived Credential Service (SLCS).
There are slides available for using GSI-SSH Term with SLCS (in German).
Please note that on slide 7 you have to select your home institute (not LRZ).
Your certificate's unique distinguished name (DN) has to be registered on the target machine to allow to use Globus services.
That is automatically done for D-Grid and DEISA project users. They also have an account ready on the machine. HLRB2 is for DEISA and Linux cluster for D-Grid users.
Other users have to tell, which machine they want to use and to register their personal certificate.
Grid user needs an account (in German only) on the target machine.
Registering IP address for the Internet firewall (HLRB2 only)
HLRB2 services can be reached from the Internet only if user's IP address is registered to LRZ. For this or other grid related issues please contact to following mailing list:
- Normal LRZ users: please contact to grid-support @ lrz.de to register IP address and for grid related questions.
- D-Grid users have mailing list dgrid-support @ lrz.de
- DEISA users are kindly asked to use the DEISA trouble ticket system.
To learn more about the options of any Globus command, you can always give the -help option to the command.
In the end of the page is also links for further reading.
Downloading Globus
It is possible to use Globus services remotely using LRZ machine.
It is also possible to use Java based login and gridftp clients to transfer files (plase see below interacive login and gridftp.
To install whole Globus you can find source codes, compiled binaries from http://www.globus.org/toolkit/ .
LRZ provides precompiled packets for some version for SuSE Linux and Scientific Linux.
There is also precompiled GSI-SSH only packets available for specific SuSE Linux and Scientific Linux.
Proxy certificate
You need a proxy certificate to Globus commands. First you need to have your personal certificate as was descriped above.
Please read following Interactive login section to see how to create proxy with Java GSI-SSH tool or Globus commands.
You have to do this only once until proxy expires.
DEISA users should read also DEISA Primer chapter of interactive access.
There is explained also how to use handy deisa_service script to avoid needing to know addresses and port of Globus services.
Visualisation server users also have a remote visualisation guide.
Port
Port to contact to GSI-SSH service is usually 2222.
In Globus gsissh can be specified -p 2222.
Login from a workstation to LRZ
It is possible to login from your workstation to LRZ with certificate based login (GSI-SSH).
Then you do NOT need to transfer your personal certificate and private key to your home directory to LRZ machine.
The easiest way to login is to use Java based client with graphical user interface.
It is also possible to install Globus at your machine and use its command line client, but it is not so simple as using the Java client.
Using gsissh from LRZ
If you use ssh login to LRZ first, you have to have usercert.pem and userkey.pem in .globus directory in your home directory.
Then you can create proxy certificate with default life time of 12 hours:
grid-proxy-init
If you wish to delete proxy later, you can call:
grid-proxy-destroy
By using -hours switch with grid-proxy-init you
can change the proxy lifetime to suit your needs.
Example of using Globus command is : gsissh hlrb2.lrz-muenchen.de -p 2222
MyProxy
It is also possible to use MyProxy service to store proxy certificate to a server.
Then you do not need to have personal key and certificate to be stored on a server where you want to use Globus commands.
Please see user guide here.
Using graphical application and myproxy service with GSI-SSH
To enable X11 forwarding to allow running graphical user interfaces Linux users have to use -X and Mac -Y command line.
GridFTP file transfer service
Port
Default port for gridftp service is 2811.
Port at HLRB2 from DEISA network: a01-deisa.hlrb2.lrz-muenchen.de:2813
Usage of globus-url-copy
DEISA users should read GridFTP chapter in
DEISA Primer.
It is possible to copy single files or directories. With file:///path/file syntax is referred to the file system of the client.
At the gridftp server must be used gsiftp://path/file syntax.
For example copying a file from current directory (environmental variable $PWD) to home directory (~) to HLRB2:
globus-url-copy file:///$PWD/my_sourcefile.txt gsiftp://a01.hlrb2.lrz-muenchen.de/~/.
It is possible to copy from gridftp server to a gridftp server. Then both addresses must be with gsiftp syntax.
If server needs another port than default 2811 it can be put after server address. For example a01-deisa.lrz-muenchen.de:2813 is used in DEISA network.
To copy directory recursively (-r) and to create target directory (-cd):
globus-url-copy -cd -r file:///$PWD/mydirectory/ gsiftp://lxgt2.lrz-muenchen.de/~/mydirectory/
Graphical client
There is also graphical client available. See SGGC user guide.
gscp
At DEISA machines such as HLRB2 gscp script allows easier file copy than globus-url-copy. Please see gscp -help
Reliable file transfer (rft)
For long lasting transfer for example over mobile Internet access, one might prefer to use reliable file transfer (rft).
It is not as simple to use as globus-url-copy, because it needs a script.
You can use following script as a template. Only modification is needed to last two lines where source and target are defined like in globus-url-copy..
#true=binary false=ascii
true
#Block size in bytes
16000
#TCP Buffer size in bytes
16000
#Notpt (No thirdPartyTransfer)
false
#Number of parallel streams
1
#Data Channel Authentication (DCAU)
true
# Concurrency of the request
1
#Transfer all or none of the transfers
false
#Maximum number of retries
10
#Source/Dest URL Pairs
gsiftp://source.host:2811/target/path/source_file.txt
gsiftp://target.host:2811/target/path/target_file.txt
RFT client connects to Globus container. By default it uses container's default port 8443.
Example command to run above script: rft -h a01.hlrb2.lrz-muenchen.de -f myscript.txt
Job submission
Ports
Default port is 8443.
- HLRB2
- Globus 4.0.x from Internet: a01.hlrb2.lrz-muenchen.de:8443
- Globus 4.0.x from DEISA network: a01-deisa.lrz-muenchen.de:8444
- Globus 4.2.x from Internet: a01.hlrb2.lrz-muenchen.de:8445
- Globus 4.2.x from DEISA network: a01-deisa.lrz-muenchen.de:8446
- Linux cluster
- Globus 4.0.x from Internet: lxgt2.lrz-muenchen.de:8443
Job submission command
Job submission command is globusrun-ws
Submission is done with -submit parameter.
The target address is specified with -f parameter with syntax address:port.
Batch scheduling system
In Linux cluster the batch scheduling system is Sun Grid Engine (SGE) and in HLRB2 PBSPro.
Both machines have also Fork available to run single process jobs. It is also the default way to run jobs.
Fork is not appropriate for real computaion since process runs at login node, which should not be used for computation.
SGE switch is -Ft SGE and for PBSPro -Ft PBS.
Port
The default port for the job submission service is 8443. In HLRB2 for DEISA network the port is 8444.
At HLRB2 there is also Globs 4.2.1 service available at port 8445 (Internet) and 8446 (DEISA).
GT 4.0.x vs. 4.2.x
Note that Globus 4.0.x and 4.2.x versions' job submission commands are not compatible with each others.
User will get a clear error message in case of wrong version.
GT 4.x vs. 5.x
Globus 5 does not support WS-GRAM anymore, but is using mechanism which was used in Globus 2.
GridFTP and GSI-SSH are compatible with the previous versions. For RFT Globus 4 versions must be used. Globus 5 globus-url-copy has new switches to make transfer more robust though.
Currently Globus 5 is not installed at LRZ for production usage.
Simple job
The simplest job submission is to submit a job just using command line (without a script). Executable is specified with -c parameter.
Absolute path to the command is required.
For example: globusrun-ws -submit -F lxgt2.lrz-muenchen.de -s -c /bin/date.
Switch -s enables streaming output to command line. Without it previous command would not show result of the date command.
To write stdout and stderr to the certain file name-so stdout_file.txt -se stderr_file.txt.
Job script
For non trivial jobs is needed a script in Globus specific XML language. Only mandatory field inside <job> tags is <executable>.
Switch -f specifies the script.
Note that -S is needed if the script contains <fileCleanUp>) tags.
Example: globusrun-ws -submit -S -F lxgt2.lrz-muenchen.de -Ft SGE -f my.rsl
Commonly used tags are:
-
<executable>: command with absolute path to be run
-
<directory>: default home directory is by default $GLOBUS_USER_HOME
-
<count>: number or processes
-
<maxWallTime>: length of the job in minutes
-
<jobType>: use mpi for MPI jobs
-
<fileStageIn: Contains one or more <transfer> entries, which contain <sourceUrl> addresses, which needs to be in gsiftp:// address syntax.
Target is specified with <destinationUrl>. It can be file:///path/target.txt address i.e. to the directory where the file is needed on that host.
-
<fileStageOut>: contains also <sourceUrl> and <destinationUrl>, but in this direction sourceUrl can be with
file:/// syntax and <destinationUrl> with gsiftp:// syntax.
- To remove files such as input files, it is possible to specify files in
<fileCleanup> tags.
Each file is surrounded with <file> tag and referred with file:/// syntax.
Extension tag
For example to load a module to set enviromental variables, one can use extension and preamble tags.
<extensions>
<preample>
module load java
</preample>
</extensions>
This preamble feature not implemented in Globus by default.
For lxgt2 also target architecture can be specified in the extension. Intel architecture is ia64 and AMD x86_64.
For example
<extension>
<arch>ia64</arch>
</extension>>
It is also possible to disable batch scheduling system e-mails by adding following tags:
<extensions>
<emailontermination>no</emailontermination>
<emailonexecution>no</emailonexecution>
<emailonsuspend>no</emailonsuspend>
<emailonabort>no</emailonabort>
</extensions>
Default directory is user's home directory, where $GLOBUS_USER_HOME refers to. That variable can be used in other tags too.
The order of the XML items in the script is important, so please use following examples as a template.
Only mandatory tag inside <job> tag is <executable>.
Example of MPI job using 3 processor cores and 130 minutes:
<job>
<executable>/home/path/helloHost</executable>
<argument>2</argument>
<directory>${GLOBUS_USER_HOME}</directory>
<stdout>${GLOBUS_USER_HOME}/test.stdout</stdout>
<stderr>${GLOBUS_USER_HOME}/test.stderr</stderr>
<count>3</count>
<maxWallTime>130</maxWallTime>
<jobType>mpi</jobType>
<fileStageIn>
<transfer>
<sourceUrl>gsiftp://my.gridftp.server/path/file</sourceUrl>
<destinationUrl>file:///${GLOBUS_USER_HOME}/path/</destinationUrl>
</transfer>
</fileStageIn>
<fileStageOut>
<transfer>
<sourceUrl>file:/// ${GLOBUS_USER_HOME}/output.txt</sourceUrl>
<destinationUrl>gsiftp://my.gridftp.server/path/</destinationUrl>
</transfer>
</fileStageOut>
<fileCleanUp>
<deletion>
<file>file:///${GLOBUS_USER_HOME}/test.stdout</file>
</deletion>
<deletion>
<file>file:///${GLOBUS_USER_HOME}/test.stderr</file>
</deletion>
</fileCleanUp>
</job>
Batch job
To release shell window for a long lasting job especially one can use -batch
switch. To get access to the job later use -o switch with parameter a file name:
globusrun-ws -submit -batch -o epr.xml -S -Ft lxgt2.lrz-muenchen.de -f myscript
When you have the EPR file globusrun-ws can be used to kill the job and to monitor its status:
-kill -j epr_file_name and -status -j epr_file_name.
Globus v. 5 has GRAM5 as job submission service. It is NOT compatible with Globus v. 4 WS-GRAM.
At LRZ GRAM5 is available on HLRB2 for Internet and DEISA networks both running the service on default port (2119).
To use Globus 5 client commands at HLRB2 you can load Globus 5 module (module load globus/5.0). By default Globus 4 module is loaded when module load globus is ran.
A short introduction for GRAM5 client commands is given at a pdf
slides set. It was part of a wider Globus 5 Workshop (see Further reading section below).
Globus Toolkit official User Guide contains specific user guides for each service: