Argonne Getting Started

Log in to surveryor and then execute the following at your bash prompt to setup your environment
export BGPPROFILEDIR=/bgsys/argonne-utils/profiles
export BGPROFILE=kh
export PATH=$PATH:$BGPPROFILEDIR/$BGPROFILE/opt/bin
export PATH=$PATH:$BGPPROFILEDIR/$BGPROFILE/scripts
export PATH=$PATH:$BGPPROFILEDIR/$BGPROFILE/bin


From here you should be able to boot a block with the following command:
khqsub 30 64

The khqsub is just a wrapper to construct the qsub command line invocation and then to monitor the boot
so that it can report back to you the khctlserver associated with your boot. the khctlserver is the node
which controls your block. The rest of the kh tools, namely, khget talk to this node to acquire nodes and create
communication domais for users. If you want to overide the qsub arguments then simply pass more than two arguments
that should serve as the qsub arguments you would like.

The following is an example of a boot:

jappavoo@login3.surveyor:~> cat Work/kh.env 
export BGPPROFILEDIR=/bgsys/argonne-utils/profiles
export BGPROFILE=kh
export PATH=$PATH:$BGPPROFILEDIR/kh/opt/bin
export PATH=$PATH:$BGPPROFILEDIR/$BGPROFILE/scripts
export PATH=$PATH:$BGPPROFILEDIR/$BGPROFILE/bin
jappavoo@login3.surveyor:~> . Work/kh.env
jappavoo@login3.surveyor:~> khqsub 60 128
will invoke: qsub -n 128 -t 60 -e stderr -o stdout --debuglog debuglog -q default -A BGPPlan9Meas --kernel kh /home/jappavoo/null.elf
174364
kh booted: info in /home/jappavoo/7118 : khctlserver=172.16.4.4


You can now set then khctlserver env variable with value returned and invoke the khget command to create communication domains and acquire nodes.
eg.
jappavoo@login1.surveyor:~/Work> export khctlserver=172.16.4.4
jappavoo@login3.surveyor:~> khget
root@172.16.4.4's password: 
ERROR:  must specify user
USAGE: khctl acquireNodes [-n netid[,netid,...]] [-p num] [-i] [-x] [-c credentials] user [num of nodes]
     -n netid[,netid,...] : add the nodes to the existing Private networks identified
                by netid
     -p num   : create num new Private networks for the nodes
     -i       : add nodes to the Internal public network
     -x       : add nodes to the External network
     -c cred  : associated credentials with user
     user     : user to get the nodes for
  num of nodes: number of nodes to get default is 1


The password you will need is "kh". This will be improved in time ;-)

From here you can build environments using khget. But as a quick start you can boot an appliance as follows:

jappavoo@login3.surveyor:~> khqsub 60 128
will invoke: qsub -n 128 -t 60 -e stderr -o stdout --debuglog debuglog -q default -A BGPPlan9Meas --kernel kh /home/jappavoo/null.elf
174365
kh booted: info in /home/jappavoo/9471 : khctlserver=172.16.4.0
jappavoo@login3.surveyor:~> export khctlserver=172.16.4.0
jappavoo@login3.surveyor:~> testramdisk 1 http://kittyhawk.bu.edu/Appliances/sshd.cpio
...
node info in 9724.nodes.1
jappavoo@login3.surveyor:~> cat 9724.nodes.1 | khdo writecon  "echo \"$(cat /etc/hosts)\" > /etc/hosts"
jappavoo@login3.surveyor:~> cat 9724.nodes.1 | khdo writecon  "echo \"$(khmyaddr)\" >> /etc/hosts"
jappavoo@login3.surveyor:~> cat 9724.nodes.1 | khdo writecon  "echo \"$(cat /etc/resolv.conf)\" > /etc/resolv.conf"
jappavoo@login3.surveyor:~> cat 9724.nodes.1 | khdo peripcmd "xterm -bg black -fg white -e ssh root@%ip% &"


The above will boot a 1028 node kh pool for 60 minutes and then fetch a simple sshd appliance and boot it on 1 node. The should fetch the appliance, allocate a node, boot it and cause a console window to open on your display attached the node. The next couple of lines update the nodes /etc/hosts file and /etc/resolv.conf file with local values via the
console connection. The last line will open an xterm with an ssh session to the node. FIXME: Sorry I don't yet know why it take an ssh login so long
to start at Argonne, will eventually fix this.

The following will acquire and configure another 100 nodes from the pool. Note we specify the appliance file fetched the last time so that we don't have to get it again from the remote site.
jappavoo@login3.surveyor:~> testramdisk 100 sshd.cpio.gz.uimg
node info in 9971.nodes.100
jappavoo@login3.surveyor:~> cat 9971.nodes.100 | khdo writecon  "echo \"$(cat /etc/hosts)\" > /etc/hosts"
jappavoo@login3.surveyor:~> cat 9971.nodes.100 | khdo writecon  "echo \"$(khmyaddr)\" >> /etc/hosts"
jappavoo@login3.surveyor:~> cat 9971.nodes.100 | khdo writecon  "echo \"$(cat /etc/resolv.conf)\" > /etc/resolv.conf"
jappavoo@login3.surveyor:~> cat 9971.nodes.100 | khdo peripcmd "echo %ip%"


The last line just list the external ip addresses of all the nodes rather than opening 100 xterms to them ;-)
P.S. the copying of the /etc/hosts and /etc/resolv.conf files might produce strange output on the broadcast console ... but that ok and expected ;-)