Dear fimm cluster users:

We have recently added two extra login nodes (login2 and login3 ) to fimm.bccs.uib.no. Current login1 will be go under maintenance for short period of time,and will be added back eventually.

We kindly ask you to save your current work on login1 and log-off from login1 and relogin again, you will landed one of login2 or login3. When you do "ssh fimm" DNS server will pick login node according to round-robin scheduling.

Eventually we will have 3 login nodes on fimm(login1, login2,login3), all of them has identical hardware, you can ssh between login nodes.

We have done this to increase redundancy and uptime.

Let us know if this caused any problem for you.

Hexagon is going to have a scheduled maintenance on April, 23rd. The
maintenance will start at 9:30. The expected downtime is about 12 hours.

During the maintenance we are going to do the following:

* Upgrade the compute node Linux to 4.2UP02.
* Upgrade the management station base OS and Cray software release.
* Apply different security patches.
* Upgrade the storage firmware.

All running jobs will terminated. The job submission system has a
reservation in place, it will not allow to start jobs which will not
be able to finish before the maintenance start.

Update 20:10 The maintenance is over. The machine is back online.

Hexagon has updated compilers and libraries.

Please read the full description/changelog in this announcement:

http://docs.cray.com/books/S-9407-1403//S-9407-1403.pdf

Updated software:

cray-mpich 6.2.2 -> 6.3.0
pmi 5.0.2 -> 5.0.3

cce 8.2.4 -> 8.2.5
pgi 14.1.0 -> 14.2.0
cray-ccdb 1.0.1 -> 1.0.2
xt-asyncpe 5.25 -> 5.26
totalview 8.12.0.1 -> 8.13.0

papi 5.2.0 -> 5.3.0
perftools (craypat) 6.1.3 -> 6.1.4

cray-libsci 12.1.3 -> 12.2.0
cray-petsc 3.4.2.3 -> 3.4.3.1
cray-tpsl 1.3.04 -> 1.4.0
cray-ga 5.1.0.3 -> 5.1.0.4
cray-trilinos 11.4.1.0 -> 11.6.1.0

In addition the following modules where removed:

chapel 1.4.0, 1.5.0, 1.7.0, 1.7.0.1
totalview 8.9.2, 8.10.0
pgi 11.10, 12.9, 13.6
xt-asyncpe 5.07 -> 5.15
cce 8.2.0
fftw 3.3.0.2
intel 13.1.163

We will have maintenance in machine room on 5th of
April. Electrician will work on power line in machine room which
requirers electricity to be switched off completely.

Therefor fimm.bccs.uib.no cluster and grunch server will be
shutdown for 3 hours. We have reserved cluster for maintenance which
means jobs submitted to cluster which can not be finished by that time
will not run, and jobs which is already running but will not be able to
finish by that time will be killed.

Maintenance will start from 09:00 AM in the morning. We would advice
you to save all your work on fimm.bccs.uib.no and
grunch.bccs.uib.no by that time.

We are sorry for inconvenience, and appreciate your understanding.

Hexagon has updated software and libraries.

Please see http://docs.cray.com/books/S-9407-1402//S-9407-1402.pdf for full release notes.

cce 8.2.3 -> 8.2.4
pgi 13.10.0 -> 14.1.0
xt-asyncpe 5.24 -> 5.25
cray-ccdb 1.0.0 -> 1.0.1
cray-lgdb 2.2.3 -> 2.2.4

cray-mpich 6.2.1 -> 6.2.2

cray-ga 5.1.0.2 -> 5.1.0.3
cray-hdf5 1.8.11 -> 1.8.12
cray-netcdf 4.3.0 -> 4.3.1
cray-parallel-netcdf 1.3.1.1 -> 1.4.0

Finally grunch maintenance is over.

* Firmware is updated to latest.
* OS is updated to CentOS 6.4.
* grunch is added to new fimm cluster.

Now grunch user can use software which is installed on fimm with "module" command.

Please let me know if you have problem to login or if you need more software to be installed.

One of OSSes has crashed this night leaving /work-common unavailable.
We are working on to fix it ASAP.

Update 10:25 we've disabled ost25, access to /work-common/shared/bjerknes is not possible, we will try to resolve access ASAP.

Update 12:53 the failed ost25 has problems with the RAID controller. We are expecting Dell technician to replace this controller tomorrow.

Update 28Jan 12:02 the RAID controller was replaced, ost25 is back in the system and access to /work-common/shared/bjerknes should be restored