Fast Write Cache batteries on ssa0, ssa1 and ssa2 on node TO were replaced while a plumber worked on the cooling water. No problems.
Downtime on the regatta and linux cluster:
20040701 08:00-10:10 = 2 hours, 10 minutes
Author Archives: lsz075
Totalview 6.5
Totalview has been upgraded to version 6.5 on the linux cluster and IBM regatta. On the regatta you should use the command 'module load totalview' to have the environment set up for totalview. On linux it alread is set up.
libgoto.a
The high performance BLAS library from Kazushige Goto has been installed in /usr/local/lib/libgoto.a.
For more information see http://www.cs.utexas.edu/users/flame/goto/.
For more information see http://www.cs.utexas.edu/users/flame/goto/.
Scheduled maintenance thursday July 1., 08:00-12:00
The regatta and linux cluster will be down for maintenance thursday july 1. 08:00-12:00. Running jobs thursday morning will be killed.
The maintenance that will be done is to replace disk cache memory batteries, and do some work on the cooling system in the machine room.
The maintenance that will be done is to replace disk cache memory batteries, and do some work on the cooling system in the machine room.
node32 in linux cluster back online
The power supply in node32 in the linux cluster failed last week. IBM has replaced it, and the node is now back online.
XL Fortran compiler upgrade
The May 2004 XL Fortran V8.1 Compiler and Runtime PTF was installed.
Power outage – machines down
On Friday, May 28 at 11:45 both tre.ii.uib.no and fire.ii.uib.no crashed due to power outage (mistake made by electrician in neighbouring machine room that triggered an emergency power stop).
tre.ii.uib.no got back on-line at 17:11, downtime 5 hours, 26 minutes
After reboot 6 nodes on cluster fire.ii.uib.no were down, queueing system was down as well. Monday, May 31 at 20:50 fire.ii.uib.no got back on-line (3 nodes were still down), downtime 81 hours, 5 minutes
tre.ii.uib.no got back on-line at 17:11, downtime 5 hours, 26 minutes
After reboot 6 nodes on cluster fire.ii.uib.no were down, queueing system was down as well. Monday, May 31 at 20:50 fire.ii.uib.no got back on-line (3 nodes were still down), downtime 81 hours, 5 minutes
Intel compilers upgraded on fire.ii.uib.no
The intel fortran and C/C++ compilers were upgraded to the latest releases on the linux cluster. This should fix problems with writing files larger than 2 GB.
intel-icc8-8.0-57
intel-ifort8-8.0-55
intel-icc8-8.0-57
intel-ifort8-8.0-55
M_Map matlab toolbox installed on fire.ii.uib.no
The M_Map toolbox with TerrainBase and GSHHS high-resolution coastline database was installed in matlab on the linux cluster. For more information see:
http://www2.ocgy.ubc.ca/~rich/private/mapug.html
http://www2.ocgy.ubc.ca/~rich/map.html
http://www2.ocgy.ubc.ca/~rich/private/mapug.html
http://www2.ocgy.ubc.ca/~rich/map.html
UPS batteries replaced
The UPS batteries, and the DC-condensators on the UPS has been replaced. We should now have a fully functional UPS again, giving us 15-30 minutes of battery backed power.