Upgrade

Tuesday June 24th will Hexagon be taken down for the final quad-core upgrade.

During the upgrade we will take up parts of the machine so short jobs can be run.

Updates:
Tuesday 24th, 08:00: hexagon is shutdown for upgrading

Tuesday 24th, 09:00: half of hexagon is started, while the other half is upgraded. The rest of the machine will be turn off tomorrow morning (Wednesday) at 08:00 for upgrading. The last two racks will be turned on and made available until 14:00, then the entire machine will be taken down for the final upgrade. From then on hexagon, including the file system, will unavailable until the diagnostics and checkout procedures has been completed.

Wednesday 25th, 08:00: Only the last two racks are now running.

Wednesday 25th, 14:00: The entire machine is now down for the upgrade. We will update this page when the diagnostics are completed.

Wednesday 25th, 20:00: The machine is now booted with final hardware configuration, but not available to users due to diagnostics and checkout procedures.

Thursday 26th, 23:00: The machine is still going through checkout procedures and will tomorrow start on benchmarking for the Acceptance test of the system. More information on when the system will be available for users will come Friday at 11:00.

Friday 27th, 11:00: Hexagon is currently running benchmarks. These are scheduled to complete by 18:00 today, at which point users will be allowed to login.

Friday 27th, 18:00: Hexagon is now available for users. Note that it has a scheduled slot for further benchmarking at Tuesday July 8th starting at 16:00. Jobs need to ask for walltime shorter than that.

Friday 9th of May, the backup system will be unavailable for a short time, because of a upgrade of our system. File systems like /migrate and /bcmhsm will be unavailable during this upgrade, which will start at 12:00 and be finished at 15:00.

Update: 15:30: Upgrade is finished.

Early on March 26th hexagon will be shutdown for the initial quad-core upgrade. We hope to be able to have parts of the machine up while the second half is upgraded. It will nevertheless mean that the entire machine will be taken down first, before being booted to a smaller size.The physical upgrade will probably take three days. There will then be some more days with tuning and reconfiguring.

One very important part of this is that ALL programs and libraries will have to be re-compiled when hexagon is booted up after the finished upgrade.

Wednesday, 09:00: Upgrade has started. Machine is now down for a while for diagnostics.

Wednesday, 12:30: Half of the machine is now running again, while the other half is being upgraded to quad-core. We expect to take the entire machine down Friday morning. Please consider the machine to be in testing state, so unannounced downtime might occure.

Wednesday, 16:45: The upgrade is ahead of schedule, therefore the machine will be taken down tomorrow around 10am.

Thursday, 12:00: Two racks are now running, which will run till tomorrow morning, Friday 28th, and then the entire machine will be shutdown at 8am. The machine will then stay down untill, at least, Monday.

Friday, 08:00: Hardware part of upgrade is now finished. The machine is now unavailable until the software, diagnostics and testing has finished.

Saturday, 17:00: Main part of software upgrade is finished. The machine is running, but is unavailable due to testing.

Tuesday, April the 1st, 18:00: Hexagon is now available again, see http://www.parallaw.uib.no/syslog/153 for more details.

The scheduled maintenance of the fimm cluster is now (mostly) complete. Please note the following changes:

- Cluster is now running Rocks 4.3 which is based on CentOS 4.5
- Login to fimm.bccs.uib.no now ends up on one of the compute nodes acting as a login node. Currently this is called compute-1-14.
- Compilers are upgraded to Intel 10.0 and PGI 7.0
- Totalview is upgraded to 8.2
- MPI libraries are upgraded and located in /local
- Several libraries and programs in /local is upgraded

All jobs that were waiting on the old queue need to be submitted again into the new queue after the upgrade.

Send questions to support-uib@notur.no

Due to the new security updates installed, tre must be rebooted. This will hopefully also solve problems with totalview debugger.
Expected downtime: 1h (starting from Mon, 10:00)

Update: Mon, 12:45 - disk import problem caused a longer dowtime. Everything should be up and running again


Downtime: 2h 45'