System maintenance is still ongoing, during the whole day today.
Update 2014.11.25 18:00 Due to unexpected behaviour during update we regret to inform that the maintenance has to be extended. Will will come later with further updates.
Update 2014.11.25 21:27 We have to postpone opening of hexagon due to issues with the scheduling system. We are working tightly with Cray to fix this issue.
Update 2014.11.26 20:33 Issues with the job submission system requires us to delay opening. It well can be that system will not be opened before next week. We try to fix it as soon as possible.
Update 2014.11.27 11:24 The majority of issues were resolved and Hexagon is now available. One of the main remaining issues is interactive job submission, which will be handled during next week, without stopping machine for an extra maintenance.
Author Archives: lsz075
Hexagon: /work-common maintenance finsihed
The scheduled maintenance of /work-common has finished and is available again on Fimm.
Hexagon: maintenance started
Hexagon maintenance has started as planned. Maintenance work is ongoing.
As part of maintenance /work-common will be unmounted, making it additionally unavailable on Fimm.
As part of maintenance /work-common will be unmounted, making it additionally unavailable on Fimm.
Hexagon scheduled maintenance on November, 24th 9:00
We will have a planned hexagon maintenance on November, 24th and 25th.
The maintenance will start on 24th at 9:00 and we expect it will take
around 2 days.
The job submission system has reservation in place such as the jobs
which are not able to finish before the maintenance will not be started.
We are planning to do the following during the maintenance window:
* Update Cray Management Software to 7.2.UP02
* Update Cray Linux Environment to 5.2.UP02
* Update Lustre file system on /work and /work-common
* Apply different firmware updates and patches
* Install newer libraries, compilers and tools
IMPORTANT!: After the maintenance, the default MPT will be 7.x (MPICH
3.1). All software will have to be recompiled. We will come with the
details and options after the maintenance.
The maintenance will start on 24th at 9:00 and we expect it will take
around 2 days.
The job submission system has reservation in place such as the jobs
which are not able to finish before the maintenance will not be started.
We are planning to do the following during the maintenance window:
* Update Cray Management Software to 7.2.UP02
* Update Cray Linux Environment to 5.2.UP02
* Update Lustre file system on /work and /work-common
* Apply different firmware updates and patches
* Install newer libraries, compilers and tools
IMPORTANT!: After the maintenance, the default MPT will be 7.x (MPICH
3.1). All software will have to be recompiled. We will come with the
details and options after the maintenance.
Hexagon: removed software
In order to prepare space for the software updates, we've removed the following packages:
trilinos
11.4.1.0
11.2.2.0
10.12.1.1
11.0.3.0
libsci
12.0.00
12.0.01
12.1.00
petsc
3.3.04
3.4.2.0
3.4.3.0
PGI
13.8.0
14.2.0
xt-libsci
11.0.05
11.1.00
gcc
4.6.1
4.6.2
4.7.0
4.7.1
4.7.2
The notification was been sent on 17 Sep 2014 15:04:17
trilinos
11.4.1.0
11.2.2.0
10.12.1.1
11.0.3.0
libsci
12.0.00
12.0.01
12.1.00
petsc
3.3.04
3.4.2.0
3.4.3.0
PGI
13.8.0
14.2.0
xt-libsci
11.0.05
11.1.00
gcc
4.6.1
4.6.2
4.7.0
4.7.1
4.7.2
The notification was been sent on 17 Sep 2014 15:04:17
Hexagon: SSH server and ECDSA key updated
We have updated the SSH server and the ECDSA key due to security reasons.
Since the ECDSA key has changed you might get a warning while connecting to hexagon, similar to the following:
---------------------------------------------------------------------------------------------------------------------------------
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the ECDSA key sent by the remote host is
-- LINE REMOVED BY SYSADMIN --.
Please contact your system administrator.
Add correct host key in ~/.ssh/known_hosts to get rid of this message.
Offending ECDSA key in ~/.ssh/known_hosts:1
remove with: ssh-keygen -f "~/.ssh/known_hosts" -R hexagon.bccs.uib.no
ECDSA host key for hexagon.bccs.uib.no has changed and you have
requested strict checking.
Host key verification failed.
-------------------------------------------------------------------------------------------------------------------------------
Please note that other keys were not changed and if you get similar error but not referring to the ECDSA key, do not follow the procedure below.
Please run the following commands:
ssh-keygen -f "~/.ssh/known_hosts" -R hexagon.bccs.uib.no
ssh-keygen -f "~/.ssh/known_hosts" -R hexagon
In case you still encounter problems connecting to hexagon please run your ssh command with "-v" option for verbose output and post it to HPC support for help.
Since the ECDSA key has changed you might get a warning while connecting to hexagon, similar to the following:
---------------------------------------------------------------------------------------------------------------------------------
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the ECDSA key sent by the remote host is
-- LINE REMOVED BY SYSADMIN --.
Please contact your system administrator.
Add correct host key in ~/.ssh/known_hosts to get rid of this message.
Offending ECDSA key in ~/.ssh/known_hosts:1
remove with: ssh-keygen -f "~/.ssh/known_hosts" -R hexagon.bccs.uib.no
ECDSA host key for hexagon.bccs.uib.no has changed and you have
requested strict checking.
Host key verification failed.
-------------------------------------------------------------------------------------------------------------------------------
Please note that other keys were not changed and if you get similar error but not referring to the ECDSA key, do not follow the procedure below.
Please run the following commands:
ssh-keygen -f "~/.ssh/known_hosts" -R hexagon.bccs.uib.no
ssh-keygen -f "~/.ssh/known_hosts" -R hexagon
In case you still encounter problems connecting to hexagon please run your ssh command with "-v" option for verbose output and post it to HPC support for help.
Hexagon: /work down because of one of osses
One of OSSes has crashed this morning making /work unavailable.
We are working on fixing it.
Update 10:57: OSS was recovered and /work is available for use.
Update 11:35: File system became unstable once again. We are working on finding a solution.
Update 13:15: Issues were remedied and /work file system is stable since 12:15.
We are working on fixing it.
Update 10:57: OSS was recovered and /work is available for use.
Update 11:35: File system became unstable once again. We are working on finding a solution.
Update 13:15: Issues were remedied and /work file system is stable since 12:15.
Hexagon: scheduled maintenance on October, 2nd 9:00
The maintenance will start at 9:00 and we expect it will take around 12 hours.
The job submission system has reservation in place such as the jobs which are not able to finish before the maintenance will not be started.
During this timeslot we are going to do the following:
* Install newer libraries, compilers and tools
* Update Cray Management Software to 7.2.UP00
* Update Cray Linux Environment to 5.2.UP00
* Apply different firmware updates and patches
IMPORTANT!: After the maintenance, the default MPT will be 7.0.x (MPICH 3.1). All software will have to be recompiled. We will come with the details and options after the maintenance.
The detailed list of the new software being installed:
* CCE 8.3.3
* MPT 7.0.3
* PMI 5.0.5
* Perftools 6.2.1
* PAPI 5.3.2
* LibSci 13.0.1
* Trilinos 3.5.1.0
* GCC 4.9.1
* PGI 14.7.0
* HDF5 1.8.13
* Netcdf 4.3.2
* Parallel-NetCDF 1.5.0
* Craype 2.2.0
* ATP 1.7.5
* LGDB 2.3.2
* Stat 2.1.0.1
* Dwarf 14.2.0
* CCDB 1.0.3
* TotalView 8.14
The job submission system has reservation in place such as the jobs which are not able to finish before the maintenance will not be started.
During this timeslot we are going to do the following:
* Install newer libraries, compilers and tools
* Update Cray Management Software to 7.2.UP00
* Update Cray Linux Environment to 5.2.UP00
* Apply different firmware updates and patches
IMPORTANT!: After the maintenance, the default MPT will be 7.0.x (MPICH 3.1). All software will have to be recompiled. We will come with the details and options after the maintenance.
The detailed list of the new software being installed:
* CCE 8.3.3
* MPT 7.0.3
* PMI 5.0.5
* Perftools 6.2.1
* PAPI 5.3.2
* LibSci 13.0.1
* Trilinos 3.5.1.0
* GCC 4.9.1
* PGI 14.7.0
* HDF5 1.8.13
* Netcdf 4.3.2
* Parallel-NetCDF 1.5.0
* Craype 2.2.0
* ATP 1.7.5
* LGDB 2.3.2
* Stat 2.1.0.1
* Dwarf 14.2.0
* CCDB 1.0.3
* TotalView 8.14
Hexagon high speed network down
Hexagon is down because high speed network went down. We are working to fix it.
Update 21.08 13:05 Hexagon is up.
Update 21.08 13:05 Hexagon is up.
Hexagon: down because of cooling
Hexagon was shut down because of issues with cooling. We suspect that cooling went down because of thunderstorm.
We are working to fix the problem ASAP.
Update 03.08 00:50 Hexagon is up.
We are working to fix the problem ASAP.
Update 03.08 00:50 Hexagon is up.