There will be a scheduled maintenance on Hexagon on June 16th starting from 9:00. We are expecting to finish on the evening of the same day.
During this maintenance slot we are going to upgrade queue system and perform some extra tasks, including replacing IO card on the metadata server.
Access to the machine will be closed and all running jobs will be terminated during this maintenance window. The queuing system has reservation in place so that the jobs which are not able to finish before the maintenance will not start. We are expecting that the idle jobs in the scheduler will not be affected.
Update: 2015-06-16 09:15 - Scheduled maintenance has started.
Update: 2015-06-16 23:48 - Maintenance has finished. We had to cleanup queue system from all jobs including idle and blocked. Please resubmit.
We have problems with /work filesystem. We are looking into the problem.
Update 13:05: Issues were remediated and filesystem is available again. Please contact us in case you still encounter issues accessing it.
Cray Debugging Support Tools - CDST 15.04 - ATP 1.8.1, CCDB 1.0.6, lgdb 2.4.2
Cray Scientific and Math Libraries - CSML 15.04 - PETSc 3.5.3.0, Trilinos 11.12.1.2, TPSL 1.4.4
Cray Environment Setup and Compiling support - CENV 15.04 - craypkg-gen 1.3.1, craype 2.3.0, cray-modules 3.2.10.3
Please find more details about Chapel here and other packages here.
We will change default versions very soon to the ones from the previous PE install, a separate notice will come.
Someone managed to kill login3 and login4 by oversubscribing memory. As a result, the jobs started from these login nodes were killed.
We've started these login nodes and will investigate reasons tomorrow.
We have once in a while NFS timeouts on different login nodes, the user logged in experience them as a short hangs. This been going for some last week, but not that often. The last week it started to be very often and almost on all nodes.
We've applied patch which is suppose to fix this issue. In order for changes to be picked up we need to restart Hexagon.
Update 15:30: Hexagon is up again.