Some of the OSTs serving /work filesystem has become full and caused few jobs to fail. We are working on rebalancing the usage between the OSTs but it is fairly difficult since /work is 87% used at the moment. We have notified top users of /work filesystem to clean-up un-necessarry files.
Due to several bugs in the queuing system, affecting mostly OpenMP jobs, the
nodes and the
ppn directives are deprecated.
The new way of submitting OpenMP jobs is covered on the HPC docs site, available here:
https://docs.hpc.uib.no/wiki/Job_execution_(Hexagon)#Parallel.2FOpenMP_jobs.
Have you any question on how to change your script please contact us.
There is a scheduled maintenance on UPS and UPS power lines in HPC server room on Saturday, 20th Feb. All HPC resources will be stopped at 8:30, we are expecting this maintenance to finish before 17:00 same day.
Hexagon, FImm, Grunch and other connected to them resources will be unavailable. Hexagon queuing system has reservation in place, so that jobs which are not able to finish before the maintenance will not be started.
Update:
2016-02-20 07:45 System maintenance has started.
2016-02-20 16:30 /work-common filesystem storage got damaged, recovery progress is ongoing.
2016-02-21 07:30 System maintenance has finished, HPC systems are functional again.
Grunch had issues with the network connection. It was rebooted and should be very soon available.
We have installed new libraries, compilers and tools on hexagon.
Below you will find the complete list of the newly installed software:
- CCE 8.4.3
- Chapel 1.12.0
- Craype 2.5.1
- GCC 5.2.0
- FFTW 3.3.4.6
- Intel Compiler 16.0.1
- HDF5 1.8.16
- LibSCI 1.13.0
- MPI 7.3.1
- PGI 15.10.0
- Totalview 8.15.10
We have disabled passing users' environment over SSH due to multiple issues.
The default environment settings are:
LANG=en_US.UTF-8
LC_ALL=en_US.UTF-8
You can override it by adjusting your ~/.profile file.
The deadline for applying for NOTUR applications for period 2016.1 is 26 January.
See for more info: https://www.sigma2.no/content/reminder-call-proposals-20161
Scheduler system on Hexagon is not well working with mppdepth directive.
Due to increased usage of OpenMP on the machine we had to stop supporting mppdepth.
The new way of running OpenMP is as simple as it was before with mppdepth.
Updated documentation on how to run an OpenMP job you can find here
https://docs.hpc.uib.no/wiki/Job_execution_(Hexagon)#Parallel.2FOpenMP_jobs
Have you any question on how to change your script please contact us.
Hexagon went down due to power blink.
Update: 2015-12-10 20:47 Machine is up again.
We got High Speed Network link error caused by cabinet fall-outs.
Cabinets fall-out around 02:10 14-11-2015 most likely due to power spikes. We are still investigating the issue.
Update: since 04:45 14-11-2015 system is up again.