Author Archives: Alexander Oltu

Most of the login nodes are having high disk (IO) load currently mostly due to copying process going on.

You can find less busy nodes by the following workaround:

 module load pdsh
 pdsh -w login[1-5] uptime
login2: 11:05am up 14 days 19:06, 18 users, load average: 4.62, 4.55, 3.98
login3: 11:05am up 14 days 19:06, 7 users, load average: 2.47, 2.96, 2.89
login1: 11:05am up 14 days 19:06, 9 users, load average: 16.21, 11.97, 13.34
login4: 11:05am up 14 days 19:06, 13 users, load average: 0.68, 0.31, 0.21
login5: 11:05am up 14 days 19:06, 8 users, load average: 40.72, 35.99, 23.38

In this example login4 is less busy and login5 is totally overloaded, you can ssh to login4 and try working on it.

We will see what we can do to decrease effect of the file transfers on the interactive user sessions. As a general rule we can recommend to you to run file transfers at night to decrease disk load on the login nodes interactive sessions.

Due to physical rearrangements in the server room the tape robot hosting /migrate and /bcmhsm will be unavailable today after 12:00 for several hours. Updates will be posted here.

Update 2017-09-11:

Uni Computing is experiencing troubles with the backend holding /migrate and /bcmhsm and it is unknown yet when this will be fixed. As these file systems were supposed to be already decommissioned earlier this year in June, we will not mount those back in ordinary place even after the file systems are healthy. However, we will finish transfer of IMR/HI files as it was agreed as soon as the filesystem is healthy. We will issue a separate update for this.
Other users  than IMR/HI needing files from those file systems are advised to contact Uni Computing helpdesk at trouble@computing.uni.no.

The following new software have been INSTALLED:
  • Intel Compilers 2017 Update 4 (module: intel/17.4.056 (non-default))
The following unused software have been REMOVED:

  • PGI
    • 13.9.0
    • 14.2.0
    • 14.3.0

  • Intel Compilers 12.1.5.339
  • Chapel

    • 1.6.0
    • 1.7.0.2
    • 1.8.0 
    • 1.9.0
    • 1.10.0
  • Cray-Libsci
    • 12.0.02
    • 12.2.0
    • 13.3.0
  • GCC
    • 4.6.1
    • 4.8.1
    • 4.8.2
    • 5.2.0
  • Cray-Trilinos
    • 11.12.1.2
    • 11.12.1.5
It could be that something stopped working for you because of  extra dependencies on the software removed. If that is the case please let us know via regular support, we would be happy to help you.

There is a scheduled maintenance on UPS and UPS power lines in HPC server room on Saturday, 20th Feb. All HPC resources will be stopped at 8:30, we are expecting this maintenance to finish before 17:00 same day.

Hexagon, FImm, Grunch and other connected to them resources will be unavailable. Hexagon queuing system has reservation in place, so that jobs which are not able to finish before the maintenance will not be started.

Update:
2016-02-20 07:45
 System maintenance has started.
2016-02-20 16:30 /work-common filesystem storage got damaged, recovery progress is ongoing.
2016-02-21 07:30 System maintenance has finished, HPC systems are functional again.