Downtime

Update 2018-12-03 12:36:
  • Hexagon is up now.
  • Interconnect errors are cleared now and /work file system is up and functional again.
  • Unfortunately the previously submitted jobs had to be canceled. Please resubmit your jobs.

Dear Hexagon User,

We must reboot Hexagon due to repeated errors on the interconnect.
Will update this case when Hexagon is up and functional again.

login5 ran out of memory yesterday (27.02.2017) around 18:16 and took about 15 minutes to recover.

During this time the compute nodes were unable to contact the application scheduler running on login5 and some jobs might have crashed.
A typical error message for this case is: "aprun: Apid nnnnnnn: close of the compute node connection after app startup barrier".

We apologise for any inconvenience caused.