Software

Several libraries and compilers have been updated on hexagon.

NOTE: We have found that the module xtpe-barcelona was not loaded by default for a time. If you have not loaded this manually your programs will not be fully optimized for hexagon. Please log out and in again and re-compile your programs.

Note also that "xt-atp" have changed name to "atp".

Updated libraries/compilers:

* xt-asyncpe 3.7
Bug Fixes and support for the CCE 7.2 compilers with DSLs.
* Libsci 10.4.2
OpenMP/SMP support and Dynamic share libraries support for
the CCE compiler.
* Trilinos 10.0.1
Performance enhancements.
* hdf5-netcdf 1.7
Support the CCE C++ ABI compliant compiler.
* MPT 4.0.2
Support the CCE C++ ABI compliant compiler.
* Cray Debugger tools
ATP 1.0.1
STAT 1.0.0
MRNet 2.2.0.1
Initial release of statview as part of STAT. Bug fixes to
ATP and MRNet.
* PGI 10.1.0 and 10.2.0
Bug Fix releases of PGI.
* GCC 4.4.3
Bug Fix releases of GNU.

More information:

xt-libsci:

Xt-libsci 10.4.2 contains dynamic shared libraries for Cray compiler.
This release also contains new dynamic shared libraries for barcelona,
istanbul and mc12 hardware.

The multi-threaded libsci implementation has been significantly enhance
for the Shared Memory Parallel programs. The new implementation uses
OpenMP, therefore, the previous environment variable GOTO_NUM_THREADS is
no longer used.
Performance improvements of 2X or more are common on multi-threaded
Level 2 BLAS routines, and significantly improved on Level 3 BLAS
routines, when running with OMP_NUM_THREADS greater than 1.

Loader Options for OpenMP Support.
To use the OpenMP libraries, you need to use the link-time options as
specified below. The examples below are for the Istanbul processor.

module load xtpe-barcelona
PGI
cc -mp foo.c *.o -lsci_quadcore_mp
ftn -mp foo.f90 *.o -lsci_quadcore_mp
GNU
cc -fopenmp foo.c *.o -lsci_quadcore_mp
ftn -fopenmp foo.f90 *.o -lsci_quadcore_mp
INTEL
cc -openmp foo.c *.o -lsci_quadcore_mp
ftn -openmp foo.f90 *.o -lsci_quadcore_mp
PATHSCALE
cc -mp foo.c *.o -lsci_quadcore_mp
ftn -mp foo.f90 *.o -lsci_quadcore_mp

Trilinos:

Trilinos is an object-oriented and componentized framework for
scientific computation, and as such allows greater flexibility,
control, portability and performance than a collection of custom
or independent solvers. The CASK library (Cray Adaptive Sparse
Kernels) is integrated with Trilinos to provide extra performance
with no additional involvement required by the user. The Cray
Trilinos package therefore enables the full productivity advantages
of the Trilinos framework while providing solvers tuned specifically
to the Cray XT hardware.

The Trilinos release 10.0.1 includes improved Cray Adaptive Sparse
Kernels (CASK) routines for sparse matrix vector multiplication with multiple vectors. Applications using Epetra will gain some performance benefits from this improvement.

The following software/libraries have been updated on Hexagon:

* Trilinos 10.0.0
Initial release of the Trilinos libraries.
* PETSc 3.0.0.9
New dynamic libraries for the Cray Compiler.
* FFTW 2.1.5.2
Add .so libraries for dynamic linking. No other code changes.
* MPT 4.0.1
Bug fixes.
* xt-asyncpe 3.6
Bug fixes.
* Intel Compilers 11.1.064
New release.

Features and Bug fixes in updates

Trilinos 10.0.0
Initial release of Trilinos by Cray Inc.
Documentation:
http://trilinos.sandia.gov/index.html

PETSc 3.0.0.9
New MUMPS 4.9.2
http://mumps.enseeiht.fr/index.php?page=dwnld#cl

Known Problems:
For the PETSc, netCDF, and HDF5 libraries:

The loading of programming environment modulefiles modify the LD_LIBRARY_PATH environment variable, so it can be used to determine the versions of dynamic shared libraries to be used by an executable at runtime. An open issue exists for some programming environment modulefiles in that the LD_LIBRARY_PATH is not updated accordingly if a PrgEnv modulefile is swapped from one compiler to another. For example:

$ module load PrgEnv-pgi
$ module load petsc
$ module swap PrgEnv-pgi PrgEnv-gnu

The LD_LIBRARY_PATH will continue to point to the PGI version of the PETSc library instead of the expected GCC version of the PETSc library. The programming environment library modulefiles affected by this issue are: acml, hdf5, hdf5-parallel, netcdf, netcdf-hdf5parallel, petsc, and petsc-complex. Programming libraries that are loaded by the PrgEnv modulefile, such as xt-mpt and xt-libsci, are not affected by this issue.

The LD_LIBRARY_PATH environment variable can be correctly set by unloading and loading the programming library modulefile. For example:

$ module load PrgEnv-pgi
$ module load petsc
$ module swap PrgEnv-pgi PrgEnv-gnu
$ module unload petsc
$ module load petsc

FFTW 2.1.5.2
The shared object library files are included in this package for use when dynamically linking applications. This is bug:
755718 .so files in fftw 2.1.5.1 missing from both MOM nodes and login

MPT 4.0.1
Bugs Fixed in MPT 4.0.1:
755075 MPICH2 threads/comm/ctxdup.c fails with "Too many communicators" in 4.0.0.3 vs 3.5.1"
755698 MPI_Allgatherv hangs when using thread-safety
*NOTE: will work after we update to CLE2.2 (Feb. 8th)

xt-asyncpe 3.6
Bugs fixed in the xt-asyncpe 3.6 update:
755715 trilinos module doesn't swap after PrgEnv-swap

Intel Compilers 11.1.064
New version

From now the default ssh daemon and client on hexagon login nodes is openssh with enabled HPN patch (http://www.psc.edu/networking/projects/hpn-ssh/).

This was done primary to allow faster data transfers from and to hexagon. To utilize new opportunities on client side users should use HPN patched OpenSSH version and additional flags.

Please read FAQ.

Update: 04/01 11:27 Switch suspended, meanwhile hpn enabled openssh can be used as before on port 4222.

Update: 13/01 11:00 hpn enabled openssh is now back on port 22, as an default ssh server.

PGI compiler have been updated on hexagon to version 10.0.0

Features of PGI 10.0.0 are documented at:
http://www.pgroup.com/doc/pgiwsrn100.pdf

The following bugs are fixed in the PGI 10.0.0 release.
730617 IVDEP BEFORE ARRAY ASSIGNMENT STATEMENTS [TPR 3425]
738243 LOOP WITH !PGI$ IVDEP NOT VECTORIZED [TPR 4161]
745962 An OpenMP code compiled using PGI 7.2.5 aborts if running on more than one thread.
751609 PGI pgcc with -Xa or -Xc options terminates with signal 11 for misspelled omp_set_num_treads(2);[TPR 16165]
751779 pointer allocation fails with PGI [16024]
752199 PGI OpenMP C++ output is not ordered using a random access iterator. [TPR 16119]
752705 pgcc produces incorrect run-time message for int main(void) {int n=4; int vla[n]; vla[0] = 1;} compiled with -Mbounds[16118]
752946 PGI OpenMP pgcpp issues "PGCC-S-0155-Illegal context for ordered " for pgm compiling OK with pgcc [TPR 16170]
752956 PGI OpenMP C++ omp for schedule(static,1) not being observed for C++ iterators [TPR 16171]
753210 pgCC issues 'warning: variable "buf" was declared but never referenced' when not deserved for OpenMP pgm [TPR 16172]
753339 PGI OpenMP ftn/cc omp_get_schedule returns incorrect output when env var OMP_SCHEDULE=auto [16144]
753349 PGI pgf90 include file omp_lib.h missing omp_sched... parameters [16145]
753520 pgf90 OpenMP - Definitions for omp_sched_* missing from omp_lib.h (new to ver 3 API) [16145]
753786 pgf901: TERMINATED by signal 11, if -O arg is > 1, while compiling Dynamo [16190]
754678 incorrect value for setenv OMP_SCHEDULE auto

The following software/libraries have been updated on Hexagon:

* MPT 4.0.0
Feature release and bug fixes.
* libpmi 1.0
Initial release, This was previously released as part of MPT.
* libpmi-devel 1.0
Initial release, This was previously released as part of MPT.
* xt-asyncpe 3.5
Bug fixes.
* hdf5-netcdf 1.6
Update to version 1.8.4 of hdf5 and bug fixes.
* Java 6.0.17
Security updates.

The following products where removed:

* MPT 3.4.0 and 3.4.1
* Java 6.0.15

Features and Bug fixes in updates

MPT 4.0.0
Features:
* The MPICH2 version which this MPI is based was upgraded from version
1.0.6p1 to 1.1.1p1 and contains the following main features:
- MPI 2.1 Standard support (except dynamic process management)
- MPI-IO supports MPI_Type_create_resized and
MPI_Type_create_index_block datatypes
- Many bug fixes from ANL

* Faster MPI_Allgatherv below 2KB msgs
(10x - 2000x faster based on runs up to 96K pes)

* Faster MPI_Scatterv above 2KB msgs (20% - 80% faster)

* Improved shmem_clear_lock (4x improvement)

* Added SHMEM_ABORT_ON_ERROR env variable for SHMEM programs

* Medium Memory Model is now supported when using the shared libraries
(exception is for the CCE compiler which is planned for a future
release)

* Intel MPI header file compatibility - Initial step in providing
binary compatibility with Intel MPI built applications

* The Process Manager Interface (PMI) which is used to launch MPI
and SHMEM applications is now released as a separate product

Bugs Fixed:
735083 - REQUEST FOR MPICH BUILT FOR -MCMODEL=MEDIUM
744296 - MPI_Type_create_f90_*/MPI_Type_get_envelope not returning
expected value
748615 - If mpi_info_get KEY doesn't exist routine returns FLAG=false,
and output VALUE should be unchanged, but isn't
749487 - MPI_Type_commit dies with Assertion failed
.../datatype/dataloop/segment_ops.c at line 351:
*blocks_p > 0
749504 - Non-commutative right_op in MPI_Reduce_scatter gives
incorrect
output in test redscat2.c
749707 - C MPI datatypes not defined in include files mpif.h and
mpi.mod.
751119 - MPT 3+: MPT_Cancel assertion following MPI_Irecv call with
MPI_PROC_NULL as source
754531 - Segfault when MPI_Cancel[ing] an MPI_Irecv from MPI_NULL_PROC
754969 - mpi_info_get' behavior conflicts with MPI standard

libpmi and libpmi-devel 1.0
Libpmi and libpmi-devel were previously released as part of the MPT libraries. Release 1.0 is the first packaged independent of MPT.

xt-asyncpe 3.5
Bugs fixed in the xt-asyncpe 3.5 update:

753825 xt-asyncpe 3.3 uses NETCDF_DIR which reintroduces bug 745951
754712 xt-asyncpe/3.2 and higher not supplying -default64 shmem
library for generic shmem routines
755409 MPT 4.0.0.4 testing -- undefined reference to
`PMI_Get_universe_size' - due to driver support
754643 The CADES xt-pe needs to add the /opt/xt-pe path to
LD_LIBRARY_PATH
755593 PE drivers do not supply an alps runtime path

hdf5-netcdf 1.6
Update to HDF5 1.8.4 and bug fixes.
CCE built real64 netcdf libraries have been added.

Bugs Fixed:
755293 - Need a version of netcdf built with cce using -sreal64.

Java 6.0.17
Security fixes.
Bugs fixed:
754848 Multiple Vulnerabilities in Sun Java JDK/JRE

The module package has been updated on hexagon to version 3.1.6.5

New Features:
The "module avail" command is enhanced with filtering options, -U, -L, -T, -P and -D. These options to "avail" control the lists of avail output by product type. To see the complete explanation of feature usage and of some configuration options, read "man module".

Bugs fixed in this release:
742630 'SET-ALIAS' MODULEFILE OMMAND DOESN'T WORK
749121 Have PE build and distribute modules as an async product rpm
750364 Suse modules-3.1.6 rpm has a bug
751441 Woud like CRAY repackage these files into a more up-to-date version of the modules
752915 Init files for "modules" needs to include setting up its man path
754456 Add options to "module avail" for more productive listings

The following software/libraries have been updated on Hexagon:

* Libsci 10.4.0

Cray Adaptive Fast Fourier Transform (CRAFFT) 2.0

* PETSc 3.0.0.7

Bug fix.

* MPT 3.5.0

MPI I/O Collective buffering enhancement and Bug Fixes.

* FFTW 3.2.2.1

Bug Fix.

* PGI 9.0.4

Bug fixes.

* Intelsup 11.1.056

Module file support for the Intel 11.1.056 compilers.

* netcdf 3.6.2

Re-release of netCDF with a name change to netcdf.

* hdf5-netcdf 1.5

Bug fix.


xt-libsci
The xt-libsci 10.4.0 release contains CRAFFT 2.0.
Cray Adaptive Fast Fourier Transform (CRAFFT) 2.0, packaged
with xt-libsci, adds new functionality to calculate 2d and 3d
double precision, complex-to-complex distributed memory Fourier
transforms. Compared to other parallel FFT libraries, CRAFFT
offers a simpler interface to improve application developer
productivity. In many cases the performance of the CRAFFT 2.0
distributed transforms is better than FFTW2 MPI transforms.
For example, using 2d FFT with transposed output for power-of-two
sizes, performance improvements can be from 10% up to 50% better
than FFTW2 MPI.
Users requiring more information on usage should see the
intro_crafft manpage.

PETSc
Bug fixed in PETSc 3.0.0.7: 753164 CASK Performance problem

xt-mpt
Features:

SHMEM_SWAP_BACKOFF enabled by default

A backoff algorithm has been in the shmem_swap and shmem_cswap
routines since MPT 3.0.0 but was not enabled by default.
It is now enabled by default with a multiplier value of 100.
This multiplier can be adjusted using the SHMEM_SWAP_BACKOFF
environment variable. The number of shmem_swap and shmem_cswap
calls and the number of backoffs done can be displayed by setting
SHMEM_SWAP_BACKOFF_BACKOFF_STATS to a value greater than 1.

MPI I/O Collective buffering enhanced for read

The collective buffering algorithm number 2, which is default,
has been enhanced for reads. This improves read performance in
some cases.

Bugs Fixed:

752391 No libmpich_threadm.a for PrgEnv-gnu
753298 shmem_set_lock fails when -N > 1
753540 MPI-IO related error with xt-mpt/3.3.0 and above

pgi
Features of PGI 9.0.4 are documented at:

http://www.pgroup.com/doc/pgiwsrn904.pdf

The following bugs are fixed in the PGI 9.0.4 release.

730860 SUPPORT FORTRAN 2003 "PROCEDURE" STATEMENT IN PGF90
COMPILER [TPR 3450]
752119 pgf90 -gopt produces symbols that gdb can't process [16040]
752407 PGI internal error when using ipa=fast [16068]
752456 PGI 9 compilation fails with long path to fail [16061]

hdf5 netcdf
Bugs Fixed: 753300 - HDF5 1.8.3.0 missing libraries compared to 1.8.2.3