PACE clusters (mostly) ready for research

Greetings,

We’ve made substantial progress getting through our activities, and are releasing jobs.  We still have a number of compute nodes that still need to be brought online, however all clusters have some amount of resources and are running jobs.  We will continue to work through these issues later today.  After sleep.

 

Major upgrade to DDN & a new scratch storage

All data migrated successfully to new front ends, additional disks have been added for upcoming scratch.  Substantial delays due to unanticipated long running processes to join compute nodes to the new GPFS cluster.  This work is still ongoing.  Benchmarking suggests a slight performance improvement for those of you with project directories in GPFS.

New PACE router and firewall hardware & additional core network capacity

successfully completed without incident.

Panasas scratch filesystem maintenance

successfully completed without incident.

Migration of home directories

successfully completed without incident.

Migration of /usr/local storage

successfully completed without incident.

Begin transition away from diskless compute nodes.

migrated approximately 100 compute nodes.  Some of these still have issues with GPFS, as above.