Access Information
How to make use of NIWA's high performance computing facility.
The operational principles for the HPCF are:
- The HPCF is a major strategic asset for NIWA and NZ, much like the research vessel Tangaroa. Its primary purpose is to enable NZ scientists to carry out nationally and internationally significant research and run time-critical production workloads that require access to a capability-class supercomputer, and to enhance international research collaborations.
- HPCF user information services, including documentation, real-time system information and issue tracking, will be seamlessly accessible to NIWA and External users.
- User requirements will inform management of the HPCF.
- HPCF Projects will contribute towards the running costs of the HPCF on a fair and equitable basis.
These principles translate into the following requirements.
Job Sizes and Priorities
- The minimum processor count (unless it forms part of a sequence of jobs that form a whole and uses less than ~2% of the wall clock time to run the whole) will be 16 physical (or 32 logical) cores. The majority of jobs are expected to consume the resources of multiple p575/p6 nodes with the number of nodes utilised limited by scalability or the total number of nodes available.
- Jobs running in queues will have access to resources for up to 4 hours, after which time they will exit and re-submit themselves to the same queue
- Operational production jobs that must run on-time will have precedence over research jobs when there is contention for resources, or the HPCF is operating in a degraded mode.
Information Sharing
- HPCF user information will be shared via: Web / wiki / issue tracking pages that are seamlessly accessible to both NIWA and external users – however, not all web / wiki / issue tracking pages will be accessible to all users.
- All documentation (i.e. language manuals, systems manuals (AIX, LoadLeveler, TSM, TSM-SM, etc.) and relevant Redbooks (including previous versions and the latest revisions) will be locally accessible from the HPCF Web/Wiki pages.
- Unless commercially sensitive, HPCF projects and a summary of science outputs will be published on the HPCF website.
- Operational issues, key (science) advances, insights, and problems experienced etc. will be communicated via a monthly video conference forum (open to HPCF users and NIWA staff) where users and HOG staff will be encouraged to share their insights in the use of the HPCF. Similarly – an annual “all-hands” meeting is proposed - to provide a wider forum for exchanges of ideas and to showcase HPCF science outputs.
- The HPCF Sys Admins will be available to provide systems level training for users, will be directly accessible (i.e. call on them personally, telephone, etc.) to users, and available to work on science related projects (will need to be agreed (through the HPCF Manager) in advance if likely to be more than a day or two of effort).
- The HPCF Scientific Programmer will mentor users on OpenMP / MPI / Language options, and debugging tools etc. provide an agreed (i.e. in consultation with he HPCF Manager, the Scientific Programmer and the relevant model user group) level of (computer science, rather than science) support for key codes and work with users to optimise their codes for running on the p575/p6 and BladeCenter hardware. This person will likely be a key member of relevant HPCF Project science teams.
- External users will be able to export their data back to their home institutions for post HPCF analysis.
Key People
- HPCF Manager: Dr Michael Uddstrom
- Systems Engineer: Mr Chris Edsall
- Systems Engineer: TBA
- Scientific Programmer: TBA