Cirrus Service Status

Current System Load

The plot below shows the status of the CPU nodes on the current Cirrus service for the past day.

A description of each of the status types is provided below the plot.

CPU

Cirrus Node Status graph

  • alloc: Nodes running user jobs
  • idle: Nodes available for user jobs
  • resv: Nodes in reservation and not available for standard user jobs
  • down, drain, maint, drng, comp: Nodes unavailable for user jobs
  • mix: Nodes in multiple states

Service Alerts

StatusStartEndScopeImpactReason
Planned2026-06-09 10:002026-06-26 18:00Full systemPossible interruption to login access and compute node availabilityWork to expand Cirrus to 640 compute nodes

Recently Resolved Service Alerts

This table lists the last five resolved service alerts A full list of historical resolved service alerts is available.

StatusStartEndScopeImpactReason
Resolved2026-05-29 11:002026-05-29 12:15Compute nodesNo new jobs can start on Cirrus. Running jobs continue as usual.Issues with cooling infrastructure
Complete2026-05-18 09:002026-05-20 18:00ceph Work Storage systemPossible slower than normal responses whilst the update works take placeMaintenance on the ceph /work storage system
Resolved2026-05-01 13:202026-05-01 14:05CoolingNew work has been prevented from starting to reduce load on the cooling systemIssues with cooling infrastructure
Resolved2026-04-30 15:552026-04-30 18:00CoolingNew work has been prevented from starting to reduce load on the cooling systemIssues with cooling infrastructure
Resolved2026-04-30 10:302026-04-30 12:30Login accessNo login access to CirrusAn emerging security threat

Service Maintenance Sessions

We keep maintenance downtime to a minimum on the service but do occasionally need to perform essential work on the system. Maintenance sessions are used to ensure that:

  • software versions are kept up to date;
  • firmware levels on HPE and third-party peripheral equipment are kept up to date; essential security patches are applied;
  • failed/suspect hardware can be replaced;
  • new software can be installed; periodic essential maintenance on HPE electrical and mechanical support equipment (refrigeration systems, air blowers and power distribution units) can be undertaken safely.

Additional maintenance sessions can be scheduled for major hardware or software updates; major upgrades to facility plant and infrastructure; acceptance testing following major service upgrades and statutory electrical testing.

StatusStartEndScopeImpactReason
Planned2026-06-22 08:302026-06-24 13:00Compute nodesAll Cirrus compute nodes will be unavailable. Login access will remain available at risk.Acceptance testing of expanded Cirrus system - dates may change depending on installation progress
Planned2026-06-02 08:302026-06-09 10:00Compute nodesAll Cirrus compute nodes will be unavailable. Login access will remain available at risk.Work to expand Cirrus to 640 compute nodes

A list of all previous maintenance sessions.