Maintenance Sessions
List of all maintenance sessions for Cirrus
| Status | Start | End | Scope | Impact | Reason | 
|---|---|---|---|---|---|
| Completed | 2025-08-29 09:00 | 2025-09-18 15:40 | Users will not be able to connect to Cirrus and will not be able to access data on any of the Cirrus file systems. The system will be drained of jobs ahead of the power outage and jobs will not run during this period. Any queued jobs will remain in the queue during the outage and jobs will start once the service is returned. SAFE and the Cirrus website will be available. | Due to a significant Health and Safety risk, associated with our power supply to the site, action is required at the Advanced Computing Facility (ACF). There will be a full power outage to the site during this period. Specialised external contractors will be working on a 24/7 basis for the outage period replacing switchgear. | |
| Completed | 2025-08-24 12:00 | 2025-09-11 21:00 | Full Cirrus system | No login access, no access to storgae systems, no jobs running | Major electrical work at the ACF datacentre | 
| Completed | 2025-02-03 08:00 | 2025-02-03 16:00 | No login access No access to any data on the system Jobs will not run. | Essential work on E1000 which hosts the cirrus work file system | |
| Completed | 2024-06-03 08:00 | 2024-06-04 11:00 | No login access No access to any data on the system Jobs will not run. | Cooling Ditribution Unit (CDU) maintenance for Cirrus. System will be brought back with new boot image which includes an updated CUDA driver. | |
| Completed | 2024-03-12 09:00 | 2024-03-12 17:00 | No login access No access to any data on the system Jobs will not run, and queued jobs will be deleted. | Migration to E1000 including the change in authentication protocol and addition of new file system. | |
| Completed | 2023-09-18 09:00 | 2023-09-22 11:55 | No login access No access to any data on the system Jobs will continue to run, and queued jobs will be started as usual The SAFE will be available during the outage but there will be reduced functionality due to the unavailability of the connection to ARCHER2 such as resetting of passwords or new account creation. | Upgrade of network | |
| Completed | 2023-07-25 14:00 | 2023-07-26 09:20 | Cirrus will not be available to users. This includes the login nodes, compute nodes and access to the filesystems. We will notify users when Cirrus is returned to service. | A fix to the volume issue on the Cirrus CXFS /scratch file system which will be performed by the vendor, HPE. | |
| Completed | 2023-05-15 16:00 | 2303-05-19 18:00 | The solid-state storage (/scratch) will be unavailable on Cirrus from Monday 15th May at 1600. We expect the disk to be unavailable until Friday 19th May but we will notify users once it is available again. This means that users will not be able to access any data on the /scratch filesystem during this time. | The maintenance is to improve the resiliency and reliability of the solid-state storage (/scratch) by applying software updates, failover policy implementation and deploying additional packages. | |
| Completed | 2023-02-07 09:00 | 2023-02-07 17:00 | CPU and GPU compute nodes will be unavailable. Login access and access to data will still be available. | Essential maintenance to the Cirrus liquid cooling system. | |
| Completed | 2022-12-07 09:00 | 2022-12-07 17:00 | Cirrus will not be available to users. This includes the login nodes, compute nodes and access to the filesystems. We will notify users when it is returned to service. | Upgrade to the slurm batch scheduler. | |
| Completed | 2022-02-21 09:00 | 2022-03-16 17:00 | There will be a full rebuild of the Cirrus Service. It will be unavailable during maintenance session. | Attach new storage storage, bring the system software up to date. | |
| Completed | 2021-12-01 09:00 | 2021-12-01 17:00 | Full system will be unavailable during maintenance session. | Third-party maintenance on cooling system. | |
| Completed | 2021-10-27 09:00 | 2021-10-27 17:00 | Period of up to 30mins when external connections are not possible. Compute nodes will continue to run jobs. | Network upgrade at the Advanced Computing Facility (ACF) | 
 
