Safe, sanitized Bash examples for planning and executing a GPFS / IBM Spectrum Scale filesystem expansion. The scripts are written as readable operational examples for a Linux Infrastructure Engineer: conservative defaults, clear validation, dry-run behavior, and explicit operator confirmation before changes.
These scripts are examples. Exact GPFS commands, flags, quorum practices, failure-group design, and storage naming standards vary by Spectrum Scale version and site policy.
- **Cluster** - the Spectrum Scale administrative domain containing the nodes, daemon configuration, quorum policy, filesystems, and NSDs.
- **Node** - a server participating in the GPFS cluster. Nodes may be clients, NSD servers, quorum nodes, manager-capable nodes, or a mix of roles.
- **Quorum** - the voting mechanism that protects the cluster from split-brain conditions. Expansion work should not proceed during quorum instability.
- **Filesystem** - the GPFS namespace and data layout presented to clients, backed by one or more NSDs.
- **NSD** - Network Shared Disk, the GPFS abstraction for a disk or LUN that is served to the cluster.
- **Failure group** - a placement hint that tells GPFS which disks share a failure domain, such as an enclosure, rack, site, controller pair, or storage array.
- **Storage pool** - a named pool of NSDs used for placement and lifecycle policy, commonly `system` plus optional data pools.
- **Restripe/rebalance** - the operation that redistributes data after disks are added. It can be I/O intensive and should run only in an approved change window.
## Required Tools
Common GPFS / Spectrum Scale tools expected in production include:
-`mmgetstate`
-`mmlscluster`
-`mmlsfs`
-`mmlsdisk`
-`mmlsnsd`
-`mmcrnsd`
-`mmadddisk`
-`mmrestripefs`
The toolkit also uses common Linux tools such as `df`, `lsblk`, `findmnt`, `journalctl`, and `dmesg` where available. Missing optional commands are reported as `WARNING` and skipped.
## Safety Model
- Default mode is dry-run.
- Real GPFS modifications require `--execute`.
- Destructive or high-impact steps also prompt for `EXECUTE`.
- Disk detection is read-only and never partitions, formats, wipes, or modifies devices.
- Device selection must always be confirmed with the storage team and cluster owners.
- The scripts do not assume production disk names.
Output uses a consistent status format:
-`OK`
-`WARNING`
-`CRITICAL`
Exit codes:
-`0` - OK
-`1` - operational validation failure
-`2` - invalid input or missing requirement
## Scripts
-`00_env.sh` - shared configuration and helper functions.
Review the generated stanza with the storage and cluster teams. Confirm device identity, LUN masking, multipath naming, failure group placement, and site standards before continuing.