Job description for HPC Storage Engineer at Techdirect
Responsible for administration, performance tuning, capacity planning, monitoring, troubleshooting, and support of enterprise parallel file systems and storage platforms supporting HPC workloads.
Key Responsibilities
Parallel Filesystem Administration
Manage HPC storage infrastructure
Administer Lustre, IBM GPFS (Spectrum Scale), BeeGFS or equivalent
Storage provisioning
Quota management
Namespace administration
Performance Management
Storage performance tuning
I/O bottleneck analysis
Throughput optimization
Latency troubleshooting
Capacity forecasting
Data Protection
Snapshots
Backup and restore
Data integrity checks
Failover testing
Disaster recovery support
Storage Operations
Daily health checks
Scrub monitoring
Error counter analysis
OEM advisory implementation
Storage incident management
Capacity Planning
Utilization monitoring
Growth forecasting
Risk reporting
Capacity recommendations
Mandatory Skills
HPC Storage
Must have experience with one or more:
Lustre
IBM Spectrum Scale (GPFS)
BeeGFS
CephFS
Enterprise Storage
Dell PowerScale / Isilon
NetApp
Pure Storage
HPE Alletra
Lenovo ThinkSystem
Linux
RHEL
Rocky Linux
Linux storage management
Performance Analysis
IOPS analysis
Throughput monitoring
Storage benchmarking
Capacity analytics
Experience Requirements
Prefer 3–5 years storage engineering experience
Experience supporting HPC environments
Experience in enterprise storage platforms
Certifications
Mandatory:
ITIL Foundation
Preferred:
Dell Storage Certification
NetApp NCDA
IBM Spectrum Scale Certification
Red Hat Storage Certification
