Chat on WhatsApp
Techdirect

HPC System Engineer

Techdirect
SGD5,500 - 7,500
Contract · On-site
3 - 5 years of experience

Job Requirements

On-site
3 - 5 years of experience

Job description for HPC System Engineer at Techdirect

Responsible for operation, administration, maintenance, troubleshooting, and optimization of NSCC's High Performance Computing (HPC) infrastructure including Linux servers, compute nodes, login nodes, HCI platforms, virtualization systems, monitoring, backups, patching, and security hardening.

Key Responsibilities

Linux & HPC Infrastructure Administration

Administer Red Hat Enterprise Linux (RHEL) based HPC environments

Manage HPC compute nodes and login nodes

Create and maintain golden OS images

Perform server provisioning and decommissioning

Manage kernel and driver upgrades

Conduct system health checks and performance monitoring

Virtualization & HCI

Support HCI clusters and virtualization platforms

VM lifecycle management

Template creation and maintenance

Backup and restore validation

Disaster recovery testing

Configuration & Change Management

Maintain system configuration baselines

Detect and remediate configuration drift

Execute approved change requests

Produce operational documentation and runbooks

Security & Compliance

Linux hardening

Vulnerability remediation

Patch management

Certificate and credential management

Audit support and evidence collection

Operations Support

Incident troubleshooting and resolution

Root Cause Analysis (RCA)

Performance tuning

Capacity management

Vendor escalation management


Mandatory Skills

Operating Systems

Red Hat Enterprise Linux (RHEL)

Rocky Linux / AlmaLinux

Ubuntu Linux

Virtualization

One or more:

VMware vSphere

Nutanix AHV

OpenShift Virtualization

KVM

Scripting & Automation

Bash

Python

Ansible

Git

Monitoring

Grafana

Prometheus

Zabbix

ELK

Splunk

Storage Knowledge

NFS

POSIX Filesystems

SAN/NAS concepts

Networking Knowledge

TCP/IP

DNS

VLAN

Linux networking

Experience Requirements

Minimum 3 years Linux Systems Administration experience

Experience supporting mission-critical environments

Experience with virtualization technologies

Experience with automation tools

Experience supporting large-scale infrastructure

Certifications

Mandatory:

ITIL Foundation

RHCSA (Red Hat Certified System Administrator)

Preferred:

RHCE

VMware VCP

Red Hat OpenShift Certification

Nutanix NCP


About the company
Techdirect
Techdirect

Glints Safety Tips

Legitimate employers won’t ask for contact Telegram or any kind of top-ups or payment. Do not provide your messaging app contacts, bank details, or credit card information.

Learn More

Techdirect

HPC System Engineer

Techdirect
SGD5,500 - 7,500
Contract · On-site
3 - 5 years of experience

Job Requirements

On-site
3 - 5 years of experience

Job description for HPC System Engineer at Techdirect

Responsible for operation, administration, maintenance, troubleshooting, and optimization of NSCC's High Performance Computing (HPC) infrastructure including Linux servers, compute nodes, login nodes, HCI platforms, virtualization systems, monitoring, backups, patching, and security hardening.

Key Responsibilities

Linux & HPC Infrastructure Administration

Administer Red Hat Enterprise Linux (RHEL) based HPC environments

Manage HPC compute nodes and login nodes

Create and maintain golden OS images

Perform server provisioning and decommissioning

Manage kernel and driver upgrades

Conduct system health checks and performance monitoring

Virtualization & HCI

Support HCI clusters and virtualization platforms

VM lifecycle management

Template creation and maintenance

Backup and restore validation

Disaster recovery testing

Configuration & Change Management

Maintain system configuration baselines

Detect and remediate configuration drift

Execute approved change requests

Produce operational documentation and runbooks

Security & Compliance

Linux hardening

Vulnerability remediation

Patch management

Certificate and credential management

Audit support and evidence collection

Operations Support

Incident troubleshooting and resolution

Root Cause Analysis (RCA)

Performance tuning

Capacity management

Vendor escalation management


Mandatory Skills

Operating Systems

Red Hat Enterprise Linux (RHEL)

Rocky Linux / AlmaLinux

Ubuntu Linux

Virtualization

One or more:

VMware vSphere

Nutanix AHV

OpenShift Virtualization

KVM

Scripting & Automation

Bash

Python

Ansible

Git

Monitoring

Grafana

Prometheus

Zabbix

ELK

Splunk

Storage Knowledge

NFS

POSIX Filesystems

SAN/NAS concepts

Networking Knowledge

TCP/IP

DNS

VLAN

Linux networking

Experience Requirements

Minimum 3 years Linux Systems Administration experience

Experience supporting mission-critical environments

Experience with virtualization technologies

Experience with automation tools

Experience supporting large-scale infrastructure

Certifications

Mandatory:

ITIL Foundation

RHCSA (Red Hat Certified System Administrator)

Preferred:

RHCE

VMware VCP

Red Hat OpenShift Certification

Nutanix NCP


About the company
Techdirect
Techdirect

Glints Safety Tips

Legitimate employers won’t ask for contact Telegram or any kind of top-ups or payment. Do not provide your messaging app contacts, bank details, or credit card information.

Learn More

HPC System Engineer

Techdirect