Job description for Data Centre IT Engineer at Cloudengine Digital Pte. Ltd.
Infrastructure Operations Engineer
Job Summary
We are looking for an Infrastructure Operations Engineer to support the deployment, operation, maintenance, and troubleshooting of server and network infrastructure.
Key Responsibilities:
1. Server Infrastructure Support
* Deploy, install, configure, and maintain server infrastructure, including GPU servers, X86 servers, storage systems, and related hardware.
* Perform hardware diagnostics and troubleshooting for servers, storage devices, memory modules, disks, power supplies, and other components.
* Assist in hardware replacement, repair, and preventive maintenance activities.
* Maintain asset records, maintenance logs, and spare parts inventory.
* Coordinate with vendors and manufacturers on warranty and RMA cases when required.
* Provide remote and onsite technical support for infrastructure-related issues.
2. Data Center Operations
* Support data center deployment activities, including rack installation, equipment mounting, structured cabling, labeling, and power connection.
* Assist with installation and commissioning of servers, network devices, and storage systems.
* Perform routine inspections and health checks of data center equipment.
* Ensure compliance with operational and safety procedures within the data center environment.
3. Network & Infrastructure Maintenance
* Support the operation and maintenance of network devices, including switches, routers, and firewalls.
* Monitor infrastructure performance and respond to incidents in a timely manner.
* Troubleshoot connectivity, hardware, and infrastructure-related issues.
* Escalate complex technical issues to senior engineers or vendors when necessary.
4. Documentation & Reporting
* Maintain accurate technical documentation, maintenance records, and inventory information.
* Prepare incident reports and maintenance reports.
* Update operational procedures and knowledge base documentation when required.
5. Continuous Improvement
* Participate in preventive maintenance activities to improve infrastructure reliability.
* Assist in identifying recurring issues and implementing corrective actions.
* Stay updated on server, storage, networking, and AI infrastructure technologies.
Requirements:
1. Education
* Diploma or Bachelor's Degree in Information Technology, Computer Science, Telecommunications, Electronics Engineering, or related fields.
2. Experience
* Years of experience in server, network, or data center operations is preferred.
* Experience with enterprise servers, storage systems, and networking equipment.
* Experience supporting GPU servers or AI infrastructure is an advantage.
3. Soft Skills
* Good troubleshooting and analytical skills.
* Strong communication and teamwork abilities.
* Willingness to work onsite at customer locations or data centers when required.
