
Part Time AI Software & Hardware Optimization Engineer (Remote)
Persyaratan
Skills
Hardware Engineering
Software Engineering
Artificial Intelligence
Loker ini dikelola oleh
Deskripsi pekerjaan Part Time AI Software & Hardware Optimization Engineer (Remote) Livit International
About Us:
We have a cutting-edge AI chatbot powered by Llama 3.2 in production. As we scale, reducing infrastructure GPU costs is a critical priority. Our current setup runs on Nvidia A100 80GB via Runpod, but we’re actively exploring new hardware solutions, including Nvidia Decimals, Apple M4 with unified memory, Tenstorrent, and other emerging AI chips.
We’re looking for an AI Software & Hardware Optimization Engineer who can analyze, adapt, and optimize our existing CUDA-based AI models to run efficiently across different hardware architectures. This is a unique opportunity to work at the intersection of AI software, performance engineering, and next-generation AI hardware.
Key Responsibilities:
✅ Optimize AI Model Performance Across Different Hardware
- Adapt and optimize CUDA-dependent AI models for alternative architectures such as Apple M4 (Metal), Tenstorrent, and other non-Nvidia accelerators.
- Implement low-level performance optimizations for AI inference across different memory architectures (GDDR6, unified memory, LPDDR5X, etc.).
- Convert and optimize models for various inference runtimes (e.g., TensorRT, ONNX, Metal Performance Shaders, Triton Inference Server, vLLM).
✅ AI Hardware Benchmarking & Cost Reduction
- Conduct rigorous benchmarking of AI workloads on Nvidia CUDA GPUs, Apple Silicon, AMD ROCm, and specialized AI chips.
- Compare memory bandwidth, latency, power efficiency, and inference throughput across different architectures.
- Identify cost-effective alternatives to high-cost cloud GPUs without sacrificing performance.
✅ Model Optimization for Efficient Deployment
- Implement quantization (INT8, FP16, BF16) and model distillation to enhance efficiency.
- Develop custom AI kernels optimized for different hardware types.
- Improve multi-threading, batching, and caching strategies to reduce inference latency.
✅ Infrastructure & Deployment
- Deploy AI models efficiently using Docker, Kubernetes, and serverless AI inference platforms.
- Implement compilation pipelines (TVM, XLA, MLIR) to target diverse hardware backends.
- Work closely with DevOps to integrate inference optimization techniques into production workflows.
Required Skills & Experience:
🔹 Deep AI Model Optimization Experience
- Strong expertise in PyTorch, TensorFlow, and JAX with deep understanding of model transformation for different backends.
- Experience optimizing AI models with CUDA, Metal, ROCm, and other accelerator-specific libraries.
🔹 Hardware & System-Level Knowledge
- Expert understanding of GPU architectures, unified memory models, tensor cores, and AI-specific accelerators.
- Experience working with alternative AI hardware, such as Apple Silicon, Tenstorrent, Graphcore, or Groq.
- Deep knowledge of memory architectures (GDDR6, LPDDR5X, HBM, Unified Memory) and their impact on AI workloads.
🔹 Inference Optimization & Acceleration
- Hands-on experience with TensorRT, ONNX Runtime, Triton Inference Server, vLLM, and Hugging Face Optimum.
- Knowledge of low-level parallelism (SIMD, VLIW, MIMD) and AI chip architectures.
🔹 Benchmarking & Profiling
- Experience with AI performance profiling tools (Nsight, ROCm SMI, Metal Profiler, perf).
- Ability to analyze power efficiency, latency, memory bandwidth, and FLOPS utilization across different chips.
Nice-to-Have Skills:
- Experience with LLM-specific optimizations, such as speculative decoding, paged attention, and tensor parallelism.
- Knowledge of compiler optimization techniques (MLIR, XLA, TVM, Glow) for AI workloads.
- Familiarity with emerging AI accelerators beyond mainstream options.
Why Join Us?
🚀 Work on cutting-edge AI infrastructure & next-gen hardware.
🌍 Fully remote, flexible work environment.
💰 Competitive salary with potential bonuses for cost reductions.
🎯 Opportunity to shape the future of AI model deployment.
Join an innovative team in the AI industry! Our client is seeking an AI Software & Hardware Optimization Engineer to contribute to their dynamic organization. Further information will be disclosed as you advance in the recruitment process. If you’re passionate about optimizing AI workloads across diverse hardware ecosystems and want to push the limits of AI performance, we’d love to hear from you!

We have a trained eye for skilled talent.
During our 5+ years of experience, we’ve learned the intricacies of the talent market, how to spot high-performers, good organizational culture fit, key position requirements, etc.
We’ve helped build remote teams across regions and industries.
Livit has successfully recruited talent for tech, finance, sales, HR and marketing positions, for companies of different sizes from across the globe.
We have an expertise-focused, flat-fee approach.
Regular recruiters who charge percentage commissions tend to be more interested in making profits than in finding the right person. Our fixed-rate structure allows us to focus on finding the ideal candidate for the job.
We become part of your team.
Regular recruiters also act like contractors, striving for their own personal interest. Our approach is to become your temporary partners, helping you increase happiness and productivity levels within your organization.
We have a strategic methodology.
We strive to ensure long-term success through all of our services. We focus on gaining a deep understanding of our partner’s goals before making any suggestions.
Jl. Bumi Ayu Gg. Pungut Sari No.6, Sanur, Denpasar Selatan, Kota Denpasar, Bali 80228



Tips Aman Cari Kerja
Pemberi kerja yang benar tidak akan meminta akun Telegram, top-ups atau pembayaran dalam bentuk apapun. Jangan berikan kontak pribadi, informasi bank, maupun kartu kredit kamu.
Pelajari Selengkapnya