GPU Kernel Engineer

Software Engineering

icon type Full Time

icon location Ho Chi Minh

We’re seeking a GPU Kernel Engineer to design, build, and optimize the low-level compute kernels that power our large-scale, GPU-accelerated AI inference platform. This is a deeply technical, high-impact role where you will write GPU code, implement advanced optimizations. As part of our engine team, you will contribute directly to the company’s proprietary inference engine which supports over 450,000 models on Hugging Face. You will work with the inventors of continuous batching and collaborate with the platform team to deploy your work into production.

I. Key Responsibilities

  • Design, implement, and optimize high-performance GPU kernels for AI inference (e.g., GEMM, attention, routing)
  • Develop and maintain GPU code in CUDA and C++, including low-level assembly when needed
  • Implement reduced-precision and quantized kernels (FP8/FP4) for low-latency or high-throughput inference
  • Benchmark and ensure cross-vendor performance parity between NVIDIA and AMD hardware
  • Contribute to internal GPU libraries and tune performance of performance-critical components
  • Accelerate multi-modal model pipelines
  • Investigate and integrate next-generation GPU features

II. Job Qualifications

Must have:

  • 3+ years of experience in GPU programming, HPC, or performance-critical systems
  • Bachelor’s or Master’s degrees in Computer Science, Computer Engineering, Electrical Engineering, or a related field
  • Strong proficiency in CUDA for NVIDIA GPUs or ROCm/HIP for AMD GPUs
  • Deep understanding of GPU architecture: warps, threads, memory hierarchy, synchronization, and latency-throughput trade-offs
  • Proficiency in C++
  • Experience with GPU profiling and performance tuning
  • Strong numerical background with understanding of precision trade-offs and quantization techniques

Nice to have:

  • Experience optimizing transformer, multi-modal, or Mixture-of-Experts (MoE) architectures at the kernel level
  • Familiarity with the latest GPU libraries and frameworks (CUTLASS, Triton, …)
  • Inter-GPU communication programming experience
  • Open-source contributions related to GPU performance or ML acceleration
  • Research or conference presentations on GPU optimization, HPC, or numerical computing

II. Why We’ll Love Working Here

1. Workplace

  • Join a vibrant, young and dynamic team working on cutting-edge projects & emerging technologies.
  • Collaborate with global experts & top tech talent to enhance your skills.
  • Thrive in a culture of openness, forward-thinking and innovation-driven team while encouraging your full potential.

2. Benefits Comprise

  • Competitive salary, 13th month salary and attractive performance bonuses.
  • Flexible hybrid working model between working at office and working at home (WFH 2 days per week)
  • Premium Healthcare and Accident insurance.
  • Annual health check package.
  • Free parking and allowances: Lunch, Marriage, Newborn baby, Bereavement and others applied.
  • A spacious pantry that is fully equipped with coffee maker, fridge, microwave and more for your most comfortable lunch time.
  • A wide range of sport and social activities: Yoga, Football, Badminton, Tech clubs, etc.
  • Annual company trip and teambuilding.
  • Chance to be honored quarterly and annually with recognition awards for individuals, teams, long-term service, etc.
  • Advanced English and appropriate soft skills training to assist your career development.
  • Engaging monthly events: Happy Gathering, Mini Game, Team Birthday Celebrations, Company’s Year-end party, etc.
  • Exclusive company supporting funds to ease your personal loans of Home, Vehicle, Tuition, etc.

IV. Additional information

  • Location: QTSC 1 Building, Quang Trung Software City, Trung My Tay Ward (District 12), Ho Chi Minh City
  • Working Time: 8:30 AM – 6:00 PM from Monday to Friday (Lunch break between 12:00 PM to 1:30 PM)

Gentle notice to our candidate about Decree No.13/2023/NĐ-CP

According to Decree No.13/2023/NĐ-CP on protecting personal data, LARION would apply “Personal Data Processing Agreement” with all candidates to ensure compliance with the decree.

By submitting this application to LARION, you agree to allow LARION to proceed your provided information in accordance with “Personal Data Processing Agreement” that you have read, fully understood and agreed to the entire content at link https://larion.com/privacy-policy/

Career

Accepted file types: pdf, doc, docx, Max. file size: 10 MB.