Jobs / App***

AIML - Senior Machine Learning Infrastructure Engineer -ML Compute, ML Platform & Technology

App*** · Santa Clara, CA, United States
Visa sponsorship details are locked. Unlock company name and apply link with .
Santa Clara, CA, United StatesExp: 4+ yrs147,400-272,100 USD/yearlyRemote
Remuneration
147,400-272,100 USD/yearly
Location
Santa Clara, CA, United States
Visa sponsorship
Sponsors visa

Job summary

App*** is seeking a Senior Engineer on the ML Compute Team to design and deliver critical features for ML compute workloads, collaborate with teams across App*** on ML tasks, and understand industry trends to develop new technologies. This role involves building and maintaining compute infrastructure for ML workloads, focusing on stability, reliability, efficiency, and cost-effectiveness.

Benefits

Comprehensive medical and dental coverageRetirement benefitsDiscounted products and free servicesReimbursement for educational expenses

Qualifications

  • Bachelor's degree in Computer Science, engineering, or a related field
  • 4+ years of hands-on experience building scalable backend systems for training and evaluation of machine learning/deep learning models
  • Proficient in programming languages like Python or Go
  • Strong expertise in distributed systems, reliability, scalability, containerization, and cloud platforms
  • Proficient in cloud computing and data processing infrastructure and tools such as Kubernetes, Ray, Beam, Flink
  • Ability to clearly and concisely communicate technical and architectural problems, while working with partners to iteratively find solutions
  • Advanced degree in Computer Science, engineering, or a related field (preferred)
  • Proficient in working with and debugging accelerators like GPU, TPU, AWS Trainium (preferred)
  • Proficient in ML training and deployment frameworks like JAX, Tensorflow, PyTorch, TensorRT, vLLM (preferred)

Responsibilities

  • Collaborate with teams across App*** on ML workloads such as training, inferencing, and fine-tuning
  • Drive the design and delivery of critical features to facilitate ML compute workloads
  • Effectively communicate complex features and systems in detail
  • Understand industry and company-wide trends to help assess and develop new technologies
  • Scope, architect, and deliver innovative high-quality solutions
  • Code using Go and Python
  • Conduct code reviews
  • Onboard new team members, provide mentorship, and enable successful ramp-up on team's code bases

Skills

GoAWSKubernetesPython

Degrees

Bachelor's degree in Computer Science, engineering, or a related fieldAdvanced degree in Computer Science, engineering, or a related field

Languages

GoPython

Relocation

Yes