AI Research Engineer (Kernel & Inference Optimization) — Tether Operations

CHF 73'500 - 111'500
Tether Operations · Zürich, Zürich (ZH)
Categoria: Ingegneria Contratto: remote Salario: CHF 73'500 - 111'500
Apply now
Location
Zürich
Contract
remote
Posted
31 days ago
SalaryCHF 73'500 - 111'500

Role overview

Join Tether and Shape the Future of Digital Finance At Tether, we’re not just building products, we’re pioneering a global financial revolution.

Our cutting-edge solutions empower businesses—from exchanges and wallets to payment processors and ATMs—to seamlessly integrate reserve-backed tokens across blockchains.

By harnessing the power of blockchain technology, Tether enables you to store, send, and receive digital tokens instantly, securely, and globally, all at a fraction of the cost.

Main responsibilities

  • Your responsibilities include engineering robust inference pipelines, establishing comprehensive performance metrics, and identifying and resolving bottlenecks in production environments.
  • The ultimate goal is to enable high-throughput, low-latency, low-memory footprint, and scalable AI performance that delivers tangible value in dynamic, real-world scenarios.
  • Design and deploy state-of-the-art model serving architectures that deliver high throughput and low latency while optimizing memory usage.
  • Ensure these pipelines run efficiently across diverse environments, including resource-constrained devices and edge platforms.
  • Establish clear performance targets such as reduced latency, improved token response, and minimized memory footprint.
  • Build, run, and monitor controlled inference tests in both simulated and live production environments.
  • Track key performance indicators such as response latency, throughput, memory consumption, and error rates, with special attention to metrics specific to resource-constrained devices.
  • Document iterative results and compare outcomes against established benchmarks to validate performance across platforms.
  • Identify and prepare high-quality test datasets and simulation scenarios tailored to real-world deployment challenges, specifically those encountered on low-resource devices.
  • Set measurable criteria to ensure that these resources effectively evaluate model performance, latency, and memory utilization under various operational conditions.

Application process

  • Your work will focus on optimizing model deployment and inference strategies to deliver highly responsive, efficient, and scalable performance across real-world applications.
  • You will work on a wide spectrum of systems, ranging from resource-efficient models designed for limited hardware environments to complex, multi-modal architectures that integrate data such as text, images, and audio.
  • We expect you to have deep expertise in designing and optimizing model serving pipelines and inference frameworks as well as a strong background in advanced model architectures.
  • You will adopt a hands-on, research-driven approach to develop, test, and implement novel serving strategies and inference algorithms.
  • Work closely with cross-functional teams to integrate optimized serving and inference frameworks into production pipelines designed for edge and on-device applications.
  • Define clear success metrics such as improved real-world performance, low error rates, robust scalability, optimal memory usage and ensure continuous monitoring and iterative refinements for sustained improvements. A degree in Computer Science or related field.
  • Ideally PhD in NLP, Machine Learning, or a related field, complemented by a solid track record in AI R&D (with good publications in A* conferences).
  • Must have knowledge of Metal Shading Language (MSL).

Contacts

  • Double-check email addresses.

Additional details

  • Define clear success metrics such as improved real-world performance, low error rates, robust scalability, optimal memory usage and ensure continuous monitoring and iterative refinements for sustained improvements. A degree in Computer Science or related field.
  • Apply only through our official channels.

Notes and original content

  • Responsibilities
  • Define clear success metrics such as improved real-world performance, low error rates, robust scalability, optimal memory usage and ensure continuous monitoring and iterative refinements for sustained improvements.
  • A degree in Computer Science or related field.
Apply now
Logo Tether Operations
Company
Tether Operations · Zürich, Zürich
Frontaliere Ticino discovered this opportunity through company monitoring.

All Tether Operations jobs in Zürich, Zürich →

Explore similar jobs