The fine-tuning API
for developers.
Upload your training data, pick a model — Llama, Mistral, Gemma — get a hosted inference endpoint. No infrastructure, no ML team, no Python required.
import { Turing } from "@turing-compute/sdk";
const client = new Turing({ apiKey: process.env.TURING_API_KEY });
// 1. Upload your training data
const file = await client.files.upload("./data/training.jsonl");
// 2. Start a fine-tuning job on any open-source model
const job = await client.fineTuning.create({
model: "meta-llama/Llama-3.1-8B-Instruct",
trainingFile: file.id,
suffix: "support-classifier-v1",
});
// 3. Poll until ready — or listen via webhook
await client.fineTuning.waitForCompletion(job.id);
// 4. Use your hosted endpoint immediately
const response = await client.chat.completions.create({
model: job.endpoint,
messages: [{ role: "user", content: "Is this a billing question?" }],
});
console.log(response.choices[0].message.content);TypeScript is the top language on GitHub (2025 Octoverse)
ARR validated by Together AI — the fine-tuning market is real
GPU providers behind every job: Together AI, Fireworks, Replicate
How it works
Train to deployed endpoint
in three steps.
Every competitor separates fine-tuning from inference. Turing ships them together — the endpoint is ready when training finishes.
Upload your training data
Drop a JSONL file with your examples. Use any chat-completion format — the same structure you already send to OpenAI.
Supports JSONL · Up to 1 GB · Automatic validation
Pick your base model
Choose from Llama 3.1, Mistral, Gemma 2, Qwen, and more. We route the job to the best available GPU provider transparently.
LoRA · Full fine-tuning · Quantization options
Get a hosted endpoint
Training completes, an inference endpoint spins up. Call it exactly like OpenAI — same client, same message format, your model.
OpenAI-compatible API · Auto-scaling · Pay per token
Why Turing
Built for the developer
who doesn't have an ML team.
TypeScript-first SDK
Every competitor is Python-first. Turing was designed from day one for the TypeScript and JavaScript ecosystem — typed responses, first-class async/await, and a client that feels native.
OpenAI drop-in replacement
The Turing endpoint speaks the OpenAI chat completions format. If you already call OpenAI, swapping the base URL and model ID is the entire migration.
No infrastructure to manage
No GPUs to provision, no CUDA drivers to configure, no inference servers to maintain. Turing routes training jobs to best-in-class providers and returns an endpoint when the work is done.
Transparent, usage-based pricing
Pay for training tokens and inference tokens. No seat licenses, no enterprise tier gating access to the good models. Every feature is self-serve from day one.
Supported models
Every major open-source
family. More added regularly.
You pick the model. We pick the best GPU provider for the job — transparently, with no lock-in to any single infrastructure vendor.
Full model catalog at docs.turingcompute.com/models
Comparison
Same underlying compute.
Better developer experience.
Based on public documentation as of mid-2025. Subject to change.
Pricing
Transparent pricing.
No enterprise sales required.
Training and inference are billed separately, per token. No seat licenses, no tier gating on models, no minimum spend.
up to $20 in training credits
Ideal for experiments and proof-of-concept fine-tunes.
- ✓3 fine-tuning jobs / month
- ✓Up to 100K training tokens
- ✓Hosted endpoint — 7-day TTL
- ✓Community support
pay only for what you train and infer
For production apps and teams shipping custom models.
- ✓Unlimited fine-tuning jobs
- ✓All models — including 70B+
- ✓Persistent hosted endpoints
- ✓Webhooks + TypeScript SDK
- ✓Email support
Fine-tune. Deploy. Done.
Your first fine-tuning job is on us. No credit card, no sales call, no infrastructure to provision.
npm i @turing-compute/sdk