How to Self-Host Traceway as a Free Datadog Alternative

Deploy Traceway, an OpenTelemetry-native observability platform, with Docker Compose in under 2 minutes. Get logs, traces, metrics, and session replay without vendor lock-in.

How to Self-Host Traceway as a Free Datadog Alternative

Observability bills sneak up on you. One day you’re logging a few services, the next you’re staring at a Datadog invoice that rivals your cloud spend. Traceway flips that model on its head: it’s MIT licensed, OpenTelemetry-native, and you can have the full stack running on your own hardware in about two minutes.

What is Traceway?

Traceway is an open-source observability platform that handles logs, traces, metrics, session replay, exceptions, and AI tracing under one roof. It ingests data directly over OTLP/HTTP, so you skip the collector, the vendor SDKs, and the glue code. Point any OpenTelemetry SDK at it, and data starts flowing.

The project sits at about 290 stars on GitHub and ships under a clean MIT license. No BSL, no “open core” tier that reserves features for paid plans. Every feature is in the box.

What’s in the box

FeatureWhat it does
LogsStructured, trace-linked search with sub-second performance
TracesEnd-to-end span waterfalls across every service
MetricsHost, runtime, and custom metrics with configurable dashboard widgets
ExceptionsSHA-256 normalized stack traces grouped into ranked issues, source-mapped for JS bundles
Session ReplayWatch user sessions leading up to errors across web (any JS framework) and Flutter
AI ObservabilityTrack LLM cost, tokens, latency, and full conversations across providers

Why Traceway over the alternatives

Enterprise (Datadog/New Relic)DIY OSS (Prometheus + Loki + Tempo)Traceway
PricingPer-event, per-host, per-seatFree but heavy ops timeSelf-host free, fixed cloud tiers
SetupVendor SDK per languageGlue 6 tools togetherdocker compose up -d
LicenseProprietaryMixed, some BSL/open-coreMIT, no asterisks
OTelWrapped in vendor SDKCollector requiredNative OTLP/HTTP ingest
Replay + traces + AIThree separate productsWire it yourselfOne system, one trace ID

New but functional

Traceway is a younger project compared to the big names. It’s under active development and patches land frequently. For production use, keep an eye on releases and test upgrades before rolling them out.

Architecture overview

Traceway runs as three main services when deployed standalone:

ServiceTechnologyRole
BackendGo 1.25, GinOTLP ingest, REST API, alerts, migrations
FrontendSvelteKit 2, Svelte 5, Tailwind CSS v4Dashboard SPA
DatabaseClickHouse + PostgreSQLTelemetry storage (ClickHouse) and relational data (PostgreSQL)

Everything communicates over the internal Docker network. The backend exposes OTLP/HTTP endpoints for traces, metrics, and logs at /api/otel/v1/traces, /api/otel/v1/metrics, and /api/otel/v1/logs.

Prerequisites

  • A Linux VPS or dedicated server with Docker and Docker Compose v2 installed
  • At least 4 GB RAM (ClickHouse and PostgreSQL run alongside the backend and frontend)
  • Root or sudo access
  • Port 80 available for the dashboard
Try Hetzner Cloud Now Try Hostinger VPS

Deploy Traceway with Docker Compose

This is the core of it. Clone, compose up, done.

Step 1: Clone the repo

git clone https://github.com/tracewayapp/traceway
cd traceway

Step 2: Start the stack

docker compose up -d

The dashboard comes up at http://your-server-ip. That’s it.

Under the hood, Docker Compose pulls the Traceway images, boots ClickHouse and PostgreSQL, runs any pending migrations, and starts the API and frontend. The first boot takes maybe 90 seconds while databases initialize.

Step 3: Create your first project

Open the dashboard, log in with the default credentials, and create a project. The UI walks you through it. You’ll get an ingest endpoint and access token for each project.

Change default credentials

The default login ships as admin@localhost.com / admin. Change the password immediately through the UI, or set your own defaults with environment variables before first boot.

Practical example: Instrument a Node.js app

Let’s walk through wiring up a real application. This is what the transition from “I deployed Traceway” to “I can see my app’s data” looks like.

Set up a basic Express app

If you don’t have an app handy, scaffold one:

mkdir traceway-demo && cd traceway-demo
npm init -y
npm install express @opentelemetry/api @opentelemetry/sdk-node \
  @opentelemetry/auto-instrumentations-node \
  @opentelemetry/exporter-trace-otlp-http

Wire up OpenTelemetry

Create a file called tracing.js:

const { NodeSDK } = require('@opentelemetry/sdk-node');
const { OTLPTraceExporter } = require('@opentelemetry/exporter-trace-otlp-http');
const { getNodeAutoInstrumentations } = require('@opentelemetry/auto-instrumentations-node');

const sdk = new NodeSDK({
  traceExporter: new OTLPTraceExporter({
    url: 'http://your-server-ip/api/otel/v1/traces',
  }),
  instrumentations: [getNodeAutoInstrumentations()],
});

sdk.start();

Create the server

Create index.js:

require('./tracing');

const express = require('express');
const app = express();

app.get('/', (req, res) => {
  res.json({ message: 'Traceway is watching' });
});

app.get('/slow', async (req, res) => {
  await new Promise(resolve => setTimeout(resolve, 2000));
  res.json({ message: 'This one took a while' });
});

app.listen(3000, () => console.log('App running on port 3000'));

Run it with node index.js, then curl both endpoints a few times:

curl http://localhost:3000/
curl http://localhost:3000/slow

Switch back to the Traceway dashboard. Traces appear within seconds. Spans break down each request, the /slow endpoint visibly takes longer, and you can click through the waterfall to see exactly where time goes.

Practical example: Instrument a Python app

Python works the same way. Here’s a minimal Flask setup:

mkdir traceway-python && cd traceway-python
python3 -m venv venv && source venv/bin/activate
pip install flask opentelemetry-distro opentelemetry-exporter-otlp-proto-http

Create app.py:

from flask import Flask
from opentelemetry import trace
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.instrumentation.flask import FlaskInstrumentor

# Point the exporter at your Traceway instance
exporter = OTLPSpanExporter(
    endpoint="http://your-server-ip/api/otel/v1/traces"
)

provider = TracerProvider()
provider.add_span_processor(BatchSpanProcessor(exporter))
trace.set_tracer_provider(provider)

app = Flask(__name__)
FlaskInstrumentor().instrument_app(app)

@app.route("/")
def home():
    return {"message": "Traceway sees this too"}

@app.route("/error")
def error():
    1 / 0  # Intentional — shows up in exceptions
    return "never reached"

if __name__ == "__main__":
    app.run(port=5000)

Run it and hit both endpoints. Back in Traceway, you’ll see spans for each Flask route plus a grouped exception entry for the division by zero.

Practical example: Frontend session replay

Traceway includes session replay out of the box. For any JavaScript frontend, add the Traceway browser SDK:

<script src="https://cdn.jsdelivr.net/npm/@tracewayapp/browser@latest/dist/traceway.min.js"></script>
<script>
  Traceway.init({
    projectToken: 'your-project-token',
    endpoint: 'http://your-server-ip/api/otel/v1/traces',
    sessionReplay: true,
  });
</script>

That snippet captures user sessions, console logs, network requests, and errors. When an exception fires, the error detail page includes a replay of the user’s session leading up to it. You watch exactly what they clicked, typed, and saw.

The browser SDK works with React, Vue, Svelte, Next.js, and plain HTML. There’s also a Flutter package for mobile apps.

Production hardening

Once you’ve confirmed everything works, lock it down:

  • Set up HTTPS with Nginx or Caddy as a reverse proxy in front of the Traceway dashboard
  • Change the default password before exposing anything to the network
  • Restrict the OTLP ingest port to internal services only (don’t expose it publicly)
  • Mount ClickHouse and PostgreSQL data directories to named volumes so data survives container restarts
  • Set resource limits on containers so ClickHouse doesn’t starve the API under heavy ingest
  • Schedule regular backups of both databases

Updating Traceway

cd traceway
git pull
docker compose pull && docker compose up -d

Migrations run automatically when the new backend starts. Check the release notes on GitHub before pulling; breaking changes are called out.

FAQ

How does Traceway compare to Grafana + Loki + Tempo?

Grafana’s stack is battle-tested but requires you to configure and maintain Loki for logs, Tempo for traces, Prometheus for metrics, and Grafana for dashboards. Each piece has its own configuration language, storage backend, and upgrade path. Traceway gives you the same capabilities — logs, traces, metrics — in one Docker Compose stack with one configuration surface. If you already run the Grafana stack and it works for you, Traceway probably isn’t worth the migration. If you’re starting fresh or tired of stitching things together, it’s worth a look.

Can I use Traceway with existing OpenTelemetry instrumentation?

Yes. That’s the whole point. If your services already export OTLP, just repoint the exporter URL at your Traceway instance. No code changes needed beyond the endpoint URL.

What's the storage footprint?

It depends heavily on your traffic. ClickHouse compresses well — expect roughly 1-3 GB per million spans for typical HTTP service data. PostgreSQL stores project configuration and user data, which is negligible in comparison. Plan for at least 20 GB of disk if you’re running in production with a handful of services, and monitor growth over the first few weeks.

Does Traceway support alerting?

Yes. You can set up alerts for error rate thresholds, latency spikes, or custom metric conditions. Notifications go to Slack, GitHub issues, email, or any webhook endpoint.

Is there a managed cloud option?

Traceway Cloud exists at cloud.tracewayapp.com if you’d rather not self-host. It runs the same MIT code. Pricing is fixed-tier rather than per-event, so your bill stays predictable.

Wrapping up

Traceway scratches a real itch. Observability shouldn’t require a procurement process and a four-figure monthly commitment just to know why your API is slow. The project gives you logs, traces, metrics, exception tracking, and session replay in a stack you control, with an MIT license that doesn’t pull the rug later.

The trade-off is maturity. Datadog has a decade of polish behind it. Traceway is early and moving fast. But for small teams, side projects, and anyone who’d rather spend money on servers than dashboards, it’s worth your evening to try.

View Traceway on GitHub Traceway Documentation