DataLook Docs

Self-hosting

Why DataLook isn't self-hostable in V1, what that means for your data, and what to ask if you genuinely need it.

The honest answer up front: DataLook is not self-hostable in V1, and that's a deliberate choice — not an oversight or a paywall. This page explains the reasoning so you can decide whether the hosted product fits, and tells you exactly what to ask us if it doesn't.

Why not

We're a two-person team chasing a focused beta. Every hour spent making the stack portable — documenting four backing services, supporting arbitrary Postgres/ClickHouse/Redis versions, writing an upgrade path, and answering "it doesn't start on my distro" tickets — is an hour not spent making the product better for the founders we're building it for.

The architecture is also genuinely involved for something you'd run yourself:

  • ClickHouse for the event store, with hot/cold storage tiering, materialized views, and load-bearing memory caps that OOM-kill the app if mis-tuned.
  • Redis Streams as the ingest buffer, with a long-lived consumer process that has to stay healthy (XAUTOCLAIM, dead-letter handling, poison-row bisection).
  • Postgres for app data, with schema migrations.
  • A reverse proxy terminating TLS, plus a CDN edge in front.

That's four stateful services and a worker, tuned to fit on one box. We run it so you don't have to think about any of it.

This is the same call Plausible and Fathom made early on: hosted-first, self-hosting later (if at all). It keeps a tiny team fast.

What this means for your data

Not self-hosting doesn't mean locked-in. The product is built so you can leave whenever you want:

  • Export any time. Every table on screen exports to CSV or JSON, and there's a date-ranged bulk events endpoint (GET /api/export/events?from=…&to=…&format=csv|json). Your data is yours.
  • Cookieless and PII-light by design. We don't store raw IPs, we strip PII keys server-side, and the visitor id is a rotating daily hash. There's less of your data sitting on our box than with most analytics tools to begin with.
  • The SDK is readable. The browser script is an unobfuscated IIFE you can read end to end — see Security & CSP.

"But ad blockers / my CSP / my compliance team…"

Most of the reasons people reach for self-hosting are already solved without it:

ConcernHosted answer
Ad blockers hide my analyticsFirst-party proxy — serve the SDK and collector from your own domain. Nothing third-party to block.
Strict CSPThe proxy install collapses your directives to script-src 'self'; connect-src 'self'. See Security & CSP.
Data residency / complianceTalk to us (below). EU-targeting + a DPA are on the v1.1 roadmap.

If you genuinely need to self-host

Some teams have a hard requirement — air-gapped networks, a contractual data-residency clause, a security policy that forbids third-party script origins even when proxied. If that's you, email us with:

  1. The actual constraint — "compliance requires X" beats "we'd prefer it." It helps us understand whether the proxy install already satisfies it.
  2. Your scale — events/month and number of sites. Self-hosting only makes sense above a certain volume.
  3. Who operates it — do you have someone who'll own a ClickHouse instance, or are you hoping for one-click?

We're not against self-hosting forever — it's just not a V1 commitment. Enough well-scoped requests move it up the roadmap.

What's next

  • Beat ad blockers without self-hosting: First-party proxy setup.
  • Read exactly what the script does: Security & CSP.
  • Pull your data out any time: the export menu on every dashboard table.

On this page