Mar 2020 – 2023 · King, Stockholm

A/B Testing Platform

Core frontend engineer — 199 commits, led delivery of metrics system, segmentation, and data layer refactor

TypeScriptReactHighchartsMobXAnt DesignStatistics

Overview

King runs hundreds of A/B tests across its games at any given time. Data scientists and product managers need to create experiments, monitor them in real time, and analyse results with statistical rigour — confidence intervals, power analysis, multiple comparison corrections. This platform is what they use every day.

I joined as a core frontend engineer and over three years delivered several major features: a complete metrics management system, a user segmentation system, a React-Query data layer refactor, and continuous improvements to the statistical visualisation layer used to interpret experiment results.

Interactive Demo — Statistical Analysis

The hardest part of building this platform was translating statistics into decisions. Try it yourself — pick a scenario, collect more data, and see how the chart guides you to a clear answer.

Step 1 — What is the real effect?

Pick a scenario. You know the truth — but the platform doesn't yet.

Step 2 — Collect more data

Drag right = more users in the experiment = narrower range on the chart.

5,000 users collected

Getting there — picture is becoming clearer

Step 3 — Read the result

⏳

Can't decide yet. The new tutorial might be better, but we've only tested 5,000 players — not enough to be sure. Keep the experiment running.

ExperimentNew player tutorial vs old tutorial — does the new version get more players through level 1?

Each point shows how many extra players completed level 1 compared to the old tutorial. The shaded band is the uncertainty range — it shrinks as you collect more data.When the band stays above the dashed line, the new tutorial is confirmed to be better.

The Challenge

The hardest part was making statistical concepts accessible to non-statisticians. A p-value or confidence interval means nothing to a product manager who just wants to know if the test "worked". The UI had to present complex statistical outputs — uplift percentages, confidence bands, smoothed time series, power curves — in a way that guides the right decision without hiding the underlying rigour.

Technical Highlights

Statistical visualisation layer

The core of the platform is a Highcharts-based chart layer that renders time series with confidence intervals, cumulative uplift curves, and per-segment breakdowns. Charts support 10,000+ data points via Boost mode, display smoothed trend lines, and adapt to the user's chosen confidence level (95% or 99%). Every chart is deeply integrated with the filter system — changing a parameter re-fetches and re-renders in real time.

Self-service metrics system

I designed and built the metrics management module from scratch: a CRUD interface backed by a SQL editor for defining custom metric queries, a measurement units layer, and a validation pipeline. Data scientists can now define new metrics themselves without involving the backend team. The module migrated state management from MobX to React-Query, making it simpler, cacheable, and easier to test.

Custom report builder (ETL UI)

The Custom Table Builder lets analysts define their own data extraction queries, schedule them as background jobs, and export results. The UI manages complex ETL task state — pending, running, failed, completed — with a DAG visualisation showing task dependencies (built with Dagre-D3). Analysts can share report templates with colleagues via URL.

Power calculator

Before running an experiment, teams need to know how long to run it and how many users they need to detect a meaningful effect. The power calculator interface takes inputs (MDE, population size, number of test groups, alpha, beta) and returns the required sample size and estimated run duration — helping teams design statistically sound experiments before they start.

Data layer architecture

The codebase had grown to use three different data-fetching approaches: legacy SWR, MobX stores with direct API calls, and ad-hoc fetch. I led a systematic refactor of key modules to React-Query 3, establishing a consistent pattern: declarative hooks, stale-time management, and devtools visibility. The platform connects to five separate backend services, all behind a unified API gateway layer.

Stack

TypeScriptReactMobXReact-QueryHighchartsAnt DesignDagre-D3Swagger CodegenVegaSQL