unitcheck

Changelog

Changelog

All notable changes to UnitCheck — methodology, features, data coverage, and public-facing operations — are documented here. This file is the source for the public /changelog page on the site.

The format follows Keep a Changelog with categories adapted to UnitCheck's context:

  • Added — new features, new categories, new retailers, new content
  • Changed — updates to existing behavior
  • Methodology — changes to calculations, formulas, or data-quality rules
  • Data — coverage updates (new locales, expanded catalog, quality fixes)
  • Fixed — bug fixes and data corrections
  • Deprecated — features or behaviors being phased out
  • Removed — features or content removed
  • Security — security-related updates

Entries under [Unreleased] are accumulated between releases and moved into a dated release block when published.


[Unreleased]

Added

  • Initial project scaffolding (SPEC, DECISIONS, RUNBOOK, METHODOLOGY, CLAUDE, README)

  • Brand identity locked: canonical UnitCheck, lowercase wordmark, unitcheck.co domain, @getunitcheck social handles

  • Tagline: "Check every price, per unit."

  • Core library src/lib/: currency helpers, protein unit-price calculator, Best Value composite score, quality-flag rules (hard exclusions + soft flags), Amazon affiliate URL builder, LLMClient interface, RetailerAdapter interface, Kysely Database schema type. All pure-logic, TDD-covered (38+ tests).

  • First concrete LLMClient: AnthropicLLMClient (workers/ingestion/llm/) targeting Claude Haiku 4.5 with Sonnet 4.6 fallback. Tool-use forced-choice structured output (tool input schema generated via native z.toJSONSchema), multiplicative confidence (model self-report × validation factor), ephemeral prompt caching on the system block. Additive systemPrompt field on LLMExtractOptions.

  • First concrete RetailerAdapter scaffold: AmazonCreatorsAdapter (workers/ingestion/retailers/). buildAffiliateUrl and parseStructuredData live; searchProducts / getProductDetails throw NotImplementedError until Amazon Creators API access clears the 10-qualifying-sales gate.

  • Cloudflare Pages project live: auto-deploy from main via Cloudflare's GitHub integration, custom domains unitcheck.co + www.unitcheck.co wired.

  • Protein extraction prompt v1 (prompts/extractions/protein-v1.md), Zod schema mirroring METHODOLOGY §3 (src/lib/extractions/), 10-entry synthetic test corpus (tests/extractions/protein-corpus.json), mocked unit tests, and an ENV-gated real-API runner (scripts/run-extraction-corpus.ts). Baseline accuracy: 95.6% (docs/ingestion/accuracy-v1.md).

  • Ingestion pipeline (workers/ingestion/pipeline.ts + Cloudflare Worker entry + local pnpm ingest:seed runner). Extracts protein facts via Claude, applies METHODOLOGY §6 confidence gates (accept ≥0.80, exclude <0.60), computes $/g protein / $/serving / $/100g protein, and writes products + extractions + unit prices + price history to Supabase. Cron is defined but disabled during bootstrap; operator triggers runs via pnpm ingest:seed or POST /admin/ingest (Bearer auth via ADMIN_TOKEN).

  • Seed tooling: pnpm seed:add (LLM-assisted interactive CLI — paste SiteStripe URL + page copy, script splits into structured fields), pnpm seed:validate (Zod), and data/seed-products.json as the canonical hand-curated input for bootstrap phase.

  • ESLint rule no-restricted-imports on src/ forbidding imports from workers/. CI fails if the site bundle ever pulls in Anthropic or Supabase SDKs.

  • wrangler.toml scaffolded; pnpm wrangler deploy --dry-run passes. Production deploy (secrets + wrangler deploy) is an explicit operator step.

  • SEO + GEO strategy established as a core build concern: deep-research synthesis + codebase-mapped P0–P2 implementation plan (docs/seo-geo-strategy.md), recorded as ADR-014. Headline: UnitCheck is the ideal GEO challenger (its computed $/unit is a citeable statistic); own the metric via Dataset + DefinedTerm structured data, lead every leaderboard with a direct-answer + statistics + dated freshness, stay penalty-safe at scale, and expose machine-readable feeds for AI/agent discovery.

  • Renamed the ingestion Worker config wrangler.tomlwrangler.ingestion.toml (feature/go-redirect, internal): once the /go Pages Function added a functions/ directory, Cloudflare Pages began trying to consume the root wrangler.toml (a Worker-only config) as Pages config and failed every deploy. A non-reserved filename makes Pages ignore it; worker commands now take -c wrangler.ingestion.toml. [internal]

  • Faster navigation + image loads (feature/perf-prefetch, internal): Speculation Rules prefetch likely-next pages on pointer intent (excludes affiliate /go links), and product-image pages preconnect to Amazon's image CDN. Native browser features, no new dependencies, no caching/staleness risk. [internal]

  • Production database provisioned (unitcheck-prod, internal): applied migrations 00010006 and mirrored the 20-product seed catalog (+ unit prices) from dev, so the production site builds against its own database. Cloudflare Pages Production reads unitcheck-prod; Preview reads unitcheck-dev. [internal]

  • Keep-warm cron (.github/workflows/keep-warm.yml, internal): pings both free-tier Supabase projects every 3 days so neither auto-pauses — a paused project breaks the production build. [internal]

  • First-party affiliate click path live (feature/go-redirect): the /go/[id] link behind every "View on Amazon" button now resolves. Built as a Cloudflare Pages Function that 302-redirects to the product's affiliate URL and records a cookieless, PII-free click (source page, product/retailer, country, coarse referrer category, plus campaign/creator/ASIN attribution) — closing the gap where outbound clicks previously only worked without JavaScript. No tracking cookies, IPs, or raw user agents (per SPEC.md §15). [internal]

  • DB foundation (feature/db-foundation, internal): migrations made clean, sequential, and complete so prod can be provisioned from committed files alone. New 0004_seed_protein_subcategories.sql (the 4 whey/casein/plant/blend sub-categories, idempotent) and 0005_click_logs_attribution.sql (adds utm_*, creator_slug, promoted_asin to click_logs for the /go worker + social attribution); renamed the colliding second 0003_* to 0006_sub_category_summaries.sql. Ingestion now tags category_id/sub_category_id at write time via a pure proteinTypeToSubCategorySlug mapper, eliminating the manual backfill that previously gated leaderboard rendering after every run. [internal]

  • Social-media growth strategy established (ADR-015, docs/social-strategy.md): a data-native content engine that turns the price dataset into native content across TikTok/Pinterest/Reddit (Tier 1) + Instagram/YouTube/Threads (Tier 2), running on the SAME backend as SEO/GEO + the /go worker (per-category feeds, OG image endpoint, ingestion price-move events, UTM/cid/ASIN-aware click logging). Researched via a 12-angle Sonnet workflow with Opus synthesis; includes a costed AI/automation stack and a 30/60/90 rollout.

  • CLAUDE.md: added operating principle #6 "Built to be cited (SEO + GEO)" and a pointer to docs/seo-geo-strategy.md, so SEO/GEO scaffolding is treated as first-class when building any page or template.

  • CLAUDE.md + DECISIONS.md: pointer to docs/social-strategy.md (ADR-015) for the social/creator growth strategy; TODO.md gained a "Social distribution & content automation" section.


[0.1.0] — TBD (MVP launch target)

Status: Pre-build. This release block reserved for MVP launch.

Added (planned)

  • Protein powder category live (whey, casein, plant-based, blend sub-categories)
  • Amazon retailer integration across US, UK, CA locales
  • Category leaderboard pages sortable by $/g protein, $/g powder, $/serving, total price, Best Value, rating, review count
  • Full filter set: protein type, flavor, dietary flags, servings range, rating floor, review count floor, Prime only, in-stock only
  • Product detail pages with live unit-price calculations and "last synced" timestamps
  • Launch editorial content: 6–10 buying guides covering whey value rankings, plant-based comparisons, grass-fed analysis, subscribe-and-save math, shrinkflation trends
  • Public methodology page documenting every formula
  • Dark mode
  • Cookieless analytics via Cloudflare Web Analytics
  • First-party affiliate click tracking

Methodology (planned)

  • Primary metric: $/g protein (novel; no competitor calculates this)
  • Best Value composite score: unit-price + log(review count) × rating × Prime bonus
  • Confidence-gated extraction (≥0.80 accept; 0.60–0.79 re-extract with Sonnet; <0.60 exclude)
  • 6-hour ingestion refresh cadence
  • Honest "last synced" timestamps everywhere prices are displayed

Release format (reference — future releases)

Future releases follow this template:

## [X.Y.Z] — YYYY-MM-DD

Brief one-line summary of the release.

### Added
- New feature or content

### Changed
- Behavioral change to existing feature

### Methodology
- Formula or rule change, with rationale

### Data
- Coverage expansion or correction

### Fixed
- Bug fix description

Version scheme: semantic versioning (major.minor.patch).

  • Major: Breaking changes to public methodology or URL structure
  • Minor: New categories, new retailers, new features
  • Patch: Bug fixes, content updates, minor improvements

Public vs. internal changes

Not every change in this repo appears in the public /changelog. The rule:

Public-facing (included in /changelog page):

  • Methodology changes
  • Category or retailer launches
  • User-facing feature additions
  • Data corrections affecting rankings
  • Material UI or URL changes
  • Privacy Policy or Terms updates

Internal-only (in this file but not rendered publicly, or in a separate internal changelog):

  • Refactors
  • Dependency updates
  • CI configuration
  • Editorial content edits (typos, clarifications)
  • Backend performance improvements invisible to users

When adding an entry, tag it [internal] if it should not render on the public page. Build tooling can filter accordingly.


Contributing to the changelog

Every PR that changes user-visible behavior, methodology, or data coverage must include a changelog entry under [Unreleased]. The entry should be:

  • Written for a non-technical reader. The public changelog is read by users and AI engines looking for freshness signals.
  • Specific about the change. "Updated protein calculation" is too vague. "Corrected grams-to-ounces conversion for single-unit listings to use exact 28.3495 conversion factor" is better.
  • Linked to rationale where relevant. For methodology changes, link to the updated METHODOLOGY.md section.
  • Dated only when released. Entries accumulate under [Unreleased] and get dated when the release is cut.

End of changelog.