From 64d62e79ce8b8d8b6cdfb3820e38e53c0661d762 Mon Sep 17 00:00:00 2001
From: mkelvers <mikkelelvers@outlook.com>
Date: Sat, 6 Jun 2026 15:54:10 +0200
Subject: [PATCH] refactor: remove docs folder

---
 docs/recommendation-architecture.md | 183 ----------------------------
 1 file changed, 183 deletions(-)
 delete mode 100644 docs/recommendation-architecture.md

diff --git a/docs/recommendation-architecture.md b/docs/recommendation-architecture.md
deleted file mode 100644
index c2a8434..0000000
--- a/docs/recommendation-architecture.md
+++ /dev/null
@@ -1,183 +0,0 @@
-# Recommendation Architecture
-
-This document defines the long-term shape of the `Top Pick for You`
-recommendation system.
-The goal is to keep the current implementation simple enough to operate inside
-the existing Go application while preserving a clean path toward a larger
-recommender system.
-
-## Current Serving Model
-
-The current `Top Pick for You` implementation is a bounded hybrid ranker:
-
-- builds weighted seeds from user watch history
-- uses Jikan recommendation edges as collaborative candidates
-- uses watchlist-derived genres, themes, studios, and demographics as profile
-  search candidates
-- excludes anime already present in the watchlist
-- boosts candidates that match user taste signals
-- reranks the final list to reduce genre pileups
-
-The online request path stays intentionally small:
-
-1. load recent watchlist state
-2. derive strong seeds
-3. build a weighted taste profile from those seeds
-4. fetch bounded collaborative and profile-search candidate sets
-5. score candidates
-6. rerank for diversity
-7. return top results
-
-## Target System Shape
-
-The future recommender should keep four stable layers:
-
-1. event collection
-2. feature aggregation
-3. candidate generation
-4. ranking and reranking
-
-That separation matters more than the specific model used at each stage.
-
-## Event Collection
-
-Recommendations should eventually be driven by behavior events, not only by
-watchlist state.
-
-Important events:
-
-- `impression`
-- `click`
-- `add_to_watchlist`
-- `start_watch`
-- `progress_update`
-- `complete`
-- `drop`
-- `hide_recommendation`
-- `search`
-
-Event capture should preserve:
-
-- `user_id`
-- `anime_id`
-- `event_type`
-- `occurred_at`
-- `source`
-- contextual metadata as JSON
-
-## Feature Aggregation
-
-Online requests should not recompute the full user profile from raw events.
-Instead, background jobs should maintain aggregated feature snapshots.
-
-Useful profile features:
-
-- genre affinity
-- theme affinity
-- studio affinity
-- demographic affinity
-- completion rate by genre
-- abandonment rate by genre
-- preference for airing vs finished anime
-- preference for recent vs older anime
-- short-term interest profile
-- long-term stable taste profile
-
-These features should eventually live in a durable profile snapshot table so
-the serving path remains cheap.
-
-## Candidate Generation
-
-Candidate generation should be modular. Each source should produce:
-
-- `anime_id`
-- `source`
-- `source_score`
-- explanation metadata
-
-Primary candidate sources:
-
-- item-item recommendation edges
-- related anime and sequel chains
-- content-similar anime from genres, themes, studios, and demographics
-- trending titles inside the user taste envelope
-- seasonal titles aligned with recent behavior
-- editorial or promoted rails when needed
-
-Candidate generation should stay bounded. Ranking the full catalog online is
-not a viable long-term approach.
-
-## Ranking
-
-The current ranker is heuristic by design. That is the correct starting point.
-
-Near-term ranking inputs:
-
-- collaborative recommendation weight
-- watch history status weight
-- recency decay
-- progress-based engagement
-- genre overlap
-- theme overlap
-- studio overlap
-- demographic overlap
-- airing or freshness alignment
-- popularity moderation
-
-The ranking API should remain stable even if the scoring model changes later.
-That allows a future move to gradient-boosted trees or other learned rankers
-without rewriting candidate generation or serving.
-
-## Reranking
-
-The final serving stage should apply product constraints that raw ranking will
-not handle well on its own:
-
-- genre diversity
-- franchise caps
-- duplicate suppression
-- hide or negative-feedback suppression
-- maturity filtering
-- freshness and exploration budget
-
-This is intentionally a separate concern from relevance scoring.
-
-## Data Tables
-
-The first recommendation-specific schema additions should support:
-
-- append-only event capture
-- recommendation impression tracking
-- cached user profile snapshots
-
-These tables are created in migration `024_add_recommendation_foundation.sql`.
-
-## Roadmap
-
-### V1
-
-- bounded hybrid ranker in request path
-- uses watchlist history and Jikan metadata
-- no offline jobs required
-
-### V2
-
-- capture user recommendation and watch behavior events
-- persist user profile snapshots
-- precompute candidate caches
-- add explicit feedback controls such as hide or not interested
-
-### V3
-
-- split retrieval from ranking
-- precompute similarity graphs and user candidate pools
-- run offline evaluation on impressions, clicks, starts, and completes
-- introduce learned ranking only when enough behavior data exists
-
-## Operational Rules
-
-- keep request-time fanout bounded
-- keep scoring explainable
-- log recommendation impressions before introducing heavier models
-- prefer replaceable modules over one large recommendation function
-- treat data collection as the foundation for later ML, not an optional extra