Zum Hauptinhalt springen

YouTube Coverage Analyse

Wissensbasis fuer die fortlaufende Optimierung der YouTube-Daten-Coverage. Alle Queries laufen direkt gegen die Production-DB (TablePlus oder psql) und sind die Grundlage fuer Plan 56 (Intelligente Tier-Aktivierung) und seine Folge-Plaene.

Konvention: Realdaten-Snapshots werden woechentlich automatisch in youtube_coverage_weekly_snapshots persistiert (siehe Plan 56 Phase 11). Die hier dokumentierten Queries sind die manuelle Variante der gleichen Aggregation — fuer Adhoc-Analysen, neue Hypothesen-Tests, oder bei Plan-Anpassungen.


Snapshot 2026-04-28 (Erstmessung)

Bestand

MetrikWert
YouTube-Profile gesamt545.644
Tracking aktiv544.163
PRO (aktiv getrackt + Sync)21.055 (3,9%)
Free (kein Per-Video-Sync)523.108 (96,1%)
youtube_video_syncs.auto_sync_enabled = true21.064
Sync mit 3+ Fehl-Tagen45
PRO ohne Sync (defekt)0
Free mit Sync (Fehlkonfiguration)0

→ PRO ↔ Per-Video-Sync ist 1:1 sauber. Free-Profile bekommen aktuell gar keine Per-Video-Snapshots.

Aktivitaets-Verteilung der 21K PRO-Profile

Letzter Video-UploadPRO-ProfileAnteil
≤ 7 Tage14.73770,0%
≤ 30 Tage2.37811,3%
30–60 Tage8344,0%
60–90 Tage4522,1%
90–180 Tage7353,5%
180–365 Tage6223,0%
> 365 Tage1.2926,1%
Videos in 30 TagenPRO-Profile
20+7.601
10–193.206
5–92.692
3–41.617
1–21.999
0~3.940

Per-Video-Snapshot-Realitaet

MetrikWert
Daily Snapshots (Mittel letzte 14 Tage)~190.000
Daily Snapshots (Schwankungsbreite)124.481 – 270.266
Unique Videos mit Snapshots in 30 Tagen2.694.981
Total Snapshots in 30 Tagen5.152.491
Average Snapshots pro Video / 30 Tage1,9
Storage youtube_video_daily_metrics2.086 MB / 7,26 Mio Rows

→ Kein Video wird taeglich gesnappt. Im Mittel 1–2× pro Monat. Das ist konsistent mit der Plan-57-Capacity-Begrenzung.

Snapshot-Freshness im 180d-Fenster

StatusVideosAnteil
Frisch (≤ 1 Tag)443.75615%
≤ 7 Tage342.16012%
≤ 30 Tage (potenziell stale)1.505.25952%
> 30 Tage (stale)433.36115%
Kein Snapshot145.7335%

67% der Videos haben Snapshots ≥ 7 Tage alt. Publishing-Analytics-Aggregation arbeitet de facto schon mit z.T. veralteten View-Counts.

Channel-Kategorie

Videos in 180dChannelsTotal VideosAnteil Videos
100+2.7522.365.18482%
51–1002.715190.2027%
21–507.328234.9188%
6–204.88272.1212%
1–51.4593.829<1%
01.91600%

2.752 Channels (13%) verursachen 82% der gesamten Video-Last.


Adhoc-Queries (TablePlus / psql)

Block A — Bestand & Volumen

-- A1: Gesamtverteilung YouTube-Profile
SELECT
COUNT(*) AS total_yt_profiles,
COUNT(*) FILTER (WHERE tracking_enabled = true) AS tracking_enabled,
COUNT(*) FILTER (WHERE tracking_enabled = false) AS tracking_disabled,
COUNT(*) FILTER (WHERE archived_at IS NOT NULL) AS archived,
COUNT(*) FILTER (WHERE blocked_at IS NOT NULL) AS blocked,
COUNT(*) FILTER (WHERE pro_enabled = true) AS pro_enabled,
COUNT(*) FILTER (WHERE pro_enabled = true AND tracking_enabled = true) AS pro_active
FROM social_profiles
WHERE platform = 'youtube';

-- A2: Video-Sync-Status (Plan 47 Stat-Boxen)
SELECT
COUNT(*) AS total_syncs,
COUNT(*) FILTER (WHERE auto_sync_enabled = true) AS active_sync,
COUNT(*) FILTER (WHERE auto_sync_enabled = false) AS inactive_sync,
COUNT(*) FILTER (WHERE consecutive_fail_days >= 3) AS failing_sync
FROM youtube_video_syncs;

-- A3: PRO ↔ Sync-Konsistenz
SELECT
COUNT(*) FILTER (WHERE sp.pro_enabled AND yvs.auto_sync_enabled) AS pro_with_active_sync,
COUNT(*) FILTER (WHERE sp.pro_enabled AND (yvs.auto_sync_enabled IS NULL OR yvs.auto_sync_enabled = false)) AS pro_without_sync,
COUNT(*) FILTER (WHERE NOT sp.pro_enabled AND yvs.auto_sync_enabled) AS free_with_active_sync
FROM social_profiles sp
LEFT JOIN youtube_video_syncs yvs ON yvs.social_profile_id = sp.id
WHERE sp.platform = 'youtube' AND sp.tracking_enabled = true;

Block B — Aktivitaets-Verteilung

-- B1: Histogramm "Letzter Video-Upload vor X Tagen"
WITH last_video AS (
SELECT sp.id, MAX(yv.published_at) AS last_published
FROM social_profiles sp
LEFT JOIN youtube_videos yv ON yv.social_profile_id = sp.id AND yv.is_removed = false
WHERE sp.platform = 'youtube' AND sp.tracking_enabled = true
GROUP BY sp.id
)
SELECT
CASE
WHEN last_published IS NULL THEN '00_no_videos'
WHEN last_published >= NOW() - INTERVAL '7 days' THEN '01_<=7d'
WHEN last_published >= NOW() - INTERVAL '30 days' THEN '02_<=30d'
WHEN last_published >= NOW() - INTERVAL '60 days' THEN '03_<=60d'
WHEN last_published >= NOW() - INTERVAL '90 days' THEN '04_<=90d'
WHEN last_published >= NOW() - INTERVAL '180 days' THEN '05_<=180d'
WHEN last_published >= NOW() - INTERVAL '365 days' THEN '06_<=365d'
ELSE '07_>365d'
END AS bucket,
COUNT(*) AS profiles
FROM last_video
GROUP BY bucket
ORDER BY bucket;

-- B2: Histogramm "Videos in letzten 30 Tagen"
WITH recent_videos AS (
SELECT sp.id,
COUNT(yv.video_id) FILTER (
WHERE yv.published_at >= NOW() - INTERVAL '30 days' AND yv.is_removed = false
) AS videos_30d
FROM social_profiles sp
LEFT JOIN youtube_videos yv ON yv.social_profile_id = sp.id
WHERE sp.platform = 'youtube' AND sp.tracking_enabled = true
GROUP BY sp.id
)
SELECT
CASE
WHEN videos_30d = 0 THEN '0_videos'
WHEN videos_30d BETWEEN 1 AND 2 THEN '1-2_videos'
WHEN videos_30d BETWEEN 3 AND 4 THEN '3-4_videos'
WHEN videos_30d BETWEEN 5 AND 9 THEN '5-9_videos'
WHEN videos_30d BETWEEN 10 AND 19 THEN '10-19_videos'
ELSE '20+_videos'
END AS bucket,
COUNT(*) AS profiles
FROM recent_videos
GROUP BY bucket
ORDER BY bucket;

-- B3: Follower-Delta letzte 30 Tage (Channel-Lebenszeichen)
WITH follower_movement AS (
SELECT sp.id,
MAX(spdm.followers_count) - MIN(spdm.followers_count) AS follower_delta_30d,
COUNT(spdm.id) AS metric_days_30d
FROM social_profiles sp
LEFT JOIN social_profile_daily_metrics spdm
ON spdm.social_profile_id = sp.id
AND spdm.date >= CURRENT_DATE - INTERVAL '30 days'
WHERE sp.platform = 'youtube' AND sp.tracking_enabled = true
GROUP BY sp.id
)
SELECT
CASE
WHEN metric_days_30d = 0 OR follower_delta_30d IS NULL THEN '0_no_metrics'
WHEN follower_delta_30d = 0 THEN '1_zero_delta'
WHEN follower_delta_30d <= 10 THEN '2_<=10'
WHEN follower_delta_30d <= 100 THEN '3_<=100'
WHEN follower_delta_30d <= 1000 THEN '4_<=1000'
ELSE '5_>1000'
END AS bucket,
COUNT(*) AS profiles
FROM follower_movement
GROUP BY bucket
ORDER BY bucket;

Block C — Tier-Simulation (Schwellenwert-Tuning)

-- C1: Default-Schwellen 90d/60d/5 — Was wuerde Plan 56 heute klassifizieren?
WITH signals AS (
SELECT
sp.id, sp.pro_enabled,
MAX(yv.published_at) AS last_video_at,
COUNT(yv.video_id) FILTER (
WHERE yv.published_at >= NOW() - INTERVAL '30 days' AND yv.is_removed = false
) AS videos_30d,
COALESCE(MAX(spdm.followers_count) - MIN(spdm.followers_count), 0) AS follower_delta_30d
FROM social_profiles sp
LEFT JOIN youtube_videos yv ON yv.social_profile_id = sp.id AND yv.is_removed = false
LEFT JOIN social_profile_daily_metrics spdm
ON spdm.social_profile_id = sp.id
AND spdm.date >= CURRENT_DATE - INTERVAL '30 days'
WHERE sp.platform = 'youtube' AND sp.tracking_enabled = true
GROUP BY sp.id, sp.pro_enabled
)
SELECT
CASE
WHEN pro_enabled AND (last_video_at IS NULL OR last_video_at < NOW() - INTERVAL '180 days')
THEN 'paused (PRO dormant)'
WHEN pro_enabled AND videos_30d >= 5
THEN 'full (PRO + high)'
WHEN pro_enabled AND videos_30d BETWEEN 1 AND 4
THEN 'standard (PRO + moderate)'
WHEN pro_enabled AND last_video_at >= NOW() - INTERVAL '90 days'
THEN 'light (PRO + recent but no 30d)'
WHEN pro_enabled
THEN 'paused (PRO + 90-180d)'
ELSE 'free (no per-video-sync)'
END AS expected_tier,
COUNT(*) AS profiles,
ROUND(100.0 * COUNT(*) / SUM(COUNT(*)) OVER (), 1) AS pct
FROM signals
GROUP BY expected_tier
ORDER BY profiles DESC;

→ Variiere INTERVAL '90 days' / INTERVAL '60 days' / videos_30d >= 5 um strengere/laschere Heuristiken zu testen.

Block D — Quota-Footprint

-- D1: Daily Per-Video-Snapshot-Volumen (letzte 14 Tage)
SELECT date,
COUNT(*) AS snapshots,
COUNT(DISTINCT youtube_video_id) AS unique_videos
FROM youtube_video_daily_metrics
WHERE date >= CURRENT_DATE - INTERVAL '14 days'
GROUP BY date
ORDER BY date DESC;

-- D2: 30-Tage-Aggregat — wie viele Videos werden ueberhaupt regelmaessig gesnappt?
SELECT
COUNT(DISTINCT youtube_video_id) AS videos_with_snapshots_last_30d,
COUNT(*) AS total_snapshots_last_30d,
ROUND(COUNT(*)::numeric / NULLIF(COUNT(DISTINCT youtube_video_id), 0), 1) AS avg_snapshots_per_video
FROM youtube_video_daily_metrics
WHERE date >= CURRENT_DATE - INTERVAL '30 days';

-- D3: Storage-Footprint
SELECT
pg_size_pretty(pg_total_relation_size('youtube_video_daily_metrics')) AS total_size,
(SELECT COUNT(*) FROM youtube_video_daily_metrics) AS total_rows;

Block E — Publishing-Analytics-Impact

-- E1: Channel-Verteilung im 180d-Fenster
WITH video_per_channel AS (
SELECT sp.id, sp.handle,
COUNT(*) FILTER (
WHERE yv.published_at >= NOW() - INTERVAL '180 days' AND yv.is_removed = false
) AS videos_180d
FROM social_profiles sp
JOIN youtube_videos yv ON yv.social_profile_id = sp.id
WHERE sp.platform = 'youtube' AND sp.tracking_enabled = true
GROUP BY sp.id, sp.handle
)
SELECT
CASE
WHEN videos_180d = 0 THEN '0_videos'
WHEN videos_180d BETWEEN 1 AND 5 THEN '1-5'
WHEN videos_180d BETWEEN 6 AND 20 THEN '6-20'
WHEN videos_180d BETWEEN 21 AND 50 THEN '21-50'
WHEN videos_180d BETWEEN 51 AND 100 THEN '51-100'
ELSE '100+'
END AS bucket,
COUNT(*) AS channels,
SUM(videos_180d) AS total_videos_in_bucket
FROM video_per_channel
GROUP BY bucket
ORDER BY bucket;

-- E2: Was wuerden wir bei light-Tier verlieren?
WITH light_candidates AS (
SELECT sp.id
FROM social_profiles sp
LEFT JOIN youtube_videos yv
ON yv.social_profile_id = sp.id
AND yv.published_at >= NOW() - INTERVAL '30 days'
AND yv.is_removed = false
WHERE sp.platform = 'youtube' AND sp.tracking_enabled = true
GROUP BY sp.id
HAVING COUNT(yv.video_id) = 0
)
SELECT
COUNT(DISTINCT yvdm.youtube_video_id) AS videos_we_would_stop_tracking,
COUNT(*) AS daily_snapshots_we_would_stop
FROM light_candidates lc
JOIN youtube_videos yv ON yv.social_profile_id = lc.id
JOIN youtube_video_daily_metrics yvdm ON yvdm.youtube_video_id = yv.video_id
WHERE yvdm.date >= CURRENT_DATE - INTERVAL '7 days';

-- E3: Snapshot-Freshness pro Video im 180d-Fenster
WITH video_recency AS (
SELECT yv.video_id, yv.social_profile_id,
MAX(yvdm.date) AS last_snapshot_date,
CURRENT_DATE - MAX(yvdm.date) AS days_since_snapshot
FROM youtube_videos yv
LEFT JOIN youtube_video_daily_metrics yvdm ON yvdm.youtube_video_id = yv.video_id
WHERE yv.published_at >= NOW() - INTERVAL '180 days' AND yv.is_removed = false
GROUP BY yv.video_id, yv.social_profile_id
)
SELECT
CASE
WHEN last_snapshot_date IS NULL THEN '0_no_snapshot'
WHEN days_since_snapshot <= 1 THEN '1_<=1d (frisch)'
WHEN days_since_snapshot <= 7 THEN '2_<=7d'
WHEN days_since_snapshot <= 30 THEN '3_<=30d (potenziell stale)'
ELSE '4_>30d (stale)'
END AS snapshot_freshness,
COUNT(*) AS videos
FROM video_recency
GROUP BY snapshot_freshness
ORDER BY snapshot_freshness;

Schluesselerkenntnisse fuer Plan 56 und Folgeplaene

  1. PRO ↔ Per-Video-Sync ist 1:1. Plan 56 wirkt fast ausschliesslich auf den PRO-Pool von 21K Profilen — die 523K Free-Profile sind de facto schon light (kein Per-Video-Sync).

  2. Quota-Einsparung im Per-Video-Pool ist marginal (~3.000 Snapshots/Tag potentielle Einsparung = vernachlaessigbar). Plan 56 sollte nicht primaer Quota sparen, sondern Coverage-Capacity fairer verteilen.

  3. Echter Quota-Hebel liegt ausserhalb Plan 56: 227.311 Profile mit zero_delta in 30d bekommen taeglich channels.list (~227K Quota-Units/Tag). Throttling auf woechentlich = -194K Units/Tag. Gehoert in einen Schwester-Plan (Channel-Stats-Throttle).

  4. Coverage ist heute schon nicht-taeglich: avg 1,9 Snapshots pro Video pro 30 Tage. Plan-56-Tier-Definitionen muessen relative Prioritaet in der Coverage-Pipeline definieren, nicht absolute Frequenz.

  5. Publishing-Analytics-Impact gering, wenn light-Tier weiterhin woechentliche Top-Video-Snapshots macht. Aggregation laeuft ueber 180d-Window mit MAX(view_count) — robust gegen einzelne stale Werte.


Trend-Tracking ueber Zeit (Plan 56 Phase 11)

Die youtube_coverage_weekly_snapshots Tabelle persistiert die obigen Aggregate woechentlich (Sonntag 04:00 UTC) — siehe Plan 56. Die Admin-Seite /admin/youtube-coverage-trends rendert die Zeitreihen als ApexCharts.

Beispiel-Trend-Queries:

-- Wachstum der getrackten Videos ueber Zeit
SELECT snapshot_date,
videos_with_snapshots_30d,
avg_snapshots_per_video,
channels_pro_active
FROM youtube_coverage_weekly_snapshots
ORDER BY snapshot_date DESC
LIMIT 26; -- letzte 6 Monate

-- Tier-Verteilung-Trend
SELECT snapshot_date,
tier_full_count,
tier_standard_count,
tier_light_count,
tier_paused_count
FROM youtube_coverage_weekly_snapshots
ORDER BY snapshot_date;

→ Diese Daten dienen als Basis fuer kuenftige Optimierungen. Bei jeder Plan-Anpassung (Schwellenwerte, neue Tiers) im Admin-UI vorher/nachher visuell pruefen.