Scoring System
After fetching content from all sources, Horizon uses an AI model to score each item on a 0-10 scale. This determines what appears in the daily summary.
Pipeline
- Batch processing β Items are scored in batches of 10 with a progress bar. Failed items receive a score of 0.
- Content preparation β For each item, the content is truncated (800 chars if comments are present, 1000 otherwise) and engagement metrics are assembled from metadata (HN score, Reddit upvote ratio, etc.).
- AI analysis β The prepared content is sent to the configured AI model (temperature 0.3) with a system prompt defining the scoring criteria.
- Response parsing β The AI response is parsed as JSON (with fallbacks for code-block-wrapped JSON). Each item gets:
ai_score(float),ai_reason(string),ai_summary(string), andai_tags(list). - Retry β Failed AI calls are retried up to 3 times with exponential backoff (2-10 seconds).
Scoring Scale
| Score | Tier | Description |
|---|---|---|
| 9-10 | Groundbreaking | Major breakthroughs, paradigm shifts, major version releases, significant research breakthroughs |
| 7-8 | High Value | Important developments, technical deep-dives, novel approaches, insightful analysis, valuable tools |
| 5-6 | Interesting | Incremental improvements, useful tutorials, moderate community interest |
| 3-4 | Low Priority | Minor updates, common knowledge, overly promotional |
| 0-2 | Noise | Spam, off-topic, trivial updates |
Scoring Factors
The AI evaluates each item based on:
- Technical depth and novelty β original ideas, new techniques, research contributions
- Potential impact β how broadly this affects software engineering, AI/ML, or systems research
- Quality of writing/presentation β clarity, structure, thoroughness
- Community discussion β insightful comments, diverse viewpoints, substantive debates
- Engagement signals β high upvotes/favorites paired with substantive discussion (not just raw numbers)
Engagement metadata is source-specific: HN provides score and comment count, Reddit provides upvote ratio and comment count.
Filtering
After scoring, items are filtered by filtering.ai_score_threshold (default: 7.0) and sorted by score descending. Only items meeting the threshold appear in the daily summary.
{
"filtering": {
"ai_score_threshold": 7.0,
"time_window_hours": 24
}
}
Items scoring 9.0 or above are featured in the βTodayβs Highlightsβ section of the summary.
Enrichment
Items that pass the score threshold go through a second AI pass for enrichment (src/ai/enricher.py):
- Concept extraction β AI identifies 1-3 technical concepts in the item that may need explanation.
- Web search β Each concept is searched via DuckDuckGo to gather grounding context.
- Structured analysis β The item content and search results are sent to AI, which produces:
whats_newβ what specifically happened or changedwhy_it_mattersβ significance and impactkey_detailsβ notable technical details or caveatsbackgroundβ background knowledge for readers without deep domain expertise
These fields are combined into a detailed_summary stored in the itemβs metadata and used in the final daily summary.