Open the methodology changelog
# Clavis Rating Methodology Changelog
All notable changes to the Clavis Rating methodology are documented here. Versions follow semantic-style conventions:
- **Major** (vN.0.0): pillar added or removed, composite formula reshaped
- **Minor** (v0.N.0): pillar calculation changes meaningfully
- **Patch** (v0.0.N): parameter tuning within an existing calculation
Every change references the spec doc (`vN.N.N.md`) and YAML (`vN.N.N.yaml`) that defines it.
---
## v0.1.0 — 2026-06-01 (ready for shadow-score evaluation)
**Reframe from deduction model to three-pillar composite. Implemented end-to-end.**
### Implementation
- `clearview/scoring/v01.py` — three-pillar Safety/Trajectory/Sentiment
computation with acuity, density, and vintage corrections.
- `clearview/scoring/sentiment.py` — Sentiment LLM extraction interface
+ recency-weighted aggregation + DB persistence.
- `clearview/scoring/shadow.py` — shadow-score diff tool. Compares any
two methodology versions across the full facility corpus and emits
a markdown distribution/migration report.
- `clearview/scoring/tinker.py` — CLI for interactive parameter
experimentation against real or synthetic facilities.
- `backend/routes/methodology.py` — `/api/methodology` endpoints
exposing the active methodology, spec doc, and changelog.
- `frontend/src/app/methodology/page.tsx` — public methodology page.
- `clearview/scripts/calibrate_acuity_benchmarks.py` — calibration
tool for the acuity benchmarks; placeholders shipped pending
real-corpus calibration.
### Schema (migration d9e0f1a2b3c4)
- `facilities.year_built` (Integer, nullable) — vintage-control input.
- `facility_scores` extended with per-pillar columns plus adjustment
flags (`safety_acuity_adjusted`, `safety_density_normalized`,
`safety_vintage_adjusted`), `insufficient_data` gate flag,
`care_type_used`, and Trajectory/Sentiment sub-component scores.
- New `community_sentiment_scores` table — per-dimension Sentiment
scores keyed on facility + dimension + methodology version.
### Launch-posture decisions documented in spec
- **Publication gate relaxed:** `require_at_least_one_non_safety_pillar`
set to `false` for v0.1.0 launch because Sentiment data does not yet
exist for the corpus. Safety alone is publishable. Flip back to
`true` in v0.1.1 once Sentiment is populated for the majority of
facilities.
- **Acuity benchmarks shipped as placeholders** (IL 0.04, AL 0.10,
MC 0.18, SNF 0.31) pending real-corpus calibration via the
`calibrate_acuity_benchmarks` script. Recalibration will be a
v0.1.1 patch.
### Added
- Three-pillar composite: Safety, Trajectory, Sentiment.
- Care-type-specific composite weights (IL, AL, MC, SNF).
- Care-type-specific deficiency benchmarks calibrated from CA 2023-2025 corpus.
- Inspection-density correction to normalize for state regulator inspection cadence differences.
- Vintage control on physical-plant deficiencies for buildings older than 10 years.
- Time-to-correct sub-component (median days from `date_found` to `date_corrected`).
- Repeat-violation-rate sub-component (same regulation cited within 18mo prior).
- Volume-gated dimensional Sentiment scoring (LLM extraction from Google reviews on 8 dimensions).
- Care-type-specific dimension weights inside Sentiment.
- Confidence bands and publication rules for sparse-data communities.
- Weight redistribution rules when Trajectory or Sentiment data is insufficient.
- `methodology_version` stamp on every score row.
### Changed
- Severity, Frequency, Recency, Complaints, Inspections (the v0.0.0 components) are now sub-components inside the Safety pillar rather than direct contributors to the overall score.
- Frequency benchmark is now care-type-specific instead of universal 0.10/bed.
- Trajectory is no longer change-in-severity (which double-counted Safety). It is now time-to-correct + repeat-rate, genuinely independent of absolute deficiency count.
### Removed
- Universal deficiency-rate benchmark (replaced by care-type benchmarks).
- Implicit assumption that all deficiencies are equally weighted regardless of building age (vintage control now applied to physical-plant subset).
### Migration notes
- All existing facility_scores rows stay valid with `methodology_version = "v0.0.0"` backfilled.
- v0.0.0 scores remain queryable for historical comparison.
- Shadow-score evaluation window: 14 days of dual-computation before v0.1.0 becomes the published source of truth.
### Open questions deferred to v0.2.0+
- Stratified Safety scoring (per care type within mixed buildings).
- Renovation events in vintage control.
- Operator-level Clavis Rating as a parallel product.
- CMS Care Compare integration → SNF Outcomes pillar.
- Sentiment confidence intervals on volume.
- EHR-driven Outcomes pillar (long horizon).
- Pricing transparency / Access pillar (when rate-card coverage broadens).
---
## v0.0.0 — production through 2026-05-25
**The deduction model. Current production.**
Five components summed with fixed weights:
```
Overall = 0.30 × Severity
+ 0.20 × Frequency
+ 0.20 × Recency
+ 0.15 × Complaints
+ 0.15 × Inspections
```
Universal deficiency-rate benchmark (0.10/bed). No acuity adjustment. No inspection-density correction. No vintage control. No trajectory signal. No consumer sentiment input. Same calculation across IL/AL/MC/SNF.
Implementation lives in `clearview/scoring/engine.py` and `clearview/scoring/config.py`. Preserved unchanged for historical comparison. New score rows after v0.1.0 launch use the new methodology; old rows retain the v0.0.0 stamp.