Score vs bytes
Track B runs only. Green points are timeline-backed promotions. Red points are explicit rejections. Blue points are measured but neither promoted nor explicitly rejected in the current summary logic.
Lower is better. This plot is generated directly from scorer-backed artifacts in the repository.
Promotion ladder
- 3.62 — 512x384, lanczos/bicubic, crf 23, keyint 32, bframes 4, ref 4
- 3.56 — 448x336, lanczos/bicubic, crf 23, keyint 32, bframes 4, ref 4
- 3.56 — 448x336, lanczos/bicubic, crf 23, keyint 48, bframes 4, ref 4
- 3.54 — 448x336, lanczos/lanczos, crf 23, keyint 48, bframes 4, ref 4
- 3.33 — 432x324, lanczos/lanczos, crf 23, keyint 48, bframes 4, ref 4
- 3.25 — 424x318, lanczos/lanczos, crf 23, keyint 48, bframes 4, ref 4
Useful negative results
- 5.73 — robust_current-roi-two-pass-cpu-2026-04-04
- 4.47 — robust_current-dynamic-main-roi-cpu-2026-04-05
- 3.44 — robust_current-464x348-cpu-2026-04-04
- 3.44 — robust_current-cand-416x312-crf23-g48-b4-r4-cpu-2026-04-05
- 3.43 — robust_current-cand-432x324-crf23-g48-b3-r4-cpu-2026-04-05
- 3.38 — robust_current-cand-432x324-crf23-g64-b4-r4-cpu-2026-04-05
- 3.38 — robust_current-cand-426x320-crf23-g48-b4-r4-cpu-2026-04-06
- 3.32 — robust_current-cand-428x320-crf23-g48-b4-r4-cpu-2026-04-05
What to open first
If you only have a minute, open the judges one-pager first. If you need to validate a claim, jump directly to the evidence index or promotion accounting.
Current default reading order:
- Headline metrics
- Score vs bytes
- Promotion ladder
- Useful negative results
- Promotion accounting
Promotion accounting
Rule-faithful values below are local estimates from scorer distortions plus honest bytes. They are not official published scores.
| Run | Scale | Filters | Current workflow | Current bytes | Rule-faithful estimate | Rule-faithful bytes |
|---|---|---|---|---|---|---|
| robust_current-medium23-cpu-2026-04-03 | 512x384 | lanczos/bicubic | 3.62 | 2,819,374 | 3.618 | 2,822,418 |
| robust_current-448x336-medium23-cpu-2026-04-03 | 448x336 | lanczos/bicubic | 3.56 | 1,978,141 | 3.563 | 1,981,185 |
| robust_current-keyint48-cpu-2026-04-04 | 448x336 | lanczos/bicubic | 3.56 | 1,901,606 | 3.562 | 1,904,650 |
| robust_current-lanczos-lanczos-cpu-2026-04-04 | 448x336 | lanczos/lanczos | 3.54 | 1,901,606 | 3.546 | 1,904,650 |
| robust_current-432x324-cpu-2026-04-04 | 432x324 | lanczos/lanczos | 3.33 | 1,781,129 | 3.330 | 1,787,266 |
| robust_current-cand-424x318-crf23-g48-b4-r4-cpu-2026-04-05 | 424x318 | lanczos/lanczos | 3.25 | 1,669,984 | 3.275 | 1,704,163 |
Working theses
- Bitrate placement and resolution remained the strongest honest levers in this scorer region.
- BAT00 became useful as a research-only ranking lane, not as a score authority.
- ROI-style multi-stream ideas were informative negative results but too expensive in the tested forms.
- The 424x318 / keyint48 floor appears locally stable after nearby 3.27, 3.26, 3.44, 3.32, and 3.38 follow-up failures.
Key turning points
- First honest floor — robust_current-baseline-cpu-2026-04-03 (4.06)
- First big win — robust_current-medium23-cpu-2026-04-03 (3.62)
- ROI failure — robust_current-dynamic-main-roi-cpu-2026-04-05 (4.47)
- Current best floor — robust_current-cand-424x318-crf23-g48-b4-r4-cpu-2026-04-05 (3.25)
Timeline
| UTC | Type | Summary |
|---|---|---|
| 2026-04-04T17:46:16.346354+00:00 | research | Tiny fixed-ROI two-pass prototype cut bytes but regressed badly on the official scorer, so it stays rejected. |
| 2026-04-04T19:11:30.675723+00:00 | promotion | Promoted robust_current 432x324 / medium / 23 / keyint48 / lanczos+lanczos after the tiny local resolution revisit. |
| 2026-04-05T05:53:25.231667+00:00 | research | Measured dynamic main-ROI prototype rejected: main ROI was preserved explicitly, but score regressed to 4.47 and bytes rose to about 2.66 MB. |
| 2026-04-05T06:35:49.299885+00:00 | verification | Fresh local CPU regression confirmed the restored promoted floor still scores 3.33. |
| 2026-04-05T06:35:49.299885+00:00 | research | BAT00 surrogate v2 remained noisy on the full mixed set, but the codec-only subset improved enough to use as a research-only ranking aid. |
| 2026-04-05T07:42:28.847777+00:00 | research | BAT00 codec-only surrogate ranked a small codec shortlist, then the top two local CPU candidates were tested sequentially. |
| 2026-04-05T07:42:28.847777+00:00 | decision | Local CPU test of surrogate-ranked #1 candidate (432x324 / crf23 / g48 / b3 / r4) scored 3.43 and was rejected. |
| 2026-04-05T07:42:28.847777+00:00 | decision | Local CPU test of surrogate-ranked #2 candidate (432x324 / crf23 / g64 / b4 / r4) scored 3.38 and was rejected. |
| 2026-04-05T08:13:15.198319+00:00 | promotion | Local CPU test of surrogate-ranked #3 candidate (424x318 / crf23 / g48 / b4 / r4) scored 3.25 and became the new promoted floor. |
| 2026-04-05T08:48:02.864256+00:00 | decision | Nearby follow-up on the new 424x318 floor (bframes3) scored 3.27 and was rejected. |
| 2026-04-05T09:47:41.998084+00:00 | decision | Nearby follow-up on the 424x318 floor (keyint64) scored 3.26 and was rejected. |
| 2026-04-05T14:29:14.555005+00:00 | decision | Lower-resolution BAT00-ranked follow-up at 416x312 scored 3.44 and was rejected. |
| 2026-04-05T15:10:07.425100+00:00 | decision | BAT00-ranked nearby-scale follow-up at 428x320 scored 3.32 and was rejected. |
| 2026-04-05T16:07:20.968299+00:00 | decision | BAT00-ranked nearby-scale follow-up at 426x320 scored 3.38 and was rejected. |