cattower-ltd / whiskr-vision-eval
whiskr-vision-eval
Vision-grading evaluation suite for the Whiskr Inc retrieval stack. 6 active runs, 312 total.
Active runs
6
across 4 nodes
Best eval/accuracy
0.861
run_8f3a2 · step 41,200
GPU hours (30d)
1,284.5
+12.4% vs prior 30d
Queue depth
3
est. wait 00:18:40
Metrics smoothing 0.6 · last 42,000 steps
eval/accuracy
0.847
0.900.750.600.45
step 29,400
run_8f3a2 0.847 runs/run_8f3a2/eval.log:1976
run_7c91d 0.821 runs/run_7c91d/eval.log:1911
run_3d44a 0.792 runs/run_3d44a/eval.log:1794
run_8f3a2 0.847 runs/run_8f3a2/eval.log:1976
run_7c91d 0.821 runs/run_7c91d/eval.log:1911
run_3d44a 0.792 runs/run_3d44a/eval.log:1794
014k28k42k
run_8f3a2
run_7c91d
run_3d44a
train/loss
0.213
2.41.60.80.0
014k28k42k
run_8f3a2
run_7c91d
run_55c19 (failed)
throughput
12,408 tok/s
16k12k8k4k
-60m-40m-20mnow
cluster aggregate
gpu/memory
38.2 GB
4832160
-60m-40m-20mnow
node-a3
node-b1
Runs showing 8 of 312
| Run | Status | Owner | Steps | eval/acc | train/loss | tok/s | Trend | GPU h | Tags | Updated |
|---|---|---|---|---|---|---|---|---|---|---|
| run_8f3a2 | running | Kitty Park | 42,000 | 0.847 | 0.213 | 12,408 | 181.4 | bf16 lr3e-4 | 12:48:07 | |
| run_7c91d | running | Ya-ong Kim | 41,200 | 0.821 | 0.246 | 11,952 | 176.9 | bf16 ema | 12:47:31 | |
| run_3d44a | running | Calico Lee | 38,500 | 0.792 | 0.281 | 10,114 | 158.2 | fp32 | 12:45:02 | |
| run_b2e07 | running | Kitty Park | 12,800 | 0.703 | 0.512 | 12,011 | 54.7 | bf16 aug-v2 | 12:44:48 | |
| run_55c19 | failed | Ya-ong Kim | 9,100 | 0.611 | 0.844 | — | 31.0 | oom | 11:58:19 | |
| run_e0d63 | queued | Calico Lee | 0 | — | — | — | 0.0 | sweep-7 | 11:31:55 | |
| run_91aa0 | finished | Kitty Park | 50,000 | 0.861 | 0.198 | 12,377 | 214.6 | bf16 best | 09:12:40 | |
| run_4c7f8 | finished | Ya-ong Kim | 50,000 | 0.853 | 0.204 | 11,840 | 209.1 | bf16 | 08:47:13 |
Selected run run_8f3a2
Configuration
config.yaml
modelcattower-7b-v3
datasetwhiskr-corpus-v3
optimizeradamw
learning_rate3.0e-4
batch_size256
precisionbf16
seed42
nodes4 × node-a
Event log
tail -n 8
12:48:07 ckpt saved checkpoint step_42000
12:47:31 eval eval/accuracy 0.847 (+0.003)
12:44:48 eval run_b2e07 eval pass started
12:30:12 ckpt saved checkpoint step_41000
12:12:55 info lr decayed to 1.8e-4
11:58:19 fail run_55c19 CUDA OOM on node-b2
11:31:55 queue run_e0d63 enqueued (sweep-7)
11:02:40 ckpt saved checkpoint step_40000
Notes
updated 12:40
objectiveacc ≥ 0.860 @ 50k
baselinerun_91aa0 (0.861)
reviewerYa-ong Kim
next sweeplr ∈ {1e-4, 3e-4}
owner teamPawment Co · eval
budget240 GPU h remaining
data cardwhiskr-corpus-v3.md