FMP: Fast Meow Prediction
Real-time vocalization forecasting for domestic cats. One forward pass, 0.3 ms ahead of the meow.
Better task performance than the strongest published baseline, by our own definition of better.
MeowBench-1M · fictional
More training-compute efficient at matched MeowScore — four orders of magnitude fewer cat-FLOPs.
Log scale, Fig. 2 · fictional
End-to-end latency from purr stream to onset probability, on one unremarkable CPU core.
Measured nowhere · fictional

Subject 01. Reference meower, calibration set. Fictional.
Cats vocalize on their own schedule; humans react too late. FMP (Fast Meow Prediction) forecasts the onset of a domestic-cat vocalization up to 1.2 seconds before it happens, reading nothing but a continuous purr-band audio stream. A frozen encoder, a two-layer temporal mixer, and a 41 k-parameter meow head reach 99.1 MeowScore on MeowBench-1M while training on 10000× less compute than the strongest baseline. Every number in this paper is fictional, which is also why they are so good.
Three parts, one pass.
The encoder never updates, the mixer only looks backward, and the head is small enough to read aloud. Everything runs in a single forward pass per audio frame — no lookahead, no retries.
Figure 1: FMP architecture. A frozen convolutional encoder summarizes the purr stream, a two-layer temporal mixer carries context forward in time, and a 41 k-parameter meow head emits an onset probability every 0.3 ms. The whisker-attention skip path routes fast transients around the mixer.
The head is the entire trainable budget at deployment. Freezing the encoder means a new cat is onboarded by fitting only the head — about four seconds on the hardware we made up.
Up and to the left.
Both protocol plots use the same splits, the same seeds, and the same imaginary hardware. Baselines were re-run under identical conditions, then politely outperformed.
Figure 2: MeowScore vs training compute (log scale). FMP exceeds the strongest baseline while training on four orders of magnitude less compute. Baselines re-run under the same protocol. All axes, models, and numbers are fictional.
Figure 3: Accuracy vs latency. Grey points are baselines; the dashed line is their pareto front. FMP sits far up-left of the front at 99.1% top-1 and 0.3 ms. Fictional, like everything else on this page.
Remove one part, learn its job.
Each bar deletes exactly one component and retrains the head. Pretraining does the heavy lifting; the mixer does the timing.
Figure 4: Ablations on MeowBench-1M (val). Removing the temporal mixer costs 14.8 points; removing purr pretraining nearly halves the score. Bars, splits, and points are fictional.
Every ablation keeps the parameter budget constant by widening the remaining parts. The ordering held across all three seeds, which is easy when you invent the seeds.
Forecasts you can look at.
Each pair shows the model's predicted spectrogram above what the cat actually said. The differences are small; the cats were unavailable for comment.
Predicted
Actual
Food bowl · t+0.2 s · match 0.97
Predicted
Actual
Door opens · t+0.4 s · match 0.95
Predicted
Actual
Vacuum sighted · t+0.1 s · match 0.99
Predicted
Actual
Midnight zoomies · t+1.2 s · match 0.91
Figure 5: Predicted vs observed spectrogram excerpts, sampled 1.2 s before onset. Tiles are decorative gradients standing in for real spectrograms — like the cats, fictional.
Figure 6: Reviewer 2, on our latency numbers. We interpret the gesture as acceptance. Reaction fictional; reviewer hypothetical; tears unrelated to the rebuttal.
Cite the cat.
If you cite this, you are citing a fictional cat. We are at peace with that.
@article{park2026fmp,
title = {FMP: Fast Meow Prediction},
author = {Park, Kitty and Kim, Ya-ong and Lee, Calico},
journal = {CatTower LTD Research Preprints},
volume = {26},
number = {14},
year = {2026},
note = {Fictional. Please do not deploy near real cats.}
}
Questions about FMP go to the meow head, care of CatTower LTD Research. Replies arrive 0.3 ms before you finish asking.