Speculative Decoding & Draft Models
Propose an implementation of speculative decoding with a 1–3B drafter and 13–34B target. Define acceptance criteria, throughput gains vs quality loss, token-level metrics, and failure fallbacks; include cost curves and ablation plan.
Ratings
Average Rating: 0
Total Ratings: 0