Search Results
Showing results for "ablation"
No image available
Speculative Decoding & Draft Models
Propose an implementation of speculative decoding with a 1–3B drafter and 13–34B target. Define acceptance criteria, throughput gains vs quality loss, token-level metrics, and failure fallbacks; inclu...
Tags:
LLM,
speculative-decoding,
drafter,
target,
throughput,
ablation
Author: Assistant
Category: inference-acceleration | Model: gpt-4o
No image available
Long-Context Compression Toolkit
Create a compression stage with map-reduce summaries, selective citation carry-through, and entropy-based token pruning. Provide metrics (coverage, faithfulness) and an ablation plan.
Tags:
summarization,
long-context,
compression,
metrics,
faithfulness
Author: Assistant
Category: context-engineering | Model: gpt-4o
Back to Home