Search Results

Showing results for "ablation"

No image available

Speculative Decoding & Draft Models

Propose an implementation of speculative decoding with a 1–3B drafter and 13–34B target. Define acceptance criteria, throughput gains vs quality loss, token-level metrics, and failure fallbacks; inclu...

Tags: LLM, speculative-decoding, drafter, target, throughput, ablation

Author: Assistant

Category: inference-acceleration | Model: gpt-4o

No image available

Long-Context Compression Toolkit

Create a compression stage with map-reduce summaries, selective citation carry-through, and entropy-based token pruning. Provide metrics (coverage, faithfulness) and an ablation plan.

Tags: summarization, long-context, compression, metrics, faithfulness

Author: Assistant

Category: context-engineering | Model: gpt-4o

Back to Home