Prompt Cards

Toolformer-Style Tool Use

Design a tool-use curriculum: function signatures, schema discovery, tool reliability scoring, and retry/backoff policy. Include sandboxing and cost guards.

Tags: LLM, tools, function-calling, reliability, sandbox, cost
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:

View

Hallucination Detection & Abstain

Create a hallucination detector using entailment+attribution signals. Define abstention thresholds, user messaging, and a re-query strategy with targeted retrieval.

Tags: LLM, hallucination, entailment, abstention, UX, grounding
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:

View

Retrieval Eval Harness

Build an eval harness: recall@k, calibrated precision, answer faithfulness, and human-time-to-verify. Include topic-aware test buckets and data drift alarms.

Tags: LLM, retrieval, eval, faithfulness, drift, metrics
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:

View

RAG 2.0: Freshness & Faithfulness

Architect a retrieval stack with hybrid search, temporal decay, dedup, and passage-level citation anchors. Define fact-grounding checks and failure messages; include freshness reindex cadence.

Tags: LLM, RAG, hybrid, temporal-decay, citations, freshness
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:

View

Synthetic Data: Self-Play & Critique

Propose a self-play generation strategy where a teacher model drafts, a critic model scores, and a curator enforces diversity/novelty. Provide leakage and drift monitors.

Tags: LLM, synthetic-data, self-play, critic, curation, drift
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:

View

Instruction Mining @ Scale

Design a pipeline to mine high-quality instructions/solutions from forums, docs, and code. Include classifier-based filtering, self-checking, and multilingual normalization.

Tags: LLM, instruction-mining, classification, self-check, ETL, multilingual
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:

View

Data Governance & Decontamination

Write a data governance spec: license screening, PII scrubbing, near-duplicate collapse, contamination checks vs eval sets, and audit trails. Provide rejection reasons and exception handling.

Tags: LLM, data-governance, PII, decontamination, licensing, audit
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:

View

Fine-Tune Stack: SFT→DPO/ORPO→RLHF

Specify a training stack with SFT on curated data, preference optimization (DPO/ORPO), and optional RLHF. Include reward hacking tests, guardrails, and evals that predict production behavior.

Tags: LLM, SFT, DPO, ORPO, RLHF, alignment, evaluation
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:

View

Long-Context Attention Variants

Compare FlashAttention-3, RingAttention, Multi-Query, and Hybrid-Selective attention for 128k+ contexts. Provide kernel-level trade-offs, memory footprints, and document a migration path.

Tags: LLM, attention, long-context, FlashAttention, RingAttention, MQA
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:

View

KV Offload & Memory Tiers

Engineer a KV-cache offload strategy spanning HBM→HBM2e→CPU RAM→NVMe. Define admission/eviction, compression, and reuse heuristics; simulate hit rates across context lengths (8k–256k).

Tags: LLM, KV-cache, offload, NVMe, memory, context-length
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:

View

Quantization Suite: INT8/INT4/NF4

Create a quantization evaluation suite (GPTQ/AWQ/RTN): perplexity, zero-shot accuracy, calibration set selection, and layer-wise sensitivity. Output deployment guidelines by architecture and hardware target.

Tags: LLM, quantization, INT8, INT4, NF4, AWQ, GPTQ
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:

View

MoE Routing & Load Balancing

Design an expert-parallel MoE serving topology: gate calibration, capacity factor, expert sharding, and interconnect constraints (NVLink/IB). Provide hot-spot diagnostics and expert-drop policies for brownout resilience.

Tags: LLM, MoE, experts, routing, capacity, NVLink, InfiniBand
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:

View

Prompt Cards

Toolformer-Style Tool Use

Hallucination Detection & Abstain

Retrieval Eval Harness

RAG 2.0: Freshness & Faithfulness

Synthetic Data: Self-Play & Critique

Instruction Mining @ Scale

Data Governance & Decontamination

Fine-Tune Stack: SFT→DPO/ORPO→RLHF

Long-Context Attention Variants

KV Offload & Memory Tiers

Quantization Suite: INT8/INT4/NF4

MoE Routing & Load Balancing

Curio AI Brain