Prompt Cards

Toolformer-Style Tool Use
Design a tool-use curriculum: function signatures, schema discovery, tool reliability scoring, and retry/backoff policy. Include sandboxing and cost guards.
Tags: LLM, tools, function-calling, reliability, sandbox, cost
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:
Hallucination Detection & Abstain
Create a hallucination detector using entailment+attribution signals. Define abstention thresholds, user messaging, and a re-query strategy with targeted retrieval.
Tags: LLM, hallucination, entailment, abstention, UX, grounding
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:
Retrieval Eval Harness
Build an eval harness: recall@k, calibrated precision, answer faithfulness, and human-time-to-verify. Include topic-aware test buckets and data drift alarms.
Tags: LLM, retrieval, eval, faithfulness, drift, metrics
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:
RAG 2.0: Freshness & Faithfulness
Architect a retrieval stack with hybrid search, temporal decay, dedup, and passage-level citation anchors. Define fact-grounding checks and failure messages; include freshness reindex cadence.
Tags: LLM, RAG, hybrid, temporal-decay, citations, freshness
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:
Synthetic Data: Self-Play & Critique
Propose a self-play generation strategy where a teacher model drafts, a critic model scores, and a curator enforces diversity/novelty. Provide leakage and drift monitors.
Tags: LLM, synthetic-data, self-play, critic, curation, drift
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:
Instruction Mining @ Scale
Design a pipeline to mine high-quality instructions/solutions from forums, docs, and code. Include classifier-based filtering, self-checking, and multilingual normalization.
Tags: LLM, instruction-mining, classification, self-check, ETL, multilingual
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:
Data Governance & Decontamination
Write a data governance spec: license screening, PII scrubbing, near-duplicate collapse, contamination checks vs eval sets, and audit trails. Provide rejection reasons and exception handling.
Tags: LLM, data-governance, PII, decontamination, licensing, audit
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:
Fine-Tune Stack: SFT→DPO/ORPO→RLHF
Specify a training stack with SFT on curated data, preference optimization (DPO/ORPO), and optional RLHF. Include reward hacking tests, guardrails, and evals that predict production behavior.
Tags: LLM, SFT, DPO, ORPO, RLHF, alignment, evaluation
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:
Long-Context Attention Variants
Compare FlashAttention-3, RingAttention, Multi-Query, and Hybrid-Selective attention for 128k+ contexts. Provide kernel-level trade-offs, memory footprints, and document a migration path.
Tags: LLM, attention, long-context, FlashAttention, RingAttention, MQA
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:
KV Offload & Memory Tiers
Engineer a KV-cache offload strategy spanning HBM→HBM2e→CPU RAM→NVMe. Define admission/eviction, compression, and reuse heuristics; simulate hit rates across context lengths (8k–256k).
Tags: LLM, KV-cache, offload, NVMe, memory, context-length
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:
Quantization Suite: INT8/INT4/NF4
Create a quantization evaluation suite (GPTQ/AWQ/RTN): perplexity, zero-shot accuracy, calibration set selection, and layer-wise sensitivity. Output deployment guidelines by architecture and hardware target.
Tags: LLM, quantization, INT8, INT4, NF4, AWQ, GPTQ
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:
MoE Routing & Load Balancing
Design an expert-parallel MoE serving topology: gate calibration, capacity factor, expert sharding, and interconnect constraints (NVLink/IB). Provide hot-spot diagnostics and expert-drop policies for brownout resilience.
Tags: LLM, MoE, experts, routing, capacity, NVLink, InfiniBand
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:

Curio AI Brain

Available in Chrome Web Store!