Prompt Cards

Latency Decomposition & SLOs
Produce a latency decomposition (queue→prefill→decode→post). Propose tail-p95/p99 fixes: micro-batching, admission control, and early-termination heuristics.
Tags: LLM, latency, SLO, micro-batch, admission-control
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:
Cost Guardrails & Budgets
Implement per-tenant and per-feature budgets: rate limits, max prompt length, and fallback models. Provide business rules and alerts to prevent runaway spend.
Tags: LLM, cost, rate-limits, budgets, fallback, tenants
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:
Observability: Tokens, Tools, Truth
Define observability: token usage distributions, tool call success, citation density, and hallucination alerts. Provide redaction-safe logs and dashboards.
Tags: LLM, observability, telemetry, citations, alerts, logging
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:
Contextual Caching & Prefix Trees
Engineer prompt prefix trees and semantic caches to cut latency/cost for recurring tasks. Provide hit-rate models and invalidation policy.
Tags: LLM, caching, prefix, semantic, latency, cost
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:
Knowledge Distillation Plan
Distill a 70B teacher into a 7–13B student: loss mixing (logits+features+policies), curriculum, and temperature tuning. Provide downstream eval deltas.
Tags: LLM, distillation, teacher-student, curriculum, losses
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:
LoRA/QLoRA Strategy
Recommend when to use LoRA/QLoRA vs full finetune. Define rank search, target layers, and quantization-aware adapters. Include memory/perf tables per GPU class.
Tags: LLM, LoRA, QLoRA, finetuning, adapters, GPU
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:
Safety Red Team & Taxonomy
Create a safety taxonomy (harm classes) and a multilingual red-team plan with auto-generation of adversarial prompts. Provide block/transform policies and human review paths.
Tags: LLM, safety, red-team, taxonomy, policy, multilingual
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:
Privacy: DP-SGD & Redaction
Outline a privacy strategy: DP-SGD variants for SFT, selective redaction layers, privacy evals (membership inference), and logging minimization.
Tags: LLM, privacy, DP-SGD, redaction, membership-inference, logging
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:
Multilingual Alignment @ Scale
Design a multilingual alignment plan (zh/ja/hi/id/pt/en): shared subword policy, cross-lingual instructions, and locale-specific refusal tuning. Provide leakage checks.
Tags: LLM, multilingual, alignment, tokenization, refusal-tuning
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:
Multi-Task Multi-Domain Evals
Create a senior-grade eval battery: reasoning (math/code), instruction-following, safety, multilingual QA, and tool-use. Include uncertainty intervals and power analysis for A/Bs.
Tags: LLM, evaluation, multidomain, statistics, AB-testing
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:
Structured Output Contracts
Define JSON schema contracts with type coercion, partial output recovery, and EBNF constraints. Provide test-time correction and repair strategies.
Tags: LLM, structured-output, JSON, EBNF, validation, repair
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:
Agents: Planner–Executor–Critic
Specify a lightweight agent loop with decomposition, execution, and critique. Provide termination conditions, trace logging, and loop unroll limits.
Tags: LLM, agents, planning, critique, traces, governance
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:

Curio AI Brain

Available in Chrome Web Store!