Prompt Cards

Latency Decomposition & SLOs

Produce a latency decomposition (queue→prefill→decode→post). Propose tail-p95/p99 fixes: micro-batching, admission control, and early-termination heuristics.

Tags: LLM, latency, SLO, micro-batch, admission-control
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:

View

Cost Guardrails & Budgets

Implement per-tenant and per-feature budgets: rate limits, max prompt length, and fallback models. Provide business rules and alerts to prevent runaway spend.

Tags: LLM, cost, rate-limits, budgets, fallback, tenants
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:

View

Observability: Tokens, Tools, Truth

Define observability: token usage distributions, tool call success, citation density, and hallucination alerts. Provide redaction-safe logs and dashboards.

Tags: LLM, observability, telemetry, citations, alerts, logging
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:

View

Contextual Caching & Prefix Trees

Engineer prompt prefix trees and semantic caches to cut latency/cost for recurring tasks. Provide hit-rate models and invalidation policy.

Tags: LLM, caching, prefix, semantic, latency, cost
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:

View

Knowledge Distillation Plan

Distill a 70B teacher into a 7–13B student: loss mixing (logits+features+policies), curriculum, and temperature tuning. Provide downstream eval deltas.

Tags: LLM, distillation, teacher-student, curriculum, losses
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:

View

LoRA/QLoRA Strategy

Recommend when to use LoRA/QLoRA vs full finetune. Define rank search, target layers, and quantization-aware adapters. Include memory/perf tables per GPU class.

Tags: LLM, LoRA, QLoRA, finetuning, adapters, GPU
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:

View

Safety Red Team & Taxonomy

Create a safety taxonomy (harm classes) and a multilingual red-team plan with auto-generation of adversarial prompts. Provide block/transform policies and human review paths.

Tags: LLM, safety, red-team, taxonomy, policy, multilingual
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:

View

Privacy: DP-SGD & Redaction

Outline a privacy strategy: DP-SGD variants for SFT, selective redaction layers, privacy evals (membership inference), and logging minimization.

Tags: LLM, privacy, DP-SGD, redaction, membership-inference, logging
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:

View

Multilingual Alignment @ Scale

Design a multilingual alignment plan (zh/ja/hi/id/pt/en): shared subword policy, cross-lingual instructions, and locale-specific refusal tuning. Provide leakage checks.

Tags: LLM, multilingual, alignment, tokenization, refusal-tuning
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:

View

Multi-Task Multi-Domain Evals

Create a senior-grade eval battery: reasoning (math/code), instruction-following, safety, multilingual QA, and tool-use. Include uncertainty intervals and power analysis for A/Bs.

Tags: LLM, evaluation, multidomain, statistics, AB-testing
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:

View

Structured Output Contracts

Define JSON schema contracts with type coercion, partial output recovery, and EBNF constraints. Provide test-time correction and repair strategies.

Tags: LLM, structured-output, JSON, EBNF, validation, repair
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:

View

Agents: Planner–Executor–Critic

Specify a lightweight agent loop with decomposition, execution, and critique. Provide termination conditions, trace logging, and loop unroll limits.

Tags: LLM, agents, planning, critique, traces, governance
Author: Assistant
Created at: 2025-12-18 00:00:00
Average Rating:
Total Ratings:

View

Prompt Cards

Latency Decomposition & SLOs

Cost Guardrails & Budgets

Observability: Tokens, Tools, Truth

Contextual Caching & Prefix Trees

Knowledge Distillation Plan

LoRA/QLoRA Strategy

Safety Red Team & Taxonomy

Privacy: DP-SGD & Redaction

Multilingual Alignment @ Scale

Multi-Task Multi-Domain Evals

Structured Output Contracts

Agents: Planner–Executor–Critic

Curio AI Brain