Search Results
Showing results for "refusals"
No image available
Self-Improving Refusal Handling: When Not to Edit
Create criteria for refusing changes: ambiguous requirements, missing tests, high risk without approval, or insufficient evidence. Include user messaging templates.
Tags:
refusal,
policy,
ambiguity,
safety,
communication
Author: Assistant
Category: safe-self-improving-ai | Model: gpt-5.2
No image available
Safety Policy: What the Bot Will Refuse or Escalate
Draft a safety policy for the research bot: sensitive topics, privacy boundaries, defamation risk, and when to ask the user for confirmation. Include a user-facing explanation style.
Tags:
safety-policy,
governance,
privacy,
refusals,
trust
Author: Assistant
Category: research-bot | Model: GPT-5.2
No image available
Hallucination Reduction Plan (RAG + Verification)
Design a hallucination reduction plan: retrieval augmentation, answer verification steps, consistency checks, and refusal behaviors. Include evaluation metrics and regression tests.
Tags:
hallucination,
RAG,
verification,
consistency,
testing
Author: Assistant
Category: recursive-ai-safety | Model: GPT-5.2
No image available
Safety-Focused Prompting Style Guide
Create a prompting style guide for internal prompts: structure, tool usage rules, refusal patterns, and safety reminders. Include examples and a checklist reviewers can apply.
Tags:
prompting,
style-guide,
internal,
guardrails,
review
Author: Assistant
Category: recursive-ai-safety | Model: GPT-5.2
No image available
Safety Regression Suite (What Must Never Break)
Create a safety regression suite: prompt injection tests, data leakage tests, refusal/guardrail tests, and policy adherence checks. Include how to maintain and evolve the suite over time.
Tags:
safety-regression,
testing,
prompt-injection,
privacy,
guardrails
Author: Assistant
Category: recursive-ai-safety | Model: GPT-5.2
No image available
Cultural Pragmatics: Requests and Politeness
Design a lesson on pragmatics: making requests, refusing politely, apologizing. Include role-plays for Japan↔US contexts and common pitfalls.
Tags:
pragmatics,
politeness,
roleplay,
culture,
ESL
Author: Assistant
Category: language-teaching | Model: gpt-4o
No image available
A2A Negotiation: Contracting Tasks Between Agents
Create an A2A negotiation mechanism: task contracts, acceptance criteria, SLAs, and cost budgets. Include how agents refuse tasks they cannot verify or safely perform.
Tags:
A2A,
task-contracts,
delegation,
budgets,
reliability
Author: Assistant
Category: agent-architecture | Model: GPT-5.2
No image available
Multilingual Alignment @ Scale
Design a multilingual alignment plan (zh/ja/hi/id/pt/en): shared subword policy, cross-lingual instructions, and locale-specific refusal tuning. Provide leakage checks.
Tags:
LLM,
multilingual,
alignment,
tokenization,
refusal-tuning
Author: Assistant
Category: internationalization-LLM | Model: gpt-4o
No image available
Domain-Safe Modes (Medical/Legal)
Specify constrained modes with scope limits, consent prompts, guideline links, and mandatory citation density. Provide refusal templates and domain-specific disclaimers.
Tags:
governance,
medical,
legal,
safety,
mode-switch
Author: Assistant
Category: policy-governance | Model: gpt-4o
No image available
Guardrail Injection
Take a prompt and inject 10 practical guardrails: forbidden content notes, schema validation, refusal policy, and fallbacks. Provide both a short and long version. Output as two Markdown blocks.
Tags:
prompt|guardrails|refusal|schema|fallbacks
Author: Curioforce Corp. Corp.
Category: Prompt-Improvement | Model: gpt-5-thinking
Back to Home