Every prompt is optimized in seven stages
before any model sees it.
PromptCraft lives in your menu bar and silently engineers every prompt through a 7 stage pipeline. The model receives structured intent, preserved entities, context from your active app, and a style specific final prompt. Local with Ollama or cloud with major providers, all from the same interface.
Balanced general-purpose optimization
The pipeline, in full.
Scroll through 01 to 07 to explore each stage. Above 01 and below 07 normal scrolling resumes.
Intent Decomposer
Parses what you actually meant, not what you typed.
Your raw input is tokenized and analyzed to extract the primary verb, subject, object, and desired outcome. Intent Decomposer distinguishes between imperative instructions ("build X"), exploratory questions ("how does Y work"), and creative briefs ("write Z in the style of"). This classification routes your request to the appropriate assembly strategy in stage 5.
make a login endpoint with proper error handling intent: CREATE
domain: backend-api
subject: authentication-endpoint
constraints: implied[security, error-handling, input-validation] Entity Extractor
Locks every technical term so nothing gets paraphrased away.
Named entity recognition identifies and pins all domain-critical tokens: proper nouns, version numbers, library names, acronyms, quoted strings, and technical identifiers. These entities receive immutable flags so no downstream stage synonymizes, paraphrases, or drops them. What you named is what the model sees.
write a test for the UserAuthService.login() method PINNED: [UserAuthService] [login()]
type: class-ref, method-ref
immutable: true
scope: test-coverage Complexity Classifier
Scores the request on four dimensions and routes accordingly.
Four orthogonal dimensions are scored 0–1: scope, depth, ambiguity, and technicality. The combined score selects one of 12 assembly strategies and determines how much context the Context Engine injects in stage 4. A casual request gets a different treatment than a deep expert-level task.
explain how transformers work scope: 0.72
depth: 0.81
ambiguity: 0.38
technicality: 0.64
strategy: pedagogical-depth Context Engine
Reads what you are working on and injects it silently.
The Context Engine reads the macOS accessibility layer to determine your active application and selected text, queries the local clipboard history ring for recent content, and applies domain priors derived from your style profile. All injected as structured annotations in the assembly buffer without you writing any of it manually.
(active: Xcode, selection: 42 lines Swift, clipboard: UserDefaults key) context.lang = swift
context.framework = swiftui
context.recently_edited = UserDefaults
context.scope = mobile-development Prompt Assembler
Builds the final prompt from every upstream signal.
The Assembler takes classified intent, pinned entities, complexity score, and injected context and renders the final prompt using the active style template. Engineering adds role specification, constraint enumeration, format requirements, and test directives. Every template is composable, not static.
pinned entities + intent + context + style=Engineering You are a senior iOS engineer. Task: refactor selected Swift code using SwiftUI best practices. Constraints: preserve UserDefaults key compatibility, add unit test stubs for each public method. Format: annotated code + change rationale. Model Execution
Routes to your provider and streams back without delay.
The assembled prompt routes to your configured endpoint, local Ollama, or a cloud provider via PromptCraft Cloud. Responses stream token by token with timeout handling, retry on transient errors, and request deduplication handled automatically.
assembled prompt to configured provider provider: Claude 3.5 Sonnet
latency: 178ms to first token
streaming: active @ 48 tok/s
status: 200 OK Post Processor
Strips model artifacts before output reaches you.
The raw response passes through format validation, artifact stripping, verbosity enforcement, and structural cleanup. The result lands in the output area with copy, export, and clear actions ready immediately.
raw model response with preamble + filler preamble: stripped
format: validated
verbosity: enforced
output: ready to copy The AI already has your context.
You just don't have to write it.
Context you never have to write.
The Context Engine reads your active application, selected text, and clipboard history before assembling any prompt. Coding in Xcode with a SwiftUI file open? It knows. Writing in Notion? It adjusts tone. The pipeline always knows what you were working on — without you ever saying it.
Seven optimization styles. Immediately.
Engineering adds role specification, constraint enumeration, format requirements, and test stubs. Research adds academic framing and source awareness. Content adds narrative structure and audience tone. You switch styles with one click — and the same rough input produces a completely different, equally precise result.
write a test for the login method
You are a senior backend QA engineer. Task: write an integration test suite for the login endpoint. Constraints: Jest, supertest, test success (200), bad credentials (401), rate limit (429), locked account (423)...
Write a comprehensive test for the login function covering the happy path and at least three failure scenarios with appropriate assertions...
Faster than you can blink. Private by design.
The entire 7-stage pipeline runs on your Mac in under 200ms. Your prompts never reach an external server until they touch your chosen AI provider — and with Ollama, that never happens at all. One read-only Accessibility permission. No telemetry. No logging. No account required for the app itself.
"I got tired of starting every AI conversation with a three-paragraph context dump. I was spending more time writing the setup than writing the actual work. PromptCraft started as a one-file Swift script. When it started saving me forty minutes a day, I turned it into a real app."
First prototype — a menu bar app that prepended a role prompt to every request. One file of Swift. Worked immediately.
Added the style selector and context engine. Response quality was obviously better. Started showing it to other engineers.
Full 7-stage pipeline, clipboard history, and the cloud proxy for provider routing without API key management.
Published. Still the only AI tool I actually keep open. Used every day, built on ever since.
$99 once.
No annual renewal trap.
Pro is a one-time purchase. You own it, it updates forever, and you never get an invoice again. Cloud adds managed AI routing for people who don't want to manage API keys.
One-time · owns forever
- All 7 pipeline stages, every request
- Runs fully local — no data leaves your Mac
- 7 built-in optimization styles General, Engineering, Research, Content, Analysis, Academic, Creative
- Custom style templates Define and save your own prompt architecture
- Unlimited clipboard history Every prompt and result stored locally, searchable
- Any Ollama model (local, offline) llama3.2, mistral, deepseek, phi-3, qwen, and every future model
- Global keyboard shortcut Cmd+Shift+Space from any app, any window, any Space
- Activate on up to 3 Macs
- All future updates
Monthly · cancel any time
- Everything in Pro
- Cloud routing: Claude, GPT-4o, Gemini No API key required
- DeepSeek V3 and Mistral Large
- Priority response queue
- Activate on up to 5 Macs
- Cancel any time
Stop writing context.
Start getting answers.
macOS 14 Sonoma or later. Apple Silicon and Intel. 14-day free trial. No account, no credit card, no setup. License key by email after purchase.