Prompt Engineering Articles

Android On-device AI Prompt Engineering: Token Budgets, Few-shot Compression, and TTFT Control

A practical Android on-device LLM prompt-engineering guide showing how token budgeting, few-shot template compression, and dynamic budget switching reduced first-token latency from 8.7 seconds to under 2 seconds.

Prompt Cost Optimization: When to Write Long and When to Write Short

Detailed prompts are not always cheaper. This post examines token pricing, context decay, and human effort to provide a measurable way to decide when prompts should be long or short.

Prompt Engineering: From Core Principles to Frontier Practice

A practical guide to prompt engineering, covering KERNEL design principles, few-shot and Chain-of-Thought prompting, promptfoo evaluation, DSPy automation, and prompt injection defense.