Prompt engineering tricks for LLMs - Blog by Grzegorz Kossakowski

Most people overcomplicate prompt engineering. There are 100-page-long prompt cookbooks that can be distilled into a few techniques that have the highest leverage. Below is my list of techniques developed for solving tasks in a knowledge- and reasoning-heavy domain, using GPT-4.

LLMs

love
bullet
lists

Great trick to leverage LLMs “world model” is role-playing. Instead of giving detailed instructions and programming in English, you say “You’re a senior lawyer planning work for junior lawyers”, “You’re an AI agent excellent at correcting OCR errors”, “You’re an editor quoting material using Chicago style”, etc.
Long prompts (30k+ token) require anchors for LLMs to make sense of your inputs. You get a very large performance boost by structuring the inputs to your prompt with Markdown. Headers are especially useful. This might be the single most powerful technique for boosting GPT4’s performance.
Chain of thought is known to work well when reasoning skills are required. You can make a chain of thought work across prompt invocations: the first prompt asks to make a plan for solving a problem or doing the work. In the next LLM invocation(s), execute the plan. The side benefit is that if you combine it with Markdown formatting, there’s a large chance your LLM picked only parts of the original document to focus on; you can discard the rest.
Give a LLM an “escape hatch”. LLMs are trained for instruction following. What should you do if you accidentally feed them the wrong input? By default, they will try to come up with some answers no matter how nonsensical. If you design your prompt to let a LLM signal something is off in the output: e.g. have “### Notes on the input document” section in the output where an LLM can comment on the input. You can quickly look at those notes and spot something is off. This is especially powerful for prompts with long inputs.
A generalization of 5. are “internal notes” or “reflection” sections in the output. They serve both as a space for “tokens to think” (Andrej Karpathy’s term), and a debugging tool: you can look at the notes and see whether an LLM’s thinking process follows your intent behind the task.
Examples of the expected output instead of detailed instructions work much better.
You can prepare the output examples by sketching out the shape of you want (e.g. Markdown sections you want in the output), and let LLM do low-level grunt work of filling in the blanks, and even brainstorming. Rerun the sketch prompt 10-20 times and you’ll get a great set of potential examples.
You can “transfer intelligence” between LLMs. Delegate coming up with 10 examples of solving tasks to a powerful model like GPT4 or Claude Opus, and stuff these as few short learning examples to less powerful model like Haiku.

Combined, these technique let you squeeze way more from LLMs than might appear on the surface. Happy prompt engineering!