Uncategorized

What makes large language models tick? | Casetext



Understanding how LLMs work is crucial to getting the most from this powerful form of AI 

GPT-4, the most advanced LLM (a form of AI) from OpenAI, has enabled the development for several industries of professional-grade generative AI tools, including AI legal assistants such as CoCounsel. This latest generation of AI, more sophisticated and intuitive than ever and designed to respond to natural language instruction, is becoming increasingly relied upon by lawyers in practice. 

But the newest AI is only as useful as its users are skilled. And that skill cannot develop without being preceded by what we like to call AI literacy. Evaluating and deciding how to apply LLM-powered tools requires at least a fundamental understanding of how LLMs work. In this two-part post, we explore AI prompting for legal professionals, with an emphasis on how to ensure accurate and consistent output in accordance with lawyers’ professional and ethical obligations

In part one, we discuss how LLMs “think” and why the quality of their output is contingent on the quality of the prompts they receive. In part two, we offer prompting tips for legal professionals to ensure optimal use of LLMs in practice.

What is an AI prompt?

At its most basic, an AI prompt is any instruction or request put into an AI. And AI prompting has already been part of the daily routine of many for years. Those requests to Alexa or Siri? Those are AI prompts. 

The request is only one of multiple parts of a prompt, and those parts depend on the type of AI being used. Different types of AI have been built to include additional engineering to ensure each request is correctly routed to the AI’s specific functions. The engineering or routing depends on the type of AI you’re using, such as general-use AI like OpenAI’s ChatGPT, or specific-use AI (we break down the difference here).

As an example, requests submitted to ChatGPT follow a simple route to GPT-4 (the LLM powering ChatGPT), while requests submitted to specific-use AI like CoCounsel have a more complex route to GPT-4. The request is routed through several discrete functions, such as legal research (which then consults a database of case law, regulations, and statutes) or document review (that reads each document provided). As a result, the specific-use LLM receives a much more sophisticated total prompt that includes multiple parts: the request plus domain-specific content, as well as potential additional back-end prompting.

How do large language models “think”?

You now know the parts of an AI prompt. But why does the structure of a prompt matter so much when using LLMs? Knowing how LLMs “think”—and their limitations—is key to understanding why the quality of its output depends on the quality of the prompts it receives. This understanding also helps you get the best possible output, rather than mediocre, half-useful responses. 

Today’s LLMs can perform at a human level on various professional and academic benchmarks—GPT-4 has passed the bar exam—but even the most advanced LLMs are less capable than humans. For one, LLMs lack full abstract reasoning capabilities, which humans have.  

LLMs are pattern-recognizing and -generating machines that have been trained on billions of data points and can generate novel content as a result of that training (hence the name “generative AI”). LLMs try to predict what a human might conclude, based on the data it’s trained on. 

And It’s for precisely this reason prompting is so important. If you’ve used AI voice assistants like Alexa or Siri, you know the clarity and specificity of your prompts matters. With LLMs, grammar is a particularly important factor—the wording and punctuation you choose significantly impacts the clarity of your prompts. And as lawyers, you already know failure to use punctuation properly—especially commas—can alter the meaning of language or result in costly ambiguity or misinterpretation

Let’s say you want to use AI to review a document and find references to Apple. You submit the following prompt: Does the document contain references to apple? 

This prompt may be too ambiguous for a LLM. The LLM cannot discern whether you’re asking it to find references to fruit or Apple products such as the iPhone. LLMs are predictive models, and while it might correctly predict what you intended to refer to (the brand Apple), this ambiguity leaves room for inconsistent or inaccurate results. 

Now imagine you make a different request: Does the document contain apple pie? The AI understands you’re referring to a dessert, as opposed to fruit or the brand Apple. You’ve communicated a more specific, complex message that reduces ambiguity. Asking “Does the document contain apple pie recipes?” provides context to the AI (that it is searching for a specific recipe in a recipe book).

These examples indicate how LLMs think and why the specificity of your prompt matters. There are also limitations on context and how much information a LLM can handle. When humans read, we read letters and words as separate units that are individually assigned a particular meaning. The words, taken together, communicate a more sophisticated message. When LLMs read, they break down language into a series of tokens, which also form meaning when taken together.  

LLMs can only consider a limited number of tokens of information at any given time, and a token varies greatly in length, ranging from one character to one word. As a result, LLMs can only handle a limited amount of context. A LLM’s limit on the amount of information retained in its memory at one time is known as a context window. And it’s context window can significantly impact the quality of the output you receiveI. 

Context windows are always in motion. When a context window is full and new tokens (information) arrive, the LLM will release the oldest tokens in the context window to make room for new ones. When this older information is expelled, it’s completely forgotten by the AI. This limited memory is another limitation of the AI—another human capability the AI lacks. 

This limit on AI’s ability to retain information impacts how we should interact with it, and you should think of your prompts in two categories: requests and refinements. Requests are the initial instructions and queries to the AI. Refinements are additional follow-up directions, additional details, or corrections. The latter helps fine-tune the AI’s output.

The golden rule for writing AI prompts

Humans are capable of translating imprecise instructions into actionable directions (think back to the Apple example). They also benefit from a longer-term memory, or much larger context window. AI has a much shorter context window, but a much more expansive knowledge base than humans. 

An LLM won’t perform well in the face of ambiguity. It requires specific instructions within its comparatively limited context window. For this reason, it’s important to be intentional and write unambiguous, clear questions to get accurate output. And, the amount and type words you include impact your results too. 

If you take away one rule about AI prompting, it should be this: How you write your request will determine how the prompt is interpreted. 

Today’s unprecedented AI is unlike most traditional technology in that it knows how to read, comprehend, and write, and it can learn. Learning how to prompt a LLM is analogous to getting to know a new colleague. LLMs need our human experience and direction to function properly—it requires effective instructions and queries—to produce the right results. 

In our next post, we’ll share specific AI prompting techniques for lawyers.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *