7 Things You Should Know When Using LLMs in 2024

Understand the pitfalls of LLMs


5 min read

7 Things You Should Know When Using LLMs in 2024

Generated with Microsoft Copilot

The year 2023 was the year of Large Language Models (LLMs)! No one has any doubt about that. Everyone was discussing the immense potential of generative AI, including LLMs, and for good reason. LLMs have revolutionized many areas in the industry, from customer service to image generation.

However, this powerful new technology also brings new challenges with it. In this article, we'll explore some common pitfalls we learned last year when working with LLMs.

We've no time to waste. Let's dive in!

#1 The problem with citing sources

You're probably familiar with it - you ask a chatbot like ChatGPT something, and the chatbot provides you with an answer. But where does the answer come from? Where are the sources?

Some chatbots return source information. This was not yet the case at the beginning of 2023. However, are these sources correct? No, LLMs cannot cite sources in the correct way. The reason for this is that some LLMs have no real-time access to the internet. Other so-called search-augmented LLMs like Microsoft Copilot have access to the internet. In addition, they do not have the ability to remember where their training data come from.

As a result, the outputs of LLMs provide sources that might be correct, but the sources are often wrong. We also see this limitation with search-augmented LLMs such as Microsoft Copilot. You should always be aware of this limitation. This is a huge problem, especially for scientific papers.

#2 Dependency on user prompts

Any LLM application is only as good as the user prompt! Of course, there are some techniques that developers use in order to be able to handle as much user input as possible. But this only works to a certain extent.

LLM applications work best when you give them clear and specific instructions. This way, the LLM will output the desired response. In addition, it reduces the probability of irrelevant or incorrect answers.

Keep in mind that a short prompt is not necessarily a clear prompt. In most cases, longer prompts provide more clarity and context for the model. Imagine you want to generate a picture with an image generator. Then, you have to provide the model with a lot of information so that the image shows what you want. A good user prompt is the result of several iterations.

📘 Get our e-book LangChain for Finance

#3 Hallucinations of LLMs

A big issue with LLMs is hallucinations. In this case, the LLM generates false information when you ask them a question. That's great in many domains where you need creative answers. It is not desirable in fact-based use cases.

This can lead to misinformation and could become a problem, especially in search engines such as Bing or Google. There are already approaches such as fact-checkers, although these only work to a limited extent.

#4 Consistency of the output

The output of LLMs is often not reliable and consistent. That's a problem when you plan to use an LLM in an existing workflow. There are many things you can do to achieve a more consistent output.

For example, you can use a prompt template. In addition, you should write clear and specific prompts. You should also set the temperature parameter to zero. You can use this parameter to control the level of randomness or creativity. A lower temperature results in more conservative LLM answers. A higher temperature leads to more unexpected or random LLM answers.

All these measures will tame the LLM, but there is no 100% guarantee that it will behave consistently.

#5 Bias in the training data

LLMs are trained on large datasets from the internet that can contain biased information. You can try to prevent this with safety measures. Nevertheless, it can happen that LLMs generate sexist or racist content. OpenAI, for example, offers the Moderation API, which gives developers ways to prevent this.

This issue is especially critical in applications for customer support or recruiting.

#6 Coding vs. Prompt Engineering

Some people say that prompt engineering could be the future of software development. That's definitely wrong. The future of software engineering is not Prompt Engineering! It's only a part of it!

We need a reliable way to build software projects. Programming languages are the most powerful tool for this. Natural Language isn't. The problem with LLMs for coding is that you'll always get different outputs. It's always a surprise. A good example is GPTEngineer. We wrote an article about this last year.

But that doesn't mean that you shouldn't use AI for coding. A software engineer with an AI as sparring partner is the most powerful combination. AI will not replace software engineers. A software engineer who uses AI will replace the others without AI.

#7 Prompt Hacking

Through Prompt Hacking, it is possible that users can "hack" LLMs to generate harmful content. That is critical, especially in public applications. It's important to know that when using LLMs.

There are many ways to reduce this behavior. But you can never avoid it at all. We wrote an article about Prompt Engineering for Developers. Check it out if you are interested.

That are the pitfalls we have learned while experimenting and working with LLMs. It is a lot of fun to work with LLMs. However, we need to address all these challenges to get more consistent outputs of LLMs.

✍🏽 What would you add? Write it in the comments. We are looking forward to it!

👉🏽 Join our free weekly Magic AI newsletter for the latest AI updates!

Did you enjoy our content and find it helpful? If so, be sure to check out our premium offer! Don't forget to follow us on X. 🙏🏽🙏🏽

Thanks so much for reading. Have a great day!

🔍 Useful Resources