Uploading documents can be a great way to give your bot important information without overwhelming it with data it may not need for every response. Although powerful, we need to be careful how much data we are asking AI to process. As in most cases, I recommend starting small and making sure AI is pulling in data properly from your documents. This article will show you step by step where to look to figure out how AI used your knowledge storage. If you want a deeper dive into how knowledge uploads work check out this article.
Here we will give you an overview of some of the information you have available when you click a message in the dashboard to help you troubleshoot information that your bot pulled in as context.
Summarize Needs
This stage summarizes the needs of the contact. We use this summary later to compare against the text that is brought in. Here I can see even though my question, "what is hyde?" is pretty vague, the AI did a good job in summarizing my needs.
Troubleshooting: The first step is to check whether this summary is accurate in summarizing what's needed.
Library Summary
Next the system will look at the conversation and pull in areas of your uploaded text that it thinks is relevant. It pulls in chunks of text here. That relevant information will be CONTEXT which we will attempt to summarize based on the results from the Summarize Needs stage above.
Troubleshooting: Check what is being pulled in as CONTEXT and make sure it is relevant. If it's not what you're expecting, you may have too much irrelevant information in your library uploads. If you don't have this section at all, maybe you forgot to attach your uploads to your bot or tried to scrape a web page that wasn't accessible.
Main
Here in Main we can see where our summary has been injected into the Main prompt and processed to give us our final answer.
Troubleshooting: Make sure you see the result from the Library Summary stage here and verify that it's an accurate summary of the CONTEXT that the AI found previously.
Conclusion
If AI is not pulling in information accurately we need to check on our uploads. The most common way things can go wrong is the formatting of the document. Direct text input will always be the best option not only because text is easier for AI to process but it will be shorter and more direct. Remember, in all our prompting we want to only give AI the essentials it needs to know and nothing more.
CSV files and PDF files are usually easily processed by AI but tend to be very large amounts of data which can cause issues. For this article the document I used was from a scraped website and in my testing it pulled in relevant data properly. Websites vary in their reliability based on how "messy" the site is with pictures, ads, and scattered information. When uploading scraped sites or any other documents its important to do thorough testing and making sure the information is accurate. Now you can always refer to this article to play detective and understand AI!๐ค