How to easily add unique, persistent knowledge to an AI chatbot
ChatGPT GPTs, Claude Projects, Perplexity Spaces & NotebookLM
Out of the box, Large Language Models (LLMs) produce generic responses based on the model of language they’ve distilled from patterns within their training data (plus some fine-tuning from their handlers).
Retrieval Augmented Generation (RAG) has emerged as an effective technique for getting LLMs to consult an external knowledge store before returning a response to the user.
AI companies are increasingly using RAG (and similar techniques) to enable users to create tailored versions of their chatbots that draw on documents provided by the user, often accompanied by the option to add custom instructions specific to that instance (e.g. asking the AI to adopt a particular role or tone).
ChatGPT has named these ‘GPTs’, Claude ‘Projects’, Perplexity ‘Spaces’ and NotebookLM, er, ‘notebooks’.
So, how do you add knowledge to each and what are their relative strengths and limitations?
To test them out, I added the same knowledge to all of them (the 79 posts I’ve written on this blog over the last 3 years - around 62,000 words) and asked them the same question: “What have I previously written about custom GPTs?”
ChatGPT GPTs
Whilst anyone can access custom GPTs, creating them requires a paid plan, which start at $20 per month (for the Plus plan).
Here’s how to create a GPT and add knowledge for it to draw upon:
Go to chatgpt.com
Click on your profile image (top right) and select ‘My GPTs’
Select ‘Create a GPT’
Give your GPT a name and add Instructions detailing how you want it to behave (e.g. “You are an experienced researcher. Thoroughly check the uploaded knowledge and provide accurate responses using only that information”)
Scroll down to the ‘Knowledge’ section and click ‘Upload files’
Upload up to 20 files. Each file can be up to 512MB. These limits mean it’s a good idea to consolidate text into a single file where possible (I uploaded the 79 blog posts in one doc)
Decide whether you want the GPT to be able to browse the web or not and check/uncheck ‘Web Browsing’ in the Capabilities section
Click ‘Additional Settings’ and uncheck ‘Use conversation data in your GPT to improve our models’
Click ‘Create’ and select ‘Only me’ (unless you want to share with others in which case select ‘Anyone with the link’)
Start interrogating ChatGPT’s newly acquired knowledge
ChatGPT gave an accurate and succinct response (one of my overall custom instructions is “be succinct”), but didn’t provide any citations or reference the titles of the relevant blog posts.
Best for: large files, analysing data sources (e.g. spreadsheets)
Limitations: 20 files per GPT, accepts AV uploads but can’t make sense of them
NB. By default, ChatGPT uses your interactions for AI model training. To opt-out go to Settings > Data controls and toggle off ‘Improve the model for everyone’.
Claude Projects
Projects are only available to paid subscribers. The cheapest plan (Pro) is £18 a month.
Here’s how to create a Project with custom knowledge:
Go to claude.ai
Select ‘Create Project’ (under the Projects section of the main menu)
Give your project a name
Describe what you want to achieve and click ‘Create Project’
In the ‘Project Knowledge’ box, click ‘Add Content’ (‘Add Knowledge’ in the mobile app) and upload text-based documents under 30MB
Add any custom instructions detailing how you would like Claude to respond in chats within the project
Start interrogating Claude’s newly acquired knowledge
Claude produced an accurate and clearly formatted response and was the only chatbot to reference the titles of the relevant blog posts. However, it didn’t provide any citation links.
Best for: smaller text-based projects, clear formatting of responses
Limitations: no web access, meagre file size/token limits, sharing only possible on Teams plan
Perplexity Spaces
Spaces are only available to paid subscribers. Perplexity Pro costs $20 per month.
Here’s how to create a Space with custom knowledge:
Go to perplexity.ai
Select ‘Spaces’ from the left-hand menu and click ‘Create a Space’
Title your Space
Add Custom Instructions that describe how you want the AI to respond in that Space
Click on your newly-created Space and click the + icon under Files
Upload files under 25MB
Start interrogating Perplexity newly acquired knowledge
Perplexity gave a thorough response to my question with citations, although gave it to me as two chunky paragraphs (something I could address by adding a new custom instruction).
Best for: the ability to toggle web search off and on
Limitations: 25MB file size limit
NB. By default, Perplexity uses your interactions for AI model training. To opt-out go to Settings > Account and toggle ‘AI Data Retention’ off.
NotebookLM notebooks
Google flagship AI chatbot, Gemini, recently rolled out an equivalent feature to ChatGPT GPTs and Claude Projects called ‘Gems’. However, it doesn’t yet support document upload. Fortunately, Google also has a standalone product, NotebookLM, which is tailored to research/document analysis and does support uploads.
Here’s how to add files to a Notebook:
Go to notebooklm.google.com
Click/tap ‘New Notebook’
Upload up to 50 files (each under 200MB) or paste in text
Start interrogating your notebook’s newly acquired knowledge
NotebookLM created a clearly formatted response with citations (which preview the relevant section on hover) and key words in bold, but didn’t give me the titles of the relevant posts. It also incorrectly stated that GPT’s “do not impact ChatGPT's default behavior or responses unless Memory is activated” - a handy reminder that none of these tools are hallucination-free.
Best for: ease of use, analysing audio, citation previews
Limitations: no web search, won’t accept MS Office files, Google Sheets or CSVs, custom instructions can only be applied to audio overviews (not text chat)
What about Microsoft?
Microsoft launched Copilot GPT Builder for Copilot Pro subscribers in January but then retired it in July. Microsoft 365 Business or Enterprise customers can create ‘agents’ with custom knowledge stores using Copilot Studio but prices start at $360 per user per year and there’s no equivalent feature for individual Copilot Pro subscribers.
Conclusions
If you’re wanting to experience the power of an LLM when applied to your own document set without parting with any cash then NotebookLM is a great place to start. It’s easy to use, it can cope with chunky file sizes, it can analyse audio as well as text and Google pinky promise they won’t use your data for modelling training.
However, if you’re wanting your chatbot to adopt a particular tone or role or to search the web as well as the documents you’ve provided, you’re going to need to pony up some cash.
If you don’t need web search and you’re only looking to add a small amount of text, then Claude Projects is a good option.
If your document set includes spreadsheets then ChatGPT GPTs are your best bet.
If you’re wanting to easily toggle web search on and off then Perplexity Spaces is the way to go.
Here’s a summary of the cost and capabilities of the different options:
Dan’s Media & AI Sandwich is free to read. If you’ve found this post interesting, please consider liking and/or sharing it. If you’d like to chip in to support my writing, you can now do so by becoming a paid subscriber. Contributions make it easier for me to dedicate more time to writing. Thanks to those of you who’ve already become paid subscribers. If you’d prefer to make a one-off contribution you can do so at buymeacoffee.com/dantaylorwatt.