We're thrilled to announce the release of Nomic GPT4All v3.5.0, featuring some important upgrades to both user experience and GPT4All core infrastructure.
A key feature in v3.5 is Chat Editing, giving you precise control over your conversations. This capability allows you to edit any message in your chat history to clarify ambiguous questions or refine unclear responses. For example, prompting with "Tell me about the llama family of llms" in this example gives information about the animals. So we just change the prompt to "Tell me about the Llama family of LLMs" to guide the chat toward Meta's language models instead.
Chat editing lets you explore different conversation paths by editing earlier messages, without needing to start over from the beginning. You can try variations of your message mid-conversation to guide the chat in new directions while keeping the existing chat context intact. To implement chat editing without sacrificing latency, GPT4All performs live surgery on the KV cache with every edit!
Step 1: Click edit on a message
Step 2: Enter your new message
Step 3: See your updated chat
Prefix caching is a technique that brings a speed upgrade to LLM chats for any model you use in GPT4All. It's a sampling technique that lets models use pre-computed calculations stored in its KV cache instead of re-computing them, which can significantly speed up the time-to-first-token of an LLM generation.
Nothing is required for users to do anything differently to get the benefits of prefix caching: it's an improvement we've implemented to the behind the scenes sampling of LLM generation for all models running in GPT4All!
As of this version, you now have ability to attach .txt
, .md
(markdown), and .rst
(reStructuredText) files to your chats.
Previously, you could only work with these file types with GPT4All LocalDocs collections. Now you are able to attach them directly to your chats!
We've added Jinja templating support for chat templates, which standardizes how GPT4All formats chats and makes customization simpler for developers. You can learn more about how to work with and customize these templates in our documentation.
Jinja templating enables broader compatibility with models found on Huggingface and lays the foundation for agentic, tool calling support.
This release lays the groundwork for an exciting future feature: comprehensive tool calling support. The new templating infrastructure will enable seamless integration with external tools, direct data fetching capabilities, and safe code execution within your private LLM chats.