Introducing the GPT4All Reasoning System and the GPT4All Reasoner v1 model, bringing cutting-edge on-device inference-time compute capabilities to the GPT4All platform. This Reasoning release improves local AI capabilities, empowering users with advanced features such as Code Interpreter, Tool Calling, and Code Sandboxing — all running securely on your own hardware.
Inference-time compute enables LLMs to iterate on their outputs during execution, improving reasoning, accuracy, and context comprehension. Previously limited to server-side LLMs, this technique is now available directly on your laptop, unlocking a new level of on-device AI performance.
How many 'r's are their in the word strawberry?
This release introduces the GPT4All Javascript Sandbox, a secure and isolated environment for executing code tool calls. When using Reasoning models equipped with Code Interpreter capabilities, all code runs safely in this sandbox, ensuring user security and multi-platform compatibility.
The Javascript Sandbox is the backbone of tool-based workflows in GPT4All, enabling:
To start using the new capabilities of GPT4All Reasoner v1:
Note: GPT4All Reasoner v1 is a modified version of Qwen Coder 7B that works with the GPT4All Reasoning System.
The following examples illustrate situations where GPT4All's Reasoning system accomplishes tasks that the underlying 7B parameter post-trained model cannot perform.
How many days left until Christmas?
Counting the number of 'r's in strawberry related queries.
Counting is easy when you can tool call into the code sandbox!
Approximating Integrals
Playing with Prime Numbers
Synthetic Data Generation with a Remote Endpoint
You can use the GPT4All Reasoning system in conjunction with any model hosted at an OpenAI-compatible endpoint. To set this up:
On-device inference-time scaling improves local LLM capabilities without increasing model size. Any open-source language model can be configured to work with the GPT4All Reasoning system.
The first version of Reasoning demonstrates that small language models, equipped with inference-time compute infrastructure, punch far beyond their parameter class accomplishing tasks usually reserved for larger models.
Subsequent versions will introduce expanded tool-use capabilities and improvements to inference-time iteration algorithms.
We do not provide a comprehensive quantitative evaluation of the system at this time but hope to do so in the near future.