Protect your AI privacy by using private LLMs — a simple step by step guide

--

We are living in a world in which both an individual as well as corporate privacy gets eroded day by day. We give up freely our data in exchange for convenience, but there is a huge price to be paid — both on individual as well as corporate levels. It is one thing to do a google search looking for your favourite chipotle restaurant, and completely another having rather intimate conversations with ChatGPT, or even worse, company devs submitting trade secrets in order to solve specific coding problems. There is an inverse relationship between convenience and privacy. The more convenience we have, the less private and secure the systems will be. In today’s age, both people and companies tend to sacrifice their privacy for convenience but I believe this approach is often suboptimal, especially for companies with lots of competition that are trying to have an edge. Luckily, from my recent interactions with BrainboxBI clients, I can sense a change of sentiment around privacy indicating raising awareness in the space.

So I decided to write this short guide on how to run powerful open-source Large Language Models (LLMs) on your local machine — by a very definition privately and securely. You don’t have to be at the mercy of OpenAI CEO or Chinese Communist Party, if you are thinking of using either GPT-4o or some variant of Deepseek model. For a best performance I recommend that you use a machine that has at least 16GM RAM memory, 8+ CPU cores and RTX3060 graphics card — although you can theoretically still use LLMs on computers with lower capabilities, albeit at a much lower pace. In this post we are going to build a local, private and fully working LLM chatbot application on Linux system using the smallest Deepseek model.

Theoretical background about the app

Out project will have a frontend for actual user interaction with the model, and a backend server for handling user requests and serving responses through an API endpoint. Normally it is a good practice to have the the backend and frontend hosted separately. However, we will include the backend and frontend in a single repository (i.e. use a monorepo method), which is acceptable for smaller scale projects. There are three main components to this app:

- Ollama, which is a platform infrastructure for managing and deploying open-source LLMs
- Next.js, which we will utilise for the frontend user interface (UI) for the interaction with LLM
- Node.js, which we will use as a backend to call Ollama JS API to stream the LLM text output

Step 1: Installing Ollama

We will be using Ollama for fetching and handling the Deepseek model. To install Ollama update your system and run installation script:
sudo apt update && sudo apt upgrade -y
curl -fsSL https://ollama.com/install.sh | sh

Now is the time to test the Ollama installation. Simply type `ollama` in the terminal and you should see the list of available commands. Next we want to pull the smallest Deepseek model:
ollama pull deepseek-r1:1.5b

Once the download is complete, you can get information about the model with:
ollama list
ollama show deepseek-r1:1.5b

NAME ID SIZE MODIFIED
deepseek-r1:1.5b a42b25d8c10a 1.1 GB 3 minutes ago
Model
architecture qwen2
parameters 1.8B
context length 131072
embedding length 1536
quantization Q4_K_M

Parameters
stop “<|begin▁of▁sentence|>”
stop “<|end▁of▁sentence|>”
stop “<|User|>”
stop “<|Assistant|>”

License
MIT License
Copyright © 2023 DeepSeek

As you can see our model has 1.8B parameters and is 1.1GB in size. Not bad, for an LLM!

At this point you could start talking to the model straight away! To do that, run:
ollama run deepseek-r1:1.5b

Once you had enough fun playing with the LLM though command-line, exit with command:
/bye

Step 2: Install Node.js and pnpm package manager

Now is the time install our Node environment. In your terminal run:

# Download and install nvm
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.1/install.sh | bash

# Source nvm manually for the current session if necessary
export NVM_DIR=”$HOME/.nvm”
[ -s “$NVM_DIR/nvm.sh” ] && \. “$NVM_DIR/nvm.sh”
[ -s “$NVM_DIR/bash_completion” ] && \. “$NVM_DIR/bash_completion”

# Install Node.js
nvm install 22

# Verify the Node.js version:
node -v
nvm current

# Install pnpm
npm install -g pnpm

# Check the pnpm version
pnpm -v

Step 3: Clone the monorepo directory

Now we will clone the monorepo GitHub repo that contains the backend and frontend. This repo is a fork I have made from an original project, with minor modifications to the backend code.
# Install git if you don’t have it
sudo apt update
sudo apt install git

# Clone the monorepo
git clone https://github.com/msxakk89/local-llm.git

Step 4: Install dependencies

From within the repository, run the following command:
pnpm install

Step 5: Build the project in dev mode
pnpm dev

Step 6: Enjoy you local, private LLM !

To start using the app go to your browser and launch the app with http://localhost:3000/url.

Thank you

One thing that I’m curious about is whether the open source Deepseek model is as good as the “gold standard” GPT-4o or other well performing LLM models from the GPT family. For this purpose I am planning on running a qualitative benchmark comparison of Deepseek LLMs of incrementally increasing sizes against the off-the-shelf GPT-4. The main vertices I am planning to explore are:

1) Coding tasks
2) Creative tasks
3) Mundane tasks

I will share what I find in the next post !

If you like this article then give it a thumbs up and feel free to connect with me on LinkedIn, visit my website and follow me on Medium.

--

--

No responses yet