使用Ollama和OpenWebUI本地运行llama3大语言模型

博主： YiYu
发布时间：2024 年 05 月 08 日
814 次浏览
2 条评论
8365字数
分类： AI

Learn AI 系列文章：
[1. Setup ubuntu server 20.04 for machine learning](https://www.helloyiyu.com/index.php/ai/28.html)
[2. Try run PyTorch Quickstart demo](https://www.helloyiyu.com/index.php/ai/32.html)
[3. 使用Ollama和OpenWebUI本地运行llama3大语言模型](https://www.helloyiyu.com/index.php/ai/39.html)

---

### Ollama介绍

![ollama logo.png](https://www.helloyiyu.com/usr/uploads/2024/05/298398074.png)

项目地址：[ollama](https://github.com/ollama/ollama)
官网地址： [Ollama](https://ollama.com/)
模型仓库： [library (ollama.com)](https://ollama.com/library)

ollama是用Go语言写的开源大模型运行软件（你可以认为是llama.cpp的Go语言版），支持GPU/CPU混合模式，你可以根据自己笔记本电脑GPU、GPU显存以及CPU、内存的情况，选择不同量化版本的大模型。

Ollama 安装十分简单，[官方](https://ollama.com/download/linux)提供了不同平台的安装方式：

![download_ollama.png](https://www.helloyiyu.com/usr/uploads/2024/05/2237377114.png)

如果你熟悉 Docker，直接使用[官方Docker image](https://ollama.com/blog/ollama-is-now-available-as-an-official-docker-image)方式会更方便。

### Ollama本地部署（CPU only）

因为不确定后续相关操作是不是会涉及到python环境，先创建一个conda env来隔离一下python环境，后来觉得不需要这一步也可以。

```shell
conda create -n ollama python=3.10
conda activate ollama
```

使用docker run运行ollama，因为我本地机器只有一张极其简陋的亮机卡，这里直接使用CPU only的指令：

```
docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
```

在ubuntu server上首次运行docker遇到了如下报错：

```shell
(ollama) yiyu@eastforest:~$ docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
docker: permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Head "http://%2Fvar%2Frun%2Fdocker.sock/_ping": dial unix /var/run/docker.sock: connect: permission denied.
See 'docker run --help'.
```

原因是当前用户没有docker权限，需要将当前用户加入到docker用户组：

```shell
(ollama) yiyu@eastforest:~$ sudo gpasswd -a ${USER} docker
[sudo] password for yiyu:
Adding user yiyu to group docker
```

退出当前shell窗口然后重新登录，应用用户组修改，然后再继续运行ollama：

```shell
(ollama) yiyu@eastforest:~$ docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
Unable to find image 'ollama/ollama:latest' locally
latest: Pulling from ollama/ollama
3c645031de29: Pull complete
2fc4741feb27: Pull complete
8ce449dca7ea: Pull complete
Digest: sha256:03072b8f3e80777ecd1dabeca80e7af32ed85f59bbcd60b6b41b233951c40faba
Status: Downloaded newer image for ollama/ollama:latest
5c2236d1770523163e9105cff8c90588a2cb9f5df893c6858e3e60e83c26d604
```

ollama运行成功后，访问`server_ip:11434`，可以看到ollama正常运行的提示：

```
Ollama is running
```

ollama框架运行起来以后，首先了解一下ollama相关指令：

```
ollama list：显示模型列表。
ollama show：显示模型的信息
ollama pull：拉取模型
ollama push：推送模型
ollama cp：拷贝一个模型
ollama rm：删除一个模型
ollama run：运行一个模型
```

我们可以很方便的通过ollama指令直接运行模型，可用的模型列表可以在[library (ollama.com)](https://ollama.com/library)查询到。

以[llama3](https://ollama.com/library/llama3:8b)模型为例，不同参数大小的模型对运行内存有不同的要求，下图中8b和70b代表80亿和700亿的参数量大小，他们分别需要4.7G和40G存储空间。
![llama3_library.png](https://www.helloyiyu.com/usr/uploads/2024/05/2221836207.png)

对于推理过程中的RAM需求，llama3:8b至少16G，llama3:70b至少64G或者更多，我本地机器总共是64G内存，首先尝试一下llama3:8b模型：

`docker exec -it <Docker Name> <Cmd>`

```shell
(ollama) yiyu@eastforest:~$ docker exec -it ollama ollama run llama3:8b
pulling manifest
pulling 00e1317cbf74... 100% ▕█████████████████████████████████▏ 4.7 GB
pulling 4fa551d4f938... 100% ▕█████████████████████████████████▏  12 KB
pulling 8ab4849b038c... 100% ▕█████████████████████████████████▏  254 B
pulling 577073ffcc6c... 100% ▕█████████████████████████████████▏  110 B
pulling ad1518640c43... 100% ▕█████████████████████████████████▏  483 B
verifying sha256 digest
writing manifest
removing any unused layers
success

>>> Send a message (/? for help)
```

模型下载和执行成功后，会直接进入命令行提示模式，此时可以进行对话交互：

```shell
>>> hello
Hello! It's nice to meet you. Is there something I can help you with, or would you like to
chat?

>>> who are you
I'm LLaMA, an AI assistant developed by Meta AI that can understand and respond to human
input in a conversational manner. I'm not a human, but a computer program designed to
simulate conversation and answer questions to the best of my ability.

I was trained on a massive dataset of text from the internet and can generate human-like
responses to a wide range of topics and questions. My purpose is to assist users like you
with information and tasks, while also learning and improving my language abilities over
time.

I don't have personal opinions or emotions, but I'm always happy to chat and help with any
questions or topics you'd like to discuss!

>>> 你会说中文吗
😊
我可以说中文！ although my proficiency may not be as high as a native speaker, I can still unders
understand and respond in Mandarin Chinese. Feel free to ask me anything in Chinese, and
I'll do my best to help! 💬

>>> Send a message (/? for help)
```

使用CPU进行推理loading还是比较重的，可以用于学习研究，如果是生产环境的话，还是需要GPU加持。
![llama3:8b loading.png](https://www.helloyiyu.com/usr/uploads/2024/05/988946445.png)

### Web UI

支持Ollama的Web UI有很多，大家可以搜索选择自己喜欢的部署，我try run使用[Open WebUI](https://openwebui.com/)

Open WebUI是一个针对LLM（大型语言模型）用户友好的Web界面，支持Ollama和OpenAI兼容的API。该界面简化了客户端和Ollama API之间的交互，通过直观的界面、响应式设计、快速响应和轻松设置等特点，为用户提供了便捷、高效的使用体验。同时，Open WebUI还支持多种功能，如代码高亮、Markdown和LaTeX支持、RAG集成、网页浏览、预设提示、RLHF注释、模型管理、多模型和多模态、历史记录管理、语音输入和高级参数调整等，满足了用户多样化的需求。

参考[🏡 Home | Open WebUI](https://docs.openwebui.com/)进行安装：

```shell
(ollama) yiyu@eastforest:~$ docker run -d -p 8087:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main
Unable to find image 'ghcr.io/open-webui/open-webui:main' locally
main: Pulling from open-webui/open-webui
b0a0cf830b12: Pull complete
72914424168c: Pull complete
d12a047f1c7e: Pull complete
ab33f1a2f662: Pull complete
94510a1366bc: Pull complete
9a14a4fb94c9: Pull complete
7e7af197da4f: Pull complete
6c9990fae9e7: Pull complete
650e1a5b00f2: Pull complete
9c4050d38b84: Pull complete
d3c0bb584512: Pull complete
12af5268b23d: Pull complete
5206f51306e9: Pull complete
ee142353b3f4: Pull complete
b1db2e5d884d: Pull complete
Digest: sha256:ab62ecc09c2f9a64f7bcdfd585d9cb25c38f4b430639e1fb03fe6155cfd02e43
Status: Downloaded newer image for ghcr.io/open-webui/open-webui:main
f56a607b4af202f6b7c3496d0bd52c8bf763f1f7e0f55df37340206b1effb7f9
```

安装完成后直接访问 `server_ip:8087` (因为我docker启动时将8087映射为8080了，如果你是用别的端口映射的，这里也需要修改为对应的端口）

选择模型（这个下拉list可以看到ollama中安装的模型），开始使用:

![Open Web UI.png](https://www.helloyiyu.com/usr/uploads/2024/05/3940119013.png)

### 运行其他大模型

llama3模型对中文支持不太好，不过在ollama框架下，一切都变得很简单。
我们可以使用ollama指令很轻松的拉取通义千问的模型，然后在open webui中选择对应的模型进行对话测试。

我这里尝试抓取qwen:4b和qwen:7b版本对比测试一下：

```shell
(ollama) yiyu@eastforest:~$ docker exec -it ollama ollama pull qwen
pulling manifest
pulling 46bb65206e0e... 100% ▕████████████████████████████████████████████████████▏ 2.3 GB
pulling 41c2cf8c272f... 100% ▕████████████████████████████████████████████████████▏ 7.3 KB
pulling 1da0581fd4ce... 100% ▕████████████████████████████████████████████████████▏  130 B
pulling f02dd72bb242... 100% ▕████████████████████████████████████████████████████▏   59 B
pulling b861bd365e67... 100% ▕████████████████████████████████████████████████████▏  483 B
verifying sha256 digest
writing manifest
removing any unused layers
success
(ollama) yiyu@eastforest:~$ docker exec -it ollama ollama pull qwen:7b
pulling manifest
pulling 87f26aae09c7... 100% ▕████████████████████████████████████████████████████▏ 4.5 GB
pulling 7c7b8e244f6a... 100% ▕████████████████████████████████████████████████████▏ 6.9 KB
pulling 1da0581fd4ce... 100% ▕████████████████████████████████████████████████████▏  130 B
pulling f02dd72bb242... 100% ▕████████████████████████████████████████████████████▏   59 B
pulling c0312cf22ef0... 100% ▕████████████████████████████████████████████████████▏  483 B
verifying sha256 digest
writing manifest
removing any unused layers
success
(ollama) yiyu@eastforest:~$ docker exec -it ollama ollama list
NAME            ID              SIZE    MODIFIED
llama3:8b       a6990ed6be41    4.7 GB  26 hours ago
qwen:7b         2091ee8c8d8f    4.5 GB  7 minutes ago
qwen:latest     d53d04290064    2.3 GB  21 minutes ago
```

抓取成功以后，直接在Open WebUI中新建对话，选择新下载的模型就可以开始使用了：

![open webui + qwen:7b.png](https://www.helloyiyu.com/usr/uploads/2024/05/196049753.png)

最后修改：2024 年 05 月 09 日

如果觉得我的文章对你有用，请随意赞赏

发表评论
使用cookie技术保留您的个人信息以便您下次快速评论，继续评论表示您已同意该条款

评论 *

私密评论

名称 *

🎲

邮箱

地址

使用Ollama和OpenWebUI本地运行llama3大语言模型

YiYu • 2024 年 05 月 08 日

---

### Ollama介绍

![ollama logo.png](https://www.helloyiyu.com/usr/uploads/2024/05/298398074.png)

项目地址：[ollama](https://github.com/ollama/ollama)
官网地址： [Ollama](https://ollama.com/)
模型仓库： [library (ollama.com)](https://ollama.com/library)

Ollama 安装十分简单，[官方](https://ollama.com/download/linux)提供了不同平台的安装方式：

![download_ollama.png](https://www.helloyiyu.com/usr/uploads/2024/05/2237377114.png)

如果你熟悉 Docker，直接使用[官方Docker image](https://ollama.com/blog/ollama-is-now-available-as-an-official-docker-image)方式会更方便。

### Ollama本地部署（CPU only）

因为不确定后续相关操作是不是会涉及到python环境，先创建一个conda env来隔离一下python环境，后来觉得不需要这一步也可以。

```shell
conda create -n ollama python=3.10
conda activate ollama
```

使用docker run运行ollama，因为我本地机器只有一张极其简陋的亮机卡，这里直接使用CPU only的指令：

```
docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
```

在ubuntu server上首次运行docker遇到了如下报错：

原因是当前用户没有docker权限，需要将当前用户加入到docker用户组：

```shell
(ollama) yiyu@eastforest:~$ sudo gpasswd -a ${USER} docker
[sudo] password for yiyu:
Adding user yiyu to group docker
```

退出当前shell窗口然后重新登录，应用用户组修改，然后再继续运行ollama：

ollama运行成功后，访问`server_ip:11434`，可以看到ollama正常运行的提示：

```
Ollama is running
```

ollama框架运行起来以后，首先了解一下ollama相关指令：

我们可以很方便的通过ollama指令直接运行模型，可用的模型列表可以在[library (ollama.com)](https://ollama.com/library)查询到。

对于推理过程中的RAM需求，llama3:8b至少16G，llama3:70b至少64G或者更多，我本地机器总共是64G内存，首先尝试一下llama3:8b模型：

`docker exec -it <Docker Name> <Cmd>`

>>> Send a message (/? for help)
```

模型下载和执行成功后，会直接进入命令行提示模式，此时可以进行对话交互：

```shell
>>> hello
Hello! It's nice to meet you. Is there something I can help you with, or would you like to
chat?

I don't have personal opinions or emotions, but I'm always happy to chat and help with any
questions or topics you'd like to discuss!

>>> Send a message (/? for help)
```

### Web UI

支持Ollama的Web UI有很多，大家可以搜索选择自己喜欢的部署，我try run使用[Open WebUI](https://openwebui.com/)

参考[🏡 Home | Open WebUI](https://docs.openwebui.com/)进行安装：

安装完成后直接访问 `server_ip:8087` (因为我docker启动时将8087映射为8080了，如果你是用别的端口映射的，这里也需要修改为对应的端口）

选择模型（这个下拉list可以看到ollama中安装的模型），开始使用:

![Open Web UI.png](https://www.helloyiyu.com/usr/uploads/2024/05/3940119013.png)

### 运行其他大模型

我这里尝试抓取qwen:4b和qwen:7b版本对比测试一下：

抓取成功以后，直接在Open WebUI中新建对话，选择新下载的模型就可以开始使用了：

![open webui + qwen:7b.png](https://www.helloyiyu.com/usr/uploads/2024/05/196049753.png)

使用Ollama和OpenWebUI本地运行llama3大语言模型

发表评论
使用cookie技术保留您的个人信息以便您下次快速评论，继续评论表示您已同意该条款

Setup ubuntu server 20.04 for machine learning

Try run PyTorch Quickstart demo

使用Ollama和OpenWebUI本地运行llama3大语言模型

STM32ADC+DMA周期采样数据滤波

HDLBits Note

使用Ollama和OpenWebUI本地运行llama3大语言模型

在ubuntu中使用docker创建stm32编译环境

Setup Docker Env On Truenas Scale

STM32ADC+DMA周期采样数据滤波

Setup ubuntu server 20.04 for machine learning

使用Ollama和OpenWebUI本地运行llama3大语言模型

发表评论 使用cookie技术保留您的个人信息以便您下次快速评论，继续评论表示您已同意该条款

使用Ollama和OpenWebUI本地运行llama3大语言模型

发表评论
使用cookie技术保留您的个人信息以便您下次快速评论，继续评论表示您已同意该条款