Ollama Model Installation + CTranslate2 Setup for Custom Main.py
Part 1: Ollama Installation & Model Pull
1. Install Ollama
- Windows/macOS: Download from official site
- Linux: Run in terminal
curl -fsSL https://ollama.com/install.sh | sh
2. Verify Installation
ollama --version
3. Pull & Run Models
Code:
# Pull small model
ollama pull llama3.2:1b
# Run model
ollama run llama3.2:1b
Useful Commands
Code:
ollama list # List installed models
ollama remove MODEL # Delete model
ollama stop # Stop service
Summary,Ollama was very easy to use ,you need to run
Code:
curl -fsSL https://ollama.com/install.sh | sh #install it
ollama pull llama3.2:3b #download model from ollama ,you can find name https://ollama.com/search
ollama run llama3.2:3b # run it
Test it
Bash:
curl http://127.0.0.1:11434/api/generate -d '{
"model": "llama3.2:3b",
"prompt": "Translate the following text to Russian: Hello, this is my new translation API.",
"stream": false
}'
For Network hosting ,
1.install nginx
2.try to URL rewrite like
Code:
location / {
proxy_pass http://127.0.0.1:11434/;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_connect_timeout 1600s;
proxy_read_timeout 3000s;
proxy_send_timeout 3000s;
proxy_buffering off;
proxy_cache off;
add_header Access-Control-Allow-Origin * always;
add_header Access-Control-Allow-Methods 'GET, POST, OPTIONS' always;
add_header Access-Control-Allow-Headers 'DNT,X-Mx-ReqToken,Keep-Alive,User-Agent,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type,Authorization' always;
if ($request_method = 'OPTIONS') {
return 204;
}
}
Part 2: CTranslate2 Setup (Run Custom Main.py)
1. Install Dependencies
Code:
pip install ctranslate2 sentencepiece huggingface-hub
#pip install ctranslate2 transformers sentencepiece
Code:
mkdir models && cd models
git clone https://huggingface.co/ctranslate2/Luna-2B-Chat-ct2
Code:
Your Project/
├─ main.py
└─ Luna-2B-Chat-ct2/
└─ model Content
Code:
python main.py
For Network hosting ,
1.install nginx
2.try to URL rewrite like
Code:
location / {
proxy_pass http://127.0.0.1:8000/;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_connect_timeout 1600s;
proxy_read_timeout 3000s;
proxy_send_timeout 3000s;
proxy_buffering off;
proxy_cache off;
add_header Access-Control-Allow-Origin * always;
add_header Access-Control-Allow-Methods 'GET, POST, OPTIONS' always;
add_header Access-Control-Allow-Headers 'DNT,X-Mx-ReqToken,Keep-Alive,User-Agent,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type,Authorization' always;
if ($request_method = 'OPTIONS') {
return 204;
}
}
Troubleshooting
- Model not found: Check file path in main.py
- GPU error: Use device="cpu" in model loading
- Ollama not working: Restart with ollama start
Last edited: