This document outlines the steps required to install and configure all components needed for HRAftu-LM-RAG, including FastGPT, Python environment, ClickHouse, Kafka (optional), and vision-language model dependencies.
sudo apt-get update
sudo apt-get install -y docker.io docker-compose
sudo usermod -aG docker $USER # Allow non-root docker usage
HRAftu-LM-RAG relies on FastGPT as its knowledge base and retrieval engine. Follow these steps to deploy FastGPT using Docker.
Clone FastGPT repository or download the Docker Compose file
You can either pull the official FastGPT Docker Compose setup or refer to the Docker YAML configuration.
mkdir fastgpt && cd fastgpt
Download docker-compose.yml
Get the latest docker-compose.yml from FastGPT’s documentation or GitHub:
curl -O https://raw.githubusercontent.com/fastgpt/fastgpt/main/docker-compose.yml
Configure Environment Variables
Create a file named .env in the same directory with the following content (update values as needed):
# FastGPT environment variables
FASTGPT_HOST=0.0.0.0
FASTGPT_PORT=3000
FASTGPT_ADMIN_USER=admin
FASTGPT_ADMIN_PASSWORD=your_password
Start FastGPT Services
docker-compose up -d
Verify FastGPT
http://localhost:3000.admin / your_password).Note: For production deployments or custom configurations (TLS, load balancing), refer to the official FastGPT documentation: https://doc.tryfastgpt.ai/docs/development/docker/
Create an isolated Python environment and install all required packages for HRAftu-LM-RAG.
Clone HRAftu-LM-RAG Repository
git clone https://github.com/YourUsername/HRAftu-LM-RAG.git
cd HRAftu-LM-RAG
Create a Virtual Environment
Using venv:
python3 -m venv venv
source venv/bin/activate
Or using conda:
conda create -n hraftu python=3.8 -y
conda activate hraftu
Install Python Dependencies
pip install --upgrade pip
pip install -r requirements.txt
The requirements.txt should include (but not be limited to):
torch>=2.0.0
transformers>=4.30.0
flask>=2.2.0
pandas>=1.5.0
requests>=2.28.0
clickhouse-driver>=0.2.1
Pillow>=9.0.0
openpyxl>=3.0.0
vllm>=0.5.0
Environment Configuration File Copy the example configuration file and update fields:
mkdir config
cp config/example_config.yaml config/config.yaml
Edit config/config.yaml and set:
fastgpt:
api_key: "YOUR_FASTGPT_API_KEY"
host: "http://localhost:3000"
vision:
llama32_device: "cuda:0"
phi3_device: "cuda:1"
phi35_device: "cuda:2"
pixtral:
ip_list_file: "config/vision_nodes.txt"
ports: [34444, 44445, 44446, 44449]
clickhouse:
host: "localhost"
port: 9000
user: "default"
password: ""
database: "clkg"
YOUR_FASTGPT_API_KEY with the key obtained from your FastGPT instance.If you plan to run Similarity Evaluation with ClickHouse, install and configure ClickHouse server:
Install ClickHouse
# Official Debian/Ubuntu repository
sudo apt-get install -y apt-transport-https ca-certificates dirmngr
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv E0C56BD4
echo "deb https://repo.clickhouse.com/deb/stable/ main/" | sudo tee /etc/apt/sources.list.d/clickhouse.list
sudo apt-get update
sudo apt-get install -y clickhouse-server clickhouse-client
Start ClickHouse Service
sudo service clickhouse-server start
Verify Installation
clickhouse-client --query="SELECT version();"
Database & Table Setup
Create a database (if not using clkg):
CREATE DATABASE clkg;
Create a table for storing LVM outputs (example schema):
CREATE TABLE clkg.vision_outputs (
id String,
model String,
output_json String,
timestamp DateTime
) ENGINE = MergeTree()
ORDER BY (model, id);
Adjust schema as needed for your JSON structure.
Note: Kafka can also be used for streaming data. If you choose Kafka, install via:
sudo apt-get install -y kafka # Configure server.properties (broker.id, zookeeper.connect, etc.) sudo service kafka startThen create a topic (e.g.,
vision_inference):kafka-topics.sh --create --topic vision_inference --bootstrap-server localhost:9092 --partitions 3 --replication-factor 1
To deploy and query vision-language models (LLaVa, Llama-3.2-Vision, Phi-3-Vision, Phi-3.5-Vision, Pixtral), follow these steps:
Install Ollama CLI
macOS (Homebrew):
brew install ollama
Ubuntu/Debian: Download the .deb from Ollama Releases and install:
sudo dpkg -i ollama_<version>_linux_amd64.deb
Pull LLaVa Model
ollama pull llava:34b
Verify Model
ollama list
Run LLaVa Service
ollama run llava:34b
http://localhost:11434 (check CLI docs for custom port).Example request:
curl -X POST http://localhost:11434/completions \
-H "Content-Type: application/json" \
-d '{"prompt": "Describe this image: <Base64-encoded data>"}'
Install vLLM
pip install vllm
Pull or Reference Model Ensure you have access to Hugging Face or other model repositories.
Llama-3.2-11B-Vision
vllm serve "meta-llama/Llama-3.2-11B-Vision" --port 8000
http://localhost:8000 by default.Example payload (JSON):
{
"inputs": [
{ "type": "image", "data": "<Base64 or URL>" },
{ "type": "text", "text": "Describe this scene." }
]
}
Phi-3-Vision-128k-Instruct
vllm serve "microsoft/Phi-3-vision-128k-instruct" --port 8001
Phi-3.5-Vision-Instruct
vllm serve "microsoft/Phi-3.5-vision-instruct" --port 8002
Pixtral-12B-2409
vllm serve "mistralai/Pixtral-12B-2409" --port 8003
Verify Services
curl or a simple Python HTTP client to validate inference.Tip: For GPU acceleration, ensure your drivers and CUDA toolkit are properly installed. Add
--device cuda:0(or appropriate device) to thevllm servecommand if needed.
http://localhost:3000 and confirm the web UI is accessible.Embedding Service: In a separate terminal, run:
python src/embedding_service/embedding_web.py --port 55443
Then test with:
curl -X POST http://localhost:55443/v1/embeddings \
-H "Content-Type: application/json" \
-d '{"input": ["Test sentence"], "model": "all-MiniLM-L6-v2"}'
Expect a JSON array of embedding vectors.
LLM Batch Query: Prepare a sample CSV/JSON of prompts and run:
python src/llm_query/llm_query.py --input_file examples/sample_prompts.csv --output_file output_llm.json
Verify output_llm.json contains responses.
LVM Query: Send a test request to each vision-language endpoint (e.g., LLaVa)
curl -X POST http://localhost:11434/completions \
-H "Content-Type: application/json" \
-d '{"prompt": "<Base64 image> Describe this image."}'
Ensure a structured JSON response is returned.
ClickHouse:
clickhouse-client --query="SHOW DATABASES;"
Confirm that the clkg database exists and the vision_outputs table is present.
Similarity Script: Run:
python src/similarity/jaccard_similarity.py --ground_truth schema-test.xlsx --clickhouse_table vision_outputs --output results_with_all_similarity_and_emd5.xlsx
Check that results_with_all_similarity_and_emd5.xlsx is generated.
Upon successful verification of each component, your HRAftu-LM-RAG environment is ready for downstream tasks.