hra-ftu-rag-supporting-information

Installation Guide

This document outlines the steps required to install and configure all components needed for HRAftu-LM-RAG, including FastGPT, Python environment, ClickHouse, Kafka (optional), and vision-language model dependencies.

1. Prerequisites

Operating System: Ubuntu 20.04 or 22.04 (or any Linux distribution with Docker support).
Hardware:
- CPU: 4+ cores recommended
- RAM: 16GB+
- GPU (optional for LVM inference): NVIDIA GPU with CUDA support

Docker & Docker Compose: Required for FastGPT deployment.

Install via:

sudo apt-get update
sudo apt-get install -y docker.io docker-compose
sudo usermod -aG docker $USER  # Allow non-root docker usage

Python: Version 3.8 or newer.
Ollama (optional): For LLaVa model deployments.
- Download & install from Ollama official site or via Homebrew (macOS)
pip / venv / conda: For Python virtual environment management.
ClickHouse (optional): For similarity evaluation storage.
Kafka (optional): If streaming LVM outputs via message bus.

2. FastGPT Setup

HRAftu-LM-RAG relies on FastGPT as its knowledge base and retrieval engine. Follow these steps to deploy FastGPT using Docker.

Clone FastGPT repository or download the Docker Compose file
You can either pull the official FastGPT Docker Compose setup or refer to the Docker YAML configuration.
Create a directory for FastGPT
```
mkdir fastgpt && cd fastgpt
```
Download docker-compose.yml Get the latest docker-compose.yml from FastGPT’s documentation or GitHub:
```
curl -O https://raw.githubusercontent.com/fastgpt/fastgpt/main/docker-compose.yml
```

Configure Environment Variables Create a file named .env in the same directory with the following content (update values as needed):

# FastGPT environment variables
FASTGPT_HOST=0.0.0.0
FASTGPT_PORT=3000
FASTGPT_ADMIN_USER=admin
FASTGPT_ADMIN_PASSWORD=your_password

Start FastGPT Services
```
docker-compose up -d
```
- This will launch the FastGPT API server, vector indexer, and any supporting components (PostgreSQL, Redis, etc.).
Verify FastGPT
- Open a browser and navigate to http://localhost:3000.
- Log in using the admin credentials (admin / your_password).
- Confirm that you can access the FastGPT web UI and API endpoints.

Note: For production deployments or custom configurations (TLS, load balancing), refer to the official FastGPT documentation: https://doc.tryfastgpt.ai/docs/development/docker/

3. Python Environment & Dependencies

Create an isolated Python environment and install all required packages for HRAftu-LM-RAG.

Clone HRAftu-LM-RAG Repository

git clone https://github.com/YourUsername/HRAftu-LM-RAG.git
cd HRAftu-LM-RAG

Create a Virtual Environment Using venv:

python3 -m venv venv
source venv/bin/activate

Or using conda:

conda create -n hraftu python=3.8 -y
conda activate hraftu

Install Python Dependencies

pip install --upgrade pip
pip install -r requirements.txt

The requirements.txt should include (but not be limited to):

torch>=2.0.0
transformers>=4.30.0
flask>=2.2.0
pandas>=1.5.0
requests>=2.28.0
clickhouse-driver>=0.2.1
Pillow>=9.0.0
openpyxl>=3.0.0
vllm>=0.5.0

Environment Configuration File Copy the example configuration file and update fields:

mkdir config
cp config/example_config.yaml config/config.yaml

Edit config/config.yaml and set:

fastgpt:
  api_key: "YOUR_FASTGPT_API_KEY"
  host: "http://localhost:3000"

vision:
  llama32_device: "cuda:0"
  phi3_device: "cuda:1"
  phi35_device: "cuda:2"
  pixtral:
    ip_list_file: "config/vision_nodes.txt"
    ports: [34444, 44445, 44446, 44449]

clickhouse:
  host: "localhost"
  port: 9000
  user: "default"
  password: ""
  database: "clkg"

Replace YOUR_FASTGPT_API_KEY with the key obtained from your FastGPT instance.
Adjust device IDs or ports as per your hardware and network setup.

4. ClickHouse Installation (Optional)

If you plan to run Similarity Evaluation with ClickHouse, install and configure ClickHouse server:

Install ClickHouse

# Official Debian/Ubuntu repository
sudo apt-get install -y apt-transport-https ca-certificates dirmngr
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv E0C56BD4
echo "deb https://repo.clickhouse.com/deb/stable/ main/" | sudo tee /etc/apt/sources.list.d/clickhouse.list
sudo apt-get update
sudo apt-get install -y clickhouse-server clickhouse-client

Start ClickHouse Service
```
sudo service clickhouse-server start
```

Verify Installation

clickhouse-client --query="SELECT version();"

Database & Table Setup

Create a database (if not using clkg):
```
CREATE DATABASE clkg;
```

Create a table for storing LVM outputs (example schema):

CREATE TABLE clkg.vision_outputs (
  id String,
  model String,
  output_json String,
  timestamp DateTime
) ENGINE = MergeTree()
ORDER BY (model, id);

Adjust schema as needed for your JSON structure.

Note: Kafka can also be used for streaming data. If you choose Kafka, install via:
sudo apt-get install -y kafka
# Configure server.properties (broker.id, zookeeper.connect, etc.)
sudo service kafka start
Then create a topic (e.g., vision_inference):
kafka-topics.sh --create --topic vision_inference --bootstrap-server localhost:9092 --partitions 3 --replication-factor 1

5. LVM Prerequisites (Ollama & vLLM)

To deploy and query vision-language models (LLaVa, Llama-3.2-Vision, Phi-3-Vision, Phi-3.5-Vision, Pixtral), follow these steps:

5.1 Ollama (LLaVa:34B)

Install Ollama CLI
- macOS (Homebrew):
```
brew install ollama
```
- Ubuntu/Debian: Download the .deb from Ollama Releases and install:
```
sudo dpkg -i ollama_<version>_linux_amd64.deb
```
Pull LLaVa Model
```
ollama pull llava:34b
```
Verify Model
```
ollama list
```

Run LLaVa Service

ollama run llava:34b

By default, Ollama listens on http://localhost:11434 (check CLI docs for custom port).

Example request:

curl -X POST http://localhost:11434/completions \
  -H "Content-Type: application/json" \
  -d '{"prompt": "Describe this image: <Base64-encoded data>"}'

5.2 vLLM (Other Vision-Language Models)

Install vLLM
```
pip install vllm
```
Pull or Reference Model Ensure you have access to Hugging Face or other model repositories.

Llama-3.2-11B-Vision

vllm serve "meta-llama/Llama-3.2-11B-Vision" --port 8000

The service exposes an HTTP endpoint at http://localhost:8000 by default.

Example payload (JSON):

{
  "inputs": [
    { "type": "image", "data": "<Base64 or URL>" },
    { "type": "text", "text": "Describe this scene." }
  ]
}

Phi-3-Vision-128k-Instruct

vllm serve "microsoft/Phi-3-vision-128k-instruct" --port 8001

Phi-3.5-Vision-Instruct

vllm serve "microsoft/Phi-3.5-vision-instruct" --port 8002

Pixtral-12B-2409

vllm serve "mistralai/Pixtral-12B-2409" --port 8003

Verify Services
- Check logs to confirm that models are loaded and listening on respective ports.
- Use a test request via curl or a simple Python HTTP client to validate inference.

Tip: For GPU acceleration, ensure your drivers and CUDA toolkit are properly installed. Add --device cuda:0 (or appropriate device) to the vllm serve command if needed.

6. Final Verification

FastGPT: Visit http://localhost:3000 and confirm the web UI is accessible.

Embedding Service: In a separate terminal, run:

python src/embedding_service/embedding_web.py --port 55443

Then test with:

curl -X POST http://localhost:55443/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{"input": ["Test sentence"], "model": "all-MiniLM-L6-v2"}'

Expect a JSON array of embedding vectors.

LLM Batch Query: Prepare a sample CSV/JSON of prompts and run:

python src/llm_query/llm_query.py --input_file examples/sample_prompts.csv --output_file output_llm.json

Verify output_llm.json contains responses.

LVM Query: Send a test request to each vision-language endpoint (e.g., LLaVa)

curl -X POST http://localhost:11434/completions \
  -H "Content-Type: application/json" \
  -d '{"prompt": "<Base64 image> Describe this image."}'

Ensure a structured JSON response is returned.

ClickHouse:
```
clickhouse-client --query="SHOW DATABASES;"
```
Confirm that the clkg database exists and the vision_outputs table is present.

Similarity Script: Run:

python src/similarity/jaccard_similarity.py --ground_truth schema-test.xlsx --clickhouse_table vision_outputs --output results_with_all_similarity_and_emd5.xlsx

Check that results_with_all_similarity_and_emd5.xlsx is generated.

Upon successful verification of each component, your HRAftu-LM-RAG environment is ready for downstream tasks.

This site is open source. Improve this page.