Go to file

lehel b55decb633 added better logging + openrouter call handling		2025-10-01 16:59:30 +02:00
.github/workflows	gitignore	2025-10-01 14:09:35 +02:00
.idea	added better logging + openrouter call handling	2025-10-01 16:59:30 +02:00
visits.bleve	added better logging + openrouter call handling	2025-10-01 16:59:30 +02:00
.gitignore	gitignore	2025-10-01 14:09:35 +02:00
Dockerfile	health	2025-09-30 21:59:35 +02:00
README.md	added better logging + openrouter call handling	2025-10-01 16:59:30 +02:00
chat_service.go	rename reason -> db	2025-10-01 10:12:26 +02:00
chat_service_integration_test.go	rename reason -> db	2025-10-01 10:12:26 +02:00
config.go	ollama server	2025-09-25 14:07:42 +02:00
config.yaml	added better logging + openrouter call handling	2025-10-01 16:59:30 +02:00
db.go	rename reason -> db	2025-10-01 10:12:26 +02:00
db.yaml	rename reason -> db	2025-10-01 10:12:26 +02:00
go.mod	refactor	2025-09-27 09:22:32 +02:00
go.sum	refactor	2025-09-27 09:22:32 +02:00
llm.go	added better logging + openrouter call handling	2025-10-01 16:59:30 +02:00
log.go	rename reason -> db	2025-10-01 10:12:26 +02:00
main.go	rename reason -> db	2025-10-01 10:12:26 +02:00
maindb.yaml	rename reason -> db	2025-10-01 10:12:26 +02:00
models.go	refactor break up a bit	2025-09-25 14:39:51 +02:00
openrouter_integration_test.go	added better logging + openrouter call handling	2025-10-01 16:59:30 +02:00
run.sh	ollama server	2025-09-25 14:07:42 +02:00
test.sh	base service	2025-09-24 13:05:24 +02:00
test_copy.sh	ollama server	2025-09-25 14:07:42 +02:00
ui.go	refactor break up a bit	2025-09-25 14:39:51 +02:00
ui.html	basic ui	2025-09-24 13:11:48 +02:00

README.md

Vetrag

Lightweight veterinary visit reasoning helper with LLM-assisted keyword extraction and disambiguation.

Features

Switch seamlessly between local Ollama and OpenRouter (OpenAI-compatible) LLM backends by changing environment variables only.
Structured JSON outputs enforced using provider-supported response formats (Ollama format, OpenAI/OpenRouter response_format: { type: json_object }).
Integration tests using mock LLM & DB (no network dependency).
GitHub Actions CI (vet, test, build).

Quick Start

1. Clone & build

git clone <repo-url>
cd vetrag
go build ./...

2. Prepare data

Ensure config.yaml and maindb.yaml / db.yaml exist as provided. Visit data is loaded at runtime (see models.go / db.go).

3. Run with Ollama (local)

Pull or have a model available (example: ollama pull qwen2.5):

export OPENAI_BASE_URL=http://localhost:11434/api/chat
export OPENAI_MODEL=qwen2.5:latest
# API key not required for Ollama
export OPENAI_API_KEY=

go run .

4. Run with OpenRouter

export OPENAI_BASE_URL=https://openrouter.ai/api/v1/chat/completions
export OPENAI_API_KEY=sk-or-XXXXXXXXXXXXXXXX
export OPENAI_MODEL=meta-llama/llama-3.1-70b-instruct  # or any supported model

go run .

Open http://localhost:8080/ in your browser.

5. Health & Chat

curl -s http://localhost:8080/health
curl -s -X POST http://localhost:8080/chat -H 'Content-Type: application/json' -d '{"message":"my dog has diarrhea"}' | jq

Environment Variables

Variable	Purpose	Default (if empty)
OPENAI_BASE_URL	LLM endpoint (Ollama chat or OpenRouter chat completions)	`http://localhost:11434/api/chat`
OPENAI_API_KEY	Bearer token for OpenRouter/OpenAI-style APIs	(unused if empty)
OPENAI_MODEL	Model identifier (Ollama model tag or OpenRouter model slug)	none (must set for remote)

How Backend Selection Works

llm.go auto-detects the style:

If the base URL contains openrouter.ai or /v1/ it uses OpenAI-style request & parses choices[0].message.content.
Otherwise it assumes Ollama and posts to /api/chat with format for structured JSON.

Structured Output

We define a JSON Schema-like map internally and:

Ollama: send as format (native structured output extension).
OpenRouter/OpenAI: send response_format: { type: "json_object" } plus a system instruction describing the expected keys.

Prompts

Prompts in config.yaml have been adjusted to explicitly demand JSON only. This reduces hallucinated prose and plays well with both backends.

Testing

Run:

go test ./...

All tests mock the LLM so no network is required.

CI

GitHub Actions workflow at .github/workflows/ci.yml runs vet, tests, build on push/PR.

Troubleshooting

Symptom	Cause	Fix
Provider error referencing `response_format` and `json_schema`	Some providers reject `json_schema`	We now default to `json_object`; ensure you pulled latest changes.
Empty response	Model returned non-JSON or empty content	Enable debug logs (see below) and inspect raw response.
Non-JSON content (code fences)	Model ignored instruction	Try a stricter system message or switch to a model with better JSON adherence.

Enable Debug Logging

Temporarily edit main.go:

logrus.SetLevel(logrus.DebugLevel)

(You can also refactor later to read a LOG_LEVEL env var.)

Sanitizing Output (Optional Future Improvement)

If some models wrap JSON in text, a post-processor could strip code fences and re-parse. Not implemented yet to keep logic strict.

Next Ideas

Add retry with exponential backoff for transient 5xx.
Add optional json fallback if a provider rejects json_object.
Add streaming support.
Add integration test with recorded OpenRouter fixture.

License

(Choose and add a LICENSE file if planning to open source.)