Troubleshooting
Overview
This page provides solutions to the most common issues you may encounter while deploying or operating Authenta On-Prem.
It covers installation, container runtime, RabbitMQ configuration, GPU setup, and update-related problems.
All issues can be diagnosed locally — no external internet or cloud dependencies are required.
1. Quick Diagnostic Checklist
Before diving into specific problems, verify the following:
| Check | Command | Expected Output |
|---|---|---|
| Docker running | systemctl status docker | Status: active (running) |
| Containers active | docker ps | RabbitMQ + ML task runner containers listed |
| Disk space | df -h | Sufficient free space (≥20 GB) |
| Shared volume exists | ls /opt/authenta/data | Directory accessible |
| RabbitMQ UI accessible | Visit http://localhost:15672 | Dashboard loads successfully |
If all checks pass but the issue persists, refer to the sections below.
2. Docker & Container Issues
🐳 Containers Not Starting
Symptom:docker compose up -d
# -> error: service failed to start- Invalid Docker Compose syntax
- Missing shared volume directories
- Port conflicts with other services
- Validate compose syntax:
docker compose config- Ensure directories exist:
sudo mkdir -p /opt/authenta/data /opt/authenta/logs- Check port usage:
sudo lsof -i :5672
sudo lsof -i :15672⚙️ Containers Exit Immediately
Symptom:docker ps -a
# shows ml-task-runner exited with code 1Resolution: View logs for details:
docker logs ml-task-runner-gpuCommon causes:
- Missing environment variables
- Incorrect data directory mapping
- Corrupted Docker image
Try:
docker compose down
docker compose pull
docker compose up -d3. RabbitMQ Connectivity Issues
🧩 RabbitMQ Not Reachable
Symptom: Client app cannot connect to RabbitMQ (ECONNREFUSED or Connection reset).
Possible Causes:- RabbitMQ container not running
- Wrong hostname or credentials
- Port 5672 blocked by firewall
- Verify container status:
docker ps | grep rabbitmq- Access dashboard:
http://localhost:15672- Check credentials in
.env:
RABBITMQ_USER=user
RABBITMQ_PASS=pass- Test AMQP connection:
telnet localhost 5672- If needed, restart RabbitMQ:
docker compose restart rabbitmq🚫 Queue Missing or Jobs Not Processed
Symptom: Tasks published, but not consumed by the ML task runner.
Resolution:- Open the RabbitMQ dashboard → Queues tab
- Verify that a queue named
task_queueexists - If missing, create it manually:
docker exec -it rabbitmq rabbitmqadmin declare queue name=task_queue durable=true- Confirm the task runner is subscribed (should show as 1 consumer)
- Restart if needed:
docker compose down && docker compose up -d4. ML Task Runner & Model Issues
🧠 Model Not Loading
Symptom: Logs show:
{ "level": "error", "msg": "Model DF-1 not found" }- Confirm image tag is correct (v1-gpu or v1-cpu)
- Re-pull container images:
docker compose pull
docker compose up -d- Ensure correct profile:
docker compose --profile gpu up -d⚡ GPU Not Detected
Symptom: Logs show:
{ "level": "warn", "msg": "No GPU available, switching to CPU" }- Missing NVIDIA drivers
- NVIDIA Container Toolkit not installed
- Docker daemon not restarted after installation
- Verify drivers:
nvidia-smi- Install toolkit:
sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker- Test GPU:
docker run --rm --gpus all nvidia/cuda:12.0-base nvidia-smi- Restart Authenta:
docker compose down && docker compose --profile gpu up -d5. Performance & Scaling Issues
🐢 Inference Too Slow
Possible Causes:- CPU mode running instead of GPU
- Insufficient memory
- Heavy concurrent load
- Switch to GPU profile:
docker compose --profile gpu up -d- Increase task runners:
docker compose up -d --scale ml-task-runner-gpu=3- Monitor resources:
docker stats6. Logs & Debugging Commands
Useful commands for investigating issues:
| Purpose | Command |
|---|---|
| View all containers | docker ps |
| View RabbitMQ logs | docker logs rabbitmq |
| View ML runner logs | docker logs ml-task-runner-gpu |
| Inspect container | docker inspect <container_name> |
| Check Docker network | docker network ls |
| Enter container shell | docker exec -it ml-task-runner-gpu /bin/bash |
| Validate compose | docker compose config |
| Restart all services | docker compose restart |
7. Contacting Support
If your issue persists:
- Collect logs:
docker compose logs > authenta_support_logs.txt-
Include:
- OS version
- Docker and Compose versions
- Short issue description
-
Send to: support@authenta.ai
Summary
| Issue Category | Typical Cause | Resolution |
|---|---|---|
| Containers not starting | Missing volumes / ports in use | Validate compose config |
| Queue not visible | Misconfiguration or missing queue | Recreate via dashboard |
| Jobs not processed | No active task runner | Restart container |
| No result output | Wrong volume mapping | Fix path + permissions |
| GPU not detected | Missing toolkit or driver | Reinstall and restart |
| ECR auth fails | Expired credentials | Re-login and retry |
| Slow performance | CPU mode or low resources | Use GPU / scale containers |
