Skip to content

Troubleshooting

Overview

This page provides solutions to the most common issues you may encounter while deploying or operating Authenta On-Prem.
It covers installation, container runtime, RabbitMQ configuration, GPU setup, and update-related problems.

All issues can be diagnosed locally — no external internet or cloud dependencies are required.

1. Quick Diagnostic Checklist

Before diving into specific problems, verify the following:

CheckCommandExpected Output
Docker runningsystemctl status dockerStatus: active (running)
Containers activedocker psRabbitMQ + ML task runner containers listed
Disk spacedf -hSufficient free space (≥20 GB)
Shared volume existsls /opt/authenta/dataDirectory accessible
RabbitMQ UI accessibleVisit http://localhost:15672Dashboard loads successfully

If all checks pass but the issue persists, refer to the sections below.

2. Docker & Container Issues

🐳 Containers Not Starting

Symptom:
docker compose up -d
# -> error: service failed to start
Possible Causes:
  • Invalid Docker Compose syntax
  • Missing shared volume directories
  • Port conflicts with other services
Resolution:
  1. Validate compose syntax:
docker compose config
  1. Ensure directories exist:
sudo mkdir -p /opt/authenta/data /opt/authenta/logs
  1. Check port usage:
sudo lsof -i :5672
sudo lsof -i :15672

⚙️ Containers Exit Immediately

Symptom:
docker ps -a
# shows ml-task-runner exited with code 1

Resolution: View logs for details:

docker logs ml-task-runner-gpu

Common causes:

  • Missing environment variables
  • Incorrect data directory mapping
  • Corrupted Docker image

Try:

docker compose down
docker compose pull
docker compose up -d

3. RabbitMQ Connectivity Issues

🧩 RabbitMQ Not Reachable

Symptom: Client app cannot connect to RabbitMQ (ECONNREFUSED or Connection reset).

Possible Causes:
  • RabbitMQ container not running
  • Wrong hostname or credentials
  • Port 5672 blocked by firewall
Resolution:
  1. Verify container status:
docker ps | grep rabbitmq
  1. Access dashboard:
http://localhost:15672
  1. Check credentials in .env:
RABBITMQ_USER=user
RABBITMQ_PASS=pass
  1. Test AMQP connection:
telnet localhost 5672
  1. If needed, restart RabbitMQ:
docker compose restart rabbitmq

🚫 Queue Missing or Jobs Not Processed

Symptom: Tasks published, but not consumed by the ML task runner.

Resolution:
  1. Open the RabbitMQ dashboard → Queues tab
  2. Verify that a queue named task_queue exists
  3. If missing, create it manually:
docker exec -it rabbitmq rabbitmqadmin declare queue name=task_queue durable=true
  1. Confirm the task runner is subscribed (should show as 1 consumer)
  2. Restart if needed:
docker compose down && docker compose up -d

4. ML Task Runner & Model Issues

🧠 Model Not Loading

Symptom: Logs show:

{ "level": "error", "msg": "Model DF-1 not found" }
Resolution:
  1. Confirm image tag is correct (v1-gpu or v1-cpu)
  2. Re-pull container images:
docker compose pull
docker compose up -d
  1. Ensure correct profile:
docker compose --profile gpu up -d

⚡ GPU Not Detected

Symptom: Logs show:

{ "level": "warn", "msg": "No GPU available, switching to CPU" }
Possible Causes:
  • Missing NVIDIA drivers
  • NVIDIA Container Toolkit not installed
  • Docker daemon not restarted after installation
Resolution:
  1. Verify drivers:
nvidia-smi
  1. Install toolkit:
sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker
  1. Test GPU:
docker run --rm --gpus all nvidia/cuda:12.0-base nvidia-smi
  1. Restart Authenta:
docker compose down && docker compose --profile gpu up -d

5. Performance & Scaling Issues

🐢 Inference Too Slow

Possible Causes:
  • CPU mode running instead of GPU
  • Insufficient memory
  • Heavy concurrent load
Fix:
  1. Switch to GPU profile:
docker compose --profile gpu up -d
  1. Increase task runners:
docker compose up -d --scale ml-task-runner-gpu=3
  1. Monitor resources:
docker stats

6. Logs & Debugging Commands

Useful commands for investigating issues:

PurposeCommand
View all containersdocker ps
View RabbitMQ logsdocker logs rabbitmq
View ML runner logsdocker logs ml-task-runner-gpu
Inspect containerdocker inspect <container_name>
Check Docker networkdocker network ls
Enter container shelldocker exec -it ml-task-runner-gpu /bin/bash
Validate composedocker compose config
Restart all servicesdocker compose restart

7. Contacting Support

If your issue persists:

  1. Collect logs:
docker compose logs > authenta_support_logs.txt
  1. Include:

    • OS version
    • Docker and Compose versions
    • Short issue description
  2. Send to: support@authenta.ai

Summary

Issue CategoryTypical CauseResolution
Containers not startingMissing volumes / ports in useValidate compose config
Queue not visibleMisconfiguration or missing queueRecreate via dashboard
Jobs not processedNo active task runnerRestart container
No result outputWrong volume mappingFix path + permissions
GPU not detectedMissing toolkit or driverReinstall and restart
ECR auth failsExpired credentialsRe-login and retry
Slow performanceCPU mode or low resourcesUse GPU / scale containers