Remote ML Detection Server =========================== ``pyzm.serve`` is a built-in FastAPI server that loads ML models once and serves detection requests over HTTP. This lets you offload GPU-heavy inference to a dedicated machine. .. code-block:: text URL mode (default) GPU box +-----------------+ frame URLs +------------------+ | zm_detect.py | ----------------> | pyzm.serve | | Detector( | | fetch from ZM | | gateway=...) | <---------------- | detect & return | +-----------------+ DetectionResult +------------------+ | +------v-----------+ | ZoneMinder API | +------------------+ Image mode (gateway_mode="image") GPU box +-----------------+ HTTP/JPEG +------------------+ | zm_detect.py | ----------------> | pyzm.serve | | Detector( | | YOLO11 (GPU) | | gateway=..., | <---------------- | Coral TPU | | gateway_mode= | DetectionResult +------------------+ | "image") | +-----------------+ Two detection modes are available: - **URL mode** (default) -- the client sends frame URLs to the ``/detect_urls`` endpoint and the *server* fetches images directly from ZoneMinder. This avoids transferring every frame through the client. - **Image mode** -- the client fetches frames from ZM, JPEG-encodes them, and uploads each one to the ``/detect`` endpoint. Use this when the server cannot reach ZoneMinder directly. .. list-table:: URL mode vs Image mode trade-offs :header-rows: 1 :widths: 30 35 35 * - - URL mode (default) - Image mode * - Network requirement - Server must reach ZoneMinder - Only client needs ZM access * - Bandwidth - Low — client sends only URLs - Higher — client uploads JPEG per frame * - Latency - Server fetches from ZM (one extra hop) - Single client → server transfer * - Security - ZM credentials forwarded via ``zm_auth`` - Images leave ZM network * - Configuration - ``gateway_mode`` omitted or ``"url"`` - ``gateway_mode="image"`` (Python) or ``ml_gateway_mode: "image"`` (YAML) * - Best for - Same network / VPN between server and ZM - Server on a different network or cloud **When to choose Image mode:** Use Image mode when the GPU server cannot reach the ZoneMinder API directly (e.g., server is in the cloud, or firewall rules prevent it). The client handles frame fetching and uploads JPEG images to ``/detect``. **When to stay with URL mode (default):** Use URL mode when the server and ZoneMinder are on the same network. This minimises bandwidth on the client side and lets the server fetch only the frames it needs. Deployment scenarios --------------------- Scenario 1: ZM + EventServerNg + hooks + pyzm (same box) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Everything runs on the same machine. The ZoneMinder EventServerNg (zmesNg) triggers hook scripts which call ``zm_detect.py``, and detection runs locally via the ``Detector`` class. .. code-block:: text ZoneMinder --> zmeventnotification.pl (zmesNg) | v zm_event_start.sh | v zm_detect.py --> Detector (local GPU/CPU) **objectconfig.yml** (no ``remote`` section needed): .. code-block:: yaml ml: ml_sequence: general: model_sequence: "object" object: general: pattern: "(person|car)" sequence: - name: YOLO11s object_weights: "/var/lib/zmeventnotification/models/ultralytics/yolo11s.onnx" object_labels: "/var/lib/zmeventnotification/models/yolov4/coco.names" object_framework: opencv object_processor: gpu **Test locally:** .. code-block:: bash sudo -u www-data /opt/zoneminder/venv/bin/python /path/to/zm_detect.py \ --config /etc/zm/objectconfig.yml \ --eventid 12345 Scenario 2: ZM + hooks + pyzm (same box, no zmesNg) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Same as Scenario 1 but without EventServerNg. ZoneMinder calls ``zm_detect.py`` directly via its ``EventStartCommand`` / ``EventEndCommand`` recording settings. .. code-block:: text ZoneMinder EventStartCommand --> zm_detect.py --> Detector (local) **ZoneMinder Console -> Click on Monitor Source -> Recording:** .. code-block:: text EventStartCommand = /opt/zoneminder/venv/bin/python /path/to/zm_detect.py -c /etc/zm/objectconfig.yml -e %EID% -m %MID% -r "%EC%" -n **objectconfig.yml** is the same as Scenario 1. Scenario 3: ZM box + remote GPU box (split architecture) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Detection runs on a separate GPU machine. The ZM box runs ``zm_detect.py`` which sends requests to the remote ``pyzm.serve`` server over HTTP. .. code-block:: text ZM box GPU box +-------------------+ +------------------------+ | zm_detect.py | HTTP | pyzm.serve | | Detector( | ------------> | --models all | | gateway=...) | | --processor gpu | | | <------------ | --port 5000 | +-------------------+ JSON result +------------------------+ **GPU box setup:** .. code-block:: bash pip install pyzm[serve] python -m pyzm.serve --models all --processor gpu --port 5000 Or with specific models and auth: .. code-block:: bash python -m pyzm.serve \ --models yolo11s yolo26s \ --processor gpu \ --port 5000 \ --auth --auth-user admin --auth-password secret \ --token-secret my-jwt-secret **ZM box objectconfig.yml:** .. code-block:: yaml ml: ml_sequence: general: model_sequence: "object" ml_gateway: "http://192.168.1.100:5000" # ml_gateway_mode: "image" # uncomment if server can't reach ZM directly ml_fallback_local: "yes" object: general: pattern: "(person|car)" sequence: - object_framework: opencv object_weights: "/var/lib/zmeventnotification/models/ultralytics/yolo11s.onnx" By default, URL mode is used -- the server fetches frames directly from ZM. Set ``ml_gateway_mode: "image"`` if the server cannot reach ZoneMinder (the client will JPEG-encode and upload frames instead). Available models ----------------- Model names and discovery ~~~~~~~~~~~~~~~~~~~~~~~~~~ Model names passed via ``--models`` (or ``Detector(models=[...])``) are resolved against ``--base-path`` on disk. There are no hardcoded presets -- any name you pass is looked up as follows: 1. **Directory match** -- ``base_path//`` containing a weight file 2. **File stem match** -- any ``.onnx``, ``.weights``, or ``.tflite`` in any subdirectory of ``base_path`` The framework is inferred from the file extension: - ``.onnx`` -- OpenCV DNN (ONNX runtime) - ``.weights`` -- OpenCV DNN (Darknet format, also needs a ``.cfg`` file) - ``.tflite`` -- Coral Edge TPU runtime (processor forced to ``tpu``) Label files are auto-detected from the same directory (``.names``, ``.txt``, ``.labels``). For Darknet models, ``.cfg`` files are also discovered automatically. The ``--processor`` flag (``cpu``, ``gpu``, ``tpu``) applies to all discovered models (except ``.tflite`` which always uses ``tpu``). ``--models all`` (lazy loading) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ When you pass ``--models all`` to the server, **every** model in ``--base-path`` is discovered and registered, but weights are **not** loaded into memory at startup. Instead, each backend loads its weights on the first request that uses it. This is useful when you have many models but don't want to consume GPU memory for all of them upfront. .. code-block:: bash python -m pyzm.serve --models all --base-path /data/models --processor gpu Use the ``GET /models`` endpoint to check which models are available and whether their weights have been loaded. Model directory layout ~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: text /var/lib/zmeventnotification/models/ +-- yolov4/ | +-- yolov4.weights | +-- yolov4.cfg | +-- coco.names +-- ultralytics/ | +-- yolo11s.onnx | +-- yolo11n.onnx | +-- yolo26s.onnx +-- coral_edgetpu/ | +-- ssd_mobilenet_v2.tflite | +-- coco_labels.txt Server setup ------------- Installation ~~~~~~~~~~~~~ The ``[serve]`` extra automatically includes all ML dependencies. .. code-block:: bash pip install "pyzm[serve]" CLI options ~~~~~~~~~~~~ .. list-table:: :header-rows: 1 :widths: 25 15 60 * - Flag - Default - Description * - ``--host`` - ``0.0.0.0`` - Bind address * - ``--port`` - ``5000`` - Bind port * - ``--models`` - ``yolo11s`` - Model names (space-separated). Use ``all`` to auto-discover every model in ``--base-path`` (loaded lazily on first request). * - ``--base-path`` - ``/var/lib/zmeventnotification/models`` - Directory containing model subdirectories * - ``--processor`` - ``cpu`` - ``cpu``, ``gpu``, or ``tpu`` * - ``--auth`` - off - Enable JWT authentication * - ``--auth-user`` - ``admin`` - Username (when auth enabled) * - ``--auth-password`` - (empty) - Password (when auth enabled) * - ``--token-secret`` - ``change-me`` - Secret key used to sign JWT tokens. **Change this in production.** * - ``--debug`` - off - Enable debug logging for both pyzm and uvicorn * - ``--config`` - (none) - Path to a YAML config file (``ServerConfig``). Overrides CLI flags. YAML config file ~~~~~~~~~~~~~~~~~ Instead of CLI flags, you can provide a YAML config file via ``--config``: .. code-block:: bash python -m pyzm.serve --config /etc/pyzm/serve.yml Example ``serve.yml``: .. code-block:: yaml host: "0.0.0.0" port: 5000 models: - yolo11s - yolo26s base_path: "/var/lib/zmeventnotification/models" processor: gpu auth_enabled: true auth_username: admin auth_password: "my-secret-password" token_secret: "a-strong-random-secret" token_expiry_seconds: 3600 All fields correspond to ``ServerConfig`` attributes. When using ``--config``, CLI flags are ignored. Client usage ------------- Using the ``Detector`` API: .. code-block:: python from pyzm import Detector # URL mode (default) -- server fetches frames from ZM detector = Detector(models=["yolo11s"], gateway="http://gpu-box:5000") # Image mode -- client uploads JPEG-encoded frames # detector = Detector(models=["yolo11s"], gateway="http://gpu-box:5000", # gateway_mode="image") # With authentication: # detector = Detector(models=["yolo11s"], gateway="http://gpu-box:5000", # gateway_username="admin", gateway_password="secret") # detect() always uploads the image (single-image mode) result = detector.detect("/path/to/image.jpg") print(result.summary) # detect_event() uses URL mode by default -- sends frame URLs, # server fetches them directly from ZM result = detector.detect_event(zm_client, event_id=12345, stream_config=stream_cfg) URL mode only applies to ``detect_event()`` calls. Single-image ``detect()`` calls always upload the image regardless of this setting. Using ``from_dict()`` ~~~~~~~~~~~~~~~~~~~~~~ The ``ml_gateway`` key in the ``general`` section of ``ml_options`` automatically enables remote mode: .. code-block:: python ml_options = { "general": { "model_sequence": "object", "ml_gateway": "http://gpu-box:5000", # "ml_gateway_mode": "image", # uncomment if server can't reach ZM # "ml_user": "admin", # "ml_password": "secret", }, "object": { "general": {"pattern": ".*"}, "sequence": [...], }, } detector = Detector.from_dict(ml_options) result = detector.detect(image) Authentication --------------- When the server is started with ``--auth``, clients must first obtain a JWT token via ``/login``, then pass it as a Bearer token on subsequent requests. The ``Detector`` gateway mode handles this automatically. The ``--token-secret`` flag controls the secret key used to sign JWT tokens. **Always set this to a strong random value in production.** The default (``change-me``) is insecure. Manual flow: .. code-block:: bash # Login TOKEN=$(curl -s -X POST http://gpu-box:5000/login \ -H 'Content-Type: application/json' \ -d '{"username":"admin","password":"secret"}' \ | jq -r .access_token) # Detect curl -X POST http://gpu-box:5000/detect \ -H "Authorization: Bearer $TOKEN" \ -F file=@/path/to/image.jpg Tokens expire after ``token_expiry_seconds`` (default 3600), configurable via the YAML config file. API reference -------------- ``GET /health`` ~~~~~~~~~~~~~~~~ Health check. Returns: .. code-block:: json {"status": "ok", "models_loaded": true} ``GET /models`` ~~~~~~~~~~~~~~~~ Returns the list of available models and their load status. Useful with ``--models all`` to check which backends have been lazily loaded. .. code-block:: json { "models": [ {"name": "yolo11s", "type": "object", "framework": "opencv", "loaded": true}, {"name": "yolo26s", "type": "object", "framework": "opencv", "loaded": false} ] } ``POST /detect`` ~~~~~~~~~~~~~~~~~ Run detection on an uploaded image (image mode). - **Content-Type:** ``multipart/form-data`` - **Parameters:** - ``file`` (required) -- JPEG/PNG image - ``zones`` (optional) -- JSON string of zone list - **Auth:** Bearer token (when auth enabled) - **Returns:** ``DetectionResult`` as JSON (image field excluded) ``POST /detect_urls`` ~~~~~~~~~~~~~~~~~~~~~~ Run detection on images fetched from URLs (URL mode). - **Content-Type:** ``application/json`` - **Body:** .. code-block:: json { "urls": [ {"frame_id": "snapshot", "url": "https://zm.example.com/zm/index.php?view=image&eid=123&fid=snapshot"}, {"frame_id": "1", "url": "https://zm.example.com/zm/index.php?view=image&eid=123&fid=1"} ], "zm_auth": "token=abc123...", "zones": [{"name": "driveway", "value": [[0,0],[100,0],[100,100],[0,100]]}], "verify_ssl": false } - **Auth:** Bearer token (when auth enabled) - **Returns:** ``DetectionResult`` as JSON (best frame selected by ``frame_strategy``) The server appends ``zm_auth`` to each URL and fetches the image via HTTP GET (10-second timeout per URL; failures are logged and skipped). Frame strategy (``first``, ``first_new``, ``most``, ``most_unique``, ``most_models``) is applied server-side to pick the best result. .. note:: The ``zones`` field accepts both ``"value"`` and ``"points"`` as the key for polygon coordinates (they are interchangeable in both ``/detect`` and ``/detect_urls``). ``POST /login`` ~~~~~~~~~~~~~~~~ Obtain a JWT token. This endpoint is always registered, even when ``--auth`` is not enabled (so clients with pre-configured credentials don't get a 404). When auth is disabled, any credentials are accepted. - **Content-Type:** ``application/json`` - **Body:** ``{"username": "...", "password": "..."}`` - **Returns:** ``{"access_token": "...", "expires": 3600}`` objectconfig.yml remote section --------------------------------- In a ZoneMinder event notification setup, configure the remote gateway in ``objectconfig.yml``: .. code-block:: yaml ml_sequence: general: model_sequence: "object" ml_gateway: "http://gpu-box:5000" # ml_gateway_mode: "image" # uncomment if server can't reach ZM ml_user: "admin" ml_password: "secret" ml_fallback_local: yes object: general: pattern: "(person|car)" sequence: - object_framework: opencv object_weights: /path/to/yolo11s.onnx object_labels: /path/to/coco.names When ``ml_gateway`` is set, detection requests are sent to the remote server. URL mode is the default -- the server fetches frames directly from ZM. Set ``ml_gateway_mode: "image"`` if the server cannot reach ZoneMinder. If ``ml_fallback_local`` is ``yes`` and the remote server is unreachable, detection falls back to local inference.