Introduction

Managing user authentication and authorization requires ensuring high availability of the NAC system, since, in the event of its failure, connecting users and endpoints will become impossible.

Eltex-NAICE implements high availability using an Active-Active scheme with a VRRP address. This allows a single RADIUS server to be used in the configuration of network devices and provides redundancy for devices that do not support specifying multiple RADIUS servers. To configure the high-availability scheme, four virtual (or physical) servers must be allocated: two for running NAICE services and two for running the PostgreSQL database, which is responsible for storing system data.

Installing or upgrading a high-availability configuration on a host previously used for operating the system in single-host mode is not allowed.

General high-availability scheme

  • High availability for NAICE is implemented using an Active-Active scheme. Network devices can send RADIUS requests to any NAICE server. This requires configuring either a single radius-server host instance with the VIP address on the network equipment, or two radius-server host instances using the actual NAICE server addresses.
  • In addition to the IP addresses of the NAICE servers, a VIP address is used. It is reserved via the VRRP protocol using the keepalived service. This address may also be used by network devices for RADIUS traffic exchange, allowing the configuration of only one radius-server host instance. The address is also used for administrative access, including access to the web management interface.
  • PostgreSQL database high availability is implemented using replication manager. Replication is configured on two nodes, which operate in the Primary and Standby roles.
  • NAICE service database connection settings must include both database addresses.

    To license a high-availability scheme, two licenses are required, one for each NAICE host. Each license must have its own Product ID while sharing the same license key.


Server system requirements

System requirements for the servers are described in the “High-availability deployment” section of v1.1_3.1 System requirements.

Installation

Both online and offline installation methods are supported.

Online installation is supported on all operating systems listed as supported and is described below.

Offline installation in an isolated network is described in section v1.1_3.4.1 High-availability installation in an isolated network (using VRRP).

Online installation

To perform an online installation, the target hosts must have direct Internet access. Using a proxy server or any mechanism that modifies certificates of destination websites accessed during installation is not allowed.

You must specify IP addresses of the target servers during installation. Using domain names is not permitted.

The installation is performed using two Ansible playbooks:

  • The PostgreSQL cluster is installed using the playbook install-postgres-cluster.yml.
  • The NAICE services and the keepalived service are installed using the playbook reservation-naice-services.yml.

Preparation for installation 

The addresses of the target hosts on which the installation will be executed are defined in the inventory/hosts-cluster.yml file.

For PostgreSQL, set the addresses in the postgres-cluster section:

# Host group for postgres-cluster installation (primary + standby)
postgres-cluster:
  hosts:
    node_primary:
      ansible_host: <IP address of PostgreSQL node 1>
      ansible_port: 22
      ansible_user: <username>
      ansible_ssh_pass: <password>
      ansible_become_password: <sudo password>
      forwarded_postgresql_port: 5432
      forwarded_ssh_port: 15432
    node_standby:
      ansible_host: <IP address of PostgreSQL node 2>
      ansible_port: 22
      ansible_user: <username>
      ansible_ssh_pass: <password>
      ansible_become_password: <sudo password>
      forwarded_postgresql_port: 5432
      forwarded_ssh_port: 15432

To install NAICE services with high availability, you must specify the addresses in the reservation section:

# Host group for NAICE high-availability installation
reservation:
  hosts:
    master_host:
      ansible_host: <IP address of NAICE host 1>
      ansible_port: 22
      ansible_user: <username>
      ansible_ssh_pass: <password>
      ansible_become_password: <sudo password>
      keepalived_interface: <interface for VIP address, e.g. eth0>

    backup_host:
      ansible_host: 192.168.0.102
      ansible_port: 22
      ansible_user: <username>
      ansible_ssh_pass: <password>
      ansible_become_password: <sudo password>
      keepalived_interface: <interface for VIP address, e.g. eth0>
  vars:
    keepalived_vip: <VIP address, without mask, e.g. 192.168.0.11>

When performing an online installation, it is not required to specify access credentials for the host from which the playbook is executed in the Local actions section. This section is used only when performing installation in an isolated environment.

Installing the PostgreSQL database cluster

Run the playbook:

ansible-playbook install-postgres-cluster.yml -i inventory/hosts-cluster.yml

As a result, PostgreSQL will be installed as a cluster on the servers specified in node_primary and node_standby. The master node of the cluster will be located on the node_primary host.

Installing the NAICE cluster

Before starting the installation, make sure that the Primary role belongs to the PostgreSQL node specified in the variable node_primary “ansible_host”. If necessary, perform a Primary role switch. If this requirement is not met, the installation cannot be completed.

Both database addresses are specified in the NAICE service database connection settings, and database entries can only be made via the primary server. The use of the targetServerType parameter in the URL is mandatory.
Example:

URSUS_POSTGRES_JDBC_URL:jdbc:postgresql://192.168.0.101:5432,192.168.0.102:5432/ursus?targetServerType=preferPrimary

The database access addresses are taken from the ansible_host values under the postgres-cluster section in the hosts-cluster.yml file.

To start the installation, run the playbook:

ansible-playbook reservation-naice-services.yml -i inventory/hosts-cluster.yml

Checking the NAICE cluster state

After the installation is completed, one of the NAICE cluster nodes will take the VRRP master role and bring up the VIP address on its interface. To determine which node currently holds the VIP, run the following command on each node:

ip address show dev <interface name specified in the keepalived_interface variable>
Example of VRRP MASTER output
$ip address show dev eth0
4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 02:00:a5:a1:b2:ce brd ff:ff:ff:ff:ff:ff
    altname enp0s5
    altname ens5
    inet 192.168.0.101/24 brd 192.168.0.255 scope global eth2
       valid_lft forever preferred_lft forever
    inet 192.168.0.103/32 scope global eth2:NAICE
       valid_lft forever preferred_lft forever
    inet6 fe80::a5ff:fea1:b2ce/64 scope link
       valid_lft forever preferred_lft forever
Example of VRRP BACKUP output
$ip a show dev eth2
4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 02:00:a5:a1:b2:cf brd ff:ff:ff:ff:ff:ff
    altname enp0s5
    altname ens5
    inet 192.168.0.102/24 brd 192.168.0.255 scope global eth2
       valid_lft forever preferred_lft forever
    inet6 fe80::a5ff:fea1:b2cf/64 scope link
       valid_lft forever preferred_lft forever

The VIP address must be present on only one cluster node. If the address appears on both nodes, this typically indicates a loss of connectivity between them.

For VRRP to operate correctly, L2 connectivity is required, as well as the ability to transmit multicast traffic to the VRRP addresses 00:00:5E:00:01:XX (used for VRRP MAC announcement messages as defined in RFC 3768).

On the host, go to the installation directory (default: /etc/docker-naice) and ensure that the containers are running.

Example output
$ sudo docker compose ps -a
NAME               IMAGE                                                               COMMAND                  SERVICE         CREATED         STATUS                   PORTS
epg-service        naice-build-hosted.registry.eltex.loc/naice/epg-service:1.1-2       "/bin/sh -e /usr/loc…"   epg-service     9 minutes ago   Up 9 minutes (healthy)   0.0.0.0:8100->8100/tcp, [::]:8100->8100/tcp
naice-aquila       naice-release.registry.eltex.loc/naice-aquila:1.0                   "java -cp @/app/jib-…"   naice-aquila    9 minutes ago   Up 8 minutes (healthy)   0.0.0.0:49->49/tcp, [::]:49->49/tcp, 0.0.0.0:5703->5703/tcp, [::]:5703->5703/tcp, 0.0.0.0:8091-8092->8091-8092/tcp, [::]:8091-8092->8091-8092/tcp
naice-bubo         naice-release.registry.eltex.loc/naice-bubo:1.0                     "java -cp @/app/jib-…"   naice-bubo      9 minutes ago   Up 8 minutes (healthy)   0.0.0.0:5704->5704/tcp, [::]:5704->5704/tcp, 0.0.0.0:8093-8094->8093-8094/tcp, [::]:8093-8094->8093-8094/tcp
naice-castor       naice-release.registry.eltex.loc/naice-castor:1.0                   "java -Djava.awt.hea…"   naice-castor    9 minutes ago   Up 8 minutes (healthy)   0.0.0.0:5705->5705/tcp, [::]:5705->5705/tcp, 0.0.0.0:8095-8096->8095-8096/tcp, [::]:8095-8096->8095-8096/tcp
naice-gavia        naice-release.registry.eltex.loc/naice-gavia:1.0                    "java -cp @/app/jib-…"   naice-gavia     9 minutes ago   Up 7 minutes (healthy)   0.0.0.0:8080->8080/tcp, [::]:8080->8080/tcp
naice-gulo         naice-release.registry.eltex.loc/naice-gulo:1.0                     "java -cp @/app/jib-…"   naice-gulo      9 minutes ago   Up 8 minutes (healthy)   0.0.0.0:8089-8090->8089-8090/tcp, [::]:8089-8090->8089-8090/tcp
naice-lemmus       naice-release.registry.eltex.loc/naice-lemmus:1.0                   "java -cp @/app/jib-…"   naice-lemmus    9 minutes ago   Up 8 minutes (healthy)   0.0.0.0:8083->8083/tcp, [::]:8083->8083/tcp
naice-lepus        naice-release.registry.eltex.loc/naice-lepus:1.0                    "java -cp @/app/jib-…"   naice-lepus     9 minutes ago   Up 9 minutes (healthy)   0.0.0.0:8087->8087/tcp, [::]:8087->8087/tcp, 0.0.0.0:67->1024/udp, [::]:67->1024/udp
naice-mustela      naice-release.registry.eltex.loc/naice-mustela:1.0                  "java -cp @/app/jib-…"   naice-mustela   9 minutes ago   Up 8 minutes (healthy)   0.0.0.0:8070-8071->8070-8071/tcp, [::]:8070-8071->8070-8071/tcp
naice-nats         naice-build-hosted.registry.eltex.loc/naice/nats:0.7.1              "docker-entrypoint.s…"   nats            8 hours ago     Up 9 minutes (healthy)   0.0.0.0:4222->4222/tcp, [::]:4222->4222/tcp, 0.0.0.0:6222->6222/tcp, [::]:6222->6222/tcp, 0.0.0.0:7777->7777/tcp, [::]:7777->7777/tcp, 0.0.0.0:8222->8222/tcp, [::]:8222->8222/tcp
naice-ovis         naice-release.registry.eltex.loc/naice-ovis:1.0                     "java -cp @/app/jib-…"   naice-ovis      9 minutes ago   Up 8 minutes (healthy)   0.0.0.0:5701->5701/tcp, [::]:5701->5701/tcp, 0.0.0.0:8084-8085->8084-8085/tcp, [::]:8084-8085->8084-8085/tcp
naice-postgres-1   naice-build-hosted.registry.eltex.loc/naice/postgres-repmgr:1.0.6   "/opt/bitnami/script…"   postgres-1      8 hours ago     Up 8 hours (healthy)     0.0.0.0:5432->5432/tcp, [::]:5432->5432/tcp, 0.0.0.0:15432->22/tcp, [::]:15432->22/tcp
naice-radius       naice-release.registry.eltex.loc/naice-radius:1.0                   "/docker-entrypoint.…"   naice-radius    9 minutes ago   Up 9 minutes (healthy)   0.0.0.0:1812-1813->1812-1813/udp, [::]:1812-1813->1812-1813/udp, 0.0.0.0:9812->9812/tcp, [::]:9812->9812/tcp
naice-sterna       naice-release.registry.eltex.loc/naice-sterna:1.0                   "/docker-entrypoint.…"   naice-sterna    9 minutes ago   Up 6 minutes (healthy)   80/tcp, 0.0.0.0:8443->444/tcp, [::]:8443->444/tcp
naice-ursus        naice-release.registry.eltex.loc/naice-ursus:1.0                    "java -cp @/app/jib-…"   naice-ursus     9 minutes ago   Up 9 minutes (healthy)   0.0.0.0:8081-8082->8081-8082/tcp, [::]:8081-8082->8081-8082/tcp
naice-vulpus       naice-release.registry.eltex.loc/naice-vulpus:1.0                   "java -cp @/app/jib-…"   naice-vulpus    9 minutes ago   Up 8 minutes (healthy)   0.0.0.0:5702->5702/tcp, [::]:5702->5702/tcp, 0.0.0.0:8086->8086/tcp, [::]:8086->8086/tcp, 0.0.0.0:8088->8088/tcp, [::]:8088->8088/tcp
naice-web          naice-release.registry.eltex.loc/naice-web:1.0                      "/docker-entrypoint.…"   naice-web       9 minutes ago   Up 6 minutes (healthy)   80/tcp, 0.0.0.0:443->443/tcp, [::]:443->443/tcp, 0.0.0.0:80->4200/tcp, [::]:80->4200/tcp

System operation overview

Normal system state

In the normal state, all four hosts are functioning.

  • RADIUS request processing is available on all three cluster addresses;
  • Service interaction with the database is performed using the two real addresses of the PostgreSQL cluster nodes. The node available for writing in the Primary state is determined automatically.

Failure of NAICE host 1

If NAICE Host 1 fails, the following actions will be performed automatically:

  • NAICE Host 2 will automatically take over the VRRP master role;
  • RADIUS request processing will be performed using the VIP address and the real address of NAICE Host 2.

Failure of database host 1

If database host 1 fails, the following actions will be performed automatically:

  • Database host 2 will automatically transition to the Primary role;
  • NAICE services will detect that database host 1 is unavailable, and all further database operations will be performed through database host 2;
  • RADIUS request processing will remain available on all three cluster addresses.

Recovery after failure

  1. After the NAICE host returns to operation, the higher-priority VRRP instance does not take over the master role and will remain in the VRRP BACKUP state.
  2. After the PostgreSQL database host returns to operation, it will run in Standby mode. The Primary role will remain assigned to the current cluster node.

Host recovery

If one of the hosts is completely lost, first restore its initial state: deploy the operating system, configure IP addressing and user accounts as they were before, and then perform the recovery procedure.

Recovering a PostgreSQL database cluster host

On the remaining operational node, create a backup of the data according to the instructions in v1.1_3.8 Creating and restoring database backup.

Redeploy the host corresponding to the failed cluster node, using the same OS, IP addressing, and user configuration as before.

Run the playbook:

ansible-playbook install-postgres-cluster.yml -i inventory/hosts-cluster.yml

After completing the playbook, check the state of the PostgreSQL database cluster, verify that it is operational, and confirm that authentication and configuration in the GUI are functioning correctly.

Recovering a NAICE service host

Redeploy the host corresponding to the failed cluster node, using the same operating system, IP addressing, and user configuration as before.

Run the playbook:

ansible-playbook reservation-naice-services.yml -i inventory/hosts-cluster.yml

When the installation is executed again, all NAICE services will be restarted, which will result in a short service interruption (up to 5 minutes). This must be taken into account when planning recovery work.

The keepalived service will also be restarted, causing the VRRP master role to switch to the higher-priority instance.

If the first NAICE host is being restored, a new self-signed HTTPS certificate will be generated.

After recovery, verify that authentication is working correctly and ensure that all services are operating properly.

  • Нет меток