KAWA On-premise Installation (without docker)

📌

This document describes how to install a single node of KAWA on premise. Please feel free to contact our support team here: support@kawa.ai

1. Prerequisites
1.a) General requirements
1.b) Hardware requirements
2. Installation procedure
2.a) Install the KAWA Server
2.b) Install the Python Library and activate your software
3. What has been installed
4. Advanced topics

1. Prerequisites

1.a) General requirements

We currently support Ubuntu Systems 20.04 LTS.

You need an account with the ability to run sudo on the target machine

Access to the clickhouse Debian package repository from the target machine: https://packages.clickhouse.com/deb (We can provide the required packages if this access is not available).

You need to be in possession of a valid KAWA license.

1.b) Hardware requirements

RAM

For small amounts of data (up to ~200 GB compressed), it is best to use as much memory as the volume of data. For large amounts of data and when processing interactive (online) queries, you should use a reasonable amount of RAM (128 GB or more) so the hot data subset will fit in the cache of pages. Even for data volumes of ~50 TB per server, using 128 GB of RAM significantly improves query performance compared to 64 GB.

CPU

KAWA will use all available CPU to maximize performance. So the more CPU - the better. For processing up to hundreds of millions / billions of rows, the recommended number of CPUs is at least 64-cores. Both AMD64 and ARM64 architectures are supported.

Storage Subsystem

SSD is preferred. HDD is the second best option, SATA HDDs 7200 RPM will do. The capacity of the storage subsystem directly depends on the target analytics perimeter.

2. Installation procedure

Please follow those steps to install KAWA.

2.a) Install the KAWA Server

Download the installation package here:

kawa-1.19.tar.gz4.1KB

Upload it onto the server on which you wish to install KAWA
Extract its content:

tar xvzf kawa.tar.gz

Input your KAWA key:

cd kawa-ubuntu
echo 'DEPLOY-TOKEN:xxxxxxxxxx' > configuration/deploy.token

Run the installation script:

sudo ./install.sh

⚠️ The installation script will ask you for the database (Clickhouse) default password:

Please input a strong password that you make sure to store somewhere (you will be prompted for it further down the installation process).

Test the installation:

Point your browser to http://<IP ADDRESS OF YOUR MACHINE>:8080

The setup admin account is:

login: setup-admin@kawa.io

password: changeme

2.b) Install the Python Library and activate your software

Please refer to this guide Python API for admin

To activate your software with your license, refer to this link:

https://docs.kawa.ai/python-api-for-admin#56fad66390cd4987aa32f9990e29bf88

3. What has been installed

KAWA installation is made for a user called: kawa-system (created during the installation process).

KAWA installation will setup three services on the Linux system:

postgresql
clickhouse-server
kawa server

Here is an overview of the interactions that will happen when working with KAWA:

KAWA uses Postgres to store its state. The application itself is 100% stateless.
KAWA will connect to Clickhouse to perform its backend computations
KAWA will run ETLs against external data providers
Users can connect to KAWA using HTTP

📁 System and configuration files:

KAWA’s configuration files sit in the /etc/kawa directory.

The main configuration file is kawa.env.
You can configure the behaviour of kawa logs as well from the log4j2.xml file.

📁 Log files:

KAWA is writing its logs in the /var/log/kawa directory. The kawa-standalone.log file is the current log file and KAWA implement a file rotation for previous days.

📁 Data:

All the data that is needed during KAWA execution is stored in /var/lib/kawa.

⚠️ It includes the uploaded CSV files by the users, please ensure that the partition for that directory has enough space.

ℹ️ This directory also contains a driver sub directory in which you can drop your own JDBC drivers. KAWA will pick them up at startup.

4. Advanced topics

KAWA can be configured to:

Terminate TLS tunnels (So support for HTTPS directly from the KAWA server)
Communication with an inernal SMTP server
Support Kerberos, SSO, OAuth2, etc

Please contact schedule a call with our support to discuss those topics.