- 1. Prerequisites
- 1.a) General requirements
- 1.b) Hardware requirements
- 2. Installation procedure
- 2.a) Install the KAWA Server
- 2.b) Install the Python Library and activate your software
- 3. What has been installed
- 4. Advanced topics
1. Prerequisites
1.a) General requirements
We currently support Ubuntu Systems 20.04 LTS.
You need an account with the ability to run sudo on the target machine
Access to the clickhouse Debian package repository from the target machine: https://packages.clickhouse.com/deb (We can provide the required packages if this access is not available).
You need to be in possession of a valid KAWA license.
1.b) Hardware requirements
RAM
For small amounts of data (up to ~200 GB compressed), it is best to use as much memory as the volume of data. For large amounts of data and when processing interactive (online) queries, you should use a reasonable amount of RAM (128 GB or more) so the hot data subset will fit in the cache of pages. Even for data volumes of ~50 TB per server, using 128 GB of RAM significantly improves query performance compared to 64 GB.
CPU
KAWA will use all available CPU to maximize performance. So the more CPU - the better. For processing up to hundreds of millions / billions of rows, the recommended number of CPUs is at least 64-cores. Both AMD64 and ARM64 architectures are supported.
Storage Subsystem
SSD is preferred. HDD is the second best option, SATA HDDs 7200 RPM will do. The capacity of the storage subsystem directly depends on the target analytics perimeter.
2. Installation procedure
Please follow those steps to install KAWA.
2.a) Install the KAWA Server
- Download the installation package here:
- Upload it onto the server on which you wish to install KAWA
- Extract its content:
tar xvzf kawa.tar.gz
- Input your KAWA key:
cd kawa-ubuntu
echo 'DEPLOY-TOKEN:xxxxxxxxxx' > configuration/deploy.token
- Run the installation script:
sudo ./install.sh
â ď¸Â The installation script will ask you for the database (Clickhouse) default password:
Please input a strong password that you make sure to store somewhere (you will be prompted for it further down the installation process).
- Test the installation:
Point your browser to http://<IP ADDRESS OF YOUR MACHINE>:8080
The setup admin account is:
login: setup-admin@kawa.io
password: changeme
2.b) Install the Python Library and activate your software
Please refer to this guide Python API for admin
To activate your software with your license, refer to this link:
https://docs.kawa.ai/python-api-for-admin#56fad66390cd4987aa32f9990e29bf88
3. What has been installed
KAWA installation is made for a user called: kawa-system (created during the installation process).
KAWA installation will setup three services on the Linux system:
- postgresql
- clickhouse-server
- kawa server
Here is an overview of the interactions that will happen when working with KAWA:
- KAWA uses Postgres to store its state. The application itself is 100% stateless.
- KAWA will connect to Clickhouse to perform its backend computations
- KAWA will run ETLs against external data providers
- Users can connect to KAWA using HTTP
đ System and configuration files:
KAWAâs configuration files sit in the /etc/kawa directory.
- The main configuration file is kawa.env.
- You can configure the behaviour of kawa logs as well from the log4j2.xml file.
đ Log files:
KAWA is writing its logs in the /var/log/kawa directory. The kawa-standalone.log file is the current log file and KAWA implement a file rotation for previous days.
đ Data:
All the data that is needed during KAWA execution is stored in /var/lib/kawa.
â ď¸Â It includes the uploaded CSV files by the users, please ensure that the partition for that directory has enough space.
âšď¸ This directory also contains a driver sub directory in which you can drop your own JDBC drivers. KAWA will pick them up at startup.
4. Advanced topics
KAWA can be configured to:
- Terminate TLS tunnels (So support for HTTPS directly from the KAWA server)
- Communication with an inernal SMTP server
- Support Kerberos, SSO, OAuth2, etc
Please contact schedule a call with our support to discuss those topics.