# ClickHouse Users Configuration

This section explains how to configure per-user ClickHouse connections in KAWA. By default, all queries are executed using a single default ClickHouse user defined in the JDBC URL. KAWA supports assigning dedicated ClickHouse users to specific KAWA users, enabling fine-grained access control at the warehouse level (quotas, row policies, read-only access, etc.).

> **Terminology:** In this document, "principal" refers to a KAWA user account. The terms are used interchangeably.

> **Important:** The default user should not have restrictive limitations, as it is used by KAWA for administrative operations (listing tables, managing schemas, etc.). When you need to impose limits on a specific user, create a separate ClickHouse user and assign it to that user's KAWA account.

***

### 1. Environment Variables

#### 1.1 ClickHouse Environment Variables

The two key variables for this feature are highlighted below. The remaining variables are listed for reference.

| Variable                                         | Description                                            |
| ------------------------------------------------ | ------------------------------------------------------ |
| **`KAWA_CLICKHOUSE_JDBC_URL`**                   | **Default JDBC connection URL**                        |
| **`KAWA_CLICKHOUSE_ADDITIONAL_USERS`**           | **JSON array of additional ClickHouse users**          |
| `KAWA_CLICKHOUSE_IS_POST_23_DOT_8`               | Whether ClickHouse version is >= 23.8 (`true`/`false`) |
| `KAWA_CLICKHOUSE_ALLOW_LIVE_CONNECT`             | Allow live connect (`true`/`false`)                    |
| `KAWA_CLICKHOUSE_CLUSTER`                        | Cluster name (optional)                                |
| `KAWA_CLICKHOUSE_STORAGE_POLICY`                 | Storage policy (optional)                              |
| `KAWA_CLICKHOUSE_VALIDATION_ALLOW_DATA_DELETION` | Allow data deletion during schema validation           |

#### 1.2 JDBC URL Format

```
jdbc:clickhouse://hostname:port/database?user=username&password=password
```

Example:

```bash
export KAWA_CLICKHOUSE_JDBC_URL="jdbc:clickhouse://localhost:8123/kawa?user=kawa&password=secretpass"
```

#### 1.3 Additional Users Format

The value of `KAWA_CLICKHOUSE_ADDITIONAL_USERS` is a JSON array of objects, each containing a `user` and `password` field:

```bash
export KAWA_CLICKHOUSE_ADDITIONAL_USERS='[{"user":"analyst_readonly","password":"pass_1"},{"user":"trader_limited","password":"pass_2"}]'
```

Each entry creates a separate connection pool to ClickHouse. The `user` value must correspond to an existing ClickHouse user.

> **Important:** Changes to `KAWA_CLICKHOUSE_ADDITIONAL_USERS` take effect only after restarting the KAWA server, as the variable is read at startup and used to create connection pools.

***

### 2. Assigning a ClickHouse User to a KAWA User

Once additional users are defined in the environment variable, you need to specify which KAWA user (principal) should use which ClickHouse user. This is done via the Python SDK using the `ReplacePrincipalDbUser` command.

#### 2.1 Running the Command

```python
from kywy.client.kawa_client import KawaClient as K

kawa = K(kawa_api_url='https://your-domain:your-port')
kawa.set_api_key(api_key='kawa-****')

kawa.commands.run_command('ReplacePrincipalDbUser', {
    'principalId': '3',
    'dbUser': 'analyst_readonly'
})
```

**Parameters:**

* `principalId` — The numeric ID of the KAWA user (principal).
* `dbUser` — The name of the ClickHouse user from the `KAWA_CLICKHOUSE_ADDITIONAL_USERS` array.

**Restrictions:** This command is available to KAWA administrators only.

> **Note:** Calling `ReplacePrincipalDbUser` does not require a KAWA restart — the mapping is updated in the database and takes effect immediately.

#### 2.2 Removing the Assignment

To revert a user to the default connection, pass `None` as the `dbUser` value:

```python
kawa.commands.run_command('ReplacePrincipalDbUser', {
    'principalId': '3',
    'dbUser': None
})
```

> **Note:** The backend parameter type is `Optional<String>`. Passing `None` from the Python SDK sends an empty optional, which clears the ClickHouse user assignment for that principal.

***

### 3. How It Works

```
KAWA_CLICKHOUSE_JDBC_URL ──────────► default connection
                                     (admin tasks & users without dbUser)

KAWA_CLICKHOUSE_ADDITIONAL_USERS ──► per-user connection pools:
  [{"user":"analyst_readonly",...}]     ├── analyst_readonly → pool 1
  [{"user":"trader_limited",...}]       └── trader_limited  → pool 2

ReplacePrincipalDbUser(principalId=3, dbUser="analyst_readonly")
  → stores dbUser on the KAWA user (principal)
```

When a user makes a request:

1. KAWA checks the user's `dbUser` field.
2. If set (e.g., `analyst_readonly`) → uses the corresponding connection pool.
3. If the user is not found in the additional users list → falls back to the default connection.
4. If `dbUser` is not set → uses the default connection.

***

### 4. Bulk Assignment via CSV

For assigning ClickHouse users to multiple principals at once, you can use a CSV file with a Python script.

**File `user_mapping.csv`:**

```csv
principalId,dbUser
3,analyst_readonly
5,trader_limited
8,analyst_readonly
12,trader_limited
```

**Script `assign_ch_users.py`:**

```python
import csv
from kywy.client.kawa_client import KawaClient as K

kawa = K(kawa_api_url='https://your-domain:your-port')
kawa.set_api_key(api_key='kawa-****')

with open('user_mapping.csv', 'r') as f:
    reader = csv.DictReader(f)
    for row in reader:
        kawa.commands.run_command('ReplacePrincipalDbUser', {
            'principalId': row['principalId'],
            'dbUser': row['dbUser']
        })
        print(f"Principal {row['principalId']} → CH user: {row['dbUser']}")
```

***

### 5. ClickHouse Configuration Examples

#### 5.1 Creating the Default KAWA User

```sql
CREATE USER kawa IDENTIFIED WITH sha256_password BY 'your_secure_password';
CREATE DATABASE kawa;
GRANT ALL ON kawa.* TO kawa;
GRANT CLUSTER ON *.* TO kawa;
```

#### 5.2 Read-Only User

```sql
CREATE USER analyst_readonly IDENTIFIED WITH sha256_password BY 'pass_1';
GRANT SELECT ON kawa.* TO analyst_readonly;
```

This user can only read data and cannot create or modify tables.

#### 5.3 User with Row-Level Security (Row Policy)

```sql
CREATE USER trader_limited IDENTIFIED WITH sha256_password BY 'pass_2';
GRANT SELECT ON kawa.* TO trader_limited;

CREATE ROW POLICY region_filter ON kawa.trades
    FOR SELECT
    USING region = 'EMEA'
    TO trader_limited;
```

This user will only see rows where `region = 'EMEA'`.

#### 5.4 User with Quotas

```sql
CREATE USER limited_user IDENTIFIED WITH sha256_password BY 'pass_3';
GRANT SELECT ON kawa.* TO limited_user;

CREATE QUOTA limited_quota
    FOR INTERVAL 1 hour MAX queries = 100, result_rows = 1000000
    TO limited_user;
```

#### 5.5 Settings Profiles

```sql
CREATE SETTINGS PROFILE readonly_profile SETTINGS
    max_memory_usage = 1000000000,
    max_execution_time = 30,
    readonly = 1;

ALTER USER analyst_readonly SETTINGS PROFILE readonly_profile;
```

***

### 6. End-to-End Example

**Step 1.** Create ClickHouse users (no KAWA restart needed):

```sql
CREATE USER analyst IDENTIFIED WITH sha256_password BY 'analyst_pass';
GRANT SELECT ON kawa.* TO analyst;

CREATE USER power_user IDENTIFIED WITH sha256_password BY 'power_pass';
GRANT SELECT, INSERT ON kawa.* TO power_user;
```

**Step 2.** Add them to the environment variable and **restart KAWA** (required for new connection pools to be created):

```bash
export KAWA_CLICKHOUSE_ADDITIONAL_USERS='[{"user":"analyst","password":"analyst_pass"},{"user":"power_user","password":"power_pass"}]'
```

**Step 3.** Assign to KAWA users via Python SDK (no restart needed, takes effect immediately):

```python
from kywy.client.kawa_client import KawaClient as K

kawa = K(kawa_api_url='https://your-domain:your-port')
kawa.set_api_key(api_key='kawa-****')

# Assign CH user "analyst" to KAWA user #5
kawa.commands.run_command('ReplacePrincipalDbUser', {
    'principalId': '5',
    'dbUser': 'analyst'
})

# Assign CH user "power_user" to KAWA user #8
kawa.commands.run_command('ReplacePrincipalDbUser', {
    'principalId': '8',
    'dbUser': 'power_user'
})
```

**Step 4.** Verify: when KAWA user #5 executes a query, it will use the `analyst` connection, and ClickHouse will enforce the `SELECT`-only restriction.
