Securely providing Secrets#
Most Flowman projects need to provide some sort of secrets (database credentials, S3 secrets, …) to external systems for being able to access them. This opens the question about a best practice how to provide those secrets to a Flowman project securely.
In order to provide secrets to Flowman, there are two ways:
Provide secrets via system environment variables, which then can be accessed in Flowman
Provide secrets stored in an external file, which then can be read by Flowman
Both approaches will extract the secret and store it in a Flowman environment variable. To prevent Flowman from printing the secret onto the console, it is enough when the variable name ends with “credential”, “secret” or “password”. In this case, Flowman will redact the value when logging the environment onto the console.
Using environment variables#
By using the $System.getenv
template function, you can access environment variables within Flowman. For example
for extracting database credentials from system environment variables, you could proceed as follows:
environment:
- frontend_db_driver=$System.getenv('MYSQL_DRIVER', 'com.mysql.cj.jdbc.Driver')
- frontend_db_url=$System.getenv('MYSQL_URL')
- frontend_db_username=$System.getenv('MYSQL_USERNAME')
- frontend_db_password=$System.getenv('MYSQL_PASSWORD')
connections:
frontend:
kind: jdbc
driver: "$frontend_db_driver"
url: "$frontend_db_url"
username: "$frontend_db_username"
password: "$frontend_db_password"
relations:
advertiser_setting:
kind: jdbcTable
connection: frontend
table: "advertiser_setting"
schema:
kind: inline
fields:
- name: id
type: Integer
- name: business_rule_id
type: Integer
- name: rtb_advertiser_id
type: Integer
Then you only need to populate the system environment variables MYSQL_DRIVER
, MYSQL_URL
, MYSQL_USERNAME
and
MYSQL_PASSWORD
before starting Flowman. This approach fits well to deployments where Flowman runs inside Kubernetes,
but can also be used in many other scenarios.
Using external files#
Another viable approach is to provide secrets via files. Flowman can read any file via the template function
$File.read
, which will simply return the files content. Moreover, by using the template function $JSON.path
you
can parse any JSON content. An example might look as follows:
environment:
- sql_host=jdbc:sqlserver://$JSON.path($File.read($System.getenv('MSSQLSERVER_SECRETS')), '$.secret[?(@.key=="sql-host-name")].value')
- sql_username=$JSON.path($File.read($System.getenv('MSSQLSERVER_SECRETS')), '$.secret[?(@.key=="sql-db-username")].value')
- sql_password=$JSON.path($File.read($System.getenv('MSSQLSERVER_SECRETS')), '$.secret[?(@.key=="sql-db-password")].value')
- sql_database=$JSON.path($File.read($System.getenv('MSSQLSERVER_SECRETS')), '$.secret[?(@.key=="sql-db-name")].value')
connections:
datahub:
kind: jdbc
driver: "com.microsoft.sqlserver.jdbc.SQLServerDriver"
url: jdbc:sqlserver://$sql_host
username: $sql_username
password: $sql_password
properties:
databaseName: $sql_database
Then you need to store the secrets in a JSON file stored on the local file system where Flowman is executed. Using the JSON paths as defined above, the file needs to look as follows:
{
"secret": [
{
"key": "sql-db-username",
"value": "my-sql-username"
},
{
"key": "sql-db-password",
"value": "my-secret-password"
},
{
"key": "sql-db-name",
"value": "MY_DATABASE"
},
{
"key": "sql-host-name",
"value": "sql-database.windows.net:1433"
}
]
}
Finally, you need to set the system environment variable MSSQLSERVER_SECRETS
to point to the corresponding file before
starting Flowman. This additional indirection is not strictly required, but helps to decouple Flowman configuration
from the deployment logic.
Using Key Vaults#
Another alternative for storing and retrieving secrets is to use key vaults. These are often provided by cloud environments. Flowman currently supports Azure Key Vault to access secrets. You will need to activate the Azure Plugin in order to use this feature in Flowman. Once enabled, you can easily access secrets as follows:
environment:
- sql_host=$AzureKeyVault.getSecret('datahub', 'sql_host')
- sql_username=$AzureKeyVault.getSecret('datahub', 'sql_username')
- sql_password=$AzureKeyVault.getSecret('datahub', 'sql_password')
- sql_database=$AzureKeyVault.getSecret('datahub', 'sql_database')
connections:
datahub:
kind: jdbc
driver: "com.microsoft.sqlserver.jdbc.SQLServerDriver"
url: jdbc:sqlserver://$sql_host
username: $sql_username
password: $sql_password
properties:
databaseName: $sql_database