Docker Compose deployment
β οΈ Security note
This stack can be deployed in production only if the host is properly secured (firewall enabled, restricted exposed ports, strong credentials, TLS/reverse proxy when needed). Running it on an open host can lead to data leaks or insecure access to internal institutional directories (especially if LDAP is enabled).
π Environment-specific Compose files (DEV / PROD)
The repository ships with additional Compose files to adapt the same modular stack to different environments:
- DEV overlay (
docker-compose.dev.yaml)- Docker image tags for components under active development are not pinned (typically
:latest). - Environment variables that depend on the environment (e.g.
APP_ENV, when present) are set toDEV.
- Docker image tags for components under active development are not pinned (typically
- PROD overlay (
docker-compose.prod.yaml)- Docker image tags are pinned (explicit versions) to ensure reproducible deployments.
- Environment variables that depend on the environment (e.g.
APP_ENV, when present) are set toPROD.
Use the overlays by combining files with -f (the overlay only overrides what differs from the base file).
ποΈ Local PostgreSQL vs external PostgreSQL
Several CRISalid components can use either:
- a local PostgreSQL container managed by Docker Compose
- or an external PostgreSQL server already provided by the institution
This is handled with two mechanisms:
- the base Compose files are now external-DB compatible by default
- the file
docker/docker-compose.local-postgres.yamlre-adds thedepends_onrelationships required when using local PostgreSQL containers
Base principle
For each component:
- the application service uses
${..._DB_HOST}and can therefore point either to a local container or to an external host - the local PostgreSQL service is placed in its own dedicated profile
- local PostgreSQL startup dependencies are added only through
docker/docker-compose.local-postgres.yaml
Dedicated database profiles
The following profiles are used for local PostgreSQL containers:
harvester-dbsovisuplus-dbkeycloak-dbcdb-db
When these profiles are not enabled, the corresponding application uses the database host configured in its .env file.
π Choosing the Components
Before you start, review the components available in map/components.qmd and decide which ones you need.
The main Compose file (docker/docker-compose.yaml) is modular. It uses the include directive and profiles to enable only selected components.
Main docker-compose.yaml
name: crisalid
include:
- path: ./neo4j/neo4j.yaml
env_file: ./neo4j/.env
project_directory: ./neo4j
- path: ./apollo/apollo.yaml
env_file: ./apollo/.env
project_directory: ./apollo
- path: ./crisalid-bus/crisalid-bus.yaml
env_file: ./crisalid-bus/.env
project_directory: ./crisalid-bus
- path: ./harvester/harvester.yaml
env_file: ./harvester/.env
project_directory: ./harvester
- path: ./ikg/ikg.yaml
env_file: ./ikg/.env
project_directory: ./ikg
- path: ./cdb/cdb.yaml
env_file: ./cdb/.env
project_directory: ./cdb
- path: ./sovisuplus/sovisuplus.yaml
env_file: ./sovisuplus/.env
project_directory: ./sovisuplus
- path: ./sovisuplus/sovisuplus-maintenance.yaml
env_file: ./sovisuplus/.env
project_directory: ./sovisuplus
- path: ./keycloak/keycloak.yaml
env_file: ./keycloak/.env
project_directory: ./keycloak
- path: ./ofelia/ofelia.yaml
env_file: ./ofelia/.env
project_directory: ./ofelia
- path: ./cvs/cvs.yaml
env_file: ./cvs/.env
project_directory: ./cvsDEV example (with docker managed PostgreSQL)
docker compose \
-f docker/docker-compose.yaml \
-f docker/docker-compose.dev.yaml \
-f docker/docker-compose.local-postgres.yaml \
--profile neo4j \
--profile apollo \
--profile crisalid-bus \
--profile harvester \
--profile harvester-db \
--profile ikg \
--profile cdb \
--profile cdb-db \
--profile keycloak \
--profile keycloak-db \
--profile sovisuplus \
--profile sovisuplus-db \
--profile ofelia \
up -dPROD example (with docker managed PostgreSQL)
docker compose \
-f docker/docker-compose.yaml \
-f docker/docker-compose.prod.yaml \
-f docker/docker-compose.local-postgres.yaml \
--profile neo4j \
--profile apollo \
--profile crisalid-bus \
--profile harvester \
--profile harvester-db \
--profile ikg \
--profile cdb \
--profile cdb-db \
--profile keycloak \
--profile keycloak-db \
--profile sovisuplus \
--profile sovisuplus-db \
--profile ofelia \
up -dPROD example (with external PostgreSQL)
docker compose \
-f docker/docker-compose.yaml \
-f docker/docker-compose.prod.yaml \
--profile neo4j \
--profile apollo \
--profile crisalid-bus \
--profile harvester \
--profile ikg \
--profile cdb \
--profile keycloak \
--profile sovisuplus \
--profile ofelia \
up -dConfiguration check
If you are unsure which services and profiles are effectively enabled after variable expansion and file overrides, run:
docker compose \
-f docker/docker-compose.yaml \
-f docker/docker-compose.dev.yaml \
-f docker/docker-compose.local-postgres.yaml \
--profile neo4j \
--profile apollo \
# add the profiles you want to check \
configπ§° Preparation Steps
1. π§Ύ .env Files
Each directory under docker/ (e.g. apollo, crisalid-bus, ikg, neo4j, cdb, harvester, β¦) has its own .env.sample file.
- Copy each
.env.sampleto.envfor each directory exceptcdb(which is configured by theconfigure_cdb.shscript). - Fill in appropriate values (hostnames, ports, secrets, etc.)
- The main
docker/.env.sampleincludes values used by multiple components (like RabbitMQ or Neo4j credentials)
If you plan to connect the CRISalid Directory Bridge (
cdb) to your institutional LDAP, make sure to set:LDAP_HOST= LDAP_BIND_DN= LDAP_BIND_PASSWORD=
2. π§ Configure CRISalid Bus
This script reads the .env values and generates the RabbitMQ definitions.json file (exchanges, queues, admin user, etc.).
./docker/configure_crisalid_bus.sh3. π§ Configure CRISalid Directory Bridge (CDB)
This script clones the DAGs, generates environment files, and runs the Airflow initialization.
You must specify the target environment:
./docker/configure_cdb.sh devor
./docker/configure_cdb.sh prod4. π Configure Basic Authentication for SVP Harvester
SVP Harvester supports HTTP Basic authentication for both its web interface and REST API.
Authentication is enabled by default. You control this behavior via the HARVESTER_ENABLE_BASIC_AUTH environment variable in the harvesterβs .env file.
Enable / disable authentication
In docker/harvester/.env:
HARVESTER_ENABLE_BASIC_AUTH=true
true(default): all/admin/*and/api/*endpoints are protectedfalse: authentication is disabled (all endpoints are public)
β οΈ This authentication mechanism is intended for local development and restricted environments. Always use HTTPS if enabling it in production.
Create the first user
User credentials are stored in a local file mounted into the container:
app/auth/users.json
Before starting the container for the first time, ensure that this file exists and is initialized with an empty JSON object:
mkdir -p harvester/auth
echo '{}' > harvester/auth/users.json
Once the harvester-ui container is running, create an initial user :
docker exec -it harvester-ui python scripts/add_basic_user.py adminYou will be prompted to enter and confirm a password.
The credentials take effect immediately; no container restart is required.
Remove a user
To remove an existing user:
docker exec -it harvester-ui python scripts/remove_basic_user.py adminπ Optional: Full reset of Airflow state
By default, the script does NOT wipe Airflow state (DAG history, users, variables, connections, etc.).
If you want to completely reset Airflow (including metadata database and volumes), use:
./docker/configure_cdb.sh dev --resetYou will be prompted to type RESET to confirm.
β οΈ This removes Docker volumes for the CDB profile and permanently deletes:
- DAG execution history
- Users and passwords
- Variables and connections
- XCom data
Use with caution
βΉοΈ In dev, Airflow GUI admin credentials are initialized from environment variables (default: admin:admin unless overridden in docker/.env).
After running the script, if you intend to use the CSV mode for structures and people (instead of LDAP), place your data files in:
docker/cdb/data/
βββ structure.csv
βββ people.csv
Sample CSVs:
Full documentation (in French):
5. π Configure Keycloak
Keycloak is handling authentication within the system. Multiple client applications (such as Sovisu+) can share the same authentication realm. To set up Keycloak in this environment, follow these steps:
- Global
.envConfiguration
In the global .env file, you will find the shared Keycloak configuration variables, such as the realm name ( KEYCLOAK_REALM) and the client secrets (SOVISUPLUS_KEYCLOAK_CLIENT_SECRET). The KEYCLOAK_REALM can be customized ( e.g., crisalid-my-university) for readability.
Example:
KEYCLOAK_REALM=crisalid-inst
SOVISUPLUS_KEYCLOAK_CLIENT_SECRET=MY-SECRET-VALUE
- Keycloak Configuration Script
Run the ./configure_keycloak.sh script. This will create the required configuration file from the template ( docker/keycloak/config/crisalid-inst.json.template).
- Customizing Keycloak .env Settings
The docker/keycloak/.env.sample file provides the environment settings for Keycloak, such as the admin credentials and database configurations. Copy the sample file to .env and modify the settings as needed.
KEYCLOAK_ADMIN=admin
KEYCLOAK_ADMIN_PASSWORD=admin
KEYCLOAK_DB_VENDOR=postgres
KEYCLOAK_DB_HOST=keycloak-db # Only if you are using the local PostgreSQL container for Keycloak.
# If using an external database, set this to the appropriate hostname or IP address.
KEYCLOAK_DB_PORT=5432
KEYCLOAK_DB_NAME=keycloak
KEYCLOAK_DB_USER=keycloak
KEYCLOAK_DB_PASSWORD=keycloak
- Define specific URIs in /etc/hosts
To ensure that Sovisu+ and Keycloak can be accessed correctly, you need to define specific URIs in your /etc/hosts file. This is necessary because Sovisu+ uses OAuth2 with ORCID, which requires a specific hostname even to deliver βsandboxβ keys.
# Add these lines to your /etc/hosts file
127.0.0.1 sovisuplus.local
127.0.0.1 keycloak.local6. SoVisu+ custom themes
SoVisu+ allows you to customize its appearance using themes. You can create your own theme by following these steps:
- Copy the sample theme directory:
cp -r sovisuplus/theme-sample sovisuplus/theme- Edit the theme files in
sovisuplus/themeto customize text and images according to your institutionβs branding.
7. SoVisu+ RBAC Roles File
SoVisu+ comes with a sample RBAC configuration you can customize. Start by copying the sample file and editing it:
cp sovisuplus/config/rbac.roles.sample.yaml sovisuplus/config/rbac.roles.yaml- Edit
sovisuplus/config/rbac.roles.yamlto define your roles by grouping permissions (but dont create new permissions, i.e new actions/subjects/fields as they are used in the code). - After any change, the docker startup script will copy the file to the container and re-seed the roles and permissions in database.
8. β° Configure Ofelia (system-wide scheduler)
Ofelia is the scheduler of the whole CRISalid stack. It runs as its own container and periodically triggers tasks for other containers (by running a CLI command or calling an API endpoint).
Ofelia reads a config file and uses the Docker API to:
job-exec: run a command inside an existing container (viadocker exec)job-local: run a command directly inside the Ofelia container itself (typically, a curl command to call an API)- (and
job-run: run a one-shot container if needed)
You can find full documentation here:
- Docker Hub: https://hub.docker.com/r/mcuadros/ofelia
- GitHub repository: https://github.com/mcuadros/ofelia/tree/master
In this deployment, Ofelia is included as its own Docker Compose profile (ofelia) with:
- a compose file:
docker/ofelia/ofelia.yaml - a scheduler config file template:
docker/ofelia/config.ini - a configuration template:
docker/ofelia/.env.sample
Create Ofelia .env
In docker/ofelia/, copy the sample file:
cp docker/ofelia/.env.sample docker/ofelia/.envThe sample content is:
CONFIG_FILE_NAME=config.ini
This variable is used only by Docker Compose to choose which config file to mount inside Ofelia. You can later duplicate config.ini under another name (for example config.dev.ini, config.prod.ini, etc.) and switch by changing CONFIG_FILE_NAME in .env without editing the Compose file:
CONFIG_FILE_NAME=config.dev.ini
# or:
# CONFIG_FILE_NAME=config.prod.ini
Define the scheduled jobs in config.ini
The default scheduler configuration file is:
docker/ofelia/config.ini
Example content:
; file: docker/ofelia/config.ini
[job-exec "ikg-fetch-pubs"]
schedule = @every 2m
container = crisalid-ikg
command = python -m app.cli people fetch-publication-randomThis declares a job named ikg-fetch-pubs that:
- runs every 2 minutes (
@every 2m) - uses
job-exec: it executes the command inside the runningcrisalid-ikgcontainer - calls the internal CLI:
python -m app.cli people fetch-publication-random
Run Ofelia with Docker Compose
To start the scheduler along with the rest of the stack, include the ofelia profile, for example:
docker compose \
...
--profile ofelia \
up -dπ Communication with Host Machine
If you want to connect external tools (on your host) to the containers, open the necessary ports.
For example, to expose RabbitMQβs AMQP port on the host machine, edit docker/crisalid-bus/crisalid-bus.yaml and uncomment the 2nd ports line:
ports:
- "${CRISALID_BUS_HTTP_PORT}:15672"
# - "${CRISALID_BUS_AMQP_PORT}:5672"
expose:
- "${CRISALID_BUS_AMQP_PORT}"β»οΈ Resetting Containers
To stop and delete containers + volumes for one profile, use the same Compose files and profiles that were used during up.
DEV example
docker compose \
-f docker/docker-compose.yaml \
-f docker/docker-compose.dev.yaml \
--profile cdb \
down --volumesPROD example
docker compose \
-f docker/docker-compose.yaml \
-f docker/docker-compose.prod.yaml \
--profile cdb \
down --volumesTo also delete images:
docker compose \
-f docker/docker-compose.yaml \
-f docker/docker-compose.prod.yaml \
--profile cdb \
down --volumes --rmi allπ Removing named volumes manually (if needed)
If some volumes remain, you can remove them explicitly:
docker volume rm postgres-db-volume redis-db-volume data-versioning-redis-volume
docker volume rm keycloak_postgres_data
docker volume rm svp-db-volume
docker volume rm crisalid-bus-volume
docker volume rm neo4j-data-volume neo4j-logs-volume neo4j-import-volume neo4j-plugins-volume neo4j-backups-volumeπ Starting the Services
The stack is modular. Select the profiles you need and combine the base Compose file with the appropriate environment overlay.
π§ DEV
docker compose \
-f docker/docker-compose.yaml \
-f docker/docker-compose.dev.yaml \
--profile neo4j \
--profile apollo \
--profile crisalid-bus \
--profile harvester \
--profile ikg \
--profile cdb \
--profile keycloak \
--profile sovisuplus \
--profile ofelia \
up -dIn DEV:
- Image tags for components under development are not pinned (typically
:latest) - Environment variables such as
APP_ENV(when defined) are set toDEV
π PROD
docker compose \
-f docker/docker-compose.yaml \
-f docker/docker-compose.prod.yaml \
--profile neo4j \
--profile apollo \
--profile crisalid-bus \
--profile harvester \
--profile ikg \
--profile cdb \
--profile keycloak \
--profile sovisuplus \
--profile ofelia \
up -dIn PROD:
- Image tags are pinned to explicit versions
- Environment variables such as
APP_ENV(when defined) are set toPROD
You can add or remove profiles depending on the components required by the institution.
β Next Steps
Once your services are up, follow the component-specific instructions in each section of the documentation. You can now:
- Access the CRISalid Directory Bridge (CDB) UI at http://localhost:8081 (Airflow) and trigger DAGs to import structures and people
- Access the Neo4j UI at http://localhost:7474 and explore the graph database
- Access the RabbitMQ UI at http://localhost:15672 with credentials from
docker/.envand monitor messages - Access SVP Harvester at http://localhost:8000 to monitor publication harvesting
- Access the Apollo GraphQL UI at http://localhost:4000/graphql to explore the API through Apollo GUI
- Access Keycloak at http://keycloak.local:8080 to manage users and roles
- Access SoVisu+ at http://sovisuplus.local:3000 to visualize your data
- Start connecting other CRISalid modules from the host machine
π§ Back to Development Index