========= Changelog ========= This file keeps track of all notable changes to the Slurm Charms. Unreleased ---------- - added action in slurmd to install singularity 0.9.2 - 2022-07-04 ------------------ - added slurmctld configuration options to use TLS certificates for etcd 0.9.1 - 2022-06-02 ------------------ - added munge key to etcd database - added action in slurmctld to create etcd user to query mungekey - added action in slurmctld to get etcd root password - added action in slurmd to get etcd slurmd password - added set-node-inventory slurmd action - updated default Mellanox Infiniband drivers to version 5.4 - fix get-node-inventory action - fix influxdb-info slurmctld action - fix etcd not being configured after an upgrade 0.9.0 - 2022-04-12 ------------------ - fix: do not install Nvidia repository by default - add support for custom NHC parameters - improve checks when nodes need to be rebooted - add Nvidia GPU support: install drivers via an action on compute nodes - update operator framework to 1.3.0 0.8.5 - 2022-01-13 ------------------ - add support for ElasticSearch Slurm addon - pin Operator Framework to 1.2.0 0.8.4 - 2021-12-10 ------------------ - use the cluster name as the database name for InfluxDB - improve logrotate profiles for Slurm and NHC logs - add epilog-prolog relation 0.8.3 - 2021-11-05 ------------------ - add labels to Fluentbit logs: cluster-name, partition-name, hostname, service - add user-group relation to create user and group on the system - fix typos in unit status 0.8.2 - 2021-10-13 ------------------ - fix changing infiniband repos on Ubuntu - fix fluentbit parser for slurm logs - fix creation of user and group when those already exist 0.8.1 - 2021-10-07 ------------------ - use Omnivector Solutions' RPM repository to install Slurm - add Fluentbit relation to forward logs from all the Slurm Charms 0.8.0 - 2021-09-13 ------------------ - fix slurmd crashing when etcd in slurmctld is down/not-started/unreachable - allow installing Slurm from different repositories on Ubuntu - use Omnivector's PPA for installing Slurm on Ubuntu 0.7.0 - 2021-08-09 ------------------ - fix: reduce number of events when a slurmd unit is added/removed - fix #111: install bullseye gpg keys from files - add support for Ubuntu partitions on CentOS deploys - add influxdb-info action to slurmctld - add grafana relation for slurmctld - add influxdb relation for accounting and profiling of jobs with the SLURM InfluxDB plugin - add acct gather configuration options - handle update-status hook to constantly give feedback about charms' status - add .jujuignore to reduce size of final charm files - fix cache of partition_name in slurmd charm - improve description of charm's state in juju status - fix race condition on relation data exchange between slurmctld and slurmdbd - change default partition name from juju-compute-random to osd-appname - fix slurmctld crashing when removing a slurmd application 0.6.6 - 2021-07-01 ------------------ - fix checks for munge when the code can't run subprocess as another UID 0.6.5 - 2021-06-28 ------------------ - fix race condition on inventory synchronization between slurmd and slurmctld before starting slurmd - fix breakage in charms when the slurmd leader is removed - remove unused code in slurmd peer relation - fix slurmd action to query infiniband version - add slurmrestd to CentOS7 - fix slurmrestd Systemd environment variables to enable jwt auth 0.6.4 - 2021-06-11 ------------------ - improve checks on Systemd commands to start/restart daemons, as to guarantee correct Charm initialization - check for munge keys before starting Slurm daemons, to avoid a race condition - fix possible breakage in Slurm configuration due to spaces in partition names - improve description of configurations and actions. - fix slurmrestd version in Juju status breaking the line. 0.6.3 - 2021-06-02 ------------------ - fix systemd command to restart munge - improve slurmdbd logs when restarting munge 0.6.2 - 2021-06-02 ------------------ - handle charm upgrade - improve juju status' message to display failure when munge/slurm does not start correctly - fix slurmd initialization when slurm.conf is present - enable munge system service, so it always start when the machine boots 0.6.1 - 2021-05-31 ------------------ - changed charm's status message to have consistent capitalization. - fixed initialization order of the charms, to ensure database and controller start before compute nodes and REST server. 0.6.0 - 2021-05-28 ------------------