Skip to content

Overview

The beep base command specifies options for creating metadata and logging for all subcommands.

This page is a general overview of options that are common among any beep subcommand. You can expect options on this page to pertain to basically any beep CLI operation's inputs, outputs, and file formats.

Basics

The BEEP CLI can be used like:

$: beep <options> <subcommand>

Options for the base beep command are specified before the subcommand. All beep subcommands take at least one file as input and return one or more files as output.

Beep has six subcommands:

  • beep structure: Parse, interpolate, clean, and standardize a wide range of battery cycler output files.
  • beep featurize: Generate features for learning from structured files.
  • beep train: Train a machine learning model based on features.
  • beep predict: Predict battery degradation based on learning features and a previously trained model.
  • beep protocol: Generate cycler protocol from pre-made templates for a wide range of cyclers.
  • beep inspect: Visually inspect and debug beep files on disk.

For more info on any command or the base command, simply pass --help as an option.

The help dialog for beep base command looks like:

$: beep --help

Usage: beep [OPTIONS] COMMAND [ARGS]...

  Base BEEP command.

Options:
  -l, --log-file FILE            File to log formatted json to. Log will still
                                 be output in human readable form to stdout,
                                 but if --log-file is specified, it will be
                                 additionally logged to a jsonl (json-lines)
                                 formatted file.
  -r, --run-id INTEGER           An integer run_id which can be optionally
                                 assigned to this run. It will be output in
                                 the metadata status json for any subcommand
                                 if the status json is enabled.
  -t, --tags TEXT                Add optional tags to the status json
                                 metadata. Can be later used forlarge-scale
                                 queries on database data about sets of BEEP
                                 runs. Example:'experiments_for_kristin'.
  -s, --output-status-json FILE  File to output with JSON info about the
                                 states of files which have had any beep
                                 subcommand operationrun on them (e.g.,
                                 structuring). Contains comprehensiveinfo
                                 about the success of the operation for all
                                 files.1 status json = 1 operation.
  --halt-on-error                Set to halt BEEP if critical featurization
                                 errors are encountered on any file with any
                                 featurizer. Otherwise, logs critical errors
                                 to the status json.
  --help                         Show this message and exit.

Commands:
  featurize  Featurize one or more files.
  predict    Run a previously trained model to predict degradation...
  protocol   Generate protocol for battery cyclers from a csv file input.
  structure  Structure and/or validate one or more files.
  train      Train a machine learning model using all available data and...

Output streams

The beep base command options are used for specifying if and where to output the metadata and status of any CLI operation.

Human-readable output will always be logged to stdout, for example:

2021-09-21 16:14:43 INFO     Structuring 1 files
2021-09-21 16:14:43 DEBUG    Hashing file '/beep/beep/tests/test_files/2017-12-04_4_65C-69per_6C_CH29.csv' to MD5
2021-09-21 16:14:43 INFO     File 1 of 1: Reading raw file /beep/beep/tests/test_files/2017-12-04_4_65C-69per_6C_CH29.csv from disk...
2021-09-21 16:14:44 INFO     File 1 of 1: Validating: /beep/beep/tests/test_files/2017-12-04_4_65C-69per_6C_CH29.csv according to schema file '/beep/beep/validation_schemas/schema-arbin-lfp.yaml'
2021-09-21 16:14:44 INFO     File 1 of 1: Validated: /beep/beep/tests/test_files/2017-12-04_4_65C-69per_6C_CH29.csv
2021-09-21 16:14:44 INFO     File 1 of 1: Structuring: Read from /beep/beep/tests/test_files/2017-12-04_4_65C-69per_6C_CH29.csv
2021-09-21 16:14:44 INFO     Beginning structuring along charge axis 'charge_capacity' and discharge axis 'voltage'.
2021-09-21 16:15:21 INFO     File 1 of 1: Structured: Written to /beep/beep/CLI_TEST_FILES_FEATURIZATION/tmp.json.gz
2021-09-21 16:15:21 INFO     Structuring report:
2021-09-21 16:15:21 INFO        Succeeded: 1/1
2021-09-21 16:15:21 INFO        Invalid: 0/1
2021-09-21 16:15:21 INFO        Failed: 0/1

But other output streams are also available:

--log-file

Machine-readable json log file to write. If not specified, no log file will be created. Example:

{"time": "2021-09-21 16:13:48,938", "level": "INFO", "process": "67214", "module": "cmd", "func": "structure", "msg": "Structuring 1 files"}
{"time": "2021-09-21 16:13:48,939", "level": "DEBUG", "process": "67214", "module": "cmd", "func": "structure", "msg": "Hashing file '/beep/beep/CLI_TEST_FILES_FEATURIZATION/PreDiag_000440_0000FB_structure.json' to MD5"}
{"time": "2021-09-21 16:13:49,228", "level": "INFO", "process": "67214", "module": "cmd", "func": "structure", "msg": "File 1 of 1: Reading raw file /beep/beep/CLI_TEST_FILES_FEATURIZATION/PreDiag_000440_0000FB_structure.json from disk..."}
{"time": "2021-09-21 16:13:50,390", "level": "ERROR", "process": "67214", "module": "cmd", "func": "structure", "msg": "File 1 of 1: Failed/invalid: (EmptyDataError): /beep/beep/CLI_TEST_FILES_FEATURIZATION/PreDiag_000440_0000FB_structure.json"}
{"time": "2021-09-21 16:13:50,391", "level": "INFO", "process": "67214", "module": "cmd", "func": "structure", "msg": "Structuring report:"}
{"time": "2021-09-21 16:13:50,391", "level": "INFO", "process": "67214", "module": "cmd", "func": "structure", "msg": " Succeeded: 0/1"}
{"time": "2021-09-21 16:13:50,391", "level": "INFO", "process": "67214", "module": "cmd", "func": "structure", "msg": " Invalid: 1/1"}
{"time": "2021-09-21 16:13:50,391", "level": "INFO", "process": "67214", "module": "cmd", "func": "structure", "msg": "         - /beep/beep/CLI_TEST_FILES_FEATURIZATION/PreDiag_000440_0000FB_structure.json"}
{"time": "2021-09-21 16:13:50,391", "level": "INFO", "process": "67214", "module": "cmd", "func": "structure", "msg": " Failed: 0/1"}
{"time": "2021-09-21 16:14:43,291", "level": "INFO", "process": "67264", "module": "cmd", "func": "structure", "msg": "Structuring 1 files"}
{"time": "2021-09-21 16:14:43,291", "level": "DEBUG", "process": "67264", "module": "cmd", "func": "structure", "msg": "Hashing file '/beep/beep/tests/test_files/2017-12-04_4_65C-69per_6C_CH29.csv' to MD5"}
{"time": "2021-09-21 16:14:43,385", "level": "INFO", "process": "67264", "module": "cmd", "func": "structure", "msg": "File 1 of 1: Reading raw file /beep/beep/tests/test_files/2017-12-04_4_65C-69per_6C_CH29.csv from disk..

--output-status-json

JSON file to write containing comprehensive structured metadata about any operation and all of its sub-operations. If not specified, no status json will be written. Example:

{
  "op_type": "featurize",
  "feature_matrix": {
    "created": true,
    "traceback": null,
    "output": "/beep/beep/CLI_TEST_FILES_FEATURIZATION/features.json.gz"
  },
  "files": {
    "/beep/beep/CLI_TEST_FILES_FEATURIZATION/PreDiag_000440_0000FB_structure.json": {
      "walltime": 8.546396970748901,
      "output": null,
      "processed_md5_chksum": "5848d8598584e45addfa8129bb078d95",
      "featurizers": {
        "HPPCResistanceVoltageFeatures": {
          "output": null,
          "valid": true,
          "featurized": true,
          "walltime": 1.2403650283813477,
          "traceback": null,
          "subop_md5_chksum": null
        },
        "DeltaQFastCharge": {
          "output": null,
          "valid": true,
          "featurized": true,
          "walltime": 0.05008506774902344,
          "traceback": null,
          "subop_md5_chksum": null
        },
        "DiagnosticSummaryStats": {
          "output": null,
          "valid": true,
          "featurized": true,
          "walltime": 0.19507122039794922,
          "traceback": null,
          "subop_md5_chksum": null
        },
        "CycleSummaryStats": {
          "output": null,
          "valid": true,
          "featurized": true,
          "walltime": 0.013413190841674805,
          "traceback": null,
          "subop_md5_chksum": null
        }
      }
    },
...
  "metadata": {
    "beep_verison": "2021.8.2.15",
    "op_datetime_utc": "2021-09-04 00:40:12",
    "run_id": null,
    "tags": []
  }
}

Any one beep command (e.g., beep structure *), regardless of how many files it intakes or generates, will always produce exactly one status json if --output-status-json is defined.

Fault-tolerance

--halt-on-error

By default, BEEP runs all operations in a fault-tolerant manner. This means that if the CLI command syntax is valid, but internally an operation or sub-operation fails, the process will return successful.

To disable this behavior, which will cause any error in any operation or sub-operation to fail the entire command use the --halt-on-error flag.

Extra metadata and run-tracking with status json

Running many experiments can make it difficult to keep track of which input and output files correspond to which experiment. Data about input files and output files is kept in the status json, but for further tracking there are two arguments which can be specified:

--run-id

An integer run_id to associate with this operation. The run-id is recorded in the metadata field of any operation in its status json.

--tags

A list of string tags to associate with this operation. The tags are recorded in the metadata field of any operation in its status json.

An example of a status json containing a user run id and user tags:

# in status json output
...
  "metadata": {
    "beep_verison": "2021.8.2.15",
    "op_datetime_utc": "2021-09-04 00:40:12",
    "run_id": 234,
    "tags": ["my_tag_1", "TRI_experiments_2021", "debugging"]
  }

Controlling compression and output file formats

Serialization in beep is done by the monty library; to use compression on any output files, status files, or intermediate files in any beep subcommand, append .gz to the end of the output filename(s).

For example:

# For example, write our status json to a regular (uncompressed) json file
# And write our feature matrix output artifact to a gzipped json file

$: beep -s status.json featurize * outputFeatureMatrix.json.gz

Although they are not officially supported, other compression methods (such as .bz2) and file formats (.yaml) may be serialized to/from beep if they are supported by the current version of monty.