Installation

Prerequisites (environment manager like conda, pipenv or poetry is recommended)

python >= 3.9, < 3.13

Using `pip` to install the suite from Pypi

pip install EasyTSAD

Additonal Dependencies

Some built-in algorithms are based on Pytorch 2.0 or Pytorch-lightning 2.0. You may need to install related packages (including but not limited to pytorch, pytorch-lightning, torchinfo, torch_optimizer) if you want to run the baselines.

Prepare datasets

Use default datasets

Original datasets can be downloaded from https://wait-to-be-published. The directory structure of the dataset is shown as follows:

datasets
└── UTS
    ├── dataset_1
    │   ├── time_series_1
    │   │   ├── train.npy (training set, 1-D ndarray, necessary)
    │   │   ├── test.npy (test set, 1-D ndarray, necessary)
    │   │   ├── train_label.npy (labels of training set, 1-D ndarray, neccessary)
    │   │   ├── test_label.npy (labels of test set, 1-D ndarray, necessary)
    │   │   ├── train_timestamp.npy (timestamps of training set, 1-D ndarray, optional)
    │   │   ├── test_timestamp.npy (timestamps of test set, 1-D ndarray, optional)
    │   │   └── info.json (some additonal information, json, optional)
    │   │
    │   ├── time_series_2
    │   └── ...
    │
    ├── dataset_2
    └── ...

The file info.json contains the information like:

{
    "intervals": 300,
    "training set anomaly ratio": 0.00148,
    "testset anomaly ratio": 0.00808,
    "total anomaly ratio": 0.00478
}

Add your datasets

Preprocess your dataset to satisfy the above structure and format. Files labeled "necessary" must be offered. Then put it under the datasets/UTS/ path.

Usage

Examples of how to use the suite can be find here, including: - run baselines with/without customized config files; - implement your new algorithm with/without config files; - implement your new evaluation protocol and evaluate the baselines; - generate CSV including the overall performance of all trained methods; - aggregate all methods' anomaly scores into one plot.

An example that implements a new method.

Prepare a global config toml file. If not provided, the default configuration will be applied:

# One example of GlobalCfg.toml. 
# For more details please refer to the default configuration.
# The new items will overwrite the default ones.
[DatasetSetting]
 train_proportion = 1 # Using the last x% of the training set as the new training set. 1 means use the full training set.
 valid_proportion = 0.2 # The proportion of the validation set to the new training set.

Define the Controller

from typing import Dict
import numpy as np
from EasyTSAD.Controller import TSADController

# if cfg_path is None, using default configuration
gctrl = TSADController(cfg_path="/path/to/GlobalCfg.toml")

Load Dataset configurations

Option 1: Load certain time series in one dataset:

# Specify certain curves in one dataset, 
# e.g. AIOPS 0efb375b-b902-3661-ab23-9a0bb799f4e3 and ab216663-dcc2-3a24-b1ee-2c3e550e06c9
gctrl.set_dataset(
    dataset_type="UTS",
    dirname="/path/to/datasets", # The path to the parent directory of "UTS"
    datasets="AIOPS",
    curve_names=[
        "0efb375b-b902-3661-ab23-9a0bb799f4e3",
        "ab216663-dcc2-3a24-b1ee-2c3e550e06c9"
    ]
)

Option 2: Load all time series in certain datasets:

# Use all curves in datasets:
datasets = ["AIOPS", "Yahoo"]
gctrl.set_dataset(
    dataset_type="UTS",
    dirname="/path/to/datasets", # The path to the parent directory of "UTS"
    datasets=datasets,
)

Implement your algorithm (inherit from class BaseMethod):

The following class YourAlgo just provides a skeleton, where you should implement several functions. - The Spot instance will help you understand how to implement a statistic model; - The ARLinear instance will help you understand how to implement a learning-based model (Implemented using PyTorch);

from EasyTSAD.Methods import BaseMethod
from EasyTSAD.DataFactory import TSData

class YourAlgo(BaseMethod):
    def __init__(self, hparams) -> None:
        super().__init__()
        self.__anomaly_score = None
        self.param_1 = hparams["param_1"]

    def train_valid_phase(self, tsTrain: TSData):
        ...

    def test_phase(self, tsData: TSData):
        result = ... 
        self.__anomaly_score = result

    def train_valid_phase_all_in_one(self, tsTrains: Dict[str, TSData]):
        # used for all-in-one and zero-shot mode
        ...

    def anomaly_score(self) -> np.ndarray:
        return self.__anomaly_score

    def param_statistic(self, save_file):
        pass

Do Experiments for your algorithm

We offer two options for algorithm setting configuration: - use config file; - specify the parameters in functions.

Note: Parameters defined within a function take higher priority than those specified in the configuration file.

Option 1: Use config file for methods (Recommended)

Prepare a toml file, which is a subset of Example.toml, for example:

# YourAlgo.toml
[Data_Params]
 preprocess = "z-score" 
[Model_Params.Default]
 param_1 = false

Load YourAlgo and the config file:

training_schema = "one_by_one"
method = "YourAlgo"  # string of your algo class

# run models
gctrl.run_exps(
    method=method,
    training_schema=training_schema,
    cfg_path="path/to/YourAlgo.toml"
)

Option 2: Specify the parameters in functions

gctrl.run_exps(
    method=method,
    training_schema=training_schema,
    hparams={
        "param_1": False,
    },
    preprocess="z-score", 
)

The Score Results can be founded in path workspace/Results/Scores, and the runtime information can be founded in path workspace/Results/RunTime

Perform evaluations (Based on the saved scores)

from EasyTSAD.Evaluations.Protocols import EventF1PA, PointF1PA
# Specifying evaluation protocols
gctrl.set_evals(
    [
        PointF1PA(),
        EventF1PA(),
        EventF1PA(mode="squeeze")
    ]
)

gctrl.do_evals(
    method=method,
    training_schema=training_schema
)

The Evaluation Results can be founded in path workspace/Results/Evals

Plot the anomaly scores for each time series

gctrl.plots(
    method=method,
    training_schema=training_schema
)

The Plot Results can be founded in path workspace/Results/Plots/score_only