Skip to content

Anomaly Detecion Service

Description

The Anomaly Detection service is designed to train, validate and inference to detect anomaly in the data

Anomaly Detecion Proxy

The Anomaly Detection proxy is designed to let the app side call function from Anomaly Detection services via API.

There are three tasks in this proxy:

  • Train - Train the model using the unsupervised learning algorithm.
    • If existing label file, the training will try with difference parameters combinations to find the best F1 score to train the model with
    • If not existing label file, the training will use the default configuration to train the model
  • Validate - Validate the trained model
  • Infer
  • Predict the feature file, given a trained model
  • Detect the outlier

There are 6 endpoints in Anomaly Detection Proxy, POST and GET for each task:

  • anomaly_detection/train: Train anomaly detection model
  • anomaly_detection/valid: Validate the trained model
  • anomaly_detection/infer: Inference from trained model

Train Endpoint

POST method

Input Schemas

  • request_id: uuid
  • model_name: str - Specify the model name to use
  • save_charts: bool
  • train_feature_file: str - Path to features data file.
  • train_label_file:: str - Path to labels data file. alt text

Input Assumptions

  • train_label_file is optional
  • Both features and label data file are expected to be numerical

Code Examples

import json
import uuid

import requests

url = "http://localhost:8000/anomaly_detection/train/"
id_ = str(uuid.uuid4())
data = {
    "request_id": id_,
    "model_name": "DBSCAN",
    "train_feature_file": "testprojectid1/testfileid1/cogen_feature",
    "save_charts": True,
    "train_label_file": "testprojectid1/testfileid1/cogen_anomaly"
}

json_str = json.dumps(data)

print(json_str)
# Check result

res = requests.post(url, json=data)
print("Status Code:", res.status_code)
print("Response Content:", res.text)  # print raw response content

if res.status_code == 200:  # Ensure that the status code is 200 before parsing
    result = json.loads(res.content)
    print(result)
else:
    print("Request failed.")
Output Schemas

  • _id: uuid - ID of the request
  • task: str - Name of the performed task
  • status: str - Status of the performed task
  • task_id: str - ID of the Celery task alt text

GET method

Input Schemas

  • _id: str - ID of the request alt text

Output Schemas

  • _id: uuid - ID of the request
  • task: str - Name of the performed task
  • status: str - Status of the performed task
  • predictions_label_file: str - Path to the prediction file
  • model_path: str - Path to the trained model.
  • predictions_charts_file: str - Path to prediction charts file. alt text

Data Storage

alt text

Database

alt text

Validate Endpoint

POST method

Input Schemas

  • request_id: uuid
  • model_path: str - Path to a trained model
  • save_charts: bool
  • valid_feature_file: str - Path to features data file
  • valid_label_file:: str - Path to labels data file

alt text

Input Assumptions

  • Both features and label data file are expected to be numerical

Code Examples

import json
import uuid

import requests

id_ = str(uuid.uuid4())
url = "http://localhost:8000/anomaly_detection/valid/"
data = {
    "request_id": id_,
    "model_path": "testprojectid1/testfileid1/anomaly_detection/1727161197_dbscan_model.pkl",
    "valid_features_file": "testprojectid1/testfileid1/cogen_feature",
    "save_charts": True,
    "valid_labels_file": "testprojectid1/testfileid1/cogen_anomaly",
}


json_str = json.dumps(data)

print(json_str)
# Check result

res = requests.post(url, json=data)
print("Status Code:", res.status_code)
print("Response Content:", res.text)  # print raw response content

if res.status_code == 200:  # Ensure that the status code is 200 before parsing
    result = json.loads(res.content)
    print(result)
else:
    print("Request failed.")
Output Schemas

  • _id: uuid - ID of the request
  • task: str - Name of the performed task
  • status: str - Status of the performed task
  • task_id: str - ID of the Celery task alt text

GET method

Input Schemas

  • _id: str - ID of the request alt text

Output Schemas

  • _id: uuid - ID of the request
  • task: str - Name of the performed task
  • status: str - Status of the performed task
  • metrics: Objects - Metrics of the performed task
  • predictions_label_file: str - Path to the prediction file
  • predictions_charts_file: str - Path to validation charts file.

alt text

Data Storage

alt text

Database

alt text

Inference Endpoint

Input Assumptions

  • features_file is required and expected to be numerical
  • model_path is required if detection_type is multivariate
  • Both method_name is required if detection_type is univariate - Outlier Detection

Multivariate

POST method

Input Schemas

  • request_id: uuid
  • detection_type: str - univariate or multivariate
  • method_name: Optional[str] - Name of the statistical method
  • model_path: Optional[str] - Path to a trained model
  • features_file: str - Path to features data file alt text

Code Examples

import json
import uuid

import requests

id_ = str(uuid.uuid4())
url = "http://localhost:8000/anomaly_detection/infer/"
data = {
    "request_id": id_,
    "detection_type": "multivariate",
    "method_name": "",
    "model_path": "testprojectid1/testfileid1/anomaly_detection/1728555947_dbscan_model.pkl",  # noqa: E501
    "feature_file": "testprojectid1/testfileid1/cogen_feature.csv",
}

json_str = json.dumps(data)

print(json_str)
# Check result

res = requests.post(url, json=data)
print("Status Code:", res.status_code)
print("Response Content:", res.text)  # print raw response content

if res.status_code == 200:  # Ensure that the status code is 200 before parsing
    result = json.loads(res.content)
    print(result)
else:
    print("Request failed.")
Output Schemas

  • _id: uuid - ID of the request
  • task: str - Name of the performed task
  • status: str - Status of the performed task
  • task_id: str - ID of the Celery task

alt text

GET method

Input Schemas

  • _id: str - ID of the request alt text

Output Schemas

  • _id: uuid - ID of the request
  • task: str - Name of the performed task
  • status: str - Status of the performed task
  • predictions_label_file: str - Path to the prediction file alt text

Data Storage

alt text

Database

alt text

Univariate - Outlier Detection

Input Assumptions

The features_file required existing Time columns at the first index

POST method

Input Schemas

  • request_id: uuid
  • detection_type: str - univariate or multivariate
  • method_name: Optional[str] - Name of the statistical method
  • model_path: Optional[str] - Path to a trained model
  • features_file: str - Path to features data file

alt text

Code Examples

import json
import uuid

import requests

id_ = str(uuid.uuid4())
url = "http://localhost:8000/anomaly_detection/infer/"
data = {
    "request_id": id_,
    "detection_type": "univariate",
    "method_name": "quantile",  # iqr, quantile, persist
    "model_path": "",  # noqa: E501
    "feature_file": "testprojectid2/testfileid1/sample_data",
}

json_str = json.dumps(data)

print(json_str)
# Check result

res = requests.post(url, json=data)
print("Status Code:", res.status_code)
print("Response Content:", res.text)  # print raw response content

if res.status_code == 200:  # Ensure that the status code is 200 before parsing
    result = json.loads(res.content)
    print(result)
else:
    print("Request failed.")

Output Schemas

  • _id: uuid - ID of the request
  • task: str - Name of the performed task
  • status: str - Status of the performed task
  • task_id: str - ID of the Celery task

alt text

GET method

Input Schemas

  • _id: str - ID of the request alt text

Output Schemas

  • _id: uuid - ID of the request
  • task: str - Name of the performed task
  • status: str - Status of the performed task
  • predictions_label_file: str - Path to the prediction file

alt text

Data Storage

alt text

Database

alt text