Anomaly Detecion Service
Description
The Anomaly Detection service is designed to train, validate and inference to detect anomaly in the data
Anomaly Detecion Proxy
The Anomaly Detection proxy is designed to let the app side call function from Anomaly Detection services via API.
There are three tasks in this proxy:
Train- Train the model using the unsupervised learning algorithm.- If existing
label file, the training will try with difference parameters combinations to find the best F1 score to train the model with - If not existing
label file, the training will use the default configuration to train the model
- If existing
Validate- Validate the trained modelInfer- Predict the feature file, given a trained model
- Detect the outlier
There are 6 endpoints in Anomaly Detection Proxy, POST and GET for each task:
anomaly_detection/train: Train anomaly detection modelanomaly_detection/valid: Validate the trained modelanomaly_detection/infer: Inference from trained model
Train Endpoint
POST method
Input Schemas
request_id: uuidmodel_name: str - Specify the model name to usesave_charts: booltrain_feature_file: str - Path to features data file.train_label_file:: str - Path to labels data file.
Input Assumptions
train_label_fileis optional- Both
featuresandlabeldata file are expected to be numerical
Code Examples
import json
import uuid
import requests
url = "http://localhost:8000/anomaly_detection/train/"
id_ = str(uuid.uuid4())
data = {
"request_id": id_,
"model_name": "DBSCAN",
"train_feature_file": "testprojectid1/testfileid1/cogen_feature",
"save_charts": True,
"train_label_file": "testprojectid1/testfileid1/cogen_anomaly"
}
json_str = json.dumps(data)
print(json_str)
# Check result
res = requests.post(url, json=data)
print("Status Code:", res.status_code)
print("Response Content:", res.text) # print raw response content
if res.status_code == 200: # Ensure that the status code is 200 before parsing
result = json.loads(res.content)
print(result)
else:
print("Request failed.")
- _id: uuid - ID of the request
- task: str - Name of the performed task
- status: str - Status of the performed task
- task_id: str - ID of the Celery task

GET method
Input Schemas
- _id: str - ID of the request

Output Schemas
_id: uuid - ID of the requesttask: str - Name of the performed taskstatus: str - Status of the performed taskpredictions_label_file: str - Path to the prediction filemodel_path: str - Path to the trained model.predictions_charts_file: str - Path to prediction charts file.
Data Storage

Database

Validate Endpoint
POST method
Input Schemas
request_id: uuidmodel_path: str - Path to a trained modelsave_charts: boolvalid_feature_file: str - Path to features data filevalid_label_file:: str - Path to labels data file

Input Assumptions
- Both
featuresandlabeldata file are expected to be numerical
Code Examples
import json
import uuid
import requests
id_ = str(uuid.uuid4())
url = "http://localhost:8000/anomaly_detection/valid/"
data = {
"request_id": id_,
"model_path": "testprojectid1/testfileid1/anomaly_detection/1727161197_dbscan_model.pkl",
"valid_features_file": "testprojectid1/testfileid1/cogen_feature",
"save_charts": True,
"valid_labels_file": "testprojectid1/testfileid1/cogen_anomaly",
}
json_str = json.dumps(data)
print(json_str)
# Check result
res = requests.post(url, json=data)
print("Status Code:", res.status_code)
print("Response Content:", res.text) # print raw response content
if res.status_code == 200: # Ensure that the status code is 200 before parsing
result = json.loads(res.content)
print(result)
else:
print("Request failed.")
- _id: uuid - ID of the request
- task: str - Name of the performed task
- status: str - Status of the performed task
- task_id: str - ID of the Celery task

GET method
Input Schemas
- _id: str - ID of the request

Output Schemas
_id: uuid - ID of the requesttask: str - Name of the performed taskstatus: str - Status of the performed taskmetrics: Objects - Metrics of the performed taskpredictions_label_file: str - Path to the prediction filepredictions_charts_file: str - Path to validation charts file.

Data Storage

Database

Inference Endpoint
Input Assumptions
features_fileis required and expected to be numericalmodel_pathis required ifdetection_typeismultivariate- Both
method_nameis required ifdetection_typeisunivariate- Outlier Detection
Multivariate
POST method
Input Schemas
request_id: uuiddetection_type: str -univariateormultivariatemethod_name: Optional[str] - Name of the statistical methodmodel_path: Optional[str] - Path to a trained modelfeatures_file: str - Path to features data file
Code Examples
import json
import uuid
import requests
id_ = str(uuid.uuid4())
url = "http://localhost:8000/anomaly_detection/infer/"
data = {
"request_id": id_,
"detection_type": "multivariate",
"method_name": "",
"model_path": "testprojectid1/testfileid1/anomaly_detection/1728555947_dbscan_model.pkl", # noqa: E501
"feature_file": "testprojectid1/testfileid1/cogen_feature.csv",
}
json_str = json.dumps(data)
print(json_str)
# Check result
res = requests.post(url, json=data)
print("Status Code:", res.status_code)
print("Response Content:", res.text) # print raw response content
if res.status_code == 200: # Ensure that the status code is 200 before parsing
result = json.loads(res.content)
print(result)
else:
print("Request failed.")
- _id: uuid - ID of the request
- task: str - Name of the performed task
- status: str - Status of the performed task
- task_id: str - ID of the Celery task

GET method
Input Schemas
- _id: str - ID of the request

Output Schemas
_id: uuid - ID of the requesttask: str - Name of the performed taskstatus: str - Status of the performed taskpredictions_label_file: str - Path to the prediction file
Data Storage

Database

Univariate - Outlier Detection
Input Assumptions
The features_file required existing Time columns at the first index
POST method
Input Schemas
request_id: uuiddetection_type: str -univariateormultivariatemethod_name: Optional[str] - Name of the statistical methodmodel_path: Optional[str] - Path to a trained modelfeatures_file: str - Path to features data file

Code Examples
import json
import uuid
import requests
id_ = str(uuid.uuid4())
url = "http://localhost:8000/anomaly_detection/infer/"
data = {
"request_id": id_,
"detection_type": "univariate",
"method_name": "quantile", # iqr, quantile, persist
"model_path": "", # noqa: E501
"feature_file": "testprojectid2/testfileid1/sample_data",
}
json_str = json.dumps(data)
print(json_str)
# Check result
res = requests.post(url, json=data)
print("Status Code:", res.status_code)
print("Response Content:", res.text) # print raw response content
if res.status_code == 200: # Ensure that the status code is 200 before parsing
result = json.loads(res.content)
print(result)
else:
print("Request failed.")
Output Schemas
- _id: uuid - ID of the request
- task: str - Name of the performed task
- status: str - Status of the performed task
- task_id: str - ID of the Celery task

GET method
Input Schemas
- _id: str - ID of the request

Output Schemas
_id: uuid - ID of the requesttask: str - Name of the performed taskstatus: str - Status of the performed taskpredictions_label_file: str - Path to the prediction file

Data Storage

Database
