AI Command Center

Machine learning model monitoring and operations

Go back
Running

Training

Epoch 42/100

42%
Loss: 0.34
Elapsed
2h 14m
Remaining
~3h 05m

Model Performance

Classification Metrics

Acc
94.2%
F1
0.90

Latency

45ms-2ms

p95 Inference

Live
v4.2.0-stable

ResNet-50 Classifier

RPM
158
+12% vs avg
Latency
48ms
Stable
Load
42%
Traffic
Running on g4dn.xlarge (us-east-1a)

API Volume

1.2k +14%
Global
Peak: 1.9k
24h

GPU Status

NVIDIA A100
80%
Memory (VRAM)32/40 GB
Compute92%

Features

Global Importance

Data Drift

DetectedFeature: "Age"
Baseline
Current

Confusion Matrix

Predicted vs Actual

ROC Curve

0.96Excellent

Area Under Curve

Predictions

SKEW: -0.42
2.5k Samples
Mean
0.82
Median
0.85
Low Conf
5.2%

Exceptions

25Last 24h
Critical
Warning
Info

Deployment

Healthy
Live
Canary (v2.1)10%
Stable (v2.0)90%
Errors0.01%
RollbackReady

Data Quality

102 CHECKS PASSED
Completeness
Validity
Consistency
Scan: 5m ago
2.4M Rows

Audit Log

12events today
Model Promoted
jdoe to prod2h ago
Retraining Started
system auto-trigger5h ago

Validation

PASSED
5-Fold Cross Val
89%Mean Acc
Precision
0.92
Recall
0.87

Versions

v3.1.2d3ad...b33f
New Data Added
2 mins ago by @data-team
14.2 GB
Schema Update
1 day ago

Retraining

Next Job

14hRemaining
ScheduleSunday 2AM

Retries

25events
Last 8 Hours

Algorithm

Active Selection

XGBoost Classifier
AutoML Score: 0.94
Random Forest
Score: 0.89