하이브리드 신경 기호 사기 탐지: 도메인 규칙을 사용하여 신경망 안내

Towards Data Science | 2026년 3월 11일 01:34 | 🔬 연구

#review #roc-auc #기호 ai #불균형 데이터 #사기 탐지 #신경망

원문 출처: Towards Data Science · Genesis Park에서 요약 및 분석

요약

해당 기사의 핵심 내용을 한국어로 요약 중이며, 현재 상세 본문 품질 검증을 진행하고 있습니다. 출처는 Towards Data Science입니다.

본문

Abstract The Problem: When ROC-AUC Lies I had a fraud dataset at 0.17% positive rate. Trained a weighted BCE network, got ROC-AUC of 0.96, someone said “nice”. Then I pulled up the score distributions and threshold-dependent metrics. The model had quietly figured out that predicting “not fraud” on anything ambiguous was the path of least resistance — and nothing in the loss function disagreed with that decision. What bothered me wasn’t the math. It was that the model had no idea what fraud looks like. A junior analyst on day one could tell you: large transactions are suspicious, transactions with unusual PCA signatures are suspicious, and when both happen together, you should definitely be paying attention. That knowledge just… never makes it into the training loop. So I ran an experiment. What if I encoded that analyst intuition as a soft constraint directly in the loss function — something the network has to satisfy while also fitting the labels? The result was a Hybrid Neuro-Symbolic (HNS) setup. This article walks through the full experiment: the model, the rule loss, the lambda sweep, and — critically — what a proper multi-seed variance analysis with symmetric threshold evaluation actually shows. The Setup I used the Kaggle Credit Card Fraud dataset — 284,807 transactions, 492 of which are fraud (0.172%). The V1–V28 features are PCA components from an anonymized original feature space. Amount and Time are raw. The severe imbalance is the whole point; this is where standard approaches start to struggle [1]. Split was 70/15/15 train/val/test, stratified. I trained four things and compared them head-to-head: - Isolation Forest — contamination=0.001, fits on the full training set - One-Class SVM — nu=0.001, fits only on the non-fraud training samples - Pure Neural — three-layer MLP with BCE + class weighting, no domain knowledge - Hybrid Neuro-Symbolic — the same MLP, with a differentiable rule penalty added to the loss Isolation Forest and One-Class SVM serve as a gut-check. If a supervised network with 199k training samples cannot clear the bar set by an unsupervised method, that is worth knowing before you write up results. A tuned gradient boosting model would likely outperform both neural approaches; this comparison is intended to isolate the effect of the rule loss, not benchmark against all possible methods. Full code for all four is on GitHub. The Model Nothing exotic. A three-layer MLP with batch normalization after each hidden layer. The batch norm matters more than you might expect — under heavy class imbalance, activations can drift badly without it [3]. class MLP(nn.Module): def __init__(self, input_dim): super().__init__() self.net = nn.Sequential( nn.Linear(input_dim, 128), nn.ReLU(), nn.BatchNorm1d(128), nn.Linear(128, 64), nn.ReLU(), nn.BatchNorm1d(64), nn.Linear(64, 1) ) def forward(self, x): return self.net(x) For the loss, BCEWithLogitsLoss with pos_weight — computed as the ratio of non-fraud to fraud counts in the training set. On this dataset that is 577 [4]. A single fraud sample in a batch generates 577 times the gradient of a non-fraud one. pos_weight = count(y=0) / count(y=1) ≈ 577 That weight provides a directional signal when labeled fraud does appear. But the model still has no concept of what “suspicious” looks like in feature space — it only knows that fraud examples, when they do show up, should be heavily weighted. That is different from knowing where to look on batches that happen to contain no labeled fraud at all. The Rule Loss Here is the core idea. Fraud analysts know two things empirically: unusually high transaction amounts are suspicious, and transactions that sit far from normal behavior in PCA space are suspicious. I want the model to assign high fraud probabilities to transactions that match both signals — even when a batch contains no labeled fraud examples. The trick is making the rule differentiable. An if/else threshold — flag any transaction where amount > 1000 — is a hard step function. Its gradient is zero everywhere except at the threshold itself, where it is undefined. That means backpropagation has nothing to work with; the rule produces no useful gradient signal and the optimizer ignores it. Instead, I use a steep sigmoid centered at the batch mean. It approximates the same threshold behavior but stays smooth and differentiable everywhere — the gradient is small far from the boundary and peaks near it, which is exactly where you want the optimizer paying attention. The result is a smooth suspicion score between 0 and 1: def rule_loss(x, probs): # x[:, -1] = Amount (last column in creditcard.csv after dropping Class) # x[:, 1:29] = V1–V28 (PCA components, columns 1–28) amount = x[:, -1] pca_norm = torch.norm(x[:, 1:29], dim=1) suspicious = ( torch.sigmoid(5 * (amount - amount.mean())) + torch.sigmoid(5 * (pca_norm - pca_norm.mean())) ) / 2.0 penalty = suspicious * torch.relu(0.6 - probs.squeeze()) return penalty.mean() A note on why PCA norm specificall

원문 보기 (Towards Data Science)

Genesis Park 편집팀이 AI를 활용하여 작성한 분석입니다. 원문은 출처 링크를 통해 확인할 수 있습니다.

요약

본문

관련 저널 읽기