Building Robust Predictive Systems for Structured Data

🧑‍💻 Zhipeng “Zippo” He @ School of Information Systems
Queensland University of Technology


Comfirmation Seminar
March 15, 2023

Superviosry Team

  • A/Prof. Chun Ouyang
  • Prof. Alistair Barros
  • Dr. Catarina Moreira

Outline

  • Research Background
  • Research Questions
  • Research Plan
  • Progress to Date
  • Future Work

Research Background

Trustworthy AI: Beyond Accuracy

The relation between different aspects of AI trustworthiness (Li et al. 2023).

What is Robustness?

In philosophy of science (Woodward 2006), the closest notion of robustness against data science is:

Inferential Robustness

For example,

Y is robust = Y is the inference from a given D, and Y remains invariant to alternative assumptions in D.

In machine learning, Y = Predictive models, D = input

Robustness in Machine Learning (ML)

In machine learning, robustness refers to the ability of a model to:

perform well on unseen or new data

variations in the input

A robust model is not overly sensitive to perturbation in the input data, such as adversarial examples, and it can also generalise well against distributional variations.

What are Input Variations?

flowchart LR
  A(Input Variations) --> B(Perturbed Data)
  A --> C(Unperturbed Data)
  B --> D(Adversarial Perturbation)
  B --> E(Non-adversarial Perturbation)
  C --> F(Out-of-distribution)
  C --> G(Outlier & Infrequent Behaviour)
  C --> H(Concept Drift)

What are Adversarial Attacks?

An adversarial attack is a method to generate adversarial examples.

Adversarial examples are specialised inputs created with the purpose of confusing a neural network, resulting in the misclassification of a given input. These notorious inputs are indistinguishable to the human eye but cause the network to fail to identify the contents of the image. (Goodfellow, Shlens, and Szegedy 2015)

Adversarial Attacks along ML Pipeline

Attack Attacks

Formalise Adversarial Problems

Given a machine learning classifier \(f: \mathbb{X}\to \mathbb{Y}\) mapping data instance \(\boldsymbol{x} \in \mathbb{X}\) to label \(y \in \mathbb{Y}\), finding adversarial examples can be seen as an optimisation problem and formalised as: \[ \begin{gathered} \argmin_{\boldsymbol{x}^{adv}} [ d(\boldsymbol{x},\boldsymbol{x}^{adv})+\lambda\cdot d^\prime (f(\boldsymbol{x}),f(\boldsymbol{x}^{adv})) ] \\ \text{such that } f(\boldsymbol{x})\neq f(\boldsymbol{x}^{adv}), \end{gathered} \]

where \(\boldsymbol{x}\) is the original input data instance and \(\boldsymbol{x}^{adv}\) represents an adversarial example. Distance functions \(d\) and \(d^\prime\) are utilised to constrain the smallest changes on inputs while outputs are misclassified.

Goal of Adversarial Attacks

Confidence Reduction

Nontargeted Misclassification

Targeted Misclassification

Source/Target Misclassification

Goal of Adversarial Attacks

Confidence Reduction

Nontargeted Misclassification

Targeted Misclassification

Source/Target Misclassification

Adversarial Attack Methods


White-Box Attacks

Grey-Box Attacks


Black-Box Attacks

  • Not common in practical applications

Input Data: Image VS Tabular (Mathov et al. 2022)

Image (unstructured):

  • High-dimensional
  • Homogeneous
  • Continuous & consistent features

Tabular (structured):

  • Low-dimensional
  • Heterogeneous
  • Discrete & categorical features
  • Feature dependencies

Challenges in Tabular Data (Borisov et al. 2022)

Inconsistent feature ranges and feature types

Missing or complex irregular spatial dependencies exist in correlation

Information loss may happen when pre-processing features with dependency

Changing a single feature can entirely flip a prediction on tabular data

Sequential Data

Challenges in Sequential Data

Long-term feature dependencies

Variable-length inputs

Control-flow constraints

Current adversarial attacks on Tabular Data (Ghamizi et al. 2020)

Current adversarial attacks on Sequential Data (Stevens et al. 2022)

Research Questions

Research Gap & Problem

Researchers have been working on developing methods to make these models more robust to adversarial attacks, but this work has mainly focused on unstructured data. However, some researchers have rarely investigated adversarial robustness for structured data.

How can one construct predictive models that are robust to adversarial attacks for both tabular data and sequential data?

Research Question & Objective 1

RQ1: What characteristics can be applied to determine the success of adversarial attacks on structured data?

RO1: Identify the set of characteristics of successful adversarial attacks on structured data.

Research Question & Objective 2

RQ2: How to evaluate the characteristics of adversarial attacks (identified in RQ1) on structured data?

RO2: Develop an evaluation framework to benchmark adversarial attacks on structured data.

Research Question & Objective 3

RQ3: What techniques can be utilised to reduce the impact of adversarial attacks, as identified in RQ2, on the robustness of predictive models for structured data?

RO3: Design defence algorithms into predictive models against adversarial attacks for structured data.

Research Plan

Research Methodology (Peffers et al. 2007)

Research Plan

Phase I: Background research, problem definition, and literature review

Phase II: Evaluation of adversarial attack techniques (Iterative)

Phase III: Development and evaluation of defence techniques (Iterative)

Phase IV: Tool implementation and thesis write up

Phase I: Background research, problem definition, and literature review

  1. Understand the theoretical and experimental background of relevant adversarial attack methods and tools
  2. Synthesise characteristics of adversarial attacks from the state-of-the-art literature
  3. Identify adversarial attacks and predictive models that are applicable to structured data

Phase II: Evaluation of adversarial attack techniques (Iterative)

  1. Discover and propose metrics for evaluating the characteristics of adversarial attacks identified in Phase I.
  2. Develop an evaluation framework for assessing adversarial attacks on structured data.
  3. Apply the developed evaluation framework upon selected adversarial attacks for structured data and analyse the evaluation results.

Phase III: Development and evaluation of defence techniques (Iterative)

  1. Derive design mechanisms of adversarial attacks based on the evaluation results obtained in Phase II.
  2. Improve and/or re-design adversarial defence techniques for structured data.
  3. Evaluate the proposed attack defence techniques using the evaluation framework proposed in Phase II.

Phase IV: Tool implementation and thesis write up

  1. Develop open-source prototypes or frameworks
  2. Write the PhD thesis.

Progress to Date

Characteristics of Adversarial Attacks

Phase 1.2

Effectiveness


Imperceptibility

Transferability

  • Succss Rate
  • Natural & Robust Accuracy
  • Sparsity
  • Proximity
  • Distribution
  • Sensitivity
  • Models
  • Datasets

Characteristics of Adversarial Attacks

Phase 1.2

Effectiveness

Imperceptibility

Transferability

Effectiveness Metrics: Accuracy

Phase 2.1

Natural Accuracy & Robust Accuracy (Yang et al. 2021).

\[ \begin{gathered} \text{Natural Accuracy}= \frac{1}{n}\sum_{i=1}^{n}\one(f(\boldsymbol{x}_i)=y_i) \\ \text{Robust Accuracy}= \frac{1}{n}\sum_{i=1}^{n}\one(f(\boldsymbol{x}^{adv}_i)= y_i) \end{gathered} \]

Effectiveness Metrics: Success Rate

Phase 2.1

The success rate of an adversarial attack is the percentage of input samples that were successfully manipulated to cause misclassification by the model.

\[ \begin{gathered} \text{Untargeted Success Rate} = \frac{1}{n}\sum_{i=1}^{n}\one( \boldsymbol{x}^{adv})\neq f(\boldsymbol{x}_i)) \\ \text{Targeted Success Rate} = \frac{1}{n}\sum_{i=1}^{n}\one( \boldsymbol{x}^{adv})= y^*_i) \end{gathered} \]

Imperceptibility Metrics: Sparsity

Phase 2.1

A good adversarial example is expected to perturb fewer features that will result in changing the model’s prediction.

Here, I adapt \(\ell_0\) norm (Croce and Hein 2019) to tabular data as sparsity metric, which measures the number of changed features in an adversarial example \(\boldsymbol{x}^{adv}\) compared to the original input vector \(\boldsymbol{x}\).

\[ Spa(\boldsymbol{x}^{adv}, \boldsymbol{x})=\ell_0(\boldsymbol{x}^{adv}, \boldsymbol{x})=\sum_{i=1}^{n}\one( x^{adv}_i-x_i) \]

Imperceptibility Metrics: Proximity

Phase 2.1

A good adversarial example is expected to introduce minimal perturbation, which can be obtained as the smallest distance to the original feature vector.

In addition to \(\ell_p\) norm that is commonly used for measuring perturbation distance, I also adapt the distance metrics used to evaluate the quality of counterfactual explanations for tabular data (Mazzine and Martens 2021).

For all proximity metrics, the lower value is , the more imperceptible is

Imperceptibility Metrics: Proximity

Phase 2.1

  • \(\ell_1\) norm, which measures the absolute difference. \[ \ell_1(\boldsymbol{x}^{adv}, \boldsymbol{x}) = \sum_{i=1}^n\vert x^{adv}_i- x_i\vert \]
  • \(\ell_2\) norm, which measures the square root of the sum of the squared vector values. \[ \ell_2(\boldsymbol{x}^{adv}, \boldsymbol{x}) = \sqrt{\sum_{i=1}^n(x^{adv}_i- x_i)^2} \]

Imperceptibility Metrics: Proximity

Phase 2.1

  • \(\ell_\infty\) norm, which measures the maximum difference in feature values. \[ \ell_\infty(\boldsymbol{x}^{adv}, \boldsymbol{x})=\Vert\boldsymbol{x}^{adv}-\boldsymbol{x}\Vert_\infty = \sup_{n}{\vert x^{adv}_n- x_n\vert} \]
  • Inverse of median absolute deviation (IMAD), consists of \(\ell_p\) norm normalized by the inverse of the median absolute deviation of feature \(j\) over the dataset. Here we apply \(\ell_1\) norm: \[ \begin{gathered} \text{IMAD}(\boldsymbol{x}^{adv}, \boldsymbol{x})= \sum^n_{i=1}\frac{\vert x^{adv}_i-x_i\vert}{\text{MAD}_i}, \text{where} \\ \text{MAD}_i=\text{med}_{j\in\{1,\cdot,n\}} \vert x_{i,j}-\text{med}_{l\in\{1,\cdot,n\}}(x_l,i)\vert \end{gathered} \]

Imperceptibility Metrics: Distribution

Phase 2.1

Perturbed vectors should be as similarly as possible to the majority of original vectors.

  • Mahalanobis distance (MD)}, which is a multi-dimensional generalization of \(\ell_2\) norm. Given an input vector \(\boldsymbol{x}\), a perturbed vector \(\boldsymbol{x}^{adv}\) and the covariance matrix \(V\), it is defined by: \[ \text{MD}(\boldsymbol{x}^{adv}, \boldsymbol{x})= \sum^n_{i=1}\sqrt{\frac{(x^{adv}_i-x_i)(x^{adv}_i-x_i)^T}{V}} \]

Imperceptibility Metrics: Distribution

Phase 2.1

  • Neighbour distance (ND) (Ballet et al. 2019), which computes \(\ell_p\) distance to the closest neighbor \(\boldsymbol{q}\) of original input set \(\mathbb{X}\). Here, I adapt this metric with \(\ell_2\) norm: \[ \text{Neigh}(\boldsymbol{x}^{adv}, \boldsymbol{q}) = \argmin_{\boldsymbol{q}\in \mathbb{X}}\Vert \boldsymbol{x}^{adv}-\boldsymbol{q} \Vert_2 \]

Imperceptibility Metrics: Sensitivity

Phase 2.1

  • The notion of perturbation sensitivity for images is calculating the inverse of the standard deviation of the pixel region. (Luo et al. 2018)
  • For tabular data, we adapted sensitivity as the inverse of the standard deviation of all numerical features within a set of adversarial examples. \[ \begin{gathered} \text{SDV}(\boldsymbol{x}^{adv})= \frac{1}{n} \sum_{i=1}^n \sqrt{ \frac{ \sum^m ( x^{adv}_{i}- \bar{x}^{adv}_{i})^2 }{ m } }\\ \text{SEN}(\boldsymbol{x}^{adv})=\frac{1}{\text{SDV}(\boldsymbol{x}^{adv})}, \end{gathered} \]

Evaluation Frameworks

Phase 2.2

Selected Attack Methods

Phase 1.3 & 2.2

Selection Criteria:

  • The selected attack methods should be applicable to tabular data or sequential data.
  • The selected attack methods should be designed for white-box attack.

Gradient-based: FGSM, BIM, MIM, PGD

Decision boundary-based: DeepFool, LowProFool

Optimization-based: C&W attack

Predictive Models

Phase 2.2

Selection Criteria:

  • The selected predictive models should be applicable to tabular data or sequential data.
  • The selected predictive models should be applicable to at least two selected attack methods.

Logistic Regression (LR)

Support Vector Machine (SVM)

Multilayer Perceptrons (MLP)

Datasets

Dataset Data Type Total
Inst.
Train/Test
Set
Total
Feat.
Cate.
Feat.
Num.
Feat.
Enc.Cate.
Feat.
Adult Mixed 32651 26048/6513 12 8 4 98
Breast Num 569 455/114 30 0 30 0
COMPAS Mixed 7214 5771/1443 11 7 4 19
Diabetes Num 768 614/154 8 0 8 0
German Mixed 1000 800/200 20 15 5 58

Model Accuracy

Datasets LR SVM MLP
Adult 0.8524 0.8532 0.8521
German 0.8125 0.8125 0.7969
COMPAS 0.7933 0.7976 0.8089
Diabetes 0.7578 0.7578 0.7266
Breast Cancer 0.9844 0.9844 0.9688

Findings 1

There is a trade-off between imperceptibility and effectiveness.

Findings 2

Optimisation-based attacks should be the preferred methods for tabular data.

Overall, C&W \(\ell_2\) attack obtains the best balance between imperceptibility and effectiveness.

C&W attack is designed to optimise a loss function that combines both the perturbation magnitude with distance metrics and the prediction confidence with objective function:

\[ \argmin_{\boldsymbol{x}^{adv}} \Vert\boldsymbol{x}-\boldsymbol{x}^{adv}\Vert_p + c\cdot z(\boldsymbol{x}^{adv}) \]

Findings 3

Adding sparsity as a term in the optimisation function is important for adversarial attacks on structured data.

Limitations

  1. The current metrics are not sufficient for a comprehensive evaluation of successful adversarial attacks.
  2. The current encoding methods for categorical features may lead to the curse of dimensionality.

Future Work

Stage 1 of Future Work

Phase 2.1

Introduce domain knowledge into the evaluation of imperceptibility.

Immutability

Feasibility

Stage 1 of Future Work: Immutability

Phase 2.1

Stage 1 of Future Work: Feasibility

Phase 2.1

Stage 2 of Future Work

Phase 2.2 & 2.3

Extension of benchmark

  1. Black-Box Attacks
  2. Tree-based models & Neural Networks
  3. Sequential Data

Stage 3 of Future Work

Phase 3

Adversarial Defences

  1. Adversarial Training
  2. Novelty defence mechanisms
  3. Ensemble-based approaches

Publication Strategy

Imperceptibility Metric & White-box Attack Evaluation on tabular data

  • Submitted to International Joint Conference on Artificial Intelligence (IJCAI-23, CORE ranking A*), but rejected
  • Will resubmit revised version to European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2023, CORE ranking A) after confirmation seminar

The comprehensive adversarial attack benchmark on tabular data will submit to Q1 journal.

Publication Strategy

CORE ranking A* Conferences

  • International Joint Conference on Artificial Intelligence (IJCAI)
  • Conference on Neural Information Processing Systems (NeurIPS)
  • National Conference of the American Association for Artificial Intelligence (AAAI)
  • IEEE Symposium on Security and Privacy (SP)

CORE ranking A Conferences

  • European Conference on Artificial Intelligence (ECAI)
  • European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD)
  • Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD)
  • IEEE International Conference on Data Science and Advanced Analytics (DASS)

Scimago Q1 Journals

  • IEEE Transactions on Neural Networks and Learning Systems
  • ACM Transactions on Intelligent Systems and Technology
  • Knowledge-Based Systems
  • Expert Systems with Applications

References

Ballet, Vincent, Xavier Renard, Jonathan Aigrain, Thibault Laugel, Pascal Frossard, and Marcin Detyniecki. 2019. “Imperceptible Adversarial Attacks on Tabular Data.” arXiv 1911.03274.
Borisov, Vadim, Tobias Leemann, Kathrin Seßler, Johannes Haug, Martin Pawelczyk, and Gjergji Kasneci. 2022. “Deep Neural Networks and Tabular Data: A Survey.” IEEE Transactions on Neural Networks and Learning Systems.
Brendel, Wieland, Jonas Rauber, Matthias Kümmerer, Ivan Ustyuzhaninov, and Matthias Bethge. 2019. “Accurate, Reliable and Fast Robustness Evaluation.” arXiv:1907.01003.
Chen, Pin-Yu, Huan Zhang, Yash Sharma, Jinfeng Yi, and Cho-Jui Hsieh. 2017. ZOO: Zeroth Order Optimization Based Black-Box Attacks to Deep Neural Networks Without Training Substitute Models.” In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, AISec@CCS 2017, edited by Bhavani Thuraisingham, Battista Biggio, David Mandell Freeman, Brad Miller, and Arunesh Sinha, 15–26. ACM.
Croce, Francesco, and Matthias Hein. 2019. “Sparse and Imperceivable Adversarial Attacks.” In 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, 4723–31.
Ghamizi, Salah, Maxime Cordy, Martin Gubri, Mike Papadakis, Andrey Boystov, Yves Le Traon, and Anne Goujon. 2020. “Search-Based Adversarial Testing and Improvement of Constrained Credit Scoring Systems.” In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 1089–1100.
Goodfellow, Ian J., Jonathon Shlens, and Christian Szegedy. 2015. “Explaining and Harnessing Adversarial Examples.” In 3rd International Conference on Learning Representations, ICLR 2015.
Li, Bo, Peng Qi, Bo Liu, Shuai Di, Jingen Liu, Jiquan Pei, Jinfeng Yi, and Bowen Zhou. 2023. “Trustworthy AI: From Principles to Practices.” ACM Computing Surveys 55 (9): 1–46.
Luo, Bo, Yannan Liu, Lingxiao Wei, and Qiang Xu. 2018. “Towards Imperceptible and Robust Adversarial Example Attacks Against Neural Networks.” In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), 1652–59.
Mathov, Yael, Eden Levy, Ziv Katzir, Asaf Shabtai, and Yuval Elovici. 2022. “Not All Datasets Are Born Equal: On Heterogeneous Tabular Data and Adversarial Examples.” Knowl. Based Syst. 242: 108377.
Mazzine, Raphael, and David Martens. 2021. “A Framework and Benchmarking Study for Counterfactual Generating Methods on Tabular Data.” arXiv 2107.04680.
Moosavi-Dezfooli, Seyed-Mohsen, Alhussein Fawzi, and Pascal Frossard. 2016. “DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks.” In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, 2574–82. IEEE Computer Society.
Peffers, Ken, Tuure Tuunanen, Marcus A. Rothenberger, and Samir Chatterjee. 2007. “A Design Science Research Methodology for Information Systems Research.” Journal of Management Information Systems 24 (3): 45–77.
Stevens, Alexander, Johannes De Smedt, Jari Peeperkorn, and Jochen De Weerdt. 2022. “Assessing the Robustness in Predictive Process Monitoring Through Adversarial Attacks.” In 4th International Conference on Process Mining, ICPM 2022, edited by Andrea Burattin, Artem Polyvyanyy, and Barbara Weber, 56–63. IEEE.
Woodward, Jim. 2006. “Some Varieties of Robustness.” Journal of Economic Methodology 13 (2): 219–40.
Yang, Shuo, Tianyu Guo, Yunhe Wang, and Chang Xu. 2021. “Adversarial Robustness Through Disentangled Representations.” In Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, 3145–53. AAAI Press.

Thank you!