Reading Note of the Paper "Threats to Pre-trained Language Models - Survey and Taxonomy"
Information of the Original Paper: Guo, S., Xie, C., Li, J., Lyu, L., & Zhang, T. (2022). Threats to pre-trained language models: Survey and taxonomy. arXiv preprint arXiv:2202.06862.
Brief Intro
Pre-trained language models (PTLMs) have achieved great success and remarkable performance, while there are growing concerns regarding their security issues.
Reasons that make PTLMs particularly vulnerable:
- Threats can occur at different stages of PTLM pipeline (pre-training, finetuning, inferring) raised by different malicious entities (model publisher, downstream service provider, user);
- Two types of model transferability facilitate attacks (landscape & portrait);
- Four categories of attacks based on different attack goals (integrity threats: backdoor and evasion attacks & privacy violations: data and model).
Content Access
Please click here to access the content of the blog from my Gitbook.