GenAI Model Security

In Ken Huang, Yang Wang, Ben Goertzel, Yale Li, Sean Wright & Jyoti Ponnapalli (eds.), Generative AI Security: Theories and Practices. Springer Nature Switzerland. pp. 163-198 (2024)
  Copy   BIBTEX

Abstract

Safeguarding GenAI models against threats and aligning them with security requirements is imperative yet challenging. This chapter provides an overview of the security landscape for generative models. It begins by elucidating common vulnerabilities and attack vectors, including adversarial attacks, model inversion, backdoors, data extraction, and algorithmic bias. The practical implications of these threats are discussed, spanning domains like finance, healthcare, and content creation. The narrative then shifts to exploring mitigation strategies and innovative security paradigms. Differential privacy, blockchain-based provenance, quantum-resistant algorithms, and human-guided reinforcement learning are analyzed as potential techniques to harden generative models. Broader ethical concerns surrounding transparency, accountability, deepfakes, and model interpretability are also addressed. The chapter aims to establish a conceptual foundation encompassing both the technical and ethical dimensions of security for generative AI. It highlights open challenges and lays the groundwork for developing robust, trustworthy, and human-centric solutions. The multifaceted perspective spanning vulnerabilities, implications, and solutions is intended to further discourse on securing society’s growing reliance on generative models. Frontier model security is discussed using Anthropic proposed approach.

Links

PhilArchive



    Upload a copy of this work     Papers currently archived: 92,991

External links

Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

Similar books and articles

Navigating the GenAI Security Landscape.Ken Huang, Jyoti Ponnapalli, Jeff Tantsura & Kevin T. Shin - 2024 - In Ken Huang, Yang Wang, Ben Goertzel, Yale Li, Sean Wright & Jyoti Ponnapalli (eds.), Generative AI Security: Theories and Practices. Springer Nature Switzerland. pp. 31-58.
GenAI Application Level Security.Ken Huang, Grace Huang, Adam Dawson & Daniel Wu - 2024 - In Ken Huang, Yang Wang, Ben Goertzel, Yale Li, Sean Wright & Jyoti Ponnapalli (eds.), Generative AI Security: Theories and Practices. Springer Nature Switzerland. pp. 199-237.
Use GenAI Tools to Boost Your Security Posture.Ken Huang, Yale Li & Patricia Thaine - 2024 - In Ken Huang, Yang Wang, Ben Goertzel, Yale Li, Sean Wright & Jyoti Ponnapalli (eds.), Generative AI Security: Theories and Practices. Springer Nature Switzerland. pp. 305-338.
Build Your Security Program for GenAI.Ken Huang, John Yeoh, Sean Wright & Henry Wang - 2024 - In Ken Huang, Yang Wang, Ben Goertzel, Yale Li, Sean Wright & Jyoti Ponnapalli (eds.), Generative AI Security: Theories and Practices. Springer Nature Switzerland. pp. 99-132.
Utilizing Prompt Engineering to Operationalize Cybersecurity.Ken Huang, Grace Huang, Yuyan Duan & Ju Hyun - 2024 - In Ken Huang, Yang Wang, Ben Goertzel, Yale Li, Sean Wright & Jyoti Ponnapalli (eds.), Generative AI Security: Theories and Practices. Springer Nature Switzerland. pp. 271-303.
From LLMOps to DevSecOps for GenAI.Ken Huang, Vishwas Manral & Wickey Wang - 2024 - In Ken Huang, Yang Wang, Ben Goertzel, Yale Li, Sean Wright & Jyoti Ponnapalli (eds.), Generative AI Security: Theories and Practices. Springer Nature Switzerland. pp. 241-269.
Foundations of Generative AI.Ken Huang, Yang Wang & Xiaochen Zhang - 2024 - In Ken Huang, Yang Wang, Ben Goertzel, Yale Li, Sean Wright & Jyoti Ponnapalli (eds.), Generative AI Security: Theories and Practices. Springer Nature Switzerland. pp. 3-30.
GenAI Data Security.Ken Huang, Jerry Huang & Daniele Catteddu - 2024 - In Ken Huang, Yang Wang, Ben Goertzel, Yale Li, Sean Wright & Jyoti Ponnapalli (eds.), Generative AI Security: Theories and Practices. Springer Nature Switzerland. pp. 133-162.
Challenges and Controversies of Generative AI in Medical Diagnosis.Jordi Vallverdú - 2023 - Euphyía - Revista de Filosofía 17 (32):88-121.

Analytics

Added to PP
2024-04-06

Downloads
13 (#1,064,266)

6 months
13 (#219,986)

Historical graph of downloads
How can I increase my downloads?

Citations of this work

No citations found.

Add more citations

References found in this work

No references found.

Add more references