Photo credit: venturebeat.com
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More
Generative AI introduces a range of unique challenges that current risk management strategies may not fully address, necessitating the development of new methodologies tailored to this groundbreaking technology.
One prominent issue is the phenomenon of hallucinations, where models generate inaccurate or misleading information. Additionally, generative AI carries risks such as the unintentional exposure of sensitive data through outputs, vulnerabilities to prompt manipulations, and biases emerging from improperly selected training datasets or inadequate management during fine-tuning.
Phil Venables, Chief Information Security Officer at Google Cloud, emphasizes the need for an expanded approach to cyber detection and response to address potential abuses of AI technologies. He also advocates for employing AI to bolster defensive strategies.
During a recent session at the Global AI Symposium hosted by the Cloud Security Alliance, Venables stressed the necessity of implementing universal frameworks and controls to prevent reinventing the wheel with each new AI deployment.
Lessons learned at Google Cloud
Venables highlighted that when tackling AI-related challenges, it is crucial to view the issue as a comprehensive business process rather than a mere technical hurdle.
There is widespread awareness regarding the risks linked to inappropriate uses of training and fine-tuning data. “Preventing data poisoning is essential, as is assessing the suitability of collected data for various risks,” Venables stated.
Organizations must prioritize the sanitation and protection of data utilized for AI training, ensuring complete transparency regarding its origins and integrity.
“Wishing for the best isn’t enough,” Venables remarked. “Actual curating and tracking are necessary for data usage.”
To enhance security, companies should implement specialized tools and controls that integrate security measures throughout the model training, fine-tuning, and testing processes, safeguarding against tampering with weights and other model parameters.
“Failure to address these issues opens us up to numerous backdoor vulnerabilities, jeopardizing the safety and security of the business processes reliant on these models,” he cautioned.
Filtering to fight against prompt injection
Another critical concern is the manipulation of models by external actors. According to Venables, such malicious behavior can stem from compromised training data or suboptimal model configurations, leading to unintended actions against established controls.
He noted that there are various instances of users exploiting models by manipulating prompts to generate unintended or harmful outputs, especially in inadequately defended environments.
Inputs can be infiltrated through techniques such as embedding malicious text within images, resulting in manipulated outputs from single or multimodal models.
“Much of the sensationalized coverage focuses on the generation of unsafe content, with some outcomes bordering on the humorous,” Venables observed.
To combat these risks, it is essential to filter inputs according to specific safety, security, and trust metrics, alongside maintaining comprehensive observability and access controls for the models, data, and related software.
“The testing data can notably affect model behavior, sometimes in unpredictable and hazardous ways,” Venables explained.
Controlling the output, as well
Instances of models behaving improperly underline the necessity for organizations to regulate not just inputs but outputs too. Venables advocated for the establishment of filters and outbound controls, which he described as “circuit breakers,” to manage how models handle data and engage with physical processes.
“It’s not exclusively a matter of adversarial activity; accidental malfunctions within the model’s behavior must also be managed,” he remarked.
Companies should actively monitor their infrastructures for software vulnerabilities, Venables advised, noting that integrated platforms can oversee data and software lifecycle management, thereby mitigating operational risks tied to AI integration in essential business functions.
“The goal is to minimize operational risks associated with a model’s output by effectively managing agent behavior to provide robust safeguards against unintended actions,” he stated.
Recommendations include sandboxing AI applications and enforcing the principle of least privilege to ensure comprehensive governance and shielding of models through independent monitoring mechanisms. Organizations should maintain meticulous logging and observability practices.
Ultimately, Venables concluded, the key lies in the careful sanitation, protection, and governance of training, tuning, and test data. Organizations should enforce stringent access controls across models, data, and infrastructure while establishing filtering protocols for both inputs and outputs, all under a risk control framework to bolster defense capabilities.
Source
venturebeat.com