ai safety - Pwim.Net

Amazon Invests $4 Billion in AI Startup Anthropic for Advanced Foundation Models

Key Takeaways

Amazon to invest up to $4 billion in Anthropic.

Anthropic gains access to AWS Trainium and Inferentia chips.

Amazon acquires a minority stake in Anthropic; governance remains unchanged.

Collaboration aims to advance AI safety and research.

Strategic Investment and Collaboration

Anthropic, an artificial intelligence (AI) startup, announced on September 25, 2023, a significant investment and collaboration agreement with Amazon. Amazon will invest up to $4 billion in Anthropic as part of a broader initiative to develop reliable and high-performing foundation models. The investment will provide Anthropic with access to Amazon Web Services (AWS) Trainium and Inferentia chips, which will be used for model training and deployment.

Technological Synergy

The collaboration will also allow Amazon developers to build on top of Anthropic’s state-of-the-art models via Amazon Bedrock. This platform will enable the integration of generative AI capabilities into existing Amazon applications and the creation of new customer experiences. Anthropic’s founding team includes alumni from OpenAI, which is notably backed by Microsoft and known for developing the widely-used AI chatbot, ChatGPT.

AI Safety and Governance

As part of the investment, Amazon will acquire a minority stake in Anthropic. The startup’s corporate governance structure will remain unchanged, overseen by the Long Term Benefit Trust in accordance with its Responsible Scaling Policy. Both companies are committed to the safe training and deployment of advanced foundation models and are actively engaged in organizations like the Global Partnership on AI (GPAI), the Partnership on AI (PAI), and the National Institute of Standards and Technology (NIST).

Impact on Industries

Enterprises across various sectors are already leveraging Anthropic models on Amazon Bedrock. For instance, LexisNexis Legal & Professional is using a custom Claude 2 model for conversational search and intelligent legal drafting. Asset management firm Bridgewater Associates is developing an investment analyst assistant powered by Claude 2. Travel publisher Lonely Planet has reduced its itinerary generation costs by almost 80% after deploying Claude 2.

Future Prospects

The collaboration aims to responsibly scale the adoption of Claude and advance safe AI in the cloud. The investment will ensure that Anthropic is well-equipped to continue advancing the frontier of AI safety and research, benefiting organizations worldwide.

Disclaimer & Copyright Notice: The content of this article is for informational purposes only and is not intended as financial advice. Always consult with a professional before making any financial decisions. This material is the exclusive property of Blockchain.News. Unauthorized use, duplication, or distribution without express permission is prohibited. Proper credit and direction to the original content are required for any permitted use.

UK to Host First International AI Safety Conference in November

In a strategic move to establish itself as a global mediator in technology discussions, the UK government has revealed plans for an International AI Safety Summit. Scheduled to occur on November 1 and 2, the summit aims to engage the UK with other influential nations such as the United States, China, and the European Union. This initiative gains significance as the UK prepares for its exit from the EU, striving to maintain a leading role in global tech policy.

The two-day conference aims to cover a wide array of topics crucial to the AI ecosystem. While the agenda was kept under wraps until recently, it was unveiled this week on the UK government’s official website. Roundtable discussions will focus on various safety risks, including but not limited to, biosecurity, cybersecurity, and the possibility of humanity losing control over advanced AI systems. Additional sessions will explore the ethical and societal implications of AI, aiming to initiate a global dialogue on future AI legislation.

UK Prime Minister Rishi Sunak will serve as the host at the historic Bletchley Park venue. The summit expects to attract a diverse group of attendees, including high-profile personalities like US Vice President Kamala Harris and Demis Hassabis, the CEO of Google DeepMind. The event will also feature representatives from international governments, major AI corporations, civil society organizations, and leading academic researchers.

In the backdrop of a global shortage of computing resources, it was reported by Cointelegraph in August 2023 that Prime Minister Sunak is allocating $130 million to acquire thousands of computer processors. These resources are intended to bolster the UK’s AI capabilities, emphasizing the nation’s commitment to being at the forefront of AI safety and technology.

The summit underscores the escalating concerns among global lawmakers about the manifold risks posed by AI, ranging from security threats to ethical dilemmas. By spearheading this initiative, the UK aims to play a pivotal role in shaping international AI policy. The outcome of this conference could serve as a milestone in the global discourse on AI safety and regulation.

President Biden Amplifies AI Safety and Security Measures with Executive Order

On October 30, 2023, President Joe Biden signed a pivotal Executive Order aimed at reinforcing the safety, security, and trustworthiness of artificial intelligence (AI) technologies in the United States. This directive is part of the Biden-Harris Administration’s broader strategy to foster responsible innovation, ensuring that AI serves as a tool for enhancing public welfare, economic growth, and national security.

In a bid to curb the potential risks associated with AI, the Executive Order mandates rigorous safety testing and information sharing for developers of influential AI systems. Under the Defense Production Act, creators of AI models that could significantly impact national security, economic stability, or public health are required to notify the federal government during the training phase, and share the outcomes of all red-team safety examinations. The National Institute of Standards and Technology (NIST) is entrusted with the task of formulating stringent standards for extensive red-team testing, ensuring AI systems are secure and reliable prior to public release. Additionally, the Department of Homeland Security will implement these standards across critical infrastructure sectors and establish the AI Safety and Security Board. This represents a monumental stride by a government to fortify the field of AI safety.

The Executive Order underscores the necessity of shielding Americans from AI-driven privacy infringements and fostering equity to prevent algorithmic discrimination. It advocates for bipartisan data privacy legislation and emphasizes the advancement of privacy-preserving techniques. The National Science Foundation will collaborate with a Research Coordination Network to expedite the development and adoption of privacy-centric technologies.

AI’s potential to revolutionize healthcare, education, and consumer markets is acknowledged, alongside a recognition of the potential perils it poses to consumers and workers. The directive encompasses measures to enhance the responsible utilization of AI in healthcare, promote the development of resources for educators, and address the ramifications of AI on the labor market, advocating for principles and practices that prioritize workers’ rights and well-being.

Recognizing the global dimensions of AI, the order emphasizes the importance of international collaboration in crafting robust AI governance frameworks. It alludes to ongoing and future engagements with various nations and international bodies to harmonize AI standards, ensure its safe deployment, and address global challenges.

The directive also outlines steps to modernize federal AI infrastructure and improve the government’s AI deployment. This encompasses issuing guidelines for agencies’ use of AI, streamlining AI procurement, and accelerating the hiring of AI professionals across federal agencies.

In summation, the actions orchestrated by President Biden mark a significant leap towards harnessing the potential of AI while safeguarding against its risks. The Administration expresses its commitment to ongoing collaboration with Congress and international allies to evolve a resilient AI governance framework.

US NIST Initiates AI Safety Consortium to Promote Trustworthy AI Development

The United States National Institute of Standards and Technology (NIST), under the Department of Commerce, has taken a significant stride towards fostering a safe and trustworthy environment for Artificial Intelligence (AI) through the inception of the Artificial Intelligence Safety Institute Consortium (“Consortium”). The Consortium’s formation was announced in a notice published on November 2, 2023, by NIST, marking a collaborative effort to set up a new measurement science for identifying scalable and proven techniques and metrics. These metrics are aimed at advancing the development and responsible utilization of AI, especially concerning advanced AI systems like the most capable foundation models.

Consortium Objective and Collaboration

The core objective of the Consortium is to navigate the extensive risks posed by AI technologies and to shield the public while encouraging innovative AI technological advancements. NIST seeks to leverage the broader community’s interests and capabilities, aiming at identifying proven, scalable, and interoperable measurements and methodologies for the responsible use and development of trustworthy AI.

Engagement in collaborative Research and Development (R&D), shared projects, and the evaluation of test systems and prototypes are among the key activities outlined for the Consortium. The collective effort is in response to the Executive Order titled “The Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence,” dated October 30, 2023, which underlined a broad set of priorities relevant to AI safety and trust.

Call for Participation and Cooperation

To achieve these objectives, NIST has opened the doors for interested organizations to share their technical expertise, products, data, and/or models through the AI Risk Management Framework (AI RMF). The invitation for letters of interest is part of NIST’s initiative to collaborate with non-profit organizations, universities, government agencies, and technology companies. The collaborative activities within the Consortium are expected to commence no earlier than December 4, 2023, once a sufficient number of completed and signed letters of interest are received. Participation is open to all organizations that can contribute to the Consortium’s activities, with selected participants required to enter into a Consortium Cooperative Research and Development Agreement (CRADA) with NIST.

Addressing AI Safety Challenges

The establishment of the Consortium is viewed as a positive step towards catching up with other developed nations in setting up regulations governing AI development, particularly in the realms of user and citizen privacy, security, and unintended consequences. The move reflects a milestone under President Joe Biden’s administration towards adopting specific policies to manage AI in the United States.

The Consortium will be instrumental in developing new guidelines, tools, methods, and best practices to facilitate the evolution of industry standards for developing or deploying AI in a safe, secure, and trustworthy manner. It is poised to play a critical role at a pivotal time, not only for AI technologists but for society, in ensuring that AI aligns with societal norms and values while promoting innovation.

OpenAI Introduces the "Preparedness Framework" for AI Safety and Policy Integration

OpenAI, a prominent artificial intelligence research lab, has announced a significant development in its approach to AI safety and policy. The company has unveiled its “Preparedness Framework,” a comprehensive set of processes and tools designed to assess and mitigate risks associated with increasingly powerful AI models. This initiative comes at a critical time for OpenAI, which has faced scrutiny over governance and accountability issues, particularly concerning the influential AI systems it develops.

A key aspect of the Preparedness Framework is the empowerment of OpenAI’s board of directors. They now hold the authority to veto decisions made by the CEO, Sam Altman, if the risks associated with AI developments are deemed too high. This move indicates a shift in the company’s internal dynamics, emphasizing a more rigorous and responsible approach to AI development and deployment. The board’s oversight extends to all areas of AI development, including current models, next-generation frontier models, and the conceptualization of artificial general intelligence (AGI).

At the core of the Preparedness Framework is the introduction of risk “scorecards.” These are instrumental in evaluating various potential harms associated with AI models, such as their capabilities, vulnerabilities, and overall impacts. These scorecards are dynamic, updated regularly to reflect new data and insights, thereby enabling timely interventions and reviews whenever certain risk thresholds are reached. The framework underlines the importance of data-driven evaluations, moving away from speculative discussions towards more concrete and practical assessments of AI’s capabilities and risks.

OpenAI acknowledges that the Preparedness Framework is a work in progress. It carries a “beta” tag, indicating that it is subject to continuous refinement and updates based on new data, feedback, and ongoing research. The company has expressed its commitment to sharing its findings and best practices with the wider AI community, fostering a collaborative approach to AI safety and ethics.

NIST's Call for Public Input on AI Safety in Response to Biden's Executive Order

The U.S. Department of Commerce’s National Institute of Standards and Technology (NIST) has issued a Request for Information (RFI). This initiative seeks public input to assist in the implementation of responsibilities outlined in the recent Executive Order on Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence. The deadline for responses is February 2, 2024.

The Executive Order, directed by President Biden, represents a comprehensive approach to manage the burgeoning risks and opportunities posed by AI. It mandates the development of new standards for AI safety and security, underscoring the critical need to protect Americans’ privacy, advance equity and civil rights, and ensure consumer and worker protection. This directive encompasses various aspects, including the development of standards, tools, and tests to ensure AI systems are safe, secure, and trustworthy, and the establishment of a robust framework for AI-enabled fraud and deception prevention.

NIST’s RFI calls for information in several key areas: AI red-teaming, generative AI risk management, minimizing the risk of synthetic content, and shaping responsible global technical standards for AI development. The responses will support NIST’s efforts to create a range of guidelines as mandated by the Executive Order. This encompasses the evaluation of AI technologies, the facilitation of consensus-based standards, and the provision of testing environments for AI systems.

The Biden administration’s focus on AI also includes setting guidelines for the testing and safeguarding of AI systems. Generative AI, capable of creating text, photos, and videos, has prompted concerns regarding its impact on jobs, elections, and the balance of power between humans and AI. The Executive Order instructs agencies to set standards for AI testing and address related chemical, biological, radiological, nuclear, and cybersecurity risks. NIST plays a central role in establishing these standards and guidelines, which are crucial for AI risk assessment and management.

Apart from AI safety and security, the Executive Order encompasses various other facets. It addresses the need to protect Americans’ privacy in the AI era, mitigate algorithmic discrimination and biases in justice, healthcare, and housing, and leverage AI for consumer benefits. Moreover, it focuses on adapting America’s workforce and workplaces to the changing landscape shaped by AI, promoting innovation and competition in AI, and establishing the U.S. as a global leader in responsible AI development and usage. These initiatives include working with international partners to develop safe and interoperable AI standards and ensuring responsible government deployment of AI.

California Spearheads AI Ethics and Safety with Senate Bills 892 and 893

California, known for its technological innovation, is taking significant steps to ensure the ethical and safe deployment of Artificial Intelligence (AI) through legislative action. State Senator Steve Padilla’s introduction of Senate Bills 892 and 893 marks a pivotal move towards establishing a robust framework for AI services, particularly those contracted by state agencies.

Senate Bill 892: A New Paradigm for AI Safety and Ethics

Senate Bill 892 is a legislative proposal aimed at mandating the Department of Technology to develop comprehensive safety, privacy, and nondiscrimination standards for AI services. This bill underscores the importance of these standards becoming mandatory for every AI company contracting with the state’s institutions from August 1, 2025. This move by Senator Padilla emphasizes the need for a proactive approach in regulating AI, acknowledging the significant influence AI has on society and the potential risks it poses if not properly safeguarded.

Senate Bill 893: Fostering AI Research and Public Benefit

In conjunction with Senate Bill 892, Senate Bill 893 focuses on the creation of the California AI Research Hub. This initiative involves collaboration between the Government Operations Agency, the Governor’s Office of Business and Economic Development, and the Department of Technology, alongside academic institutions. The primary goal is to advance AI technology for public good, ensuring that AI research and development in California are aligned with the principles of privacy, security, and societal benefit. This bill aims to leverage California’s status as a technological hub and its renowned research universities to democratize AI resources and foster innovation for the public interest.

California’s Leadership in AI Regulation

California’s legislative actions on AI reflect a broader trend in the United States, where states are increasingly recognizing the need to regulate AI technologies. These initiatives are not isolated; other states, including Texas, West Virginia, and Louisiana, have initiated the monitoring or study of AI systems used by state agencies. This underscores a growing awareness of the importance of state-level intervention in the absence of comprehensive federal regulation of AI.

Conclusion: Shaping the Future of AI

The introduction of Senate Bills 892 and 893 by California State Senator Steve Padilla signifies a critical step towards establishing a framework that ensures AI is used in a manner that is safe, ethical, and beneficial for society. By setting standards for AI services and fostering a research environment that prioritizes societal good, California is positioning itself as a leader in AI governance. This proactive approach is essential for ensuring that the development and use of AI technologies align with ethical principles and contribute positively to society.

Google DeepMind: Subtle Adversarial Image Manipulation Influences Both AI Model and Human Perception

Recent research by Google DeepMind has revealed a surprising intersection between human and machine vision, particularly in their susceptibility to adversarial images. Adversarial images are digital images subtly altered to deceive AI models, making them misclassify the image contents. For example, a vase could be misclassified as a cat by the AI.

The study published in “Nature Communications” titled “Subtle adversarial image manipulations influence both human and machine perception” conducted a series of experiments to investigate the impact of adversarial images on human perception. These experiments found that while adversarial perturbations significantly mislead machines, they can also subtly influence human perception. Notably, the effect on human decision-making was consistent with the misclassifications made by AI models, albeit not as pronounced. This discovery underlines the nuanced relationship between human and machine vision, showing that both can be influenced by minor perturbations in an image, even if the perturbation magnitudes are small and the viewing times are extended.

DeepMind’s research also explored the properties of artificial neural network (ANN) models that contribute to this susceptibility. They studied two ANN architectures: convolutional networks and self-attention architectures. Convolutional networks, inspired by the primate visual system, apply static local filters across the visual field, building a hierarchical representation. In contrast, self-attention architectures, originally designed for natural language processing, use nonlocal operations for global communication across the entire image space, showing a stronger bias toward shape features than texture features. These models were found to be aligned with human perception in terms of bias direction. Interestingly, adversarial images generated by self-attention models were more likely to influence human choices than those generated by convolutional models, indicating a closer alignment with human visual perception.

The research highlights the critical role of subtle, higher-order statistics of natural images in aligning human and machine perception. Both humans and machines are sensitive to these subtle statistical structures in images. This alignment suggests a potential avenue for improving ANN models, making them more robust and less susceptible to adversarial attacks. It also points to the need for further research into the shared sensitivities between human and machine vision, which could provide valuable insights into the mechanisms and theories of the human visual system. The discovery of these shared sensitivities between humans and machines has significant implications for AI safety and security, suggesting that adversarial perturbations could be exploited in real-world settings to subtly bias human perception and decision-making.

In summary, this research presents a significant step forward in understanding the intricate relationship between human and machine perception, highlighting the similarities and differences in their responses to adversarial images. It underscores the need for ongoing research in AI safety and security, particularly in understanding and mitigating the potential impacts of adversarial attacks on both AI systems and human perception.

Exploring AI Stability: Navigating Non-Power-Seeking Behavior Across Environments

Recently, a research paper titled “Quantifying Stability of Non-Power-Seeking in Artificial Agents” presents significant findings in the field of AI safety and alignment. The core question addressed by the paper is whether an AI agent that is considered safe in one setting remains safe when deployed in a new, similar environment. This concern is pivotal in AI alignment, where models are trained and tested in one environment but used in another, necessitating assurance of consistent safety during deployment. The primary focus of this investigation is on the concept of power-seeking behavior in AI, especially the tendency to resist shutdown, which is considered a crucial aspect of power-seeking.

Key findings and concepts in the paper include:

Stability of Non-Power-Seeking Behavior

The research demonstrates that for certain types of AI policies, the characteristic of not resisting shutdown (a form of non-power-seeking behavior) remains stable when the agent’s deployment setting changes slightly. This means that if an AI does not avoid shutdown in one Markov decision process (MDP), it is likely to maintain this behavior in a similar MDP.

Risks from Power-Seeking AI

The study acknowledges that a primary source of extreme risk from advanced AI systems is their potential to seek power, influence, and resources. Building systems that inherently do not seek power is identified as a method to mitigate this risk. Power-seeking AI, in nearly all definitions and scenarios, will avoid shutdown as a means to maintain its ability to act and exert influence.

Near-Optimal Policies and Well-Behaved Functions

The paper focuses on two specific cases: near-optimal policies where the reward function is known, and policies that are fixed well-behaved functions on a structured state space, like language models (LLMs). These represent scenarios where the stability of non-power-seeking behavior can be examined and quantified.

Safe Policy with Small Failure Probability

The research introduces a relaxation in the requirement for a “safe” policy, allowing for a small probability of failure in navigating to a shutdown state. This adjustment is practical for real models where policies may have a nonzero probability for every action in every state, as seen in LLMs.

Similarity Based on State Space Structure

The similarity of environments or scenarios for deploying AI policies is considered based on the structure of the broader state space that the policy is defined on. This approach is natural for scenarios where such metrics exist, like comparing states via their embeddings in LLMs.

This research is crucial in advancing our understanding of AI safety and alignment, especially in the context of power-seeking behaviors and the stability of non-power-seeking traits in AI agents across different deployment environments. It contributes significantly to the ongoing conversation about building AI systems that align with human values and expectations, particularly in mitigating risks associated with AI’s potential to seek power and resist shutdown.

British Standards Institution Pioneers International AI Safety Guidelines for Sustainable Future

The British Standards Institution (BSI) has marked a significant milestone in the field of artificial intelligence (AI) by releasing the world’s first international guideline on AI safety, designated as BS ISO/IEC 42001. This publication aims to establish a comprehensive framework for organizations to manage AI systems safely and responsibly, addressing the growing global demand for standardization in this rapidly evolving technology sector.

Emergence of AI and the Need for Standardization

AI’s transformative impact across various industries, from healthcare to finance, has been phenomenal. However, its rapid adoption has also raised concerns about safety, ethical use, and trustworthiness. A survey by BSI highlighted that 61% of people globally sought international guidelines for AI use, pointing to a significant ‘AI confidence gap’. In response, BSI’s BS ISO/IEC 42001 seeks to provide an authoritative, globally recognized standard for AI management.

Key Features of BS ISO/IEC 42001

This guideline is an impact-based framework focusing on essential aspects such as non-transparent automatic decision-making, the use of machine learning over human-coded logic, and continuous learning algorithms. It delineates how organizations can establish, implement, maintain, and continually improve an AI management system, underpinning it with robust safeguards.

Benefits for Organizations and Society

BS ISO/IEC 42001 is designed to help organizations introduce a quality-centric culture for AI development and usage. It provides detailed risk assessments, risk treatments, and controls for both internal and external AI products and services. By adhering to this standard, organizations can not only enhance the trustworthiness of their AI systems but also align themselves with ethical considerations, contributing positively to society and the environment.

Global Relevance and Adoption

The guideline’s publication is timely, aligning with AI being one of the key themes at the World Economic Forum in 2024. It’s a crucial step for businesses to navigate the complex path to AI compliance, especially considering the developments like the EU AI Act. The standard is expected to be a foundational step for organizations globally to manage AI systems responsibly.

The Future of AI and Regulatory Landscape

BSI’s initiative is a crucial step in shaping the future of AI. It aligns with the UK Government’s National AI Strategy and represents a proactive approach to address regulatory challenges in AI. The publication of BS ISO/IEC 42001 coincides with a flurry of AI regulatory developments in the UK, including consultations on AI compliance with data protection law and the setting of benchmarks for future legislation regulating AI technology.