ETFOptimize | High-performance ETF-based Investment Strategies

Quantitative strategies, Wall Street-caliber research, and insightful market analysis since 1998.


ETFOptimize | HOME
Close Window

MLCommons Announces the Formation of AI Safety Working Group

The initial focus will be on the development of safety benchmarks for large language models used for generative AI — using Stanford's groundbreaking HELM framework.

Today, MLCommons®, the leading AI benchmarking organization, is announcing the creation of the AI Safety (AIS) working group. The AIS will develop a platform and pool of tests from many contributors to support AI safety benchmarks for diverse use cases.

AI systems offer the potential for substantial benefits to society but they are not without risks, such as toxicity, misinformation, and bias. As with other complex technologies, society needs industry-standard safety testing to realize the benefits while minimizing the risks.

The new platform will support defining benchmarks that select from the pool of tests and summarize the outputs into useful, comprehensible scores – similar to what is standard in other industries such as automotive safety test ratings and energy star scores.

The effort’s immediate priority will be supporting rapid evolution of more rigorous and reliable AI safety testing technology. The AIS working group will draw upon the technical and operational expertise of its members, and the larger AI community, to help guide and create the AI safety benchmarking technologies.

"The open and dynamic nature of the safety benchmarks being developed by the broad AI community creates real incentives to set and meet common goals," said Joaquin Vanschoren, Associate Professor of Machine Learning at Eindhoven University of Technology. "Anyone can propose new tests if they see unsolved safety issues. We have some of the smartest minds in the world coming together to actually solve these issues, and using benchmarks means we will have clear insights on which AI models best address safety concerns."

The initial focus will be developing safety benchmarks for large language models (LLMs), building on groundbreaking work done by researchers at Stanford University’s Center for Research on Foundation Models and its Holistic Evaluation of Language Models (HELM). In addition to building upon the HELM framework and incorporating many of its safety related tests, we expect several companies to externalize AI safety tests they have used internally for proprietary purposes, and share them openly with the MLCommons community, which will help speed the pace of innovation.

Percy Liang, the director for the Center for Research on Foundation Models (CRFM) says: “We have been developing HELM, a modular framework for evaluation, for about 2 years. I’m very excited to work with MLCommons to leverage HELM for AI safety evaluation, which is a topic that I’ve been thinking about for 7 years, but has become extremely urgent with the rise of powerful foundation models.”

The AIS working group believes that, as testing matures, standard AI safety benchmarks will become a vital element of our approach to AI safety. This aligns with responsible development and risk-based policy frameworks such as the voluntary commitments on safety, security, and trust that several tech companies made to the White House in July 2023, NIST’s AI Risk Management Framework, and the EU’s forthcoming AI Act.

MLCommons supports a broad set of stakeholders across industry and academia in developing shared data, tools, and benchmarks to more effectively build and test AI systems. “We are excited to work collaboratively with our members,” said David Kanter, MLCommons Executive Director. “Over the next year we will focus on building and deploying AI Safety benchmarks, beginning first with open source models and with the aim of applying the benchmarks more broadly to other LLMs once the initial methodology is proven.”

The AIS working group’s initial participation includes a multi-disciplinary group of AI experts including: Anthropic, Coactive AI, Google, Inflection, Intel, Meta, Microsoft, NVIDIA, OpenAI, Qualcomm Technologies, Inc., and academics Joaquin Vanstoren from Eindhoven University of Technology, Percy Liang from Stanford University, and Bo Li from the University of Chicago. Participation in the working group is open to academic and industry researchers and engineers, as well as domain experts from civil society and the public sector. Please see our AIS Working Group page for information on how to participate.

About MLCommons

MLCommons is the world leader in building benchmarks for AI. It is an open engineering consortium with a mission to make machine learning better for everyone through benchmarks and data. The foundation for MLCommons began with the MLPerf benchmark in 2018, which rapidly scaled as a set of industry metrics to measure machine learning performance and promote transparency of machine learning techniques. In collaboration with its 125+ members, global technology providers, academics and researchers, MLCommons is focused on collaborative engineering work that builds tools for the entire machine learning industry through benchmarks and metrics, public datasets and best practices.

For additional information on MLCommons and details on becoming a Member or Affiliate of the organization, please visit http://mlcommons.org and contact participation@mlcommons.org.

MLCommons announces a new AI Safety working group. Comprised of experts across industry and academia, they are building a platform & pool of tests for industry-std AI safety testing. https://mlcommons.org/en/news/formation-ai-safety-working-group/

Contacts

Stock Quote API & Stock News API supplied by www.cloudquote.io
Quotes delayed at least 20 minutes.
By accessing this page, you agree to the following
Privacy Policy and Terms Of Service.


 

IntelligentValue Home
Close Window

DISCLAIMER

All content herein is issued solely for informational purposes and is not to be construed as an offer to sell or the solicitation of an offer to buy, nor should it be interpreted as a recommendation to buy, hold or sell (short or otherwise) any security.  All opinions, analyses, and information included herein are based on sources believed to be reliable, but no representation or warranty of any kind, expressed or implied, is made including but not limited to any representation or warranty concerning accuracy, completeness, correctness, timeliness or appropriateness. We undertake no obligation to update such opinions, analysis or information. You should independently verify all information contained on this website. Some information is based on analysis of past performance or hypothetical performance results, which have inherent limitations. We make no representation that any particular equity or strategy will or is likely to achieve profits or losses similar to those shown. Shareholders, employees, writers, contractors, and affiliates associated with ETFOptimize.com may have ownership positions in the securities that are mentioned. If you are not sure if ETFs, algorithmic investing, or a particular investment is right for you, you are urged to consult with a Registered Investment Advisor (RIA). Neither this website nor anyone associated with producing its content are Registered Investment Advisors, and no attempt is made herein to substitute for personalized, professional investment advice. Neither ETFOptimize.com, Global Alpha Investments, Inc., nor its employees, service providers, associates, or affiliates are responsible for any investment losses you may incur as a result of using the information provided herein. Remember that past investment returns may not be indicative of future returns.

Copyright © 1998-2017 ETFOptimize.com, a publication of Optimized Investments, Inc. All rights reserved.