Pointed out that 'the reproducibility of nearly 100 papers on AI is lost due to OpenAI's policy'
OpenAI , an AI research group, announced in March 2023 that it will end support for the AI system `` Codex '' that automatically outputs code from the input natural language. In response, AI researchers Sayash Kapoor and Professor Arvind Narayanan , who are enrolled in a doctoral course at Princeton University in the United States, said, ``With the end of support for Codex, which is used in about 100 papers, research can be reproduced. You lose your sexuality,' he said.
OpenAI's policies hinder reproducible research on language models
Codex developed by OpenAI is an enhanced version of the source code completion AI tool ` ` GitHub Copilot '' built and released in partnership with GitHub in July 2021, and can interpret natural language and output appropriate code. Codex is not open source, unlike other AI models released by OpenAI, so users who wanted to use it had to apply to OpenAI for access to the model.
However, on March 20, 2023, local time, OpenAI sent an email to users notifying them that ``Codex support will end on March 23.''
OAI will discontinue support for Codex models starting March 26. And just like that, all papers and ideas built atop codex (> 200 on ArXiv) will not be replicable or usable as is. Do we still think openness doesn't matter? pic. twitter.com/CEzBgdP1ps— Delip Rao ???? (@deliprao) March 21, 2023
Codex is used in about 100 academic papers, so if OpenAI ends support for Codex and users can no longer access it, the reproducibility of these academic papers will be lost. Also, the period from notification to service termination is less than a week, which is extremely short compared to general software practices.
“Independent researchers will not be able to assess the validity of papers and build on their findings,” Kapoor et al. We can no longer guarantee that the application will continue to work as expected.'
In language model studies, the exact model used in the study must be accessible to ensure reproducibility, as slight changes in the model can affect the results. If research results cannot be reproduced with only access to the new model, it will be impossible to determine whether it is due to model differences or whether the research itself is flawed.
The reproducibility of research results by others is important in ensuring the accuracy of scientific research, but in recent years the decline in reproducibility in scientific research has been viewed as a problem.
The 'reproducibility' of science is at stake - GIGAZINE
In response to the feedback we received, OpenAI launched a program to continue supporting access to the Codex for researchers.
Thanks for all the feedback on this, we will continue to support Codex access via our Researcher Access Program. If you are not already part of it, we encourage you to apply in order to maintain access to Codex: https://t.co /9OwWyR54NE—Logan.GPT (@OfficialLoganK) March 22, 2023
However, Kapoor and others are concerned that the application process for the access program to Codex is opaque and that it is not clear how long access to Codex will be maintained. In addition, OpenAI regularly updates the latest models such as GPT-3.5 and GPT-4, and maintains access to past versions for only three months, which impairs the reproducibility of research using the latest models. It is said that it is. This means that researchers, as well as developers building applications using OpenAI's models, are uncertain whether future models will maintain the functionality of their applications.
Kapoor et al. point out that language models are now an important infrastructure, and that OpenAI's policy of not providing versioned models hurts the reproducibility of research. Although various factors should be considered when open sourcing large-scale language models, he argued that open source language models are an important step to ensure reproducibility of research.
Hacker News, a social news site, commented , ``If you end support for the old model, you should open source it.'' Sometimes it happens.” Comments were received.
OpenAI's policies hinder reproducible research on language models | Hacker News
In addition, when OpenAI released GPT-4, the dataset and training method used for construction were not disclosed. Ilya Satsukivar, chief scientist and co-founder of OpenAI, said, 'If you, like us, believe that AI and AGI, or general artificial intelligence, will become incredibly powerful. If so, it's a pointless and bad idea to open source it, and I think in a few years everyone will see that it's not wise to open source AI.' , and OpenAI is showing a stance to make AI closed.
OpenAI co-founder says ``We were wrong'', a major shift from the dangers of AI to a policy of not opening data-GIGAZINE