OpenAI looks for alternatives to Nvidia to accelerate AI development

OpenAI is looking for additional hardware for the intensified development of its artificial intelligence (AI) algorithms, in particular Generative Pretrained Transformer (GPT) and the related chatbot ChatGPT. Currently, OpenAI appears to be considering several options, many of which would reduce its dependence on Nvidia. Currently, server graphics cards (GPUs) from Nvidia with integrated AI processing units are in high demand, partly because of the attractive associated software, but Nvidia cannot fully meet the high demand at the moment.

According to Reuters news agency, OpenAI has been looking for alternatives since 2022. One possible option would be to develop its own chips that are optimally designed for training extensive language models. Such AI accelerators are designed to be relatively simple compared to conventional processors (CPUs) and GPUs. They do not need to be able to process quite a few differentiated instructions, but primarily need to perform large amounts of matrix multiplication and addition (multiply-accumulate, MAC). Specialized AI accelerators typically feature countless identical compute units with SRAM cache, possibly driven by ARM or RISC V cores. Contract manufacturers such as TSMC are available to develop such chips. OpenAI has apparently already considered acquiring AI chip developers to speed up the process of in-house development. Nevertheless, it would probably still take a few years before such a development would bring effective benefits.

An alternative option would be for OpenAI to consider additional suppliers. Among the major chip manufacturers, AMD with its upcoming Instinct MI300 and Intel with the long-awaited Ponte Vecchio would be worth considering, for example. There are also smaller companies like Cerebras and Graphcore. It is believed that OpenAI is already testing an AI accelerator developed by Microsoft. This underlines the current trend towards more in-house developments, such as Amazon, which has already produced two generations of AI chips (Trainium, Inferentia), or Google with its own Tensor Processing Unit (TPU).

OpenAI is working closely with Microsoft on AI education. Microsoft developed a dedicated supercomputer exclusively for OpenAI in 2020. This originally used 10,000 Nvidia V100 (Volta) GPUs combined with thousands of Epyc processors from AMD. Microsoft last reported 285,000 CPU cores for the supercomputer, presumably from the Zen 2 generation (Epyc 7002, Rome). The V100 was Nvidia’s first GPU with integrated Tensor cores for AI calculations. Since then, Nvidia has developed two more generations (A100, H100), in which Microsoft has invested – an investment that also benefits OpenAI.

Related