A monthly overview of things you need to know as an architect or aspiring architect. Unlock the full InfoQ experience by logging in! Stay updated with your favorite authors and topics, engage with ...
Jim Fan is one of Nvidia’s senior AI researchers. The shift could be about many orders of magnitude more compute and energy needed for inference that can handle the improved reasoning in the OpenAI ...
The next phase of AI infrastructure will not be defined by a single destination called “the cloud” or “the edge.” ...
The standard guidelines for building large language models (LLMs) optimize only for training costs and ignore inference costs. This poses a challenge for real-world applications that use ...
Artificial intelligence has moved beyond proving it works. The challenge today is producing enough computing power to satisfy ...
Over the past several years, the lion’s share of artificial intelligence (AI) investment has poured into training infrastructure—massive clusters designed to crunch through oceans of data, where speed ...
An analog in-memory compute chip claims to solve the power/performance conundrum facing artificial intelligence (AI) inference applications by facilitating energy efficiency and cost reductions ...
The variable most organizations are missing isn’t compute — it’s storage purpose-built for AI context, not just data capacity ...
In a new case study, Hugging Face researchers have demonstrated how small language models (SLMs) can be configured to outperform much larger models. Their findings show that a Llama 3 model with 3B ...