Inference Time Compute

OpenAI Presents Research on Inference-Time Compute to Better AI Security

A monthly overview of things you need to know as an architect or aspiring architect. Unlock the full InfoQ experience by logging in! Stay updated with your favorite authors and topics, engage with ...

NextBigFuture

OpenAI Strawberry LLM Reasoning Needs More Compute and Energy for Inference

Jim Fan is one of Nvidia’s senior AI researchers. The shift could be about many orders of magnitude more compute and energy needed for inference that can handle the improved reasoning in the OpenAI ...

14h

How AI Inference Sends Decision Making To The Edge

The next phase of AI infrastructure will not be defined by a single destination called “the cloud” or “the edge.” ...

VentureBeat

Train-to-Test scaling explained: How to optimize your end-to-end AI compute budget for inference

The standard guidelines for building large language models (LLMs) optimize only for training costs and ignore inference costs. This poses a challenge for real-world applications that use ...

24/7 Wall St.

AI Demand Is Outstripping Supply — Even Google Can’t Keep Up

Artificial intelligence has moved beyond proving it works. The challenge today is producing enough computing power to satisfy ...

EDN

Purpose-built AI inference architecture: Reengineering compute design

Over the past several years, the lion’s share of artificial intelligence (AI) investment has poured into training infrastructure—massive clusters designed to crunch through oceans of data, where speed ...

EDN

Analog in-memory compute tackles the AI inference conundrum

An analog in-memory compute chip claims to solve the power/performance conundrum facing artificial intelligence (AI) inference applications by facilitating energy efficiency and cost reductions ...

Why AI Infrastructure Bottlenecks Are Moving Beyond GPUs

The variable most organizations are missing isn’t compute — it’s storage purpose-built for AI context, not just data capacity ...

VentureBeat

Hugging Face shows how test-time scaling helps small language models punch above their weight

In a new case study, Hugging Face researchers have demonstrated how small language models (SLMs) can be configured to outperform much larger models. Their findings show that a Llama 3 model with 3B ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results