The Real Cost of Scaling AI: How Supermicro and NVIDIA Are Rebuilding Data Center Infrastructure

As artificial intelligence models grow larger and more demanding, the quiet pressure point isn’t the algorithms themselves—it’s the AI infrastructure that has to run them. Training and deploying modern AI models now requires enormous amounts of computing power, which creates a different kind of challenge: heat, energy use and space inside data centers. This is the context in which Supermicro and NVIDIA’s collaboration on AI infrastructure begins to matter.

Supermicro designs and builds large-scale computing systems for data centers. It has now expanded its support for NVIDIA’s Blackwell generation of AI chips with new liquid-cooled server platforms built around the NVIDIA HGX B300. The announcement isn’t just about faster hardware. It reflects a broader effort to rethink how AI data center infrastructure is built as facilities strain under rising power and cooling demands.

At a basic level, the systems are designed to pack more AI chips into less space while using less energy to keep them running. Instead of relying mainly on air cooling—fans, chillers and large amounts of electricity, these liquid-cooled AI servers circulate liquid directly across critical components. That approach removes heat more efficiently, allowing servers to run denser AI workloads without overheating or wasting energy.

Why does that matter outside a data center? Because AI doesn’t scale in isolation. As models become more complex, the cost of running them rises quickly, not just in hardware budgets, but in electricity use, water consumption and physical footprint. Traditional air-cooling methods are increasingly becoming a bottleneck, limiting how far AI systems can grow before energy and infrastructure costs spiral.

This is where the Supermicro–NVIDIA partnership fits in. NVIDIA supplies the computing engines—the Blackwell-based GPUs designed to handle massive AI workloads. Supermicro focuses on how those chips are deployed in the real world: how many GPUs can fit in a rack, how they are cooled, how quickly systems can be assembled and how reliably they can operate at scale in modern data centers. Together, the goal is to make high-density AI computing more practical, not just more powerful.

The new liquid-cooled designs are aimed at hyperscale data centers and so-called AI factories—facilities built specifically to train and run large AI models continuously. By increasing GPU density per rack and removing most of the heat through liquid cooling, these systems aim to ease a growing tension in the AI boom: the need for more computers without an equally dramatic rise in energy waste.

Just as important is speed. Large organizations don’t want to spend months stitching together custom AI infrastructure. Supermicro’s approach packages compute, networking and cooling into pre-validated data center building blocks that can be deployed faster. In a world where AI capabilities are advancing rapidly, time to deployment can matter as much as raw performance.

Stepping back, this development says less about one product launch and more about a shift in priorities across the AI industry. The next phase of AI growth isn’t only about smarter models—it’s about whether the physical infrastructure powering AI can scale responsibly. Efficiency, power use and sustainability are becoming as critical as speed.

Keep Reading

Artificial Intelligence

Are Workplace Chats Becoming the Next Layer of AI Memory?

As workplace knowledge spreads across chats, AI firms are building systems that can structure, retrieve and preserve it over time.

Ventureport Media Team

Updated

May 11, 2026 5:24 PM

A messaging app on a phone. PHOTO: ADOBE STOCK

Votee AI, an enterprise AI company headquartered in Hong Kong, has partnered with its Toronto-based research lab Beever AI to launch Beever Atlas. The new platform is designed to turn workplace chats into searchable knowledge that AI systems can retrieve and understand.

The release focuses on a growing issue inside organisations. Much of today’s workplace knowledge now exists inside chat platforms such as Slack, Microsoft Teams, Discord and Telegram. Important discussions, project decisions and technical information often disappear into long message histories that are difficult to search later.

Beever AI developed the platform to organise those conversations into a structured system for AI assistants. The software connects with Telegram, Discord, Mattermost, Microsoft Teams and Slack, then converts conversations into linked records of people, projects, files and decisions.

The collaboration combines Votee AI’s enterprise infrastructure work with Beever AI’s research around AI memory systems. The companies are releasing two versions of the product. The open-source edition is aimed at individual developers, researchers and creators. The enterprise edition is designed for banks, government agencies and larger organisations with stricter security requirements.

The release also reflects a broader shift happening across the AI industry. Companies are increasingly looking at how AI systems store and retrieve long-term knowledge, rather than relying solely on large context windows or search-based retrieval.

Earlier this year, OpenAI founding member and former director of AI at Tesla Andrej Karpathy discussed the growing need for what he described as “LLM Knowledge Bases.” He argued that AI systems need structured and evolving memory rather than depending only on context windows and vector search.

Beever Atlas approaches that problem through workplace communication. Instead of focusing mainly on uploaded files, the system is designed around conversations that happen daily across team chat platforms. It can also process images, PDFs, voice notes and video files within the same searchable system.

The companies say the software is designed to work directly with AI assistants and coding tools such as Cursor, AWS Kiro and Qwen Code. Integrations for OpenClaw and Hermes Agent are expected later in 2026.

Pak-Sun Ting, Co-Founder and CEO of Votee AI said: "Hong Kong has always been known for property and finance. Beever Atlas is proof that world-class AI infrastructure can emerge from an HK-headquartered company and be shared openly with the world. Every growing organization faces the same silent liability: conversational knowledge loss. Beever Atlas turns this perishable resource into a compounding organizational asset."

A large part of the enterprise version focuses on privacy and access control. The system mirrors permissions from Slack and Microsoft Teams so users can only retrieve information they are already authorised to access. Permission updates are reflected automatically when access changes inside company systems.

The enterprise edition also includes audit logs, encryption controls and data retention settings for organisations handling sensitive internal data. Companies can run the software entirely inside their own infrastructure using Docker and connect it to their preferred AI models through LiteLLM.

The companies argue that organising information is more useful than simply storing chat archives. Jacky Chan Co-Founder and CTO of Votee AI said: "The key technical decision was to treat agent memory as a knowledge engineering problem, not a retrieval problem. Structure beats similarity — a typed graph of who works on what is more useful to an AI than vector search over a Slack archive."

The software also includes protections against prompt injection attacks and systems designed to reduce hallucinated responses. According to the companies, the AI is designed to return “I don't know” with citations when confidence is low instead of generating unsupported answers.

As workplace communication becomes increasingly fragmented across chat platforms, companies are beginning to treat internal conversations as information that AI systems can organise, retrieve and build on. Beever Atlas reflects a broader push to turn everyday workplace communication into long-term organisational memory.