- Future of Computing Newsletter
- Posts
- š„š¦¾ What DeepSeek-V3 Teaches Us About Efficient AI Infrastructure
š„š¦¾ What DeepSeek-V3 Teaches Us About Efficient AI Infrastructure
A Newsletter for Entrepreneurs, Investors, and Computing Geeks
We are back with our weekly newsletter! And weāre joined by Emily, a VC with a passion for computing, who will be helping us curate this newsletter going forward.
This weekās deep dive looks at DeepSeek-V3 and the practical lessons it offers for building large-scale AI systems efficiently. We also highlight a breakthrough in quantum computing and share curated news across AI, quantum, photonics, neuromorphic, and infrastructure, plus key readings and funding. Plus a bonus section with different perspectives on the US vs. China tech race.
Finally, thanks to everyone who joined the Future of Computing Conference in Berlin last June ā weāre in active preparations for the next edition in Paris on November 6 (more on that soon!)
Deep Dive: What DeepSeek-V3 Teaches Us About Efficient AI Infrastructure
Research Paper: āInsights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architecturesā (ISCA '25, DeepSeek-AI)
Summary:
The team at DeepSeek-AI has managed to train a top-tier open-source large language model (DeepSeek-V3) using just 2,048 NVIDIA H800 GPUs. Instead of relying on brute-force scaling, they focused on a tight integration between model architecture and hardware. The result is a highly efficient system that challenges the idea that only Big Tech can play in the large-model arena.
Key Takeaways:
Smarter Attention = Less Memory Use
DeepSeek-V3 uses a method called Multi-head Latent Attention (MLA) to compress memory usage during inference. This reduces the memory needed per token by up to 85 percent compared to models like LLaMA-3, which is crucial for handling long-context inputs efficiently.Efficient Scaling with Sparse Models
Thanks to a Mixture of Experts (MoE) architecture, the model activates only a small subset of its 671 billion parameters at any time. This reduces compute costs while maintaining performance, making the model more practical for on-premises or personalized use.Trained with simplified numerical formats
To make training more efficient, DeepSeek-V3 uses a compact type of number representation known as FP8 precision. These simplified formats occupy significantly less space than traditional ones, which helps reduce memory usage and speed up computation, all with minimal impact on model quality.Faster Text Generation with Multi-Token Prediction
Instead of generating one token at a time, the model predicts several in parallel and verifies them on the fly. This approach boosts generation speed by up to 1.8 times in real-world scenarios.Networking Matters More Than Ever
The team redesigned the GPU network using a multi-plane topology to reduce latency and keep infrastructure costs down. They also optimized token routing to avoid communication bottlenecks.
Why It Matters:
DeepSeek-V3 is a strong example of what becomes possible when models and infrastructure are designed together. As GPU availability tightens and energy costs rise, this kind of smart engineering may help smaller players stay competitive.
Another very relevant analysis of DeepSeek, looking at different aspects than those mentioned above: DeepSeek Debrief: >128 Days Later (SemiAnalysis)
Spotlight
āļø Quantum computers just beat classical ones ā Exponentially and unconditionally (ScienceDaily)
āA research team has achieved the holy grail of quantum computing: an exponential speedup thatās unconditional. By using clever error correction and IBMās powerful 127-qubit processors, they tackled a variation of Simonās problem, showing quantum machines are now breaking free from classical limitations, for real.ā
Headlines
Last weekās headlines span major milestones in AI, quantum, photonics, and neuromorphic computing, plus growing concerns around the energy and water footprint of data centers.
š¤ AI
𦾠Semiconductors
āļø Quantum Computing
Could āQuantum Sensingā Make Stealth Technology Obsolete? DARPA Thinks So (The National Interest)
EU Presses for Quantum-Safe Encryption by 2030 as Risks Grow (The Quantum Insider)
OQC and Kvantify announce new InnovateUK project, CoaxChem, to develop novel quantum solution to batteries (OQC, Oxford Quanum Circuits)
Xanadu & Mitsubishi Chemical Boost Chip Tech with Quantum Computing (Quantum Zeitgeist)
ā”ļø Photonic / Optical Computing
Researchers Build 11-Mile-Long Quantum Highway Using Photons (SciTechDaily)
š§ Neuromorphic Computing
BrainChip and HaiLa to develop ultra-low power edge AI connectivity for IoT applications (New Electronics)
š„ Data Centers
Googleās data center energy use doubled in 4 years (TechCrunch)
Selected Readings
This weekās reading list spans semiconductor strategies, photonic innovation, and the environmental impact of data centers.
𦾠Semiconductors
Semiconductors Winners And Losers At The Start Of H2 2025: Geopolitical Shifts and Contrarian Plays (AI Invest) (3 mins)
How Oracle Is Winning the AI Compute Market (SemiAnalysis) (14 mins)
Exploring scalable pathways for cost-effective memristors using solution-processed 2D materials (Phys.org) (5 mins)
Inside Texas Instruments' $60bn US Supply Chain Gamble (Supply Chain Digital) (5 mins)
ā”ļø Photonic / Optical Computing
Security layers for neuromorphic photonic accelerators (Open Access Government) (5 mins)
Two-Dimensional Semiconductors Advance Nanophotonics and Future Optoelectronic Devices (Quantum Zeitgeist) (7 mins)
š„ Data Centers
What Google's Environmental Report Says About Data Centres (Data Centre Magazine) (6 mins)
Funding News
Last weekās funding activity highlights momentum across foundational compute technologies, from integrated photonics and quantum error correction to edge HPC and energy-aware AI infrastructure. These enabling technologies are critical to scaling next-generation workloads.
Meanwhile, xAIās $10B raise underscores how capital-intensive the foundation model race has become and how high the stakes are.
š¤ AI
xAI raises $10B in debt and equity (TechCrunch)
āļø Quantum Computing
QEDMA Raises $26 Million With Participation From IBM to Tackle Quantum Computing Errors (The Quantum Insider)
Zerothird: Zehn Millionen Dollar Investment für Wiener Quanten-Startup German only (brutkasten)
ā”ļø Photonic / Optical Computing
EFFECT Photonics Raises $24M to Expand Coherent Optical Solutions for AI & Edge Networks (The Fast Mode)
āļø Cloud Computing
Swiss cloud platform Impossible Cloud Network (ICN) secures ā¬28.8 million to become an alternative to āmonopolistic hyperscalersā (EU-Startups)
š„ Data Centers
PoliCloud raises ā¬7.5m for Edge HPC data center build-out (Data Center Dynamics)
Emerald AI Launches with $24.5M Seed Round to Transform AI Data Centers into Grid Allies (PR Newswire)
Bonus: US vs. China - From Different Perspectives
This section brings together different perspectives on the US vs. China tech race, including takes from US media, Asian outlets, and stock market analysts.
China Is Quickly Eroding Americaās Lead in the Global AI Race (The Wall Street Journal)
China shows off tech progress as US limits chip exports (Tech in Asia)
Why is the US leading in the chip industry? (36kr - a Chinese media company)
Love these insights? Forward this newsletter to a friend or two. They can subscribe here.