Modern Australian
Times Advertising

how a small Chinese AI company is shaking up US tech heavyweights

  • Written by Tongliang Liu, Associate Professor of Machine Learning and Director of the Sydney AI Centre, University of Sydney

Chinese artificial intelligence (AI) company DeepSeek has sent shockwaves through the tech community, with the release of extremely efficient AI models that can compete with cutting-edge products from US companies such as OpenAI and Anthropic.

Founded in 2023, DeepSeek has achieved its results with a fraction of the cash and computing power of its competitors.

DeepSeek’s “reasoning” R1 model, released last week, provoked excitement among researchers, shock among investors, and responses from AI heavyweights. The company followed up on January 28 with a model that can work with images as well as text.

So what has DeepSeek done, and how did it do it?

What DeepSeek did

In December, DeepSeek released its V3 model. This is a very powerful “standard” large language model that performs at a similar level to OpenAI’s GPT-4o and Anthropic’s Claude 3.5.

While these models are prone to errors and sometimes make up their own facts, they can carry out tasks such as answering questions, writing essays and generating computer code. On some tests of problem-solving and mathematical reasoning, they score better than the average human.

V3 was trained at a reported cost of about US$5.58 million. This is dramatically cheaper than GPT-4, for example, which cost more than US$100 million to develop.

DeepSeek also claims to have trained V3 using around 2,000 specialised computer chips, specifically H800 GPUs made by NVIDIA. This is again much fewer than other companies, which may have used up to 16,000 of the more powerful H100 chips.

On January 20, DeepSeek released another model, called R1. This is a so-called “reasoning” model, which tries to work through complex problems step by step. These models seem to be better at many tasks that require context and have multiple interrelated parts, such as reading comprehension and strategic planning.

The R1 model is a tweaked version of V3, modified with a technique called reinforcement learning. R1 appears to work at a similar level to OpenAI’s o1, released last year.

DeepSeek also used the same technique to make “reasoning” versions of small open-source models that can run on home computers.

This release has sparked a huge surge of interest in DeepSeek, driving up the popularity of its V3-powered chatbot app and triggering a massive price crash in tech stocks as investors re-evaluate the AI industry. At the time of writing, chipmaker NVIDIA has lost around US$600 billion in value.

How DeepSeek did it

DeepSeek’s breakthroughs have been in achieving greater efficiency: getting good results with fewer resources. In particular, DeepSeek’s developers have pioneered two techniques that may be adopted by AI researchers more broadly.

The first has to do with a mathematical idea called “sparsity”. AI models have a lot of parameters that determine their responses to inputs (V3 has around 671 billion), but only a small fraction of these parameters is used for any given input.

However, predicting which parameters will be needed isn’t easy. DeepSeek used a new technique to do this, and then trained only those parameters. As a result, its models needed far less training than a conventional approach.

The other trick has to do with how V3 stores information in computer memory. DeepSeek has found a clever way to compress the relevant data, so it is easier to store and access quickly.

Computer screen with deepseek logo and the words into the unknown underneath it.
DeepSeek has shaken up the multi-billion dollar AI industry. Robert Way/Shutterstock

What it means

DeepSeek’s models and techniques have been released under the free MIT License, which means anyone can download and modify them.

While this may be bad news for some AI companies – whose profits might be eroded by the existence of freely available, powerful models – it is great news for the broader AI research community.

At present, a lot of AI research requires access to enormous amounts of computing resources. Researchers like myself who are based at universities (or anywhere except large tech companies) have had limited ability to carry out tests and experiments.

More efficient models and techniques change the situation. Experimentation and development may now be significantly easier for us.

For consumers, access to AI may also become cheaper. More AI models may be run on users’ own devices, such as laptops or phones, rather than running “in the cloud” for a subscription fee.

For researchers who already have a lot of resources, more efficiency may have less of an effect. It is unclear whether DeepSeek’s approach will help to make models with better performance overall, or simply models that are more efficient.

Authors: Tongliang Liu, Associate Professor of Machine Learning and Director of the Sydney AI Centre, University of Sydney

Read more https://theconversation.com/deepseek-how-a-small-chinese-ai-company-is-shaking-up-us-tech-heavyweights-248434

Interstate Car Transporter Urges Buyers to Book Early

As the conflict in the Middle East continues to put increasing pressure on local fuel supply, Australian transport companies are experiencing increasi...

Digital Minimalism for Business Owners: Fewer Tools, Better Systems

Be honest. How many apps are open right now? One for scheduling, another for invoices, a third for customer notes, plus a spreadsheet someone email...

The Importance Of Proactive NDIS Renewal Preparation For Sustaining Your Provider Business

Your NDIS renewal notice is not a signal to start preparing. By the time it arrives, preparation should already be well underway. For new providers, s...

Why Fire Extinguisher Testing in Sydney Is Becoming a Records Game, Not Only a Maintenance Job

A fire extinguisher used to feel like one of the simpler parts of building safety. It hung on the wall, wore a service tag, and sat there quietly unle...

The Switchboard Upgrade Question Every Melbourne Renovator Should Ask Before the Walls Close Up

Renovations have a funny way of making people think on surfaces first. Splashback, stone, joinery, tapware, paint. Fair enough too. That is the exciti...

Winter Sanitation Gaps in Parramatta Kitchens: A Hidden Pest Risk

Winter brings a host of changes to our homes, from the chill in the air to the cozy warmth indoors. However, this season also introduces sanitation ch...

When to Seek Advice from Employment Lawyers in Melbourne

Australian employment law is detailed and, at times, complex, with rights and obligations that aren't always obvious to employees or employers witho...

7 Benefits of Professional Gutter Cleaning for Australian Homeowners

Gutters aren't exactly glamorous. They sit up there on the edge of your roof, doing their job quietly - until they stop working. Clogged, overflowing ...

Pipe Floats Strengthening Pipeline Performance In Demanding Environments

Pipelines often travel through environments that are anything but predictable, water currents shift, terrain changes, and materials keep moving unde...

Why Ceiling Fans Are Essential For Comfort, Efficiency, And Modern Living

Creating a comfortable indoor environment is not just about temperature; it is about how air moves, how a room feels, and how efficiently energy is ...

Why Duct Cleaning In Melbourne Is A Smart Investment For Healthier Living Spaces

Behind your walls, ceilings, and vents lies a network quietly working every day to keep your home comfortable. Yet over time, this system can become...

Disability Service Providers Supporting Inclusive And Independent Living

Finding the right support system can feel like assembling a puzzle where every piece must fit just right. For individuals and families navigating di...

A Beginner's Guide to Owning a Caravan in Australia

Owning a caravan opens up a style of travel that's hard to match for freedom and flexibility. However, for those just starting out, the process of c...

Preparing Your Air Conditioner for Summer: What Most Homeowners Overlook

As temperatures rise, many homeowners switch on their air conditioning for the first time in months — only to find it’s not performing the way i...

What Actually Adds Value to Properties in Newcastle

Newcastle has seen steady growth over the past few years, with more buyers looking beyond Sydney for lifestyle, space, and long-term value. As dema...

What is Design and Build in Construction?

Imagine you’re about to start a new construction project, maybe it’s a custom home or a commercial building. You’ve got the idea, the land, an...

Commercial roof leak detection: why early action protects your building

Water ingress is one of the most disruptive and costly issues facing commercial properties. For property managers and facilities teams, even a minor...

Custom Photo Frames: Turning Everyday Moments into Lasting Displays

Photos capture moments, but how you display them determines how they’re experienced every day. A meaningful photograph deserves more than a generi...