China DeepSeek Artificial Intelligence Model Shocks the AI World

Related stories

Akpabio acted within the province of the law: A case for decorum in legislative proceedings, By Yusuf Ali

The controversy surrounding Senate President Godswill Akpabio’s recent action...

Osun LG Crisis: AGF Fagbemi, Adeleke In Heated Exchanges

 Fagbemi: "You Can’t Conduct Fresh LG Elections"  Adeleke:...

By Sonny Iroche

After completing my Senior Academic Fellowship at the African Studies Centre at the University of Oxford in 2023, I decided to pursue a one-year postgraduate program in Artificial Intelligence for Business at the Saïd Business School of the University of Oxford. This decision has significantly intensified my interest in AI, prompting me to author several articles on the subject.

In my piece titled “The Race for AI Supremacy,” published in Thisday newspapers on May 2, 2024, I noted, “AI has emerged as a transformative force in the modern world, revolutionizing industries and reshaping economies. In the rapidly evolving AI landscape, leading nations such as the USA, China, the UK, India, the EU, and Israel have made substantial advancements in AI development. Consequently, the regulation of AI has become essential to ensure its responsible and ethical application…”.

The launch of ChatGPT by OpenAI in November 2022 positioned the USA as a clear frontrunner in the AI arena, especially alongside the major chip manufacturer, Nvidia. The foremost AI companies are primarily based in the USA. However, this status quo seems poised for a shift beginning in January 2025, when the AI community was taken by surprise with the introduction of DeepSeek, an obscure and hitherto unknown Chinese firm, and its R1 model, an Open Source platform. On the debut of DeepSeek R1 model, Nvidia lost about $500billion of its stock valuation, while the NYSE lost over 17% of its share value, ever one-day loss in the history of the Stock Market.

In February 1st-7th 2025 edition of The Economist, it was reported that: “….DeepSeek’s origins lie in an effort to improve High-Flyer’s algorithms. In 2019 the firm invested 200million Yuan to set up a separate unit to develop its own deep-learning platform, called “High-Flyer 1”. The fund poured 1billion Yuan into the effort in 2021 in order to launch a second iteration armed with 10,000 of Nvidia’s A100 graphics- processing units. This made High-Flyer an outlier: at the time just four other firms in China held such large arsenals of powerful chips, all of which were tech giants such as Alibaba. DeepSeek was made a standalone company in 2023.
It delivered its first jolt to the market in May last year, when it released an ultra cheap chatbot based on its V2 model”.
“…DeepSeek’s new R1 model, which has shocked the West, suggests it is making progress. The company says it cost less than $6 million to train a tiny fraction of comparable models from firms such as OpenAI, maker of ChatGPT. Sam Altman, OpenAI’s boss, has called R1 “impressive”. There is also speculation that DeepSeek has trained its models by studying the results of American ones, a process known as “distillation”. OpenAI has said it has evidence that point to DeepSeek distilling its models, in violation of its terms of service”.
Such claims have been dismissed by some AI analysts, as a case of sour grapes.
I will not join that school of thoughts of those who quickly dismiss the accusations of OpenAI, neither would I lend credence to the accusations, I will rather opine that if OpenAI has any substantial evidence or proof to that effect, it should seek redress in a court of competent jurisdiction.

Now a number of non-AI people may be wondering what distillation is all about.
Let me try to explain what it is in as simplest of terms that one could.

Distillation is a powerful technique in AI design and development that allows practitioners to leverage the strengths of large models while creating more efficient alternatives. By understanding and effectively implementing the distillation process, developers can build models that maintain high accuracy while being suitable for real-world applications where computational and financial resources are a concern.

Distillation in AI design and development can be likened to a process where a smaller, more efficient model (often called a “student” model) is trained to replicate the performance of a larger, more complex model (known as the “teacher” model). This technique is especially useful in scenarios where deploying large models is impractical due to resource constraints, such as memory, computation power, or latency requirements.
Below is a detailed description of the distillation process and its various aspects:

• Understanding the Models

– Teacher Model: This is typically a large, high-capacity model (like a deep neural network) that has been trained on a specific task and achieves high performance metrics. It captures complex patterns and relationships in the data.

– Student Model: This is a smaller, more efficient model that aims to approximate the performance of the teacher model while being less resource-intensive. The student model can be a smaller neural network or a different architecture altogether.

• Data Preparation

– Dataset Selection: The dataset used for distillation should ideally be the same or similar to the one used to train the teacher model. This ensures that the student model learns from the same data distribution.

– Input Processing: Data preprocessing steps (normalization, augmentation, etc.) are applied to the input data to maintain consistency between the teacher and student models.

• Output Generation from the Teacher Model

– Soft Targets: Instead of using the hard labels (e.g., class labels) from the original dataset for training the student, the outputs (predictions) of the teacher model are used. These outputs often include probabilities for each class, which provide more nuanced information about the data distribution.

– Logits Extraction: The logits (raw output scores before applying softmax) from the teacher model can be used to generate soft targets, which provide richer information about class relationships and can help the student model generalize better.

• Training the Student Model

– Loss Function: The training process typically involves a modified loss function. A common approach is to use a combination of two losses:
– Distillation Loss: This measures how closely the student model’s output matches the teacher model’s soft targets. It often uses Kullback-Leibler (KL) divergence or cross-entropy loss.
– Hard Target Loss: This measures how well the student model predicts the true labels from the dataset. This is typically a standard cross-entropy loss.

– Temperature Scaling: A temperature parameter is used during the softmax computation of the teacher model’s outputs. Higher temperatures soften the probability distribution, allowing the student model to learn from the relative differences between classes rather than absolute probabilities.

– Training Procedure: The student model is trained using the combined loss function, iteratively updating its weights to minimize this loss. The training process may involve techniques such as backpropagation and gradient descent.

• Evaluation and Fine-tuning

– Performance Evaluation: After training, the student model is evaluated on a validation dataset to assess its performance. Key metrics might include accuracy, precision, recall, and F1 score.

– Hyperparameter Tuning: Based on evaluation results, hyperparameters (like learning rate, batch size, etc.) can be adjusted to improve the student’s performance.

– Fine-Tuning: In some cases, additional fine-tuning of the student model may be performed to further enhance its performance on specific tasks or datasets.

• Deployment

– Model Compression: The trained student model is often more compact and faster to execute than the teacher model, making it suitable for deployment in environments with limited resources, such as mobile devices or edge computing.

– Inference Optimization: Techniques like quantization and pruning can be applied to further optimize the student model for inference, reducing memory footprint and increasing speed.

• Monitoring and Iteration

– Real-World Performance Monitoring: Once deployed, the performance of the student model should be monitored in real-world applications. Any drift in data distribution or performance can necessitate re-training or further distillation.

– Iterative Improvement: The distillation process can be iteratively refined by using feedback from real-world performance, adjusting the architecture of the student model, or retraining with updated datasets.

Having briefly highlighted the AI supremacy race between the USA and China, it will be interesting to focus attention on AI development in Africa by looking the state of the subject matter in the context of Africa.

The State of AI Preparedness and Development in Africa: Challenges and Opportunities

The potential of Artificial Intelligence (AI) to transform economies and countries is immense, yet many African nations, particularly the leading economies on the continent such as Nigeria, South Africa, Morocco, Egypt, Algeria, Kenya, Ethiopia, Ivory Coast, and Rwanda, face significant challenges in their preparedness and adaptation of AI. The extreme lack of resources, infrastructural deficits, unreliable data, and inadequate computational power severely hinder the development and adoption of AI technologies on the continent. Let me delve into some of these challenges, highlights the current state of AI capabilities in these countries, and suggests how African nations can leverage their limited pool of trained professionals to drive an AI revolution. A few of the African countries, Egypt, Morocco, and Algeria may be better equipped than the others for an AI revolution, but their inputs in world AI technologies have been negligible, to say the least.

Theses are some of the Challenges to AI Development in Africa.

First; Lack of Resources:
Many African countries struggle with limited financial resources that can be allocated to AI research and development. This lack of funding affects various levels of AI initiatives, from academic research to the establishment of startups focused on AI solutions. Governments often prioritize immediate socio-economic challenges—such as healthcare, education, and infrastructure—over long-term investments in technology. As a result, AI initiatives often lack the necessary financial backing to thrive.

Secondly, Inadequate Infrastructure:
The infrastructural deficit in many African countries poses a significant barrier to AI development. Reliable electricity, high-speed internet, and modern computing facilities are prerequisites for AI research and deployment. Unfortunately, many regions still experience frequent power outages and have limited access to the internet, which hampers the ability to conduct data-intensive AI research or run complex algorithms effectively. Without a robust infrastructure, the potential for leveraging AI to address local challenges is severely diminished.

Thirdly; Data Reliability and Availability:
High-quality, reliable data is the lifeblood of AI systems. However, many African countries suffer from a lack of comprehensive data collection mechanisms, resulting in poor data quality and availability. Government databases may be underdeveloped, and private sector data collection may not be standardized or systematically managed. This lack of reliable data significantly hampers the training of AI models, which require vast amounts of high-quality data to function effectively.

Fourthly; Limited Compute Power:
AI development relies heavily on advanced computational power, often provided through powerful GPUs and cloud computing resources. Many African countries lack access to these essential technologies, limiting their ability to develop sophisticated AI algorithms and conduct meaningful research. The high costs associated with acquiring cutting-edge computing infrastructure further exacerbate these challenges, making it difficult for local researchers and startups to compete on a global scale.

Fifthly; Insufficient Support for AI Research:
The absence of a supportive ecosystem for AI research—comprising funding, mentorship, and collaboration opportunities—hinders the growth of AI capabilities. Research institutions often lack the necessary frameworks to attract and retain talent, resulting in brain drain as trained professionals migrate to countries with better opportunities. This issue is particularly pronounced in leading economies like Nigeria and South Africa, where despite having a pool of talent, the environment for research and innovation can be stifling.

How can Africa scale its AI involvement?

Leveraging Trained Professionals
Despite these challenges, Africa has a unique opportunity to capitalize on its growing number of citizens who have been trained in reputable AI research institutions and universities worldwide, such as the University of Oxford, MIT, Imperial College, Cambridge, and the University of Toronto, just to name a few. These individuals possess the skills and knowledge necessary to drive the continent’s AI initiatives forward. Here are ways to leverage this talent pool:

Encouraging Return Migration:
African governments can create attractive conditions for trained professionals to return and contribute to local AI ecosystems. This could involve offering competitive salaries, research grants, and tax incentives for those who establish AI startups or collaborate with local universities and research institutes.

Building Academic Partnerships:
Collaboration between local universities and top-tier institutions abroad can facilitate knowledge transfer and create research opportunities. Programs that allow for exchange visits, joint research projects, and workshops can enhance the skill sets of local researchers and elevate the quality of AI research conducted in Africa.

Establishing AI Incubators and Hubs:
Creating innovation hubs and incubators that specifically focus on AI can foster collaboration among researchers, startups, and government agencies. These hubs can provide the necessary resources, mentorship, and networking opportunities to accelerate AI development and support the commercialization of innovative solutions.

Government Support and Policy Frameworks:
The governments of leading economies like Nigeria, South Africa, Morocco, Algeria, Kenya, and Ivory Coast must develop comprehensive policies that prioritize AI development. This includes investing in research funding, establishing regulatory frameworks that encourage innovation, and creating public-private partnerships to facilitate AI projects.

Focusing on Local Challenges:
By directing AI research towards solving local challenges—such as healthcare delivery, agriculture, security, banking & financial services, and urban planning—African nations can ensure that AI technologies are relevant and beneficial to their populations. This localized approach can also attract funding and support from international organizations and investors interested in impactful projects.

The state of AI preparedness in many African countries, particularly among the leading economies in the region is characterized by significant challenges stemming from inadequate resources, infrastructure deficits, unreliable data, and limited computational power. However, by leveraging the skills of trained professionals and fostering an environment conducive to AI research and development, these nations can position themselves at the forefront of the AI revolution. With strategic investments, supportive policies, and a focus on local challenges, Africa can harness AI to drive economic growth, enhance public services, and improve the quality of life for millions. The potential is vast, but realizing it will require concerted efforts and collaboration between governments, academia, and the private sector.

NB: Sonny Iroche is a Senior Academic Fellow, African Studies Centre 2022-2023 and
Post Graduate AI for Business
Saïd Business School.
University of Oxford. UK
LinkedIn: http://linkedin.com/in/sonnyiroche

The New Diplomat
The New Diplomathttps://newdiplomatng.com/
At The New Diplomat, we stand for ethical journalism, press freedom, accountable Republic, and gender equity. That is why at The New Diplomat, we are committed to speaking truth to power, fostering a robust community of responsible journalism, and using high-quality polls, data, and surveys to engage the public with compelling narratives about political, business, socio-economic, environmental, and situational dynamics in Nigeria, Africa, and globally.

Subscribe

- Never miss a story with notifications

- Gain full access to our premium content

- Browse free from up to 5 devices at once

[tds_leads input_placeholder="Your email address" btn_horiz_align="content-horiz-center" pp_msg="SSd2ZSUyMHJlYWQlMjBhbmQlMjBhY2NlcHQlMjB0aGUlMjAlM0NhJTIwaHJlZiUzRCUyMiUyMyUyMiUzRVByaXZhY3klMjBQb2xpY3klM0MlMkZhJTNFLg==" pp_checkbox="yes" tdc_css="eyJhbGwiOnsibWFyZ2luLXRvcCI6IjMwIiwibWFyZ2luLWJvdHRvbSI6IjQwIiwiZGlzcGxheSI6IiJ9LCJwb3J0cmFpdCI6eyJtYXJnaW4tdG9wIjoiMTUiLCJtYXJnaW4tYm90dG9tIjoiMjUiLCJkaXNwbGF5IjoiIn0sInBvcnRyYWl0X21heF93aWR0aCI6MTAxOCwicG9ydHJhaXRfbWluX3dpZHRoIjo3NjgsImxhbmRzY2FwZSI6eyJtYXJnaW4tdG9wIjoiMjAiLCJtYXJnaW4tYm90dG9tIjoiMzAiLCJkaXNwbGF5IjoiIn0sImxhbmRzY2FwZV9tYXhfd2lkdGgiOjExNDAsImxhbmRzY2FwZV9taW5fd2lkdGgiOjEwMTksInBob25lIjp7Im1hcmdpbi10b3AiOiIyMCIsImRpc3BsYXkiOiIifSwicGhvbmVfbWF4X3dpZHRoIjo3Njd9" display="column" gap="eyJhbGwiOiIyMCIsInBvcnRyYWl0IjoiMTAiLCJsYW5kc2NhcGUiOiIxNSJ9" f_msg_font_family="downtown-sans-serif-font_global" f_input_font_family="downtown-sans-serif-font_global" f_btn_font_family="downtown-sans-serif-font_global" f_pp_font_family="downtown-serif-font_global" f_pp_font_size="eyJhbGwiOiIxNSIsInBvcnRyYWl0IjoiMTEifQ==" f_btn_font_weight="700" f_btn_font_size="eyJhbGwiOiIxMyIsInBvcnRyYWl0IjoiMTEifQ==" f_btn_font_transform="uppercase" btn_text="Unlock All" btn_bg="#000000" btn_padd="eyJhbGwiOiIxOCIsImxhbmRzY2FwZSI6IjE0IiwicG9ydHJhaXQiOiIxNCJ9" input_padd="eyJhbGwiOiIxNSIsImxhbmRzY2FwZSI6IjEyIiwicG9ydHJhaXQiOiIxMCJ9" pp_check_color_a="#000000" f_pp_font_weight="600" pp_check_square="#000000" msg_composer="" pp_check_color="rgba(0,0,0,0.56)" msg_succ_radius="0" msg_err_radius="0" input_border="1" f_unsub_font_family="downtown-sans-serif-font_global" f_msg_font_size="eyJhbGwiOiIxMyIsInBvcnRyYWl0IjoiMTIifQ==" f_input_font_size="eyJhbGwiOiIxNCIsInBvcnRyYWl0IjoiMTIifQ==" f_input_font_weight="500" f_msg_font_weight="500" f_unsub_font_weight="500"]

Latest stories

Latest News
NDDC Chair, Chiedu Ebie Pens Tribute To E.K Clark, Says His Passing Is A Huge Loss to Ijaw NationExcitement As UEFA Unveils Round of 16 Draw For Champions League, Europa LeagueAkpabio acted within the province of the law: A case for decorum in legislative proceedings, By Yusuf AliWar Over Wigwe’s Assets Rages As Shyngle Wigwe, Others Approach Appeal Court, Protest High Court RulingOsun LG Crisis: AGF Fagbemi, Adeleke In Heated ExchangesCardoso: Why CBN Halted Further Increment, Maintained Interest Rate at 27.5%Aborted Third Republic: Tinubu Speaks On How He Confronted Babangida In 1993Kemi Badenoch Speaks Again, Faults Trump: “Zelensky Democratically Elected, Not A Dictator.”Tackling Nigeria’s Multidimensional Challenges for a Prosperous FutureAir Mishap Again! Two Perish As Two Planes Collide Midair In Arizona, United States[VIDEO] Drama As Akpabio, Natasha Clash Again Over Seat ArrangementTompolo Mourns Demise Of E.K Clark, Says He Was A Pan-Niger Delta NationalistYou Didn’t See It Happen!, By Johnson BabalolaIBB's Autobiography: "Don't Be Worried About Critics, Expect Good, Bad, Ugly Reactions..." , Obasanjo Tells IBBBabangida Opens Up: How Abiola Won June 12, 1993 Presidential Election
X whatsapp