Meet South Korea’s LLM Powerhouses: HyperClova, AX, Solar Pro, and More

Meet South Korea’s LLM Powerhouses: HyperClova, AX, Solar Pro, and More


Thank you for reading this post, don't forget to subscribe!

South Korea is rapidly establishing itself as a key innovator in large language models (LLMs), driven by strategic government investments, corporate research, and open-source collaborations to create models tailored for Korean language processing and domestic applications. This focus helps mitigate dependencies on foreign AI technologies, enhances data privacy, and supports sectors like healthcare, education, and telecommunications.

Government-Backed Push for Sovereign AI

In 2025, the Ministry of Science and ICT initiated a 240 billion won program, selecting five consortia—led by Naver Cloud, SK Telecom, Upstage, LG AI Research, and NC AI—to develop sovereign LLMs capable of operating on local infrastructure.

Regulatory advancements include the Ministry of Food and Drug Safety’s guidelines for approving text-generating medical AI, marking the first such framework globally in early 2025.

Corporate and Academic Innovations

SK Telecom introduced AX 3.1 Lite, a 7 billion-parameter model trained from scratch on 1.65 trillion multilingual tokens with a strong Korean emphasis. It achieves approximately 96% performance on KMMLU2 for Korean language reasoning and 102% on CLIcK3 for cultural understanding relative to larger models, and is available open-source on Hugging Face for mobile and on-device application.

Naver advanced its HyperClova series with HyperClova X Think in June 2025, enhancing Korean-specific search and conversational capabilities.

Upstage’s Solar Pro 2 stands as the sole Korean entry on the Frontier LM Intelligence leaderboard, demonstrating efficiency in matching performance of much larger international models.

LG AI Research launched Exaone 4.0 in July 2025, which performs competitively in global benchmarks with a 30 billion-parameter design.

Seoul National University Hospital developed Korea’s first medical LLM, trained on 38 million de-identified clinical records, scoring 86.2% on the Korean Medical Licensing Examination compared to the human average of 79.7%.

Mathpresso and Upstage collaborated on MATH GPT, a 13 billion-parameter small LLM that surpasses GPT-4 in mathematical benchmarks with 0.488 accuracy versus 0.425, using significantly less computational resources.

Open-source initiatives like Polyglot-Ko (ranging from 1.3 to 12.8 billion parameters) and Gecko-7B address gaps by continually pretraining on Korean datasets to handle linguistic nuances such as code-switching.

Korean developers emphasize efficiency, optimizing token-to-parameter ratios inspired by Chinchilla scaling to enable 7 to 30 billion-parameter models to compete with larger Western counterparts despite constrained resources.

Domain-specific adaptations yield superior results in targeted areas, as seen in the medical LLM from Seoul National University Hospital and MATH GPT for mathematics.

Progress is measured through benchmarks including KMMLU2, CLIcK3 for cultural relevance, and the Frontier LM leaderboard, confirming parity with advanced global systems.

Market Outlook

The South Korean LLM market is forecasted to expand from 182.4 million USD in 2024 to 1,278.3 million USD by 2030, reflecting a 39.4% compound annual growth rate, primarily fueled by chatbots, virtual assistants, and sentiment analysis tools. Integration of edge-computing LLMs by telecom firms supports reduced latency and enhanced data security under initiatives like the AI Infrastructure Superhighway.

South Korean Large Language Models Mentioned

#ModelDeveloper / Lead InstitutionParameter CountNotable Focus1AX 3.1 LiteSK Telecom7 billionMobile and on-device Korean processing2AX 4.0 LiteSK Telecom72 billionScalable sovereign applications3HyperClova X ThinkNaver~204 billion (est.)Korean search and dialogue4Solar Pro 2Upstage~30 billion (est.)General efficiency on global leaderboards5MATH GPTMathpresso + Upstage13 billionMathematics specialization6Exaone 4.0LG AI Research30 billionMultimodal AI capabilities7Polyglot-KoEleutherAI + KIFAI1.3 to 12.8 billionKorean-only open-source training8Gecko-7BBeomi community7 billionContinual pretraining for Korean9SNUH Medical LLMSeoul National University Hospitalundisclosed (~15B est.)Clinical and medical decision support

These developments highlight South Korea’s approach to creating efficient, culturally relevant AI models that strengthen its position in the global technology landscape.

Sources:

https://www.cnbc.com/2025/08/08/south-korea-to-launch-national-ai-model-in-race-with-us-and-china.html

https://www.forbes.com/sites/ronschmelzer/2025/07/16/sk-telecom-releases-a-korean-sovereign-llm-built-from-scratch/

https://www.kjronline.org/pdf/10.3348/kjr.2025.0257

SK Telecom launches Korean-specific AI model as open source

https://huggingface.co/skt/A.X-3.1-Light

https://www.koreaherald.com/article/10554340

http://www.mobihealthnews.com/news/asia/seoul-national-university-hospital-builds-korean-medical-llm

https://www.chosun.com/english/industry-en/2024/05/03/67DRPIFMXND4NEYXNFJYA7QZRA/

https://huggingface.co/blog/amphora/navigating-ko-llm-research-1

https://www.grandviewresearch.com/horizon/outlook/large-language-model-market/south-korea

Michal Sutter is a data science professional with a Master of Science in Data Science from the University of Padova. With a solid foundation in statistical analysis, machine learning, and data engineering, Michal excels at transforming complex datasets into actionable insights.



Source link