The Learning Curve, Part 4: A New AI Model and an Evolving Language

As Samsung continues to pioneer premium mobile AI experiences, we visit Samsung Research centers around the world to learn how Galaxy AI1 is enabling more users to maximize their potential. Galaxy AI now supports 16 languages, so more people can expand their language capabilities, even when offline, thanks to on-device translation in features such as Live Translate2, Interpreter, Note Assist and Browsing Assist. But what does AI language development involve? Last time we visited Vietnam to learn about preparing the data that is used to train AI models. This time, we’re seeing how teams made Galaxy AI a unique offering for both the Chinese mainland and Hong Kong.

The rapid growth in AI tools that use large language models (LLM) has been seen worldwide, and China is no exception. With Baidu’s ERNIE Bot and Meitu’s MiracleVision emerging as popular choices in China, Samsung R&D Institute China partnered with both companies to help build Galaxy AI features for the country.

Samsung R&D Institute China in Guangzhou (SRC-G) and Beijing (SRC-B) worked to ensure Mandarin speakers in China had the same Galaxy AI experience as other users around the world, despite the back-end technology looking very different. The team took advantage of the dedicated resources of Chinese dialects from third-party partners and built a unique Galaxy AI solution for China.

“We have the advantage of blending global best practices with China’s local practices, as well as creating new features and constantly improving them through daily communication with Chinese consumers,” says Hairong Zhang, Software Innovation Group Leader at SRC-G. “With rich development experience from the Galaxy S24, I’m proud of how our team cooperated with local Chinese AI companies such as Baidu and Meitu to provide a solution that resonates in China.”

Learning-Curve_China_main2

At the beginning, the teams had to acclimate to each other’s working styles and iron out the initial kinks of information asymmetry. Daijun Zhang, Head of SRC-B, established a task force to ensure the project followed the development schedule and moved quickly toward its goals.

Learning-Curve_China_main3

Thanks to the Beijing team’s experience in generating large-scale models and successful collaboration with third-party partners, all the generative AI features were successfully launched in China. The result is a solution that has local relevance and market-specific features such as Touch to Search.

Expanding on Chinese to Develop for the Cantonese Dialect

Chinese for mainland China (Mandarin) arrived on Galaxy AI with the launch of the Galaxy S24 in January 2024. But the job for Samsung R&D Institute China was far from finished. The team was also tasked with developing the AI model for Chinese in Hong Kong (Cantonese), a dialect that builds on the work already carried out for Mandarin but brings an entirely new set of language features to address.

In developing for Cantonese, the China R&D team faced major cultural challenges that it needed to respond to in order to fully support localization for the market. The first cultural phenomenon is the two sets of systems for writing and speech. Hong Kong locals use grammar and expressions similar to Mandarin when writing but adopt a completely different colloquial grammar when communicating daily. Also, Cantonese has nine tones for pronunciation, whereas Mandarin has four.

Another cultural phenomenon is that the Cantonese dialect itself develops with the times. Add to that the fact that people often blend Cantonese and English into conversations, and it’s clear to see why it was complicated to create test cases and validate language packs.

“Cantonese is a very unique dialect that varies in different Cantonese-speaking regions,” says Jing Li, who leads the operation for testing the Cantonese AI solution. “Some of the slang, phrases, vocabulary and even the tones are varied from place to place. Therefore, we conducted a large amount of work in verifying the Hong Kong-specific data, as well as proofreading tens of thousands of relevant test cases.”

Learning-Curve_China_main7

With these complexities in mind, SRC-G and SRC-B worked together to support a deep code mix using a mixture of Cantonese and English for speech recognition, simultaneously supporting both written and spoken expressions in machine translation and reflecting current pronunciations in speech synthesis.

Cultural Impact of Communication

When Galaxy AI launched the Chinese (Hong Kong) language option, the customer feedback showed that the hard work of the Samsung R&D team was justified.

For both the Chinese mainland and Hong Kong, Samsung’s Galaxy AI activities show the importance of a global brand having a local presence and expertise, as well as the power of open collaboration with other organizations. In Hong Kong, Cantonese is a key part of the cultural identity of those who live there. That’s why it was so important for the team to get the AI language model right.

“Language and communication are crucial in every region and in all walks of life,” says Henry Wat, Heads of Engineering Group at Samsung Electronics Hong Kong. “No matter the language, any tool that helps people communicate is invaluable. I believe our work is meaningful.”

Learning-Curve_China_main8

In the next episode of The Learning Curve, we will head to Brazil to see how a team works across cultures and borders to bring Galaxy AI to more people.

1   Galaxy AI features by Samsung will be provided for free until the end of 2025 on supported Samsung Galaxy devices.
2   Samsung account log-in required. Calls must be made using the native Samsung phone app. Samsung does not make any promises, assurances or guarantees as to the accuracy, completeness or reliability of the output provided by AI features.

Leave a Reply

Your email address will not be published. Required fields are marked *