Go to Contents Go to Navigation

CN.AI leads emerging market for AI synthetic data processing

All News 14:07 October 28, 2022

By Kim Boram

SEOUL, Oct. 28 (Yonhap) -- In recent years, artificial intelligence (AI) assisted diagnosis has emerged in many medical fields, such as gastric cancer, one of the leading causes of cancer-related deaths in South Korea, based on endoscopic and other computerized tomographic images.

But it is hard to secure enough quality endoscopic images from real clinical cases to run an AI diagnosis program as symptoms of gastric cancer are found in so many different places and forms. Private issues also matter.

This photo provided by CN.AI shows its CEO Lee Won-seop. (PHOTO NOT FOR SALE) (Yonhap)

This photo provided by CN.AI shows its CEO Lee Won-seop. (PHOTO NOT FOR SALE) (Yonhap)

Lee Won-seop, CEO of CN.AI Inc., a South Korean startup that generates synthetic data for AI, said the artificially manufactured information is the key to solve the data-lacking problem.

"Having the right and enough data is the most important and challenging part of building AI," he said in an interview with Yonhap News Agency earlier this week. "My company creates synthetic data based on statistics of the original to help companies collect quality data for their AI engines."

Lee, who started his engineering career at Samsung Electronics Co. about 10 years ago, cited his company's project to design an AI-powered gastric cancer diagnosis program with the Samsung Medical Center a year ago.

He had received around 5,000 endoscopic images on 13 divided sections of stomachs, but it was far behind the 200,000 images required for system programing. And some sections had no data at all.

To make up for the shortage, his company digitally generated thousands of necessary images of lesions in gastric tissues.

"We've collected image data for about one year. We needed images both with cancer and without cancer, and we wanted enough data for each section," explained the 36-year-old. "We filled in the blanks with synthetic data."

This image provided by CN.AI highlights its synthetic data process business. (PHOTO NOT FOR SALE) (Yonhap)

This image provided by CN.AI highlights its synthetic data process business. (PHOTO NOT FOR SALE) (Yonhap)

Synthetic data refers to information that is artificially generated by computer simulations or algorithms as an alternative to real-world data.

It has been welcomed by a variety of fields, especially by AI engineering, as collecting quality data from the real world is complicated, expensive and time-consuming.

A rise of autonomous vehicles focused a spotlight on the synthetic data industry a few years ago as digitally generated driving scenarios are considered essential to build a safe autonomous driving program.

South Korea also has been experiencing the booming trend, and CN.AI, launched in 2019, was the first mover. It was the only company in the industry when it started operations three years ago, but now there are some five rivals in the country.

"We don't only generate synthetic data but also program AI solutions for our clients," Lee said. "Some big companies want just synthetic data, but most want us to design their AI engine using synthetic data."

His company posted 1.3 billion won (US$918,000) in sales in 2020 and 1.4 billion won in 2021. This year, the number is predicted to rise to 1.8 billion won.

It had nine business partners last year and has 35 this year, ranging from large companies and government institutions to medical centers.

Lee said his company is now eyeing the fast-growing global synthetic data market, which will expand to $26.1 billion in 2024.

"We are planning to go overseas," he said. "We are working on establishing a branch in Silicon Valley and have hired the branch president to attract investors."

The corporate image of CN.AI provided by the company (PHOTO NOT FOR SALE) (Yonhap)

The corporate image of CN.AI provided by the company (PHOTO NOT FOR SALE) (Yonhap)

brk@yna.co.kr
(END)

Issue Keywords
Most Liked
Most Saved
Most Viewed More
HOME TOP
Send Feedback
How can we improve?
Thanks for your feedback!