Go to Contents Go to Navigation

(Yonhap Interview) CrowdWorks supports crowdsourcing data for AI

All News 10:00 June 11, 2019

By Kim Han-joo

SEOUL, June 11 (Yonhap) -- Data is the lifeblood of the artificial intelligence (AI) needed to train machines to make their own accurate decisions based on learning algorithms, but not many people know that humans process and label a large set of raw data to make the cutting-edge technology feed through to the end.

For a computer to train itself to recognize a cat, for example, its AI algorithm should look at countless images manually labeled by humans, which is a time-consuming, expensive, yet decisive process of AI application.

To reduce costs and improve accuracy, CrowdWorks Inc. provides a crowdsourcing marketplace where freelance workers review the data and add annotations for AI developers. And in return, they earn a small amount of money for each completed task.

"In the world of AI, there is a famous saying 'garbage in, garbage out,' which highlights the importance of the accurately labeled data," CEO Park Mi-nu said in an interview with Yonhap News Agency at the company's office in Seoul. "AI works only when developers have high quality data."

Park Mi-nu, CEO of South Korean startup CrowdWorks, smiles in this undated photo provided by the company. (PHOTO NOT FOR SALE) (Yonhap)

Park said growing demands for high-quality AI training data sets have attracted him to adopt the crowdsourcing model, which allows unspecified groups of individuals to divide work on an open platform to achieve a cumulative result.

"Behind AI technologies are complicated data sets labeled by humans," Park said. "That is why I paid attention to the crowdsourcing model."

The 3-year-old startup is South Korea's first and biggest AI crowdsourcing platform, with a pool of 14,000 skilled workers, providing data for some 30 big-name clients, including the country's top portal operator, Naver Corp., and tech giant Samsung Electronics Co.

Its first mission was labeling thousands of imagery data for SK Planet Co., the operator of local e-commerce platform 11st, for its AI-based system designed to recognize a certain item after taking a photo, he said.

CrowdWorks is currently working with Naver for its translation service Papago, which is powered by Neural Machine Translation (NMT) technology, a deep learning framework that learns from millions of examples. To facilitate their collaboration, the company's 30 employees work in Naver's D2 Startup Factory, an accelerator for tech startups.

The company said it aims to take a bigger chunk of the promising sector as the domestic AI market is expected to reach 11 trillion won (US$9.26 billion) in 2021, with 1 trillion won up for grabs in the AI data processing alone.

CrowdWorks stressed that it provides training data sets with near perfect accuracy through its unique management know-how to help its clients reduce time and money related to the AI development.

Park said the company's strength lies in a skilled pool of participants and rigorous protocols for data collection and processing to guarantee accuracy of its projects.

"We have a pool of some 500 so-called 'supervisors,' chosen from a pool of workers with high credibility, who examine each labeled data at the final stage of the workflow," Park said. "We manage the entire process, which is what sets us apart from other similar companies overseas."

While Amazon's crowdsourcing data platform Mechanical Turk (MTurk), a major player on the global scene, has a vast network of registered workers doing data labeling and other simple chores, the Korean startup claims it is ready to tap into the global market with its competitive edge in imagery and voice data labeling.

Setting sights higher, the CEO said the company plans to make a smooth transition from a startup to a scaleup company through a successful global expansion, setting a goal of raising 4 billion won in sales this year.

As part of the plans, CrowdWorks established a Japanese branch with a focus on labeling video data, considered a new growth engine for the AI market, and plans to make a foray into China and Vietnam later this year, Park said.

"We plan to expand our business overseas based on our competitiveness in imagery and voice data labeling," he said. "It is time for us to scale up."


Send Feedback
How can we improve?
Thanks for your feedback!