Skip to main content

Sama aims to bring greater equality to crowd-labeling of datasets with new $70M

Article by – VentureBeat

Sama, a company providing data to train machine learning systems, has raised $70 million in a series B found led by CDPQ with participation from First Ascent Ventures, Salesforce Ventures, Vistara Capital Partners, and existing investors. CEO Wendy Gonzalez says that the company will use the funding to grow its platform with new products that “enable teams to manage the complete AI lifecycle.”

Data scientists spend about 45% of their time on data preparation tasks including loading and cleaning data, according to Anaconda. A separate report from Alation found that 97% of data leaders have suffered the consequences of ignoring data, either missing out on new revenue opportunities, poorly forecasting performance, or making bad investments. Yet another study — this by MIT Technology Review Insights and commissioned by Databricks — reveals that machine learning’s business impact is limited largely by challenges in managing its end-to-end lifecycle.

Founded by Leila Janah, San Francisco, California-based Sama — formerly Samasource — developed its first relationships with partner delivery centers in 2018, focusing on data entry, sentiment analysis, and data transcription. In 2009, the company launched the initial version of its technology platform, SamaHub, and embarked on a slew of commercial projects — including providing images and annotations used by Microsoft to build out the company’s Xbox Kinect.

“Janah believed that giving meaningful, living-wage work was the best way to permanently lift people out of poverty,” Gonzalez told VentureBeat via email. “To date, we’re the only AI training data provider with a responsible training and employment program that provides actionable career skills for underserved communities to bring us closer to a more equitable future of AI.”