In the not-so-distant future, machine learning may help civil-rights agencies predict who could face workplace discrimination.

Samuel Christopher Haffer, the first chief data officer hired by the Equal Employment Opportunity Commission, said his goal is to create a “distant early warning system,” examining things like “intersectionality,” or groups of similar qualities like race, gender, or ethnicity. Machine learning—the algorithmic study of past experiences to optimize, or predict, future experiences—will have a lot to do with that.

Hypothetically, this could mean the agency would be able to flag a particular group of people in a specific industry who would be susceptible to discrimination. The agency would then be able to target outreach to those workers to make them aware of their rights, he said.

“That team of social scientists and data scientists will be looking at the treasure trove of data that we have here at the agency to look at trends and patterns at the enterprise level,” Haffer told Bloomberg Law in a recent interview.

The machine learning tool could lead to more discrimination complaints filed with the agency, if the algorithm-based outreach reaches more people who may be subjected to bias. It probably won’t be relied on once litigation is pursued, however, as there’s some debate within courts and agencies on whether the use of statistics alone—without additional evidence from allegedly aggrieved workers—can be used to prove bias claims.

More Collaboration

The EEOC also is planning to partner with other federal agencies, such as the U.S. Census Bureau, Haffer said. The EEOC has memorandums of understanding with divisions within the departments of Labor, Health and Human Services, Justice, and Homeland Security, among others.

The agency hopes to better illustrate and understand worker threats to accomplish its mission “to prevent and remedy,” he said.

For example, the Census Bureau collects more focused information on race and ethnicity, organized by industry, than the EEOC does. The bureau also examines information like English-language proficiency, something that could aid outreach efforts. The more focused the information, the better understanding of the threat.

“If we started identifying things at the embers phase,” the agency can try to prevent “the next big fire,” Haffer said. “I really don’t think there is enough effort put into other ways to use other federal data sources.”

The “early warning system” is still three to five years away, he said.

A First: Available Employer Information

The agency fields multiple requests for EEOC data on a daily basis, but those requests often ask the agency to sort or analyze the data in a tailored way, Haffer said. To mitigate the special requests, the EEOC will make employer information reports, or EEO-1 surveys, publicly available for the first time by the end of 2019, Haffer said.

“People will be able to download those files and do the slices and dices of the data the way they want to do it,” he said.

The report contains sensitive workplace information, but Haffer said he’s hired people who specialize “in creating public use files for a number of federal agencies.”

“If we could provide a public use data set at the lowest use level that we could, while maintaining all of the various privacy protections that we have in place,” members of the public could “just do the analysis themselves,” he said.

Enterprise and Investigative Analytics Teams

Every day, the data teams focus on aiding individual investigations of charges, he said. The data team will help streamline information requests from the claimants, as well as identify discrepancies, he said.

They want to “minimize the burden of the respondent” in the request for information process, he said.

Overall, the team has successfully “started to transition to a 21st century data analytics” outfit in a relatively short period of time, he said.

“In federal time, this has been amazing,” Haffer said.