Tuesday, March 15, 2022

New tool helps journalists more quickly sift through data dumps agencies send them in response to records requests

When journalists request documents from a government through the Freedom of Information Act or a state law, the data often arrive in huge, unorganized batches. Sorting through it can be a daunting task for any journalist, especially for smaller newspapers with fewer resources. A new tool could make the process easier, Paroma Soni reports for Columbia Journalism Review.

New York University journalism professor Hilke Schellmann teamed up with senior research scientist Mona Sloane, computer-science professor Julia Stoyanovich, and a team of graduate students at NYU’s Center for Data Science to develop Gumshoe, "an artificial-intelligence tool that uses natural language processing to sort through large caches of text documents and categorize them by relevance to the journalist’s main topic of investigation, reducing the time needed to sift through everything," Soni reports.

When users enter search terms, the tool can evaluate which documents or sections of documents are more likely to be relevant to the subject matter. "MuckRock, a nonprofit news site devoted to record requests, plans to integrate Gumshoe into its DocumentCloud platform, which is used by journalists for posting and reviewing public records," Soni reports. "The Gumshoe team developed the tool with an initial grant from the Center for Digital Humanities at NYU. A subsequent $200,000 grant, awarded last month by the Patrick J. McGovern Foundation, will enable the team to build out Gumshoe’s user interface and distribute the product widely. At the moment, the team is inviting journalists and newsrooms to test out the tool and help review/improve it."

MuckRock data and investigations editor Derek Kravitz told Soni that Gumshoe could be a critical resource for newsrooms: "Having this accessible when it’s needed might be the difference between some really important stories getting told and some stories never even being looked at."

No comments: