Reporter Ryan Gabrielson covers issues concerning for the California Watch, which is founded by the Center for Investigative Reporting. On Saturday, I interviewed Gabrielson and asked him about the use of databases for his last story about mishandled sexual abuse cases in California development centers.
Gabrielson said that the investigation started with a tip that there was overtime abuse by the police force at the development centers. He said he wanted to examine the police force's caseload to understand how they operate.
One of the difficulties was that he could not request all of the criminal investigation files or a log of the criminal investigation files. He said that they are confidential in California and that he had to find an outside source to find information.
"The department of developmental services that runs the police force is a nightmare to deal with in terms of public records," Gabrielson said. "They stall, they deny, and they go to law. It's a mess."
This department also prohibited its employees from speaking to the press, Gabrielson said. He said that all of the records about the internal cases came from people in the system and that they risked their jobs to provide him with the data.
The main difficulty with using large sets of data is to clean them, Gabrielson said. Sometimes this process takes a couple of weeks and he said that it takes far longer to clean the data than analyze it.
Gabrielson said that students who want to become reporters should take a statistics class in college. He said that it was important to understand the method that the data follows. Students also need to learn the software, Gabrielson said. They should definitely know excel, he said, and a statistics analysis package would help as well.