The 2023 State CIO Survey conducted by the National Association of State Chief Information Officers (NASCIO) asked IT leaders in 49 states how they rate their data governance maturity for enterprise information management. . A majority, 69%, reported that they were only in the “early stages” of their governance structure. Although 27% believe their company's governance is “mature,” NASIO Executive Director Doug Robinson is not convinced.
“I think a quarter of states don't have mature data governance and data management, so they were probably risk averse,” Robinson said. “I've had honest CIOs say to me, 'Well, I thought we were probably better off than some states.'”
Robinson said poor data quality has always been a challenge in the public sector due to the fragmented structure of government operations, but as government agencies look to incorporate generative AI into their systems, the potential for messy data is increasing. He said that harm is more important than ever. Inappropriate data being passed to a particular AI initiative can lead to inequitable allocation of resources and wasted funds on inefficient programs.
In addition to underdeveloped data governance, the NASCIO study also found that 84 percent of states surveyed do not have a formal data literacy or data proficiency program in place, which is critical to data quality. Concerns have been raised about whether data from multiple institutions can be effectively integrated into one tool without negative impacts.
Robinson said these two things need to be overcome before states can build trust in their data and make generative AI tools effective.
To unlock the full potential of generative AI for the public sector, government technology We asked AI and data experts to plan practical ways to prepare state and local government data for optimal use.
Building a government that speaks data
The more agencies dig into their data, the dirtier it becomes, says Milda Aksamitauskas, a fellow in the State Chief Data Officers Network at Georgetown University's Beek Center for Social Impact and Innovation. He said it would be like this. But agencies can get the clearest picture of what needs to be fixed by observing data silos, input errors, and imbalanced wording in question prompts.
“No data set is perfect,” Aksamitauskas said. “But the more you invest, the more information you look at, the more questions you ask, the more you can actually improve the quality of your data.”
Like Robinson, Aksamitauskas is skeptical that government agencies have truly mature data governance, but noted that those in the best position are those that have identified data captains. .
“That way, when you have questions, you can go to this committee or someone who can explain how to approach things,” Aksamitauskas said. “Governance means you can ask questions about the quality of your data and reach out to the people who know the most about your specific data.”
According to the Beak Center, only three-quarters of states created a statewide chief data officer (CDO) position by 2023. Many CDOs are tasked with leading Data 101 training exercises, but this is a significant streamlining that risks being omitted without consent. C.D.O.
“You really need a person or group to spearhead what data governance and data management means to your organization,” agrees Ricki Koinig, CIO of the Wisconsin Department of Natural Resources (DNR). “It can't just be free, otherwise it won't be standardized and it won't become a valuable data element that you can actually use.”
Koinig, who spent years working as a global IT leader in the private sector before working for the state, said that even in the private sector it is easy to start conversations about data governance, but reaching agreement on the details can be difficult. I learned something.
“In the first meeting, you're very happy with data governance, but then you quickly realize, 'Wow, I need to make some difficult decisions that could change the processes and even the culture of the organization going forward.'” said Koinig. . “When everyone understands and agrees on the decision-making and escalation process early on, you can get through issues faster and more amicably.”
Koinig added that it's also important to include diverse voices in the discussion as early as possible to ensure that the data fed to AI is free of bias, and that diversity teams should have a seat at the table and conduct data training. He suggested that details such as materials need to be considered.
“You need to get perspectives on AI governance and data governance from diverse teams that you probably wouldn’t normally think of,” she said. “It means women, it means people of color. It means anything that's important to the diversity of your organization or what you're trying to accomplish.”
Give data owners a voice in procurement
Luis Videgaray, senior lecturer at the MIT Sloan School of Management and director of MIT's Global AI Policy Project, says one of the most pressing issues government agencies face with new AI initiatives is vetting the vendors they work with. I think it's about determining whether there are vendors to work with.agency's
Data is input into AI.
“You're deploying your data into someone else's tools,” Videgaray said. “So you need a very strong in-house team, and one of their strengths has to be in procurement technology. They need to have a deep understanding of technology, but at the same time they have a very good ability to identify products. You must be.”
He said government agencies should check vendors' terms of service for red flags about how they handle data. He added that a lack of transparency could hinder the pilot's success and undermine public trust.
“Some of the APIs in the underlying model are closed and can only be accessed through the API,” he said, adding that agencies cannot directly interact with the model's code or the underlying technology, so the underlying model is He pointed out that he had no idea if or how it would work. If the data is being fed to a biased algorithm. “So these terms essentially shift all the risk and all the responsibility onto the user, in this case the government. That may not be acceptable.”
Videgaray encourages agents to partner with suppliers who understand and respect their needs.
“If your AI capability suppliers are reluctant to adapt your terms, you probably need to move away from them,” Videgaray said. “We need to find someone who will protect the data in an appropriate way and provide the necessary degree of transparency to state or local governments about how the models work, the data used to train them, and how that data works. Please.” Manipulate the data provided by the customer. ”
Josh Martin, Indiana CDO and director of the Indiana Management Performance Hub, said data professionals and managers need to be in the room for those conversations.
“We want to ask the tough questions that agencies can’t ask, and evaluate the approach taken by external vendors who sell AI as some magical thing that’s going to come in and solve everything. But the solutions offered don't seem to really get us there,” Martin said. “We can ensure that we are getting higher quality solutions, that we are doing things responsibly, and that we are considering bias.”
He said those “difficult questions” could include: Do the required data elements exist? Is the data properly retrieved? Is the data in a usable format? Does the data exist?
“AI has become a big buzzword in the vendor community and sales pitch,” Martin says. Asking these questions will help data managers move beyond the AI hype and get to where AI actually has the potential to work.
For smaller agencies or those without a designated data custodian, Aksamitauskas suggests focusing on the local government community and leveraging neighborhoods for insights.
“I think we'll talk to other government agencies and ask if anyone has a cheat sheet that they can share: 'What do you ask your vendors?' — that might be helpful,” Axamitaskas said. “With five agencies, we can figure out more than anyone could on their own.”
Create data policies and standards for AI use
Scottsdale, Arizona, was one of the first cities to issue a data services standard, an eight-page document that details the city's guidelines and plans for building data services.
Scottsdale CIO Bianca Lochner said, “We're not just publishing open data, we're making sure that data services are used meaningfully, not only internally, but also by residents and businesses.” Masu. “We are looking at what needs are within the community in terms of the types of data that should be made available and how those sets are being used to improve the quality of life within the community. We definitely understand what is there.”
In Indiana, Martin's team created a data policy that complements a statewide AI policy developed by the CIO and CISO.
“There's a lot of goodwill in the world, but goodwill alone doesn't pave the way to success,” Martin said. “We need to think holistically about what our own capacity is and what our data environment looks like. How do we influence other institutions through our policies and procedures? We need to think about how we can support these institutions and work with them before they grow into the future.”
Conduct data audits and readiness assessments
As Scottsdale works to implement a new data management platform, Lochner noted that the audit process is key to organizing and creating access restrictions. Knowing where the data is stored not only makes it easier to retrieve it when needed, but also allows her agency to consider who exactly needs access to the data.
“We have millions of datasets, and cataloging and indexing alone is huge,” says Lochner. “As we roll out this platform, we also need to build in processes and policies,” he said. “One of those is a data access insurance policy that allows us to foster collaboration while ensuring compliance with data sharing regulations.”
While agencies may not currently be conducting regular data audits, the Wisconsin DNR's Koinig said they will become more prevalent if AI pilots fail and agencies revert to source data. I predict that I will be deaf. She said that when she started working at the DNR, a closer look revealed dirty data.
“When I first came here, we had human octuplets in our system, not just duplicates, but octuplets,” Koinig said.
“Data management is going to become more of a focus in the future,” Koinig continued. “Organizations will realize that they can't just throw themselves off a cliff and deliver great AI that will impress everyone. They will go back to data.”
Martin also predicts an increase in AI-enabled data evaluation.
“We often focus on the exciting opportunities of AI and what it can do,” he said. “We are fooled by the concept of doing something with AI without really understanding what kind of data we need to examine. The old saying, “When the garbage comes in, There will be trash.” If you don't know the quality of your data, if you don't know its completeness, if you don't really understand where everything is, the shortcomings, the assumptions and all that, then garbage is in, garbage is out. ”
Meanwhile, many experts urged patience. There was one use case that was most frequently identified as a good starter pilot project.
“AI-enabled chatbots and virtual agents are essentially building databases through machine learning,” said NASCIO’s Robinson. “I think these are areas where states can get a lot of benefit because the data they rely on comes from surveys. It’s not based on individual data, it’s transactional data. It is based on data that is being generated at the level.”
This article was originally published in the March 2024 issue of the magazine. government technology magazine. Click here to view the complete digital version online.