This opinion piece was written by Stefaan G. Verhulst. He is the co-founder of the GovLab, and co-author (with Andrew Young) of Global Impact of Open Data (2016) and Open Data in Developing Countries (2017). It also appears in our digital government newsfeed.
As data grows ever more important for our economy and society, it’s increasingly clear that tremendous value can be derived from reusing and combining previously separate datasets.
But, of course, recombination poses risks, notably to privacy. At the GovLab, an action research centre at New York University, we are researching ways that data can be reused and combined responsibly, in a manner that helps to solve important public problems.
One avenue that holds particular promise is the data collaborative. These occur when institutions or organisations provide access to data, usually across sectors, in new and innovative ways. Some of the most interesting data collaboratives result when private sector entities share access to their vast stores of data with civil society, public officials or others working in the public interest. They have, for instance:
- Helped streamline aid delivery to victims of natural disaster by leveraging Facebook data on people’s movement after natural disasters with aid workers.
- Measured people’s economic resilience to natural disasters using sale payments and ATM cash withdrawal data from the bank BBVA in Mexico.
- Provided better urban insights to city officials by leveraging Waze data through its Connected Citizen Program.
The importance of “data stewards”
Not all data collaboratives succeed, and we have come across many failures over the course of our research. Success depends on several factors, including whether a problem statement has been clearly articulated, whether the recipient possesses the technical capacity to handle the data, whether both parties trust one another and whether the data itself is accessible and of usable quality.
Crucially, we have found that success also depends on the existence of people within organisations that are empowered to proactively initiate, facilitate and coordinate data collaboratives. We call such people “data stewards”. They have the requisite expertise and authority to recognise opportunities for productive collaborations, or to respond to external requests for data. They systematise the process of collaborating, and help scale efforts when there are fledgling signs of success.
Several companies — including Mastercard, Facebook, LinkedIn, Google, Uber, Cloudera and Digital Globe — have recently put into place data steward roles (though not necessarily with that title). They are tasked with the responsibility of identifying, structuring, guiding and evaluating collaborative opportunities.
Defining the role
Yet, even as we see more data steward-type roles within companies, there’s considerable confusion about just what they should be doing. In particular, we have noticed a tendency to conflate the roles of data stewards with those of chief privacy, chief data or security officers.
This slippage is understandable, but our notion of the role is somewhat broader. While privacy and security are ,of course, key components of trusted and effective data collaboratives, the real goal is to leverage private data for broader social goals.
So what are the necessary attributes of data stewards? What are their roles, responsibilities and goals? And how can they be most effective?
The following three goals and five functions can help define the aspirations of data stewards, and what is needed to achieve the goals. While only a start, these attributes can help guide companies currently considering establishing data steward-like roles.
The three goals of data stewardship
- Collaborate: Data stewards are committed to working with others, with the goal of unlocking the inherent value of data when a clear case exists that it serves the public good and that it can be used in a responsible manner.
- Protect: Data stewards are committed to managing private data ethically, which means sharing information responsibly while preventing harm to potential customers, users, corporate interests, the wider public and of course those individuals whose data may be shared.
- Act: Data stewards are committed to proactively acting in order to identify partners who may be in a better position to unlock value and insights contained within privately held data.
The five functions of data stewards
- Partnership and community engagement: A central function of data stewards involves developing and implementing a more proactive approach to reaching out to and vetting potential partners. They should also be informing potential beneficiaries of the insights generated by collaboration. In general, they are responsible for engaging with all actors who may be affected by, or otherwise have a stake in, the innovative use of data.
- Internal coordination and staff engagement: Establishing a successful data collaborative requires internal coordination and sign-off from various company actors– for instance, the legal, technical, data, marketing and sales teams. Data stewards are key to ensuring internal stakeholders and company leadership are informed and aligned. In addition, they often play an important role in mapping and matching staff that have specific skills, such as data science abilities.
- Data audit and assessment of value and risk: Data stewards are responsible for monitoring and assessing the value, potential and risk of all data held within an organisation. This includes knowing what data the organisation collects, and what public interest questions that data could help answer. They should also be in involved in preparing data for analysis, assessing the challenges involved in sharing it, as well as steps to minimise those challenges and other risks at different steps of the data value chain. Finally, data stewards should consider the ethical implications and be involved in establishing externally validated public impact measurements that result from data collaboratives.
- Dissemination and communication of findings: Data stewards often act as the public face of their company’s data projects, and they are responsible for raising awareness, disseminating findings and communicating shared outcomes from data collaboratives. Data stewards may also be responsible for overall communication with customers, users, partners, government and other stakeholders about regulatory compliance, contractual obligations, how data is being shared and used and what public benefits it has had.
- Nurture data collaboratives to sustainability: Many ambitious data collaborative projects collapse after initial pilots or experiments. Data collaboratives can play a valuable role in nurturing and helping scale these projects until they are sustainable. While data stewards may not themselves have the budget to ensure long-term sustainability, they must work with a variety of stakeholders to gather resources and support to ensure long-term impact.
The Data Stewards Network
As part of our ongoing research, and in effort to flesh out and better understand the attributes outlined above, the GovLab has recently launched a Data Stewards Network. Its primary goal is to connect existing and aspiring corporate data stewards to jointly develop methodologies, tools and frameworks that could help build more efficient, collaborative, impactful and safe data collaboratives. We are in effect building a directory of existing data stewards and a database of best practices and case studies to advance the cause of data sharing and public-private partnerships.
Whether you are a data steward yourself or just someone interested in the potential of data collaboratives, we invite you to visit the website and get in touch. Over the next few months we will be launching a dedicated online platform to facilitate shared learning. We welcome your suggestions, thoughts and ideas. — Stefaan Verhulst
(Picture credit: Unsplash/rawpixel)