BRC Comment Letter on Proposed Federal Government Data Strategy

Submitted by Lars Toomre on Mon, 07/08/2019 - 12:00

On June 4th 2019, the Acting Director, Mr. Russell T. Vought, of the Office of Management & Budget ("OMB") issued a memo entitled Federal Data Strategy - A Framework for Consistency. The proposed Federal Data Strategy sets forty (40) finalized practices in addition to the ten (10) principles as well as the mission statement identified in October 2018. Along with the finalized Federal Data Strategy, OMB simultaneous released the Draft 2019 - 2020 Federal Data Strategy Action Plan that sets the short-term actions US Federal agencies will need to implement over the Oct. 1st 2019 - Sept 30th 2020 Federal fiscal year, with updates coming on an annual basis thereafter.

The Leveraging Data as a Strategic Asset Cross Agency Priority ("CAP") Goal commits the Trump Administration to developing a long-term comprehensive Federal Data Strategy. Senior officials from the Department of Commerce, the Small Business Administration, the Office of Management and Budget, and the Office of Science and Technology Policy co-lead the CAP Goal. Co-lead staff members serve as a project managers supporting and facilitating the development of the Federal Data Strategy with significant stakeholder input. Select Federal Data Working Group Chairs and Team Members conduct research and analysis; facilitate stakeholders; and distill feedback for the Federal Data Strategy.

Leveraging Data as a Strategic Asset CAP Goal Co-leads

  • Maria Roat, Chief Information Officer, Small Business Administration
  • Karen Dunn Kelley, Deputy Secretary, Department of Commerce
  • Suzette Kent, Federal Chief Information Officer, Office of Management and Budget, Office of the Federal Chief Information Officer
  • Nancy Potok, Chief Statistician of the U.S., Office of Management and Budget, Office of Information and Regulatory Affairs
  • Kelvin Droegemeier, Director, Office of Science and Technology Policy

On Monday, July 8th 2019 ,Brass Rat Capital LLC ("BRC") submitted the following comment letter about the Draft 2019 – 2020 Federal Data Strategy Action Plan to the above members of the Federal Data Council ("FDC" or "CAP Goal Co-leads").

 

July 8, 2019

 

Office of Management & Budget
Attn:  CAP Goal 2 leaders: Mr. Droegemeier, Ms. Kent, Ms. Potok, Ms. Kelley, and Ms. Roat
725 17th St., NW
Washington, DC 20503

RE: Comments on the Federal Data Strategy’s Draft Action Plan (USBC-2019-0001)

 

Dear CAP Goal 2 leaders - Mr. Droegemeier, Ms. Kent, Ms. Potok, Ms. Kelley, and Ms. Roat:

Brass Rat Capital LLC (“BRC”) appreciates the opportunity to provide comments on the Draft 2019 – 2020 Federal Data Strategy Action Plan (“Data Strategy” or “Action Plans”).

We wholeheartedly support the Federal government’s effort to become more data-centric. In our view, the Federal government’s primary goal should be to develop data science sophistication on par with the so-called FAANG+M technology companies. Today, most of the US (including the Federal government) fall far short of that expertise. Unfortunately, we see little in the Data Strategy or Action Plans that will help the Federal government achieve that goal. Indeed, there is no statement that remotely indicates what the longer-term goal is, other than to “leverage the full value of data…for the public good.”

Among our key conclusions and recommendations:

  • There is no mention of completing the implementation of the Digital Accountability and Transparency Act of 2014 (“2014 DATA Act”). We strongly believe completing this task should be the primary Action Plan for now. By doing so, the Federal government will learn far more about shortcomings in its data management systems and how to overcome them than would be gained by implementing most of the proposed Action Plans. The GAO has issued a series of reports  detailing the many problems with the current process.  These GAO reports also provides a clear blueprint of what needs to be done.
  • We recommend the resources for Action 4,7-13, and 15-16 focus for now on implementing the 2014 DATA Act.
  • The United States and world are quickly moving toward leveraging data through machine learning (“ML”) and artificial intelligence (“AI”). These modeling capabilities require a data infrastructure that includes well-developed data mappings, taxonomies, and ideally semantic ontologies that attach data elements to the tree of knowledge and known relationships. Properly structured, the data within the US government is an incredibly valuable resource capable of unlocking economic value that would significantly enhance US productivity and US GDP growth. However, there is no indication in the Data Strategy or Action Plan that the government understands this, let alone envisions building a data infrastructure that incorporates these features. Without this it will be impossible for the government to accomplish much on Action 9 (Improve Data Resource for AI Research and Development), whether in the short or more importantly longer term. We recommend Action 9 focus initially on better understanding and planning what has to be done so that the evolving government data infrastructure will be able to support ML and AI models and applications.
  • It appears the government intends to achieve its goals with “existing resources”, supplemented by $11.2 million in the coming year. Quite frankly, the funds allocated are wholly inadequate for the large task at hand. With this level of commitment, it is inconceivable that the government can achieve any meaningful progress toward achieving the goals outlined in the Data Strategy and Action Plan, let alone developing a data infrastructure on par with the FAANG+M companies (whose combined 2018 software development and CAPEX budget was roughly $168 billion or 20.6% of revenue.). We recommend as part of Action 1 that the Data Council conduct a comprehensive cost/benefit analysis of the resources that will be required in coming years to achieve its limited data goals and the more comprehensive vision we advocate. We are confident that the potential benefits will far outweigh the upfront investments.

 

Overview

Brass Rat Capital LLC appreciates the opportunity to provide comments on the Draft 2019 – 2020 Federal Data Strategy Action Plan. BRC is a semantic data, semantic analytics, and FinTech “skunkworks” technology firm that focuses on data and analytics solutions for the financial services industry. As such, we wholly support the federal government’s efforts to become more data-centric.

We do indeed live in an era where our ability to understand and leverage data could lead to a paradigm shift in how we live in the 21st century, on order of the movable type printing press and the Gutenberg Bible or the development of Penicillin. The fantastic rise of the FAANG+M companies[1] over just the past 10 years is not only a vivid testament to the incredible power (and pratfalls) of being able to harness data, but also a beacon for what our economy and society might be able to accomplish should this technology becomes more widespread.

But most of the US economy lags far behind these leaders in simply understanding how to leverage the power of data, let alone being able to implement such leverage. That includes the US Federal government. But, surely by now, we must know that as a society we must at the very least aspire to achieve this advanced level of data competence.

Time alone will not bridge that gap. Rather, for this to happen, we need the kind of inspirational leadership and vision that led to the US putting a man on the moon. Recall President Kennedy’s inspiring words in 1961: (The US) “should commit itself to achieving the goal, before this decade is out, of landing a man on the moon and returning him safely to the Earth”.[2] Eight years later (and 50 years ago), America and its incredible engineering talent indeed achieved that goal.

Neither recent legislation (Foundations for Evidence-Based Policy Making Act of 2018 (“the Act”)) or the documents intended to implement it (Federal Data Strategy – A Framework for Consistency (“Data Strategy”), which outlines a multi-year data strategy, or the 2019-2020 Federal Data Strategy Action Plan (“Action Plan”)) begin to offer the kind of visionary leadership that is essential if the Federal government is to realize the full potential of its data resources. In fairness, we recognize that the latter two documents are operational rather than visionary in nature, and mark the initial steps of what will be a long process. Still, we feel there are serious shortcomings in both documents that must be addressed now if this effort is to succeed.

In the sections below we first outline several general issues we have with the Data Strategy and Action Plan. We then discuss specific recommendations for the Action Plan.

General Issues

The Data Strategy and Action Plans lay out detailed blueprints to codify and improve existing data, get more data, promote sharing of data resources among agencies, and make better use of data in government policy and decision-making. The Act in particular highlights the need for more and improved statistics and statistical analytical methods.[3] There is heavy emphasis on the need to protect privacy rights and data security. This is all commendable but it is difficult to see how any of the proposed policies and actions will accomplish much more than bring the overall government up to existing basic data and analytic standards that prevail outside the FAANG+M companies.

We agree there is a need to incorporate improved data and statistical analysis into the government decision-making process. But today statistics is becoming a branch of the much broader discipline of data science. We appreciate that Action 2 calls for developing a data science training catalog but it would be easy for people implementing the Act to focus primarily on statistics to the exclusion of much of what data science has to offer. We emphasize this point because there is a fatal tendency of bureaucracies (often backed up by the courts) to hew to the literal letter of the legislation.

We recommend that the Data Strategy and Action Plan incorporate more inclusive language about data science and related tools and analytics that might be brought to bear in coming years, including artificial intelligence and machine learning.

A second issue is money and resources. The Act, Data Strategy and Action Plan say that the government is expected to become more data-centric using “existing resources”, supplemented in the coming year by about $11.2 million dollars. At the risk of stating the obvious, “existing resources” are already mostly committed to supporting existing programs, many or most of which are mandated by statute. It is not clear from the Data Strategy or Action Plan which agencies are included, but in any case their total budgets run well over a trillion dollars. The mooted $11.2 million of supplementary funding is a minuscule fraction of one percent of that total. Even if we are unfair or outright wrong in our assessment of the Government’s vision for data, we can assure you that nothing more than empty symbolic gestures can be achieved with the resources that are being made available. To provide a rough benchmark of the resources required to succeed (and as noted above), the FAANG+M companies spent about $168 billion last year on research and development and CAPEX.

BRC recommends that one of the top action plans should be to scope out the resources that will be needed in coming years to achieve the government’s goal of becoming more data-centric, both at the level envisioned in the Data Strategy/Action Plan and what would be required to achieve FAANG+M competence. This should be done in the form of a cost/benefit analysis to highlight how the government and US economy stand to gain as the government upgrades its data infrastructure. This should also serve to show the costs of effectively doing nothing, which is what the current Data Strategy and Action Plan will accomplish with the proposed funding strategy.

Comments on Action Plan

We see at least two major shortcomings in the Action Plans – both of which pertain to addressing current problems.

First, there is no mention of the many problems around implementation of the 2014 DATA Act, and the need to complete this task. The Government Accountability Office (GAO) has issued at least six reports over the past four years criticizing efforts to implement the 2014 DATA Act, including a devastatingly detailed 48-page report issued in July 2018. That was followed by a 38-page report in March 2019 pointing out that the OMB has yet to develop a formal data governance structure for spending data.[4] 

The most powerful and effective action plan before the government is to complete work on implementing the 2014 DATA Act.

  • First, this is the law of the land. Given that this process has already been underway for several years completing it should be a top priority.
  • Second, the GAO in its ongoing reports has provided a clear outline of the problems. Complying with the 2014 DATA Act may not be an easy task but what needs to be done is a known known.
  • Third, completing this task should give the government far more actionable information about the shortcomings of its data management practices; and how to resolve or reconcile differences among agencies so they can better share data and related resources.
  • Further, to complete this task the government will have to develop detailed data mappings and taxonomies, ideally supported with semantic ontologies. The knowledge and skills gained through this process will be invaluable in advancing the government’s data strategy – but only if an effort is made to learn from and apply these lessons.
  • Many of the resources identified in the Action Plan for compiling existing data resources should instead be focused on studying how the government and its vendors address and resolve issues with implementing the 2014 DATA Act, and drawing on this experience to develop the next round of action plans. This should lead to far more substantive action plans in future years.

A second problem is that there no mention of a process to ensure the government can overcome political or corporate opposition to obtain critical data from industry and constituents. For example, on April 2, 2019, the SEC issued a Final Rule for the FAST Act Modernization and Simplification of Regulation S-K.[5] In particular, it decided not to require reporting companies to provide a legal entity identifier (LEI) for its subsidiaries even if it is available. This may seem like one small data point that perhaps was not essential for the SEC and allowed it to provide a small win for reporting companies that complained it would be expensive to provide information they already have. But the LEI is a critical nugget of information for regulators tasked with monitoring sources of emerging financial risks in the system as well as a key identifier for pulling together the silos of corporate information that exist across many government agencies and global regulators. At some point regulators will have go through the process and expense of issuing proposed regulations to obtain this information.

This is a good example of what happens when decisions about data are made in silos without regard for how other agencies or analysts might use it. It is essential that procedures be put in place so that lapses like this cannot happen again.

Our comments on specific Action Plans follow:

Action 1 – BRC agrees that an OMB Data Council should be formed, with the responsibilities described in the Action Plan. We recommend that the following additional responsibilities in the coming year be added:

  • Carry out a comprehensive cost/benefit analysis of the financial and human resources that will be required incoming years to achieve the government’s data-related goals and to achieve data competence on par with the FAANG+M companies.
  • Create a review process of all government requests for data or data-related requests for comments to ensure that mishaps like the SEC declining to require LEI data be submitted cannot happen again.

Action 2 – This should be replaced with an action plan to bring the government into full compliance with the 2014 DATA Act.

The current Action 2 should be moved down in priority.

Action 3 – BRC agrees that developing an ethics framework is important.

Action 4 – We appreciate that it is important to address re-identification risks but we think this may be very difficult to accomplish in any effective way given the current state of the government’s data infrastructure.

We recommend this Action 4 be postponed until some of the other actions are completed. For now, we recommend that the resources be focused on our recommended Action 2 (complete 2014 DATA Act Implementation).

Action 5 – We agree with this action. We further note that developing and populating this repository should be an ongoing effort, not just a one-year action item.

Action 6 – We agree that making it easier to for researchers to access data by creating a one-step standard application would be beneficial; our concern is that this process be secure to protect data integrity and ensure that the wrong people (including unauthorized hackers, terrorists, and foreign governments) cannot access the data.

Action 7 and 8 – We have no idea what “an automated tool” means, or what is being “automated”. Does this mean a tool that runs without human intervention?

We appreciate that these are “pilot” programs, presumably intended to be limited and exploratory in nature. Yet we question whether it makes sense at this point to try to create a standard Federal Data Catalog. Given the disparate data infrastructures across agencies (as highlighted by the GAO reports mentioned above) it simply is not realistic to think a “government-wide” “cloud hosted” “cheap” and “customizable” “data catalog platform” will begin to address or overcome, for example, the problems and issues identified by the GAO reports on the 2014 DATA Act implementation. We assume other types of data across agencies will present similar problems.

BRC recommends the resources for this Action Plans 7 and 8 be focused on our recommended Action 2 (complete 2014 DATA Act Implementation).

Action 9 – Developing AI models, capabilities and applications is an important goal, but this action plan makes little sense at this time. The government needs to finish implementing the 2014 DATA Act implementation and apply lessons learned to get a better handle on its data resources before it can begin to contemplate trying to make data available for AI related activities and research.

Most AI applications require extensive, complete, and high-quality data. This includes a thorough understanding of available data sources, and well-developed data mappings, with taxonomies and semantic ontologies across various datasets. BRC would suggest adding an assessment of accuracy and precision for the data since private entities expend sometimes as much as 90% of an effort to “cleaning up” the model data.  As noted above, completing the 2014 DATA Act implementation will go far toward helping the government develop these kinds of skills, which hopefully can be applied to prepare other data resources for AI applications.

We further note that there is no mention anywhere in the Act, Data Strategy or Action Plan of any goal of developing or enhancing data in this way or to this extent. Actions 7 and 8, which address inventorying and categorizing data, could be carried out in a way that facilitates developing this kind of semantic data infrastructure in the future. Or they could be executed in a way that does not support or facilitate this infrastructure, meaning that this work will almost surely have to be redone at some point if the government is to implement AI technology in any meaningful way.

BRC recommends that this Action be modified to focus on better understanding and planning what has to be done so that the evolving government data infrastructure will be able to support AI applications. That would be valuable input for when Actions 7 and 8 are implemented. Part of this Action should include working closely with the2014 DATA Act implementation process to gain a better understanding of how to create a data mapping infrastructure.  

Action 10 – This action comes closest to implementing the 2014 DATA Act, in particular the first two bullets – Getting Payments Right and Results Oriented Accountability for Grants. The third item, Federal IT Spending Transparency, is perplexing. It is hard to imagine that Federal executives today do not “analyze trade-offs between cost, quality, and value of IT investments” or do not make “data-driven decisions” especially when purchasing IT equipment. To cut to the chase, all three of these items and many more would be accomplished by successfully implementing the 2014 DATA Act.

We recommend these resources be focused on our recommended Action 2 (2014 DATA Act implementation).  

Action 11 – We understand improving geospatial data standards is a legislative requirement; however, we think successfully implementing the 2014 DATA Act is the higher priority, despite the recognized value of a geospatial data standard. BRC recommends that the OMB Data Council approach a consensus software standards organization such as Object Management Group (“OMG”) to start the process of creating a robust geospatial data standard.  Both of the signatories of this letter co-lead OMG’s Federated Enterprise Risk Management working group (“FERM WG”) where many US Federal government needs are first surfaced.  This FERM WG is also the OMG group focused on helping to work with the Data Coalition and XBRL International.

We advise these resources be focused on our recommended Action 2 (2014 DATA Act implementation).  

Action 12 – We agree that agencies should have appropriate data governance structures in place. We further think it is critical that agency-level data governance infrastructure work with other agencies if the government is to fulfill its goal of greater data sharing and coordination. If this kind of cooperative effort is not explicitly mandated and institutional infrastructure put in place, it will be far too easy for agencies to develop governance structures in their respective silos. See also recommdations for Actions 13 and 15-16.

Actions 13 and 15– 16 – These actions overlap with each other. For example, Action 13 calls for “an assessment focusing on data and data infrastructure … needed to answer agency priority questions”. Action 15 calls on agencies “to identify and prioritize the data needed to answer key agency questions”. And Action 16 requires agencies to “identify an initial set of priority agency datasets that are key to mission success”. How many times do you have to require agencies to do the same thing?

In our view, these actions are a catch-all of various items, many of which should already be ongoing activities. We suspect, for example, most agencies are fully aware of the data they need or would like to have to answer key questions and accomplish their missions. If they are not, they probably have far deeper problems than can be addressed by these fulfilling these action items. We think most or all of these should be under the purview of the agency data governance infrastructure (Action 12), and do not need to be separate items. Indeed, different agencies may have differing needs and priorities when it comes to data needs.

Action 14 - It is difficult to understand why Action 14 (Increasing staff skills) is not part of the present Action 2 (Develop data science training catalog). It makes little sense that Action 2 is to be completed in six months and Action 14 completed in nine months. How can anyone develop a meaningful catalog of training materials (Action 2) before one knows what training and skills are needed (Action 14)?

Conclusion

BRC appreciates the opportunity to share our thoughts about the Data Strategy and Action Plan. Our intent is to encourage policymakers to think much more broadly about what they could accomplish with true semantic data (and semantic analytics), and to focus on accomplishing truly substantive action steps. The most critical ones for now, in our view, are completing the 2014 Data Act implementation and scoping out the resources the Federal government will need in coming years to achieve its goals.

We are happy to discuss any of our comments with you in more detail.

 

Yours Sincerely,

 

Lars Toomre

Managing Partner, Brass Rat Capital LLC

 

[1] Facebook (FB), Apple (APPL), Amazon (AMZN), Netflix (NFLX), Google (now Alphabet, GOOG) and Microsoft (MSFT)

[2] https://history.nasa.gov/moondec.html

[3] See Title III of the Foundations for Evidence-Based Policymaking Act of 2018.

[4] See GAO 18-546, Data Act: Reported Quality of Agencies’ Spending Reviewed by OIGs Varied because of Government-wide and Agency Issues, July 2018, General Accountability Office; and GAO 19-284, Data Act: OMB Needs to Formalize Data Governance for Reporting Federal Spending, March 2019, General Accountability Office

[5] See Federal Register FAST Act Modernization and Simplification of Regulation S-K Final Rule, April 2, 2019, in particular Section III C 2, which addresses proposed amendments not adopted, including subsidiaries entity identifiers