FracFocus Enters the Era of Big Data

6914441342_605f947885_zThe identity of the chemicals used in hydraulic fracturing fluid has been a key policy issue since the beginning of the shale revolution a decade ago.  Now, that chemical information just became publicly available in a whole new way.

Today, FracFocus, the nationwide state-run hydraulic fracturing chemical registry database, made its chemical data publicly available as a raw data download – bringing FracFocus into the modern era of Big Data where large information sets can be analyzed for actionable patterns and trends.

In the past, data about the chemicals used in hydraulic fracturing were obtainable only on a well-by-well basis and only in PDF format. With nearly 100,000 wells registered with the FracFocus program, this made collecting and analyzing broad chemical information a nearly impossible task for researchers and the public.

Questions about how the practice of hydraulic fracturing can impact groundwater has been a growing  area of interest for scientists, drillers and the population at large since the practice of pumping a mixture of water, sand and chemicals into shale formations became standard operating procedure for natural gas extraction. Last year, the Secretary of Energy’s Advisory Board (SEAB)—on which EDF President Fred Krupp served– made recommendations for how FracFocus could improve its operations and help answer some of those long-standing questions about the chemicals used in hydraulic fracturing.

The release of this aggregate chemical data is among the first, and the most impactful, FracFocus has made in response to that report.  Other enhancements, which reflect SEAB recommendations supported by EDF, will be launched this summer as part of “FracFocus 3.0” and will include measures to enhance data quality and facilitate the reduction of trade secret claims, meaning more chemicals will likely be logged into the system.

This new method of data sharing, using raw data downloads, provides chemical data in a format that allows users to analyze information more thoroughly and accurately.

More Disclosure

When it comes to hydraulic fracturing chemicals, we are seeing a trend towards transparency. As recently as the start of 2010, no states required companies to disclose what chemicals were used in hydraulic fracturing, making it difficult for policy makers to make informed decisions about how to adequately oversee hydraulic fracturing operations.

Today, almost every oil and gas producing state requires chemical disclosure. Most states, as well as the Bureau of Land Management, use FracFocus as the data repository. But even in places that don’t require disclosure, operators typically disclose to FracFocus anyway—approximately 95% of hydraulic fracturing operations report chemicals to the database.

There are still significant issues associated with some companies hiding specific chemicals behind ‘trade secret’ provisions, and there are legitimate questions being raised about whether state disclosure requirements are being properly administered and enforced, but there is no question that states are requiring greater transparency than was common a few short years ago.

The more you know

Access to aggregate chemical data is critical to understanding broader usage and trends. Without it, scientists, researchers and the public at large are less able to participate in policy discussions that impact the health and safety of communities affected by oil and gas drilling. This improvement to the FracFocus program also means policy makers, regulators, and operators can make better decisions about managing wastewater from hydraulic fracturing to better safeguard against water contamination.

In this era of Big Data, huge flows of information are coursing from industry to regulators, and now to the public, at rapidly increasing rates. Chemical disclosure is an excellent example of data transparency, and EDF expects more disclosure about well integrity, air toxics, methane emissions, spills and releases, and other environmental data is necessary and both industry and regulators can and should do more. Used effectively, Big Data can lead to innovative, evidence-backed and highly tailored policy solutions, and the data released today by FracFocus can help make that happen.

Image source: Flickr/luckey_sun.

This entry was posted in Natural Gas. Bookmark the permalink. Both comments and trackbacks are currently closed.


  1. Posted May 13, 2015 at 1:06 pm | Permalink

    For those attempting to use the FracFocus download link cited above, I must concur with John Amos’ comment about the need for “curation.”

    To wit, after I downloaded the database backup and imported it into Microsoft SQL Server 2012 on 8 May 2015, I attempted to export the RegistryUploadIngredients table to a comma-separated-value (CSV) file, only to discover that at or about row 205 of the table, tab and linefeed characters appearing in the IngredientName column (if memory serves me correctly) rendered any export to CSV file effectively useless. As I’m a software and database developer by trade, I was able to compensate for these characters (as well as some diacritic single quotes) that should have been removed or adjusted with a little “curation.” Such “curation” is allowed in the professional IT community and is expected as a matter of both course and courtesy to those needing subsequent use of the data in question.

    Stating that the data is “as is” or “as received” (quotes mine), effectively without “curation,” is NOT an acceptable policy for any organization entrusted with its gathering and preservation. Disallow undesired characters from this time forward and remove or adjust undesired characters present in table rows from the past. Every good collector curates his or her collection, period.

    Should one need files of data exported from imported data backups of FracFocus data, please refer to the following web page on my blog:… and scroll down to the bolded FracFocus bullet for alternate file formats.

    • Adam Peltz
      Posted May 13, 2015 at 2:41 pm | Permalink

      Thank you. We’re glad to see that the research community is beginning to interface with this data and make it useful for an even larger audience.