Our Sponsors


Interior/Exterior House Painting by someone you can trust.
(845) 554-5119


Brown Ink
Commercial Printing

600 Horsepound Road, Kent Lakes, NY 10512 (845) 225-0177
Email Greg Brown

Joe Greico's
Out On A Limb

All types of tree work, all aspects of lawn maintenance, snow plowing, lot clearing, excavation, retaining walls, stump grinding.

82 Hortontown Rd.
Kent Cliffs, NY 10512
T- (914)224-3049
F- (845)231-0815

Chuckie Goodnight Foundation

To educate children on how to be good stewards of the earth.

Hudson Valley Photo and Video

Photography by Chris Casaburi (845) 531-2358

Town of Kent Conservation Advisory Committee

Explore the outdoors in the Town of Kent, New York

One Click ButterCutter

The BEST way to handle butter!

A Putnam County Owned Business Enterprise

Activist Calendar


Cost of Wars Since 2001

House moves to restrict open data on corporate financial statements

Photo credit: SEC

Last night the House of Representatives passed H.R. 5405 which, among other things, would exempt more than half of public companies in the U.S. from reporting their financial statements as open data.

The bill requires the Securities and Exchange Commission (SEC) to exempt public companies with less than $250 million in revenues from filing their financial statements in the eXtensible Business Reporting Language (XBRL), despite recent efforts by the SEC to begin ensuring the quality of data collected via XBRL. This represents a significant blow against transparency as these companies will now only be required to file their statements on paper, making it harder to analyze the potentially vital data.

The SEC currently collects this data in paper as well as electronic formats. We agree that this duplicative collection is unnecessary, but can’t support a move back to a past of unparsaeble paper filings. Instead of helping multi-million dollar companies avoid modern disclosure requirements, Congress should be helping the SEC ensure that it is collecting accurate data in a machine-readable format fit for the modern age.

In recent years the Obama Administration (in the form of the president’s open data executive order) and Congress (in the form of the DATA Act) have worked to move the ball forward towards machine-readable, structured data throughout government information generation and collection. If Congress and the President move to exempt companies from submitting financial statements in an open, machine-readable format it will represent a major step backwards from the gains both have made over the past few years.

If H.R. 5404, which was pitched as a bill to help reduce burdens on small business (up to and including “small businesses” worth $249 million), comes before the Senate, that chamber should remove any language that would exempt huge numbers of not-so-small businesses from an important transparency requirement.


Transparency advocates look at “The Price We Pay for Money’s Influence on Politics” during Sunlight-hosted event

Sen. Jon Tester, D-Mont., talks about how he has advocated for transparency while in Congress during "The Price We Pay for Money's Influence on Politics."
Sen. Jon Tester, D-Mont., talks about how he has advocated for transparency while in Congress during “The Price We Pay for Money’s Influence on Politics.”

Reporters at the newspaper that broke the Watergate scandal have concluded that “dark money” inundating this year’s elections makes it impossible to give voters a clear picture of who’s trying to influence their vote, Washington Post reporter Tom Hamburger said Tuesday at a panel discussion hosted by the Sunlight Foundation and ReThink Media.

The event, which introduced Sunlight’s incoming president, Chris Gates, featured journalists, transparency experts and political scientists for a conversation about campaign finance and the tangled web of so-called dark money groups, which do not have to disclose their donors. Although the speakers – and audience members, during the Q&A portions – painted a dire picture of transparency in politics, a handful sounded a more hopeful tone, saying that progress is happening, largely due to the work of advocates at the event.

“What I see in this room are a whole lot of people dedicated to putting themselves out of a job,” said Emily Peterson-Cassin, a panelist and project coordinator at the Bright Lines Project, which aims to clarify the IRS rules about nonprofits’ political activities.

One of the morning’s first speakers was Sen. Jon Tester, D-Mont., who immediately gave a shout-out to Sunlight’s multipronged work to make government data more available and accessible.

“With transparency comes accountability, and we need more people like you,” Tester said of the organization.

In fact, he credits his initial Senate win in 2006 to this “accountability,” explaining that he believes he won because his opponent, former Republican Sen. Conrad Burns, was linked to the scandal surrounding lobbyist Jack Abramoff. Since coming into office, Tester has championed electronically filing Senate campaign finance reports with the Federal Election Commission as well as the DISCLOSE Act, which would require that political contributions of $10,000 or more be reported within 24 hours. Tester was also the first senator to post his daily schedule online, which Sunlight heralded, in 2007, as “what real openness looks like.”

On Tuesday, Tester contended “transparency will help” combat the hyper-partisanship enveloping D.C., a main motivation for his work on the subject. “People ask me, why does this matter? I’ll tell you why it matters: Folks are disgusted with Washington, D.C., and they’re disgusted with the election process,” he said.

During a money and politics panel – moderated by Sunlight’s Lisa Rosenberg, who lobbies Congress on transparency-related legislation – reporter Robert Maguire pointed out that money’s role in campaigns continues to impact politics long after elections are decided. Dark money organizations that, for example, buy ad time to influence an election are “trying to get people into office so they can make policies” friendly to a specific agenda, he said.

And attempting to track how money wends its way into politics, from beginning to end, has only gotten more convoluted, according to Hamburger, a longtime money and politics reporter. He said that when he first started on his beat, he was able to keep track on a simple spreadsheet. But the environment now is more akin to a “swamp,” he said, referencing a complicated (and informative) chart the Center for Responsive Politics created about the Koch brothers’ web of nonprofits.

The Sunlight Foundation's new president, Chris Gates, speaks at "The Price We Pay for Money's Influence on Politics" on Tuesday, Sept. 16.
The Sunlight Foundation’s incoming president, Chris Gates, speaks at “The Price We Pay for Money’s Influence on Politics” on Tuesday, Sept. 16.

Hamburger also said that, during a recent editorial meeting about the Washington Post’s midterms coverage, everyone present concluded it was next to impossible to give readers a full report on the money flowing into political campaigns. Because of the presence of dark money groups, the money that’s reported to the FEC is only part of the equation, he said.

When it comes to shifting this balance, Hamburger’s assessment was stark – and rather bleak.

“Without a scandal, I’m not confident things will change,” Hamburger said.

Along with Sunlight’s founding executive directior, Ellen Miller, her successor, Sunlight’s new president, Chris Gates, also attended the panel. Gates, who officially starts Oct. 1, is a political and civic engagement advocate who has served as president of the National Civic League and, most recently, was the executive director of Philanthropy for Active Civic Engagement, or PACE. Miller, who announced in February that she would retire, founded Sunlight in 2006. She also helped found the Center for Responsive Politics and Public Campaign.


Today in #OpenGov 9/17/2014

Keep reading for today’s look at #OpenGov news, events, and analysis, including late night money in politics, fighting corruption in Costa Rica, and boosting transparency in the Big Apple.

A newspaper with the headline Open Gov

National News

  • Although US troops are preparing to leave Afghanistan, the Special Inspector General for Afghanistan Reconstruction has no plans to stop his watchdog ways as long as US money is still streaming in to the country. (Government Executive)
  • Nancy Pelosi talked money in politics on Late Night With Seth Meyers. (National Journal)
  • A follow up to last week’s stories about Recovery.gov losing government data that’s really owned by Dun and Bradstreet. (Federal Computer Week)

International News

  • NDI is cohosting a workshop on opening election data with the Electoral Commission of South Africa. (NDI)
  • The new President of Costa Rica highlighted issues of mismanagement and corruption in the country’s public administration in a televised address in August. Rhetoric must be turned into action for anything to really change. (Transparency International)

State and Local News

  • Philadelphia’s Chief Data Officer was on Philly311 TV, an online show, to discuss open data and civic technology. (State Scoop)
  • The 2013 New York City elections saw an unprecedented influx of outside, opaque spending on local contests. Now, the city’s campaign finance board and city council are looking for ways to boost transparency. (Gotham Gazette)

Events Today

Events Tomorrow

Do you want to track transparency news? You can follow the progress of relevant bills, court cases, and regulations using Scout. You can also get Today in #OpenGov sent directly to your preferred news reader. If you would like suggest an event, please email mrumsey@sunlightfoundation.com by 7 am on the Monday prior to the event.  [...]

Campaign intelligence: Conservative money plays catch up

After a costly season of primary campaigns in which hardline conservative groups often found themselves at odds with establishment players and the national GOP party committees, outside and inside dollars are coalescing behind Republican nominees in Senate races in Alaska, Iowa and Michigan — the three states that have seen the most outside spending over the past week according to Real-Time FEC data.

With nearly all primary results in the books, donors and party operatives on both sides are shifting their focus — and checkbooks — to the general election. Each competitive Senate race looms large as the Republican party eyes a potential path to flipping the Senate.

Democratic candidates didn’t see anywhere near the same amount of friendly fire that Republican candidates faced, a Sunlight analysis of independent expenditure data found. The few liberal on liberal attacks seen in the primaries were largely products of the personal policy agendas of megadonors Michael Bloomberg and Tom Steyer. With the exception of Bloomberg’s Independence USA, most liberal-leaning heavy hitters have been aiming their electoral ammo at presumptive Republican nominees.

The chart below shows the top three contests by outside spending, as well as the sum of the previous week’s outside spending in those races. The Michigan Senate battle recently shot to the top of ‘priciest’ thanks to a new $1.5 million investment in that race from Steyer’s NextGen Action. A recent television ad from that group slams the “billionaire polluter Koch brothers” for spending millions to elect Terri Lynn Land, the GOP nominee, so that “she’ll let them keep polluting.”

Chart from Real-Time showing the most expensive elections this week by outside spending. They are: the Senate races in Alaska, Iowa and North Carolina

Click the chart for more detailed information; Image: Real-Time FEC

Meanwhile, with little over a month left until the general election, national party committees are shifting into high gear, fêting candidates at Washington fundraisers and gobbling up air time at local stations across the country. Local ad data from the Des Moines-Ames market, collected by Sunlight’s Political Ad Sleuth, show a serious ramping up of broadcast attacks in the race for Iowa’s open Senate seat. Contracts show the NSRC has agreed to purchase nearly $500,000 worth of broadcast time in that market since July, while the DSCC has snatched around $950,000 worth of ad time.

The candidates are also upping their games. Invitations collected by Sunlight’s Political Party Time project show that Joni Ernst, the Republican nominee in Iowa, , and Rep. Bruce Braley, the Democrat, have both been making the rounds at high dollar fundraisers in the beltway in preparation of a close general election.

They’ve been traveling east while politicians with presidential aspirations have been racking up frequent flier miles going the other way, stumping and chowing down at famous steak fries.

You can follow ad spending data in your local markets with Ad Sleuth to see ad blitzes before they land and learn how committees’ focus is shifting from week to week.


Finally, Some Good Environmental News: The Ozone Layer Is Recovering

A United Nations scientific panel reports that the Earth’s protective ozone layer has begun to recover, in large part because the world has successfully phased out man-made halogenated hydrocarbons, including chlorofluorcarbons (CFCs), which used to be found in all aerosol sprays and refrigerators. These chemicals release chlorine and bromide, which destroy molecules far up in the […] [...]

Finally, Some Good Environmental News: The Ozone Layer Is Recovering

A United Nations scientific panel reports that the Earth’s protective ozone layer has begun to recover, in large part because the world has successfully phased out man-made halogenated hydrocarbons, including chlorofluorcarbons (CFCs), which used to be found in all aerosol sprays and refrigerators. These chemicals release chlorine and bromide, which destroy molecules far up in the […] [...]

The enduring power of the ex-senator

three-quarters portrait of John Breaux, wearing a dark suit in his former Senate office with an American flag to his right.

John Breaux, a Louisiana Democrat who served 32 years in Congress before opening a lobbying firm, will be back in his old committee room today. (Photo credit: Wikipedia)

To see the power of Washington’s revolving door — and the weakness of congressional lobbying regulations — check out two events today involving well-heeled corporate interests, health care policy and powerful former members of Congress.

This afternoon at the offices of the influential Bipartisan Policy Center, one of the cofounders of the organization, former Senate Democratic Leader Tom Daschle, will be emceeing a program presenting “innovative strategies” on improving the nation’s health from CEOs of major corporations and leaders of several health associations. Daschle is not registered as a lobbyist, even though he serves as a senior adviser to DLA Piper, a law firm that has spent nearly $137 million since 1989 lobbying on behalf of a wide range of blue-chip corporate clients.

Those clients include two major health insurers, Aetna and Blue Cross/Blue Shield, both companies that are represented on the panel Daschle is moderating. The event is just an hour long, which is potentially noteworthy: Congressional regulations say an individual does not have to register as a lobbyist with Congress unless he or she spends more than 20 percent of his or her time working for an individual client in a given period.

Thirty minutes after Daschle’s panel gets underway, his former Democratic Senate colleague, John Breaux, will be leading a discussion about the role of digital technology in health care in an even more impressive venue — in a Senate hearing room, sponsored by the Senate Select Committee on Aging, which the Louisianan used to chair. Breaux is a registered lobbyist with the Breaux Lott Leadership group, named after him and his partner, former Senate Republican Leader Trent Lott of Mississippi. The notice for the event gives a nod to Breaux’s role as one of the three leaders of the Alliance for Connected Health Care, an organization so new it does not yet appear to have filed the 990 form that would provide details on officers and expenses with the Internal Revenue Service. The other leaders: Lott and Daschle.

While the Breaux Lott firm does not list Alliance for Connected Health Care as a client, Daschle’s employer, DLA Piper, does. DLA Piper registered as a lobbyist for the alliance in February and was paid $170,000 through the first half of this year.


Wrangling messy political data into usable information

Thanks to the Lobbying Disclosure Act of 1995, individuals and organizations must disclose the activities they undertook each quarter while representing themselves or their clients to Congress. After the Honest Leadership and Open Government Act of 2007 was passed, there was a rapid and sustained use in electronic filing for lobbying disclosures. There are now over 500,000 disclosure forms available for analysis in electronic formats from the past seven years. Although the disclosures don’t offer nearly as many specifics as one would hope, when taken in aggregate the available data provides a high level overview of the movements and trends of the lobbying industry.

Sadly, we can’t just skip from downloading the data to calculating aggregate statistics. The disclosure forms include no reliable way of knowing when two lobbying firms or two clients of a lobbying firm are the same. Without taking an educated guess, we don’t know from the data that the client called “1Sky Education Fund” and the client called “350.org (formerly known as 1Sky Education Fund)” are in fact the same organization. Compounding the issue, two firms usually don’t disclose the name of the same client in the exact same way. Some lobbying firms hardly even disclose their own name consistently. Before we can get into the high level overview of lobbying disclosure data, we must merge and identify all the organizations and individuals in the disclosure forms.

The traditional approach to this problem has been to build software that allows people to label and tag disclosure forms. Human annotation by experts is a tried and true method for understanding these forms. If I had not been one lone intern but rather, say, a hungry swarm of human labor ready to descend on the senate disclosure data like politically inclined locusts settling in on vast fields of informative wheat, I too would have built a system to store the stream of annotations I would have been producing. But, for better or worse, this summer it was just me and a computer with 32 cores, 64 gigabytes of memory and no inborn interest in the lobbying activity of “NPLMCC–Nuclear Power Labor-Management Cooperation Committee” during the first quarter of 2014.

Moreover, we’ve found that lobbying disclosure forms all get submitted during the same two week period each quarter. This means that the month after the disclosure deadlines are hell on researchers. When disclosures hit, everybody drops everything and helps fight the good fight. Despite the best efforts of the labor liquidity movement, hiring and subsequently firing large swaths of lobbying disclosure experts is not a tenable system for dealing with the quarterly disclosures long term. If an organization wants to annotate and tag lobbying disclosure forms, the organization has to be structured to deal with sharp, regular and unavoidable labor surpluses and deficits from the get go.

Some organizations amortize this labor cost by creating dual roles like an individual serving as a reporter primarily and only as an annotator when needed, or finding other disclosure data sets of similar size that are “on” when lobbying disclosure forms are “off.” The Sunlight Foundation was not willing to pursue such a drastic organizational shift and so we decided to explore how far we could get with only software. If a technological solution could be found, we figured it would be faster, cheaper and more reliable than a team of human annotators with, hopefully, acceptable levels of accuracy and precisions.

And so, with all of the above in mind, I embarked on a quest to train a computer to care about politics. The Influence Explorer team figured that if I could reproduce even a fraction of the accuracy that human annotation provides, then a technological solution offered some very real benefits that made the trade off reasonably attractive. In short, we hoped that by sprinkling magical silicon dust over the lobbying disclosure data, we wouldn’t have to destroy the environment by burning of all the midnight oil folks would need to get the projects done each quarter.


Upon arrival to the Sunlight Foundation in May, I was given the goal of automating the annotation of lobbying disclosure data. I had effectively free rein to do what I thought was best. While in pursuit of this goal, I built a series of systems capable of easily answering interesting questions about the world of lobbying disclosures. ECHELON is the third prototype I’ve built so far and is by far the most successful and powerful. With just a few hours of computation, ECHELON is able to approximately reproduce the resolution precision that several years of dedicated hand curation built up.

I’m happy to say that automated annotation and tagging is well within the realm of possibility. I was able to build a system that approximated the results of other organizations. In particular, the number of unique clients and lobbying firms produced by our program was within a few thousand of the same statistics for human annotated data for the same time period. Considering that we started out with over 1,000,000 clients and registrants in total, we were very excited when we saw that we were getting down to around 33,000 clients and 7,000 registrants.

This is not meant to be a deep technical blog post, so I will only touch on the technical architecture before diving into some of the results. ECHELON is a Clojure project built on top of the free version of Datomic and levagesInstaparse heavily. At this point, the core of ECHELON consists of 1500 lines, with another 1000 lines for one off experiments and queries that I’ve been exploring. Most of the core code deals with loading in the data and getting it into just the right format.

As we’ll soon see, such a small project can pack a mean punch.

Parsing field names

One of the major forces powering ECHELON is the understanding of the disclosed name fields. Thanks to Instaparse, we were able to create a formal grammar for parsing the various corporate entities that appeared in the name fields for client, registrants, affiliated organizations and foreign entities. This means that we can turn “SkyTerra Communications, Inc., formerly Mobile Satellite Ventures” into something that looks like the following:

(("skyterra" "communications" :corporation) :fka ("mobile" "satellite" "ventures")) 

All that the above is indicating is that there was a corporation called Skyterra Communications, (“skyterra” “communications” :corporation), that was formerly known as, :fka, Mobile Satellite Ventures, (“mobile” “satellite”"ventures”). Which is neat, but sort of useless alone. Here’s what one gets when one runs “SKYTERRA COMMUNICATIONS CORPORATION F/K/A MOBILE SATELLITE VENTURES” through ECHELON’s parser:

(("skyterra" "communications" :corporation) :fka ("mobile" "satellite" "ventures")) 

We get the exact same result for both examples! Two names which look very different produce the exact same result. What I’ve done is create a way of taking a wide variety of inputs and imposing a rigid structure on them, with an emphasis on making similar inputs produce the exact same result.

Once I had run this parser over every organization name field in the disclosure data, I had a fair amount of power to play with. The parser is a vital step in the automated annotation process. As an example, I’ve picked out LightSquared because it is a organization that has a long history and has operated with several different names. Here are all the various names that ECHELON has annotated to be the same thing:

Names of entities matched to Lightsquared
SkyTerra Communications, Inc., formerly Mobile Satellite Ventures
LightSquared (Formerly known as SkyTerra)
LightSquared (formerly known as Skyterra / Lightsquared)
SkyTerra / LightSquared
LightSquared (formerly SkyTerra Communications, Inc.)
Mobile Satellite Ventures, LP
SkyTerra (Formerly Mobile Satellite Ventures)
Mobile Satellite Ventures
Skyterra (formerly known as Mobile Satellite Ventures)
Skyterra (formerly known as Mobile Satellite Ventures, LP)
SkyTerra Communications, Inc. (formerly Mobile Satellite Ventures)

So, by the use of a formal grammar and a smart annotation step, we are able to easily find and record the various names that an organization has used as it went about lobbying Congress.


ECHELON provides a powerful interface for querying the data. Assuming the answer to a question exists within the data, there hasn’t been a question yet that I’ve been able to think of that ECHELON cannot answer. The system is surprisingly powerful, more so than I could have hoped for. Here are the organizations which come up the most often in the disclosure data, broken down by the various associated names:

Alias Number of Occurrences
“Patton Boggs LLP” 3469
“Squire Patton Boggs formerly Patton Boggs LLP” 170
“Squire Patton Boggs” 7
“Patton Boggs, LLP” 6
Alias Number of Occurrences
“Van Scoyoc Associates” 6526
Alias Number of Occurrences
“Holland & Knight LLP” 5062
“Holland & Knight, LLP” 3
Alias Number of Occurrences
“Akin, Gump, Strauss, Hauer & Feld” 19
“Delaware North Companies on behalf of Akin Gump Strauss Hauer & Feld” 11
“Oneida Indian Nation on behalf of Akin Gump Strauss Hauer & Feld” 10
“City of Houston on behalf of Akin Gump Strauss Hauer & Feld” 10
“Akin Gump Strauss Hauer and Feld” 1
“Akin Gump Strauss Hauer & Feld” 1
Alias Number of Occurrences
“K&L GATES LLP” 4056
“K&L Gates LLP” 212
“K&L Gates, LLP” 12
“K&L Gates, LLp” 1
Alias Number of Occurrences
“Hogan Lovells US LLP” 1459
“Hogan & Hartson LLP” 634
“Hogan Lovells US LLP f/k/a Hogan & Hartson LLP” 380
“Hogan Lovells f/k/a Hogan & Hartson LLP” 131
“Hogan Lovells US LLP f/k/a Hogan & Hartson LLP” 3
Alias Number of Occurrences
“Cornerstone Government Affairs, LLC” 2999
Alias Number of Occurrences
“Cassidy & Associates, Inc. formerly known as Cassidy & Associates “ 2215
“Cassidy & Associates, Inc.” 268
“Cassidy & Associates” 244
“Cassidy & Associates Inc.” 70
“Tiffany & Co. on behalf of Cassidy & Associates” 7
“Hospital for Special Surgery on behalf of Cassidy & Associates” 6
“College of New Rochelle, The on behalf of Cassidy & Associates” 6
“Claflin University on behalf of Cassidy & Associates” 6
“Hampton University on behalf of Cassidy & Associates” 5
“United States Tennis Association Inc. on behalf of Cassidy & Associates” 4
“National Acquarium in Baltimore, Inc. on behalf of Cassidy & Associates” 3
“Institute for Student Achievement on behalf of Cassidy & Associates” 3
“National Aquarium in Baltimore, Inc. on behalf of Cassidy & Associates” 2
“Cassidy & Associates, Inc.formerly known as Cassidy & Associates” 1
Alias Number of Occurrences
“Podesta Group, Inc.” 62
Alias Number of Occurrences
“ALCALDE & FAY” 2894
“Alcalde & Fay” 14

There are many interesting little tidbits in the above output. The value of the parser is easily seen as we look at all the variations of the names that pop up within the documents. Specifically, there is a phenomena within disclosure forms where a third party will include itself within the name of the client, i.e. “Patton Boggs on behalf of Northrop Grumman Inc.” even though the lobbying firm that filed the form could be “Cassidy & Associates.” In general, the client name will be something like “Firm A on behalf of Client A” while the registrant will be neither “Firm A” nor “Client A.” This is a common pattern and the parser and annotator account for it. There are several theories about what these disclosed “on behalf of” relationships mean. The most believable one is that the disclosing firms hire these other firms to lobby on behalf of their clients in areas where the disclosing firms is weak. The clients get a wider range of expertise and, perhaps more importantly, clients don’t have to go through the trouble of coordinating with more than one lobbying firm directly. These seems like a reasonable explanation, but these relationships admittedly deserve scrutiny than I’ve been able to give them.

In some rare instances the grammar of the disclosure gets messed up though. While I’m not so into linguistic prescription, it seems like “Entity A on behalf of Entity B” usually means that “Entity A” undertook some work for the benefit of “Entity B” and not that “Entity B” undertook some work for the benefit of “Entity A.” However, as evidenced above, sometimes form fillers will flip the entity positions within the “on behalf of” statement. This confuses the automated annotator. That’s why “Hospital for Special Surgery on behalf of Cassidy & Associates” is resolved to be the same entity as just “Cassidy & Associates.” There is a potential solution to this problem involving more information and a more complicated annotation process, but this issue only occurred a handful of times and thus didn’t feel like it was within the scope of the current project. .

We can see from this that Pattons Boggs occurs most often! Neat. What sorts of activities does Boggs undertake for its clients? Part of the disclosure process is that Patton Boggs must break down what they do into specific issue codes representing the areas that any lobbying activity can fall under. Thus, here is a list of lobbying codes and the number of times Patton Boggs has undertaken an lobbying activity with that code during a quarter on behalf of itself or a client.

Issue Code Number of Occurrences
“Budget/Appropriations” 1679
“Transportation” 965
“Health Issues” 908
“Taxation/Internal Revenue Code” 593
“Medicare/Medicaid” 550
“Urban Development/Municipalities” 399
“Homeland Security” 393
“Energy/Nuclear” 351
“Housing” 332
“Financial Institutions/Investments/Securities” 332
“Defense” 290
“Telecommunications” 284
“Aviation/Aircraft/Airlines” 272
“Education” 257
“Economics/Economic Development” 247
“Law Enforcement/Crime/Criminal Justice” 236
“Natural Resources” 227
“Labor Issues/Antitrust/Workplace” 226
“Environmental/Superfund” 222
“Trade (Domestic & Foreign)” 163
“Government Issues” 148
“Indian/Native American Affairs” 127
“Agriculture” 115
“Insurance” 113
“Clean Air & Water (Quality)” 110
“Retirement” 106
“Consumer Issues/Safety/Protection” 99
“Disaster Planning/Emergencies” 81
“Communications/Broadcasting/Radio/TV” 66
“Chemicals/Chemical Industry” 65
“Copyright/Patent/Trademark” 64
“Banking” 64
“Gaming/Gambling/Casino” 61
“Utilities” 59
“Food Industry (Safety, Labeling, etc.)” 59
“Pharmacy” 54
“Science/Technology” 53
“Roads/Highway” 53
“Manufacturing” 49
“Marine/Maritime/Boating/Fisheries” 46
“Travel/Tourism” 41
“Computer Industry” 41
“Tobacco” 40
“Small Business” 40
“Medical/Disease Research/Clinical Labs” 39
“Immigration” 38
“Foreign Relations” 38
“Railroads” 35
“Veterans” 33
“Sports/Athletics” 32
“Bankruptcy” 24
“Torts” 23
“District of Columbia” 19
“Real Estate/Land Use/Conservation” 18
“Fuel/Gas/Oil” 16
“Beverage Industry” 15
“Aerospace” 15
“Firearms/Guns/Ammunition” 12
“Automotive Industry” 10
“Family Issues/Abortion/Adoption” 9
“Accounting” 9
“Welfare” 8
“Advertising” 8
“Trucking/Shipping” 7
“Media (Information/Publishing)” 6
“Alcohol & Drug Abuse” 5
“Arts/Entertainment” 3
“Intelligence and Surveillance” 2
“Constitution” 2
“Commodities (Big Ticket)” 2

Woah! No wonder Patton Boggs is the entity that shows up the most, they seem to be doing a little bit of everything. How neat. I wonder how things change with time, though. Does Patton Boggs have its bread and butter type lobbying activities or has it been a dynamic firm? Here are the top five activities for each of the past seven years for Patton Boggs:

2008 Number of Reports
“Budget/Appropriations” 281
“Transportation” 127
“Health Issues” 105
“Taxation/Internal Revenue Code” 76
“Medicare/Medicaid” 72
2009 Number of Reports
“Budget/Appropriations” 308
“Transportation” 167
“Health Issues” 149
“Taxation/Internal Revenue Code” 95
“Medicare/Medicaid” 89
2010 Number of Reports
“Budget/Appropriations” 296
“Health Issues” 198
“Transportation” 152
“Taxation/Internal Revenue Code” 113
“Medicare/Medicaid” 101
2011 Number of Reports
“Budget/Appropriations” 254
“Transportation” 150
“Health Issues” 134
“Taxation/Internal Revenue Code” 92
“Medicare/Medicaid” 74
2012 Number of Reports
“Budget/Appropriations” 224
“Transportation” 158
“Health Issues” 122
“Medicare/Medicaid” 82
“Taxation/Internal Revenue Code” 79
2013 Number of Reports
“Budget/Appropriations” 184
“Transportation” 122
“Health Issues” 119
“Medicare/Medicaid” 83
“Taxation/Internal Revenue Code” 76
2014 Number of Reports
“Budget/Appropriations” 132
“Transportation” 89
“Health Issues” 81
“Taxation/Internal Revenue Code” 62
“Medicare/Medicaid” 49

So it seems that Patton Boggs does have its standard issues that it hits every year, with very little movement in the ranking of issues each year. A solid firm then, a stoic firm one might say, a firm that knows what it is good at and sticks to its guns. Good on you Patton Boggs, good on you. Now, this is not the limit of what is possible with ECHELON at all. There is a whole rabbit hole of queries and results that we could disappear into. Every which way I turn when touching the data new questions arise and they can quickly overwhelm us. This post is only meant to introduce and briefly explain ECHELON and its capabilities and so let’s focus on one particular type of query to wrap everything up.

A Comedy of Errors

Back when I was young and naive, i.e. three months ago, I had great faith in the identifiers that the house and senate gave to each registrant and client. You see, registrants are the ones who are actually filling out and filing the forms that I’ve been analyzing. Every registrant, which typically means every lobbying firm, must register that they are going to lobby on their clients’ behalf. Then, each quarter, the registrants file a report on behalf of their clients disclosing the activities they undertook. The house and the senate give each client and firm pair a unique identifier to use when filing the forms. While these identifiers aren’t terribly useful by themselves, they could potentially make it easier to link up all the activities that firms undertook for clients across time. Early on, I was advised by colleagues to look into how reliable the identifiers were. After some rough experiments, it seemed that lobbyists made enough mistakes when entering the identifiers that correcting them all by hand was possible but would not be enjoyable nor productive. I decided to ignore the government issued identifiers for a while if I could by.

Obviously, ECHELON has been a success without using the government issued identifiers. Moreover, ECHELON can tell us exactly how much of a problem these government issued identifiers would have posed if I tried to use them. By relying on only the automated annotation, we can easily find mistakes that lobbyists made when entering in the identifiers on the forms.

First off, form fillers don’t seem to make mistakes when entering in the senate issued identifier. We’ve checked and the senate identifier is apparently used to log into the disclosures systems for both the senate and the house. Thus, these forms cannot be uploaded and still have the senate id wrong. This was a surprising and encouraging find!

The house identifier did not fare nearly so well. I was able to find a couple dozen serious uncorrected mistakes that were made when entering in the house id. At first blush, fourteen mistakes out of over 500,000 forms filled out is a pretty decent track record. However, this number is a lower bound on the number of mistakes that have been made and gives no indication as to the actual number of mistakes. If I ran a better query to find mistakes, found better techniques for annotation, or caught an unknown mistake I was making in my current code, the number of found mistakes in the house identifiers column could sharply increase.

What distresses me most about these mistakes is that they occur in the field where precision matters the most. A firm can forget to include an activity in the disclosures, fudge the numbers on how much they were paid, even misspell the name of their client and it would be fine. That sort of thing doesn’t really matter all that much. Identifiers matter because they are meant to precisely identify an organization and there is no room for error. By making any mistake at all when entering in the identifier, no matter how small the mistake may be, lobbying firms effectively negate the entire purpose of the field in the first place. I’d rather have them leave it blank than to put in nonsense.

Putting the rants of a young pedantic wonk aside, there are two types of mistakes that I’ve found so far when firms are filling out disclosure forms. The first is just a simple typo where something like “1001″ becomes something like “10001” or “2001.” The majority of the mistakes found where just typos. These mistakes aren’t terribly interesting, so let’s just look at two examples of them.

“Process Handler et al.” registered that it would be lobbying on behalf of “Mr. Cie Sharp” in early 2007. All throughout 2008, the activities undertaken by the firm on behalf of the client were disclosed with the house identifier “363570022” (as evidenced by the Q1, Q2, Q3 and Q4 reports). However, the house identifier of “362570022” was used on theQ1 report for 2009. After that, the reports switched back to using “363570022” until the relationship was terminated at the end of 2009 (Q2,Q3,Q4, Q4 termination).

“Hoffman, Silver, Gilman & Blasco P.C. (formerly known as Robertson, Monagle & Eastuagh [sic]” has had a long multi year relationship with the “Alaska Forest Association.” The relationship between the two entities is typically disclosed with the house identifier “306260000.” In the fourth quarter of 2011 though, three separate reports were filed to detail this relationship. None of them are amendments to the others, all of them spell the client’s name wrong and two of them use the wrong house identifier (the incorrect “306250000” and “306260005” vs. the correct “306260000”). Strange.

Moving beyond typos, there were two cases of general incompetence. “Keevican Weiss Bauerle & Hirsch, LLC” has lobbied for “TriState Capital Bank” under the identifier “405970000” since 2009. Although they eventually settled in with using the correct identifier, the fourth quarter of 2009 saw that same pattern of three different disclosures, none of them amendments, with two of the disclosures using the wrong identifiers. Thefirst mistake was a simple typo where an extra zero was included at the begininng of the identifier. The second mistake seems nonsensical though; there isn’t a simple way of getting from “405970000” to “408550000” without making at least three typos. If we look for other relationships which have the same identifier though, we see that “Keevican Weiss Baurele & Hirsch, LLC” also does work for “C & S Patient Education Foundation dba Conquer Chiari” and, surprise, that relationship has the “40885000” identifier.

“The Susquehanna Group” has lobbied for “The Corps Network” for about a decade now. This relationship typically uses the identifier “358530003”.One time they made a typo, who cares, but another time they did something odd. I think they just made up a house identifier and used that instead. This report uses “200052379” as the house identifier. That’s more than a few typos and there doesn’t seem to be any other client of anyone who has ever used that identifier. So, as far as I can tell, during one quarter “The Susquehanna Group” just made up an identifier and decided to use that instead. Very strange.


This has been a terribly long way to explain something that most anyone who has ever worked with raw lobbying disclosure forms has discovered: lobbying disclosure forms are awful in a variety of astounding and disappointing ways. The disclosure forms provide very little information about what is actually going on and the information that is provided is on par with second hand gossip at best. Only by leveraging a fair amount of technical resources and techniques could these forms be processed and turned into something useful. In a way, we’ve shown how ECHELON bootstraps itself out of nothingness and into the Lobbying Form Typo Limelight. ECHELON needs to exist because look at what ECHELON has already had to do to exist! In all seriousness though, the ECHELON project has been a success that shows the power and potential of automated annotation systems. Just as the earth ever so patiently applied pressure and force on the excrement of long forgotten herbivores to create the fuel that powers our modern day economy, we too can apply annotation techniques and hard work to lobbying disclosure data and create something that can further our understanding of the modern political landscape.


Today in #OpenGov 9/16/2014

Keep reading for today’s look at #OpenGov news, events, and analysis, including Inspector General publicity, a not-so-transparent law in Spain, and lots of money for government tech start ups. 

A newspaper with the headline Open Gov

National News

  • Inspectors General often have trouble getting their reports heard over the din of Washington. Luckily, the Special Inspector General for Afghanistan Reconstruction runs a robust PR operation and believes in sharing tips with other watchdogs. (POGO)
  • The third round of Presidential Innovation Fellows has been announced. (The White House)
  • The GOP is looking increasingly likely to take the Senate in October and K street is already preparing. A group of Republican Senators met with major lobbyists and donors for a strategy session yesterday morning. (POLITICO)
  • Fitbit, the personal health tracking company, is stocking up its Washington pantry after Senator Chuck Schumer (D-NY) questioned the privacy practices of it and similar companies. (National Journal)

International News

  • Spain is set to launch its transparency law in three months but, despite calls from civil society, the government has not been forthcoming with any guidance as to how it will be implemented. (Access-Info)

State and Local News

  • The new GovTech venture capital fund will focus on funding software and hardware start ups building tools aimed at state, local, and national governments. (Fast Company Exist)
  • Manhattan Borough President Gale Brewer is tackling civic technology problems with open data and students in the City University of New York’s Service Corps program. The CUNY students will learn how to mine the Big Apple’s data portal. (Gotham Gazette)
  • Predictive analysis is helping cities predict and move to ameliorate public health crises before they happen. (Harvard Business Review)

Events Today

Events Tomorrow

Do you want to track transparency news? You can follow the progress of relevant bills, court cases, and regulations using Scout. You can also get Today in #OpenGov sent directly to your preferred news reader. If you would like suggest an event, please email mrumsey@sunlightfoundation.com by 7 am on the Monday prior to the event. 


Landscape Architecture in the News Highlights (September 1–15)

Design Profile: Q&A with Marcel Wilson of Bionic Landscape Architecture – The San Francisco Chronicle, 9/2/14 “Marcel Wilson, the principal of San Francisco-based Bionic Landscape Architecture, sees every project as a possibility for invention.” Grand Park Benefits Made in America, but Is the Reverse True? – The Los Angeles Times, 9/2/14 “Luckily, even as concertgoers were […] [...]