Try searching for...

  • People: "alexandria ocasio cortez"
  • Topics: "medicare"
  • States: "kansas" (coming soon)
  • Cities: "kalamazoo" (coming soon)
  • Zip Codes: "90210" (coming soon)

A First Look at Profile Pages

Ryan Pivovar / June 26, 2023

Since April, we've been collecting congressional bill data, congressmember data, and voting records. Today, I'm happy to share a first look at profile pages. A lot of the data we've been collecting will eventually be on these pages.

A quick update on our data collection efforts:

  • We've collected most U.S. House of Representative voting records since 2019. Nearly a million individual votes!
  • We're using Apache Hive to perform massive bulk updates to clean this data. The work is in place to collect U.S. Senate records once the current House data is in a clean state.
  • We've downloaded over 17,000 Creative Commons-licensed images of U.S. Congressmembers. We want to use the best of them, so only a fraction of these images will actually make it to the site.

Collecting all of this data has helped us build out an important set of pages for our site: profile pages.

Profile pages allow you to easily see what national and local politicians have historically supported. More specifically for congressmembers, profile pages are where you can find every individual vote that the congressmember has cast while in Congress.

You can see a live example of the profile page below at bythetopics.com/people/al-lawson.

There are a few important details here:

  • All images we've gathered use Creative Commons licenses, and we provide linked attribution on every image.
  • A short description is provided on each profile page. This will be helpful for screen reader users, but there is likely lots of improvement we can make here.
  • By default, we hide the party that the politician is affiliated with. More on this below.
  • Every profile page will feature a complete table of all of the votes that person has cast while in Congress. Currently we dump the entire list on the page, but we'll paginate this soon.

If you visited that link above, you may have also noticed that the page is fast despite containing a non-paginated table of over a thousand voting records. We've developed a build process for our site which makes each page as static as possible. We pre-fetch all data unique to any given route and bake it right into the page. We want users to find what they're looking for as quickly as possible.

We currently have about 200 profile pages with voting records. In the next month, we should have profile pages with complete voting records for all active congressmembers.

We believe this will be the easiest method on the web to see every single vote that a congressmember has cast.

By default, we hide the party that the politician is affiliated with. Ultimately, the person's policies should be the focus of the profile page.

There are still some major challenges that need to be figured out for profile pages. In reality, congressmembers cast hundreds of votes each year. This is a massive amount of data, and a lot of it will be too "low-level" for users of the site.

We want our users to understand a politician's voting records in less than five minutes, and our current implementation is too "low-level".

We plan on adding some enhancements like in this example profile page below.

With all the voting records we have on each congressmember, we can determine "biases" that each politician has on a variety of topics. We can make these determinations with some confidence by using natural language processing. By analyzing bill text, we're able to determine what topics a bill covers, and we can determine politicians' biases by how they vote on these topics.

For instance, based on Bradley Barkus' voting records, we can determine that Bradley votes against bills which are about "Mail Delivery Trucks" nearly 67% of the time.

Using AI to extract meaning and intention is useful to produce confident results at scale, but we realize this is a challenging undertaking that will likely evolve over time. We want to make discovery of this information incredibly easy for users, but we also want to be as accurate as possible, and achieving both of these things will be difficult.

If there is not enough information for us to make a determination, then we cannot make that determination. Eventually, our system for making determinations will be "multi-modal" and will depend on a variety of heuristics.

Currently, we're focusing all of our efforts on U.S. Congress, but eventually we'll expand these to state and local legislatures.

So, what do you think? Do you have any suggestions or questions for profile pages? Please give us your feedback at the form here.

- Ryan