The Public Data Observatory’s datasets are prepared, verified, and validated by federal data subject matter experts. Our datasets are built on a foundation of publicly available federal data sources, including Congressional and Presidential budget documents, HHS’s Tracking Accountability in Government Grants System (TAGGS), the HHS Employee Directory, and CDC Funding Profiles. Where possible, we automate the collection, cleaning, and validation of these data to ensure accuracy and timeliness, and we publish all datasets as open-source files accompanied by data dictionaries, user guides, and full documentation of our methods. We use data linkage techniques to align information across sources — connecting appropriations to grant awards, workforce records to organizational units, and budget lines to program-level activity — creating an integrated framework that reveals how federal funding decisions translate into real-world impacts. All datasets are regularly updated as new appropriations are enacted, budgets are proposed, or workforce changes occur, ensuring that advocates, policymakers, and researchers always have access to the most current and complete picture of federal public health infrastructure.
