Services / Source Coverage
Fourteen sources. One normalized feed.
Federal Register, agency RSS, regulator-specific PDFs — we handle the shape differences so you don't have to.
Why this matters.
Different regulators publish in completely different ways. The Federal Register is a structured JSON API. HUD publishes mortgagee letters as PDFs behind an HTML index. Fannie Mae lender letters are PDFs behind an Akamai CDN that blocks plain HTTP requests. Freddie Mac bulletins are on a JavaScript-rendered SPA. USDA Rural Development procedure notices are on a host that blocks requests from some cloud IP ranges entirely. Each source is its own ingest problem.
Every team that tries to aggregate this eventually hits the same wall: you can get the easy sources working, but the hard ones — the Fannie Mae PDFs, the Freddie Mac SPA, the USDA IP-block — demand specific tooling that doesn't generalize. We've built that tooling. Fourteen sources, none of them hand-waived.
What this looks like.
How we do it.
Per-source fetchers
JSON API for the Federal Register. Plain fetch for RSS and static HTML indexes. Firecrawl for sources that serve behind bot protection or a JavaScript SPA. Gemini PDF extraction for regulators that publish body content only as PDFs. Each source gets the right tool, not the easiest tool.
Cadence-aware scheduling
Every source has a known publication cadence — daily, weekly, monthly, or as-issued. Ingest runs match that cadence. We don't poll hourly for sources that publish monthly, and we don't wait a week for sources that publish daily.
Drift detection
Regulators quietly restructure their pages. When that happens, our parser can start returning nothing without throwing an error. We run a daily health check against every source: if a source hasn't produced new items in longer than expected, or if an ingest run has failed repeatedly, we get a Slack alert. You can see the current status of every source on the admin health page.
What's in the product today.
| SOURCE | CADENCE | NOTES |
|---|---|---|
| Federal Register (7 agencies) | Daily | CFPB, Fed, OCC, FDIC, FinCEN, HUD, FHFA via JSON API |
| CFPB newsroom + blog | Daily | RSS feed, ~25 items per refresh |
| HUD Mortgagee Letters | As issued | HTML index + Gemini PDF extraction |
| Ginnie Mae APMs | As issued | SharePoint index, body inline — no PDF dependency |
| VA Circulars | As issued | Static HTML index + Gemini PDF |
| USDA RD Procedure Notices | As issued | Firecrawl (IP-blocked host) + Gemini PDF |
| MPF Program Announcements | As issued | Static HTML index + PDF body via Gemini |
| Fannie Mae LL / SEL / SVC | As issued | Firecrawl (Akamai-protected) + Gemini PDF |
| Freddie Mac Bulletins | As issued | Firecrawl (JS-rendered SPA), body on detail page |
| CFPB Enforcement | As issued | Plain fetch, mortgage-filtered |
| HUD MRB Enforcement | As issued | HTML index + Gemini PDF, 100% mortgage-relevant |
| OCC Enforcement | Monthly | Monthly news release walker + Gemini PDF |
| FDIC Enforcement | As issued | Two-stage: press release detect + Firecrawl per-action |
| CSBS State Licenses | Monthly | 877 licenses across 54 jurisdictions, versioned |