Guides
Task-oriented walkthroughs: how fb reads Facebook, pages and profiles, posts and comments, media, search, and building datasets.
Each guide is a focused walkthrough of one part of fb. Start with how fb reads Facebook to understand what it can and cannot see, since that shapes everything else.
- How fb reads Facebook: the anonymous crawler model and its limits.
- Pages and profiles: resolve entities and stream feeds.
- Posts and comments: one story, its thread, and reactions.
- Media: photos, videos, reels, and events.
- Search and discovery: find entities and classify ids.
- Datasets: seed, crawl, and query a local SQLite store.
- Archiving: mirror a Page to an incremental tree of Markdown.
How fb reads Facebook
fb reads the server-rendered pages Facebook serves to search engines, so there is no login and no browser.
Pages and profiles
Resolve Pages, profiles, and groups to rich records, and stream their feeds.
Posts and comments
Resolve one post in full, walk its comment thread, and read the reaction breakdown.
Media
Stream a handle's photos, videos, and reels, resolve a single item, and list public events.
Search and discovery
Search across Facebook's surfaces, and classify any id or URL with fb id.
Datasets
Expand a root into URLs, crawl them into records, and store them in a local SQLite database you can query with SQL.
Archiving
Save a Page's whole feed as a browsable, incremental tree of Markdown files: one file per post with its comments, indexed by month.