Crawling a billion web pages in just over 24 hours, in 2025

1 comments

Nice work, but I feel like it's not required to use AWS for this. There are small hosting companies with specialized servers (50gbit shared medium for under 10$), you could probably do this under 100$ with some optimization.

This. AWS is like a cash furnace, only really usable for VC backed efforts with more money than sense.