- I can confirm that the web archive can be really slow

- I think I have seen that AI scrapers create bottleneck in the bandwidth

- To some digital archives you need to create scientific accounts (I think Common Crawl works like that)

- Data quite easily can be very big. The goal is to store many things. We not only store Internet, but with additional dimension of time

- Since there is a lot of data, it is difficult to navigate it, search it, so it easily can become unusable

- For example that is why I created my own meta data link, I needed some information about domains

Link: