pavithirakc Ответов: 0

Как я могу динамически очищать данные с разных веб-сайтов на основе входных данных


I am trying to build a system, which when given an input, would return relevant specific information about it by scraping the web (For example: given a software name, output information about its releases).

How to go about building a scraper for such a system?


Что я уже пробовал:

I have done web scraping before using Beautiful Soup. But, that pertained to getting information from a single specific website.

In this case, I might have to scrape websites of dynamically built URLs (like wiki pages of the input software or official product pages shown in google search results) and different software websites/wiki have different structures to display releases data. Are there any other approaches to get such information about different softwares in a structured way?

Richard MacCutchan

Точно так же, как выскабливание одного сайта, но вам нужно будет найти список адресов веб-сайтов откуда-то.

0 Ответов