Web Crawling Model and Architecture

Vol-4 | Issue-02 | February 2019 | Published Online: 20 February 2019    PDF ( 337 KB )
Author(s)
Dr Anish Gupta 1; Dr. K. B. Singh 2; Dr. R. K. Singh 3

1Department Of Information Technology, B. R. Ambedhar Bihar University, Muzaffarpur

2PG Department of Physics, Samastipur College, Samastipur, LNMU, Darbhanga

3Department of EC and IT, MIT Muzaffarpur, Bihar

Abstract

Web crawlers [1] concurrently contend with multiple problems that can even contradict each other. To re-visit pages together with finding new pages concurrently, fresh copies of the web pages are held [3]. Usage of limited infrastructure such as network connectivity without web servers being overwhelmed. They ought to have many "good pages" but they don't specify exactly which pages are the preferred alternatives in advance.
This paper explores a paradigm that directly combines crawling with all existing search engine and, using customizable criteria, retrieves potential answers to cope with conflicting objectives [2]. We explain how this model generalises several individual cases, and add a crawling programming structure which really utilizes the model.

Keywords
web crawler, network, search engine, software architecture.
Statistics
Article View: 266