FOCUS : LEARNING TO CRAWL WEB FORUMS

  • Unique Paper ID: 143665
  • Volume: 2
  • Issue: 12
  • PageNo: 448-452
  • Abstract:
  • In this paper, we are describing Forum Crawler Under Supervision (FoCUS), a supervised web-scale forum crawler. The goal of FoCUS is to crawl relevant forum content from the web with minimal verhead. Forum threads contains the information about the target of forum crawlers. And the forums had several styles and they are powered by different forum software packages. They always have similar implicit navigation paths connected by specific URL types to lead users from entry pages to thread pages. Based on this observation, we reduce the web forum crawling problem to a URL-type recognition problem by the following techniques and methods.
add_icon3email to a friend

Copyright & License

Copyright © 2025 Authors retain the copyright of this article. This article is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

BibTeX

@article{143665,
        author = {Bagam Rakesh and S Ravi Kiran, Asst. Prof. and N. Swapna Suhasini},
        title = {FOCUS : LEARNING TO CRAWL WEB FORUMS},
        journal = {International Journal of Innovative Research in Technology},
        year = {},
        volume = {2},
        number = {12},
        pages = {448-452},
        issn = {2349-6002},
        url = {https://ijirt.org/article?manuscript=143665},
        abstract = {In this paper, we are describing Forum Crawler Under Supervision (FoCUS), a supervised web-scale forum crawler. The goal of FoCUS is to crawl relevant forum content from the web with minimal verhead. Forum threads contains the information about the target of forum crawlers. And the forums had several styles and they are powered by different forum software packages. They always have similar implicit navigation paths connected by specific URL types to lead users from entry pages to thread pages. Based on this observation, we reduce the web forum crawling problem to a URL-type recognition problem by the following techniques and methods.},
        keywords = {FOCUS, Forums, Crawler, Page Classification, URL Pattern Learning},
        month = {},
        }

Cite This Article

  • ISSN: 2349-6002
  • Volume: 2
  • Issue: 12
  • PageNo: 448-452

FOCUS : LEARNING TO CRAWL WEB FORUMS

Related Articles