This project is read-only.
Project Description
A Website Search Engine and Crawler written in C#.

This web crawler program can be used to index entire web sites. It will save the pages including the url, title, language, meta description, mimetype, charset, complete html and extracted text of the page in a database.
The web crawler will also grab pdf, word, excel and powerpoint files and extract text from these file types.
The index will be saved in a MS SQL database. A database creation sql script is included in the download.

There is also a sample of how to search the index.

Click here to see how you get started: Geting started guide.

Last edited Jun 8, 2012 at 10:08 AM by esben2000, version 3