Data Mining is not Screen Scraping
Data Mining and Screen Scraping do not mean the same thing in computing. There are some people who may disagree with this remark, but they are two separate concepts and disciplines. To give a brief answer, one can explain the concepts this way: Screen Scraping lets you get specific information, whereas Data Mining lets you analyze information you already have.
Let’s explain in greater detail so the two disciplines can be understood better. The name Screen Scraping derives from the early days of computing. You might remember when a computer filled an entire room and monitors had black and white screens which only produced text.
Screen Scraping was a technique, used back then, to extract characters from the monitors so they could be analyzed. Computers are very different today and the term ‘screen scraping’ has now a new meaning to go along with the advancing technology, and refers to extracting information from websites. This is done with specialist computer programs which have the ability to crawl or spider through websites extracting specific data.
This process allowed people to make comparisons, download text to spreadsheets, archive web pages and generally fill and analyze any information they care to choose.
On the other hand Data Mining is the practice of automatically searching huge stores of structured & unstructured data for certain patterns and trends. This means in practice you already have the data in house, but now you want to analyze it, to glean useful information for you and your business or organization. All businesses analyze data for marketing and to improve overall performance of the company.
Data Mining quite often uses a lot of complex algorithms which are based on statistical methods. As you can see Data Mining has nothing whatsoever to do with how you got the data in the first instance. Data Mining is only concerned with analyzing the data you already have available to you.
The problem is that sometimes people will search on the internet for anything that resembles Screen Scraping, because they don’t really know what the term means. There are other terms which mean much the same thing such as: Web Site Data Extraction, Text Data Mining, and Automated Data Collection. The various terminologies can be confusing i.e., Data Mining v Screen Scraping, but is important to use the terminology that people use. This article goes a short way to explaining the differences between the two concepts.