Website User Query Privacy Preserving by Data Mining

Project Video

Project Discription

15-August-2018

Introduction

With the drastic increase of the digital text data on the servers, it is necessary to develop new algorithms by the researchers. Considering this fact work has focus on one of the issue of the keyword identification which is build by the different documents. Here many researchers has already done lot of work but that is focus only on the content classification. But in this work not only keywords but documents are also identified then classify. In some of previous work document classification is occur on the basis of the Prior information about the content provider. This limitation is successfully overcome in this work by classifying whole set of documenys without any background information.

Proposed Work

As the web mining is utilize in different type of data analysis so for the same all need to increase the different technique in the required area. So contributing the text mining is done in this work by the proposed method for classifying the keywords and categorizes the articles in the group without having any prior knowledge of the individual articles. In the propose work no need of any format for the input data such as speakers identification symbol or special character, here all process is done by utilizing the different combination of text mining field.

PROBLEM IDENTIFICATION

Web Searching of document is done by using multi-keyword technique. Here frequent words are arrange in tree data structure where levels increase as per number of keywords.
For each term and document there is separate path, but number of path increases as number of documents are increasing, so adding of new document is quit complex.
So for finding the new document recursive steps are required for other related documents which is quit time consuming.
Text document need privacy while comparison which can be improved by replacing words by its comparative value.

Testing Parameters

Precision = TP / TP + FP

Recall = TP / TP + FN

F-Measure = 2 * Precision * Recall / (Precision +Recall)

In above true positive value is obtain by the system when the ranked article is in favor of user query and system also says that article is in favor of the user query. While in case of false positive value it is obtain by the system when the input article is in favor of user query and system do not rank that article in their list.

Project Sample Image

Other Detail

Software Requirement : MATLAB Software

Hardware Requirement : RAM: 2GB, Processor 2.2Ghz

Application :

Project Attachement

PDF			IEEE Base Paper
Doc			Complete Document in MS Word
Read me			Read Me of Project
Source Code			Complete Code files