Frequent Itemset Mining for Big Data in social media using ClustBigFIM algorithm
Abstract – Tremendous amount of data is getting explored through IOT (Internet of Things) from variety of sources such as sensor network, social media feed, internet applications, called as Big Data. Big Data cannot be handled by conventional tools and techniques. Social networks are becoming dominant in communications over internet. The Big Data mining is essential in order to extract value from massive amount of data which could give better insights using efficient techniques. Association Rule mining and frequent itemset mining are popular techniques for data mining which needs entire dataset into main memory for processing, but large datasets do not fit into main memory. To overcome this limitation MapReduce is used for parallel processing of Big Data having features such as high scalability and robustness which helps to handle problem of large datasets. Proposed novel method, ClustBigFIM works on MapReduce framework for mining large datasets; ClustBigFIM is modified BigFIM algorithm providing scalability and speed in order to extract meaningful information from large datasets in the form of associations, emerging patterns, sequential patterns, correlations and other significant data mining tasks which gives better insight to make effective business decisions in competitive environment using faster and efficient parallel processing platform.