Design and Implementation of Clustering Algorithm for Big Data Analytics

Thakur, Parminder Singh; Sharma, Ankit; Kumar, Pardeep [Guided by]

Please use this identifier to cite or link to this item: http://www.ir.juit.ac.in:8080/jspui/jspui/handle/123456789/6402

Title:	Design and Implementation of Clustering Algorithm for Big Data Analytics
Authors:	Thakur, Parminder Singh Sharma, Ankit Kumar, Pardeep [Guided by]
Keywords:	Data mining Clustering algorithm Hadoop Framework Streaming via flume
Issue Date:	2017
Publisher:	Jaypee University of Information Technology, Solan, H.P.
Abstract:	This project is intended to implement clustering on Big Data sets. Data analysis is a crucial part of the process of formulating new policies and strategies for an organization that help the company to be successful in competitive markets, allowing them to better understand their customer base and giving solutions to so hard problems and thus gain competitive advantage. With almost every aspect of human life connected to the internet, data analysis has gained greater importance in fields like Smart Healthcare and national security and is now not just limited to business analysis. The vast and diverse nature of such Big data sets imposes new challenges as the traditional methods of knowledge discovery from databases are not equipped to handle such Big data. Distributed and Parallel computing frameworks are the key for such analysis. The purpose of clustering algorithms is to make sense of and extract value from large sets of structured and unstructured data. The technique of clustering enables a data analytic to obtain a snapshot of data of huge volumes and complexities which can be used to form some logical structures on such huge volumes of complex data. Thus clustering can provide with some form of structure and insight to the nature of data at hand and can form the basis of further analysis. To handle Big Data clustering various limitations of present clustering techniques are needed to be mitigated by understanding the framework used to handle Big data sets and analyzing data clustering. The target is to analyze these challenges in order to redesign and implement a clustering algorithm that is suitable for Big Data sets based on today's available technologies and frameworks. Furthermore, the possible future path for more advanced algorithms is illuminated
URI:	http://ir.juit.ac.in:8080/jspui/jspui/handle/123456789/6402
Appears in Collections:	B.Tech. Project Reports

Files in This Item:

File	Description	Size	Format
Design and Implementation of Clustering Algorithm for Big Data Analytics.pdf		1.58 MB	Adobe PDF	View/Open

Show full item record