Please use this identifier to cite or link to this item:
http://ir.juit.ac.in:8080/jspui/jspui/handle/123456789/9935
Title: | Healthcare Data Pipeline |
Authors: | Dabral, Shivam Mohana, Rajni [Guided by] |
Keywords: | Python programming Apache spark Pseudocode Healthcare |
Issue Date: | 2023 |
Publisher: | Jaypee University of Information Technology, Solan, H.P. |
Abstract: | Big Data Processing is a matter of interest for many companies around the globe as they try to harness the true power of data. Similarly Nference labs private limited is trying to make use of healthcare data to provide people with better medical support. This project aims at exploring such various techniques that employ engines and frameworks that can generate useful data from raw data effectively and efficiently. Various techniques were examined based upon many research papers and compared. The results suggested the use of Apache Spark as an engine for computation. The data files were stored in parquet format with snappy compression, so that data occupies less space. Hence the aim was to come up with an efficient data generation pipeline that can handle Terabytes of data. |
Description: | Enrolment No. 191273 |
URI: | http://ir.juit.ac.in:8080/jspui/jspui/handle/123456789/9935 |
Appears in Collections: | B.Tech. Project Reports |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Healthcare Data Pipeline.pdf | 1.64 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.