HOME JOURNALS CONTACT

Journal of Engineering and Applied Sciences

A Statistical Method for Big Data with Excessive Zero-Inflated Problem
Sunghae Jun

Abstract: In many cases, we meet the zero-inflated problem in big data analysis. This is because the value of zero is too much in the data table structured through preprocessing from collected big data. If the big data is analyzed as it is the performances of estimation and prediction of statistical models will deteriorate. To build valid models for big data analysis, we have to solve the zero-inflated problem of big data. So, we propose a statistical modeling to overcome the zero-inflated problem in big data analysis. In this study, we combine the method of data division with count data models such as Poisson, hurdle, negative binomial regressions. In order to verify the validity of the proposed approach, we carry out case study using simulated and patent big data.

How to cite this article
Sunghae Jun , 2019. A Statistical Method for Big Data with Excessive Zero-Inflated Problem. Journal of Engineering and Applied Sciences, 14: 2465-2469.

© Medwell Journals. All Rights Reserved