A High-Performance Data Accessing and Processing System for Campus Real-time Power Usage

Sheng-Cang Chou, Chao-Tung Yang

Abstract


With the flourishing of Internet of Things (IoT) technology, ubiquitous power data can be linked to the Internet and be analyzed for real-time monitoring requirements. Numerous power data would be accumulated to even Tera-byte level as the time goes. To approach a real-time power monitoring platform on them, an efficient and novel implementation techniques has been developed and formed to be the kernel material of this thesis. Based on the integration of multiple software subsystems in a layered manner, the proposed power-monitoring platform has been established and is composed of Ubuntu (as operating system), Hadoop (as storage subsystem), Hive (as data warehouse), and the Spark MLlib (as data analytics) from bottom to top. The generic power-data source is provided by the so-called smart meters equipped inside factories located in an enterprise practically. The data collection and storage are handled by the Hadoop subsystem and the data ingestion to Hive data warehouse is conducted by the Spark unit. On the aspect of system verification, under single-record query, these software modules: HiveQL and Impala SQL had been tested in terms of query-response efficiency. And for the performance exploration on the full-table query function. The relevant experiments have been conducted on the same software modules as well. The kernel contributions of this research work can be highlighted by two parts: the details of building an efficient real-time power-monitoring platform, and the relevant query-response efficiency for reference.

Article Metrics

Abstract: 1143 Viewers PDF: 626 Viewers

Keywords


Internet of Things; Big data warehouse; Real-time processing; Spark; Hive; Impala;

Full Text:

PDF


References


Making hadoop easy with cloudera manager, 2017. https://www.cloudera.com/products/product-components/cloudera-manager.html.

Diane J Skiba. The internet of things (iot). Nursing education perspectives, 34 (5): 63–64, 2015.

Jens Dittrich and Jorge-arnulfo Quian. Efficient Big Data Processing in Hadoop MapReduce. Proceedings of the VLDB Endowment, 5 (12): 2014–2015, 2012.

Ashish Thusoo, Joydeep Sen Sarma, Namit Jain, Zheng Shao, Prasad Chakka, Ning Zhang, Suresh Antony, Hao Liu, and Raghotham Murthy. Hive -A petabyte scale data warehouse using hadoop. Proceedings -International Conference on Data Engineering, pages 996–1005, 2010.

Farag Azzedin. Towards a scalable HDFS architecture. In Proceedings of the 2013 International Conference on Collaboration Technologies and Systems, CTS 2013, pages 155–161, 2013.

Sahithi Tummalapalli and Venkata rao Machavarapu. Managing mysql cluster data using cloudera impala. Procedia Computer Science, 85: 463–474, 2016.

Matei Zaharia, Mosharaf Chowdhury, Michael J Franklin, Scott Shenker, and Ion Stoica. Spark: cluster computing with working sets. HotCloud, 10: 10–10, 2010.

Spark sql, dataframes and datasets guide, 2016. http://spark.apache.org/ docs / 1.6.0 / sql-programming-guide.html.

K. Li, F. Su, X. Cheng, W. Chen, and K. Meng. The research of performance optimization methods based on impala cluster. pages 336–341, 2016.

Xiufeng Liu and Per Sieverts Nielsen. A hybrid ict-solution for smart meter data analytics. Energy, 115, Part 3: 1710 –1722, 2016. Sustainable Development of Energy, Water and Environment Systems.

Shaker H. Ali El-Sappagh, Abdeltawab. Ahmed Hendawi, and Ali Hamed El Bastawissy. A proposed model for data warehouse {ETL} processes. Journal of King Saud University -Computer and Information Sciences, 23 (2): 91 –104, 2011.

D. Wang and Q. Zhou. A method of distributed on-line analytical processing of status monitoring big data of electric power equipment. Zhongguo Dianji Gongcheng Xuebao / Proceedings of the Chinese Society of Electrical Engineering, 36 (19): 5111–5121, 2016.

I. Mavridis and H. Karatza. Performance evaluation of cloud-based log file analysis with apache hadoop and apache spark. Journal of Systems and Software, 125: 133–151, 2017.


Refbacks

  • There are currently no refbacks.



Barcode

IJIIS: International Journal of Informatics and Information Systems

ISSN:2579-7069 (Online)
Organized by:Departement of Information System, Universitas Amikom Purwokerto, IndonesiaFaculty of Computing and Information Science, Ain Shams University, Cairo, Egypt
Website:www.ijiis.org
Email:husniteja@uinjkt.ac.id (publication issues)
  taqwa@amikompurwokerto.ac.id (managing editor)
  contact@ijiis.org (technical & paper handling issues)

 This work is licensed under a Creative Commons Attribution-ShareAlike 4.0