A Gentle Introduction to Hadoop Platforms
International Journal of Recent Engineering Science (IJRES) | |
|
© 2015 by IJRES Journal | ||
Volume-2 Issue-5 |
||
Year of Publication : 2015 | ||
Authors : Rabi Prasad Padhy, Deepti Panigrahy |
||
DOI : 10.14445/23497157/IJRES-V2I5P107 |
How to Cite?
Rabi Prasad Padhy, Deepti Panigrahy, " A Gentle Introduction to Hadoop Platforms," International Journal of Recent Engineering Science, vol. 2, no. 5, pp. 44-57, 2015. Crossref, https://doi.org/10.14445/23497157/IJRES-V2I5P107
Abstract
Big fish, eats small fish: Big enterprises eat small enterprises-law of nature cloned into law of economics and technology. A further proof of this law is the popularity of Big Data Hadoop Platforms because it meets the needs of many organizations for flexible data analysis capabilities with an unmatched price-performance curve. This is the era of big data and an increasing number of companies are using to analyze structured and unstructured data due to features like scalability, cost effectiveness, flexibility and fault tolerance. Currently Hadoop is in boom stage and there is a WhatsApp-like movement in Big Data Analytics Market. In this research paper we have focused basic architecture of Hadoop, implementation of HDFS file system and MapReduce Algorithm. We have also briefly discussed on various big data computing Hadoop platforms with the advantages and disadvantages of each platform.
Keywords
Hadoop, HDFS, Big Data, MapReduce, Unstructured Data;
Reference
[1] Padhy, Rabi Prasad. "Big Data Processing with HadoopMapReduce in Cloud Systems." International Journal of Cloud Computing and Services Science (IJ-CLOSER) 2, no. 1 (2012): 16-27.
[2] S. Ghemawat, H. Gobioff and S-T. Leung, “The Google file system,” Proc. of the 19th symposium on Operating Systems Principles, 2003, pp. 29-43.
[3] Padhy, Rabi Prasad, Manas Ranjan Patra, and Suresh Chandra Satapathy. "RDBMS to NoSQL: Reviewing some nextgeneration non-relational databases." International Journal of Advanced Engineering Science and Technologies 11.1 (2011): 15-30.
[4] Douglas, Laney. "The Importance of 'Big Data': A Definition". Gartner. Retrieved 21 June 2012.
[5] Apache Hadoop: http://hadoop.apache.org/
[6] Apache Hadoop MapReduce Tutorial: http://hadoop.apache. org/docs/r1.0.4/ mapred_tutorial.html
[7] Hadoop Tutorial, Yahoo Developer Network, http://developer.yahoo.com/hadoop/tutorial Hadoop 2.0 (YARN) and Its Components
[8] Dwivedi, Kalpana, and Sanjay Kumar Dubey. "Analytical review on Hadoop Distributed file system." Confluence The Next Generation Information Technology Summit (Confluence), 2014 5th International Conference-. IEEE, 2014.
[9] Apache Hadoop homepage (https://hadoop.apache.org/)
[10] Cloudera homepage (http://www.cloudera.com/content/cloudera/en/home.html)
[11] Hortonworks HDP homepage (http://hortonworks.com/)
[12] Kulkarni, Amogh Pramod, and Mahesh Khandewal. "Survey on Hadoop and Introduction to YARN." International Journal of Emerging Technology and Advanced Engineering 4.05 (2014): 82-87.
[13] K. Shvachko, H. Huang, S. Radia, and R. Chansler, “The Hadoop distributed file system,” in 26th IEEE (MSST2010) Symposium on Massive Storage Systems and Technologies, May 2010.
[14] J. Dean and S. Ghemawat, “Mapreduce: Simplified data processing on large clusters,” Commun. ACM, vol. 51, no. 1, pp. 107–113, Jan. 2008.
[15] T. Gunarathne, T.-L. Wu, J. Qiu, and G. Fox, “Mapreduce in the clouds for science,” in Cloud Computing Technology and Science (CloudCom), 2010 IEEE Second International Conference on. IEEE, 2010, pp. 565–572.
[16] Dean J, Ghemawat S.(2008) MapReduce: simplified data processing on large clusters. Communications of the ACM, 51(1), p.p.07-113.
[17] V. Kalavri and V. Vlassov, “MapReduce: Limitations, Optimizations and Open Issues.”
[18] Amazon Elastic MapReduce homepage (http://aws.amazon.com/elasticmapreduce/)
[19] MapR homepage (http://mapr.com/)
[20] Microsoft HDInsight homepage (cloud, http://www.windowsazure.com/)
[21] Kevin T. Smith “Big Data Security : The Evolution of Hadoop’s Security Model”
[22] Vinay Shukla s “Hadoop Security: Today and Tomorrow”.