Tag: Hadoop
-
Day 24
Data Warehouse Online Analytical Processing (OLAP). Store structured data, data source from Online Transactional Processing (OLTP), live database. Storage repository; not storing live data, only store pass/historical data. Mostly used for model training, analysis and pattern discovery. Example, student data stored in database but graduated student data stored in data warehouse. Data Lake Store both…
-
Day 6
Map Reduce refer to– https://informationit27.medium.com/hadoop-mapreduce-in-action-b7c723b604ba– https://www.slideshare.net/mudassarmulla/tutorial-hadoop-hdfsmapreduce– https://cwiki.apache.org/confluence/display/HADOOP2/JobTracker– https://www.youtube.com/watch?v=ULtOZqlZnCw Tools built on top of Map Reduce Shortcoming of Map Reduce
-
Day 5 (1)
Tutorial 4 Tutorial 5 Discuss and evaluate suitable techniques/methods being used in literature while performing the big data analytics on the following:a) Market Basket Analysis.b) Customer Churn Prediction Analysis.Please support your discussion based on a research paper. Example AnswerThe big data analytics on Market Basket Analysis could help to· Provide combo offers based on products…
-
Day 4
5 Daemon “Daemon” that sound like demon, is the background service that not initiated by user. HDFS Map Reduce Hadoop is distributed storage and processing. It only means the data node (storage & processing), not for name node; Name node (master) must be high availability hardware (expensive); Secondary name node come in to make name…
-
Day 3
What is the benefit of distributed? Using parallel concept, original task might complete in 11 hour, but if parallel in 4 machine, it would took only 3 hour. Challenges Hadoop Core Principle Hadoop Components Why Hadoop? (feature) Hadoop Definition Hadoop is an open-source software framework (LICENSE) for distributed storage and distributed parallel processing (HOW) of…
