CSUB MapReduce Programming

Description

Instructions1. Write a MapReduce program that outputs the number of times each neighborhood appears in the Kaggle AirBNB dataset. You can download the dataset from here: https://www.kaggle.com/dgomonov/new-york-city-airb… You can see the schema (columns) of the dataset at the link above, too.The file is a CSV (comma-separated values) dataset; a comma separates the fields in the dataset.Use the WordCount approach to output from the Reduce stage to count the number of rentals in each neighbourhood (use the neighborhood field) and also output the neighborhood group (e.g. Brooklyn) using the neighbourhood_group field. For each neighborhood encountered, your output should look like this (this is only an example):Brooklyn Kensington 25Brooklyn Clinton Hill 5Manhattan Midtown 45
…To receive full credit, please hand in all of the following items:-All code (please attach this homework zipped into one file).-Your screenshots (-cat AND running the job)2. Write a MapReduce program that further analyzes the same Kaggle AirBNB dataset uses in part 1 of this homework. Write WordCount approach MapReduce programs as indicated below:2A Write a WordCount program to count the number of lines in the file. Name the program: CountLines. The Reducer should output:Total number of lines in AirBNB file: [number]2B Write a MapReduce WordCount program to count all lines that are shorter than the ideal number of fields. Name the program: CountBadShortRecordsThe Reducer should output:Total number of short lines in AirBNB file: [number]2C Write a WordCount program to count all lines that are longer than the ideal number of fields. Name the program: CountBadLongRecordsThe Reducer should output:Total number of long lines in AirBNB file: [number]2D Write a MapReduce WordCount program to count all lines that contain the ideal number of fields. Name the program: CountGoodRecordsThe Reducer should output:Total number of good lines in AirBNB file: [number]To receive full credit, please hand in all of the following items:A. All code (please attach this homework zipped into one file).
B. -cat each of the four output filesat the hlog command prompt, screenshot of the job running, for 4 results

Tags:
programming

code

map reduce

User generated content is uploaded by users for the purposes of learning and should be used following Studypool’s honor code & terms of service.

Reviews, comments, and love from our customers and community:

Article Writing

Keep doing what you do, I am really impressed by the work done.

Alexender

Researcher

PowerPoint Presentation

I am speechless…WoW! Thank you so much!

Stacy V.

Part-time student

Dissertation & Thesis

This was a very well-written paper. Great work fast.

M.H.H. Tony

Student

Annotated Bibliography

I love working with this company. You always go above and beyond and exceed my expectations every time.

Francisca N.

Student

Book Report / Review

I received my order wayyyyyyy sooner than I expected. Couldn’t ask for more.

Mary J.

Student

Essay (Any Type)

On time, perfect paper

Prof. Kate (Ph.D)

Student

Case Study

Awesome! Great papers, and early!

Kaylin Green

Student

Proofreading & Editing

Thank you Dr. Rebecca for editing my essays! She completed my task literally in 3 hours. For sure will work with her again, she is great and follows all instructions

Rebecca L.

Researcher

Critical Thinking / Review

Extremely thorough summary, understanding and examples found for social science readings, with edits made as needed and on time. Transparent

Arnold W.

Customer

Coursework

Perfect!

Joshua W.

Student

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>