# vectorized Array Multiplication Reduction using SSE Project

Description

Here is the code mentioned in the requirements file: https://web.engr.oregonstate.edu/~mjb/cs575/Projec…
You can find more material related to the subject over here: https://web.engr.oregonstate.edu/~mjb/cs575/
all the lectures are listed there, with the week to the side https://web.engr.oregonstate.edu/~mjb/cs575/Handou…

1 attachmentsSlide 1 of 1attachment_1attachment_1

Unformatted Attachment Preview

CS 475/575 Project #4
1 of 3
https://web.engr.oregonstate.edu/~mjb/cs575/Projects/…
CS 475/575 — Spring Quarter 2022
Project #4
Vectorized Array Multiplication/Reduction using SSE
60 Points
Due: May 11
This page was last updated: March 16, 2022
Introduction
There are many problems in scientiﬁc and engineering computing where you want
to multiply arrays of numbers together and add up all the multiplies to produce a
single sum (Fourier transformation, convolution, autocorrelation, etc.): sum =
ΣA[i]*B[i]
This project is to test array multiplication/reduction using SIMD and non-SIMD.
For the “control groups” benchmarks, do not use OpenMP parallel forloops. Just use straight C/C++ for-loops. In this project, we are only using
OpenMP for the timing.
Requirements
1. Use the supplied SIMD SSE assembly language code to run an array
multiplication/reduction timing experiment. Run the same experiment a
second time using your own C/C++ array multiplication/reduction code.
2. Use diﬀerent array sizes from 1K to 8M. The choice of in-between values is up
to you, but pick values that will make for a good graph.
3. Run each array-size test a certain number of trials. Use the peak value for the
performance you record.
4. Create a table and a graph showing SSE/Non-SSE speed-up as a function of
array size. Speedup in this case will be (P = Performance, T = Elapsed Time):
S = Psse/Pnon-sse = Tnon-sse/Tsse
5. Note: this is not a multithreading assignment, so you don’t need to
worry about a NUMT. Don’t use any OpenMP-isms except for getting
the timing.
4/23/22, 14:06
CS 475/575 Project #4
2 of 3
https://web.engr.oregonstate.edu/~mjb/cs575/Projects/…
6. The Y-axis performance units in this case will be “Speed-Up”, i.e.,
dimensionless.
7. Parallel Fraction doesn’t apply to SIMD parallelism, so don’t compute one.
8. Your commentary write-up (turned in as a separate PDF ﬁle) should tell:
1. What machine you ran this on
2. Show the table of performances for each array size and the
corresponding speedups
3. Show the graph of SIMD/non-SIMD speedup versus array size (one curve
only)
4. What patterns are you seeing in the speedups?
5. Are they consistent across a variety of array sizes?
6. Why or why not, do you think?
SSE SIMD code:
• You are certainly welcome to write your own if you want, but we have already
written Linux SSE code to help you with this.
Find starter code in the ﬁle: all04.cpp.
• Note that you are linking in the OpenMP library only because we are
using it for timing.
• Because this code uses assembly language, this code is not portable. I
know for sure it works on ﬂip, using gcc/g++ 4.8.5. You are welcome to
try it other places, but there are no guarantees. It doesn’t work on
rabbit.
• You can run the tests one-at-a-time, or you can script them by making the
array size a #define that you set from outside the program.
Warning!
Do not use any optimization ﬂags when compiling this code. It jumbles up
the use of the registers.
+5 points Extra Credit
Combine multithreading and SIMD in one test. In this case, you will vary both the
array size and the number of threads (NUMT). Show your table of performances.
Produce a graph similar to the one on Slide #19 of the SIMD Vector notes, using
your numbers. Add a brief discussion of what your curves are showing and why you
think it is working this way.
Feature
Points
Array Multiply/Reduction performances and speedups
20
Array Multiply/Reduction speedup curve
20
Commentary
20
4/23/22, 14:06
CS 475/575 Project #4
3 of 3
https://web.engr.oregonstate.edu/~mjb/cs575/Projects/…
Extra Credit
+5
Potential Total
65
4/23/22, 14:06

Purchase answer to see full
attachment

Tags:
reduction

sse

Vectorized Array Multiplication

User generated content is uploaded by users for the purposes of learning and should be used following Studypool’s honor code & terms of service.

## Reviews, comments, and love from our customers and community:

### Article Writing

Keep doing what you do, I am really impressed by the work done.

Researcher

### PowerPoint Presentation

I am speechless…WoW! Thank you so much!

#### Stacy V.

Part-time student

### Dissertation & Thesis

This was a very well-written paper. Great work fast.

Student

### Annotated Bibliography

I love working with this company. You always go above and beyond and exceed my expectations every time.

Student

### Book Report / Review

I received my order wayyyyyyy sooner than I expected. Couldn’t ask for more.

Student

### Essay (Any Type)

On time, perfect paper

Student

### Case Study

Awesome! Great papers, and early!

Student

### Proofreading & Editing

Thank you Dr. Rebecca for editing my essays! She completed my task literally in 3 hours. For sure will work with her again, she is great and follows all instructions

Researcher

### Critical Thinking / Review

Extremely thorough summary, understanding and examples found for social science readings, with edits made as needed and on time. Transparent

Customer

Perfect!

Student