## Lesson Summary

Summary

In this lesson, students will learn how to acquire and analyze data to find answers to questions and solutions to problems. Students will consider whether or not the data they are presented with is necessarily valid, and research some of the various data sources online.

Outcome

• Students will explore how computation can be employed to help people process data and information to gain insight and knowledge.
• Students will learn how computation can be used to facilitate exploration and discovery when working with data.
• Students will consider what considerations and trade-offs arise in the computational manipulation of data.
• Students will explore opportunities that large data sets provide for solving problems and creating knowledge.

Overview

Session 1

1. Getting Started (5 min) - Students journal on the importance of validating data
2. Discuss journal prompt (5 min)
3. Brainstorm types of online data (5 min)
4. Explore how meaning is created from data (10 min)
5. Work with data online (20 min)
6. Assign homework (5 min)

Session 2

1. Getting Started (5 min) - Students journal about what they learned the previous day
2. Analyzing Data (30 min) - Discuss correlation and causation; discuss and explore different types of data and analysis
3. Present homework findings (10 min)
4. Wrap Up - Journal (5 min)

## CSP Objectives

Big Idea - Data
• EU 3.1 - People use computer programs to process information to gain insight and knowledge.
• LO 3.1.2 - Collaborate when processing information to gain insight and knowledge. [P6]
• EU 3.2 - Computing facilitates exploration and the discovery of connections in information.
• LO 3.2.1 - Extract information from data to discover and explain connections or trends. [P1]
• LO 3.2.2 - . Determine how large data sets impact the use of computational processes to discover information and knowledge. [P3]

## Key Concepts

Students will be able to acquire data and analyze it to find answers to a specific question or solutions for a specific problem.

## Essential Questions

• What opportunities do large data sets provide for solving problems and creating knowledge?

## Teacher Resources

Student computer usage for this lesson is: required

Student computer usage for this lesson is: optional

## Teacher Resources

In the Lesson Resources Folder

• PowerPoints: "Finding Data" and "Finding and Analyzing Data"
• Session 1 Homework: "Homework Unit 4 Lesson 1"

Webpages Session 1

Webpages Session 2

# Session 1

For this session, use the presentation "Finding Data" in the Lesson Resources Folder.

# Getting Started (5 min)

Given this data: [slide 1]

A blood drive at the local high school reveals that 20% of the students were HIV positive.

Journal on these questions:

• What is your immediate reaction?
• What questions do you have?

# Activities (40 minutes)

## Activity 1 (5 min) - Discuss the journal prompt

Lead the students in discussion using the bullets below and slide 2 of the PowerPoint as guidence. Students should talk about WHY they assumed the data was true, or were uncomfortable questioning the truth of the data.

## Activity 2 (5 min) - Brainstorm: What kinds of data can be found online?

Part 1 - Discussion

Data comes from many places and takes many forms [slide 3]

• Have students discuss: How do business, personal, government and devices create use data?

Part 2 - Brainstorm

Brainstorm as a class: what kinds of data are generated? Possible answers:

• pictures: maps, Instagram, photos, cartoons, drawings, …. everything!
• words: books, articles, news, stories, blogs, Facebook
• numbers: facts, financial transactions, scientific data
• sound: music, speech
• behavior tracking: GPS, click behavior, search history

## Activity 3: How is meaning created from data? (10 minutes)

1. Look at some data gathered about selfies from different cities around the world. [slide 4]
• Main ideas:
• You have to gather the data and analyze it to create meaning.
• Creating meaning from pictures still takes some human interpretation.
• Prompt students to come to a conclusion about the graphed data on the page.
• Question for discussion: How large of a sample is needed to draw a conclusion?
2. Quick review: Make the point that there is a LOT of data even in a single picture. [slide 5]
1. Define these and put them in order. Use this webpage to review bytes: http://highscalability.com/blog/2012/9/11/how-big-is-a-petabyte-exabyte-zettabyte-or-a-yottabyte.html
• MB, bit, TB, ZB, byte, GB, pixel (one dot of color on the screen), KB, PB
2. Look at the photo on slide 5.
1. 365 gigapixels is 365 billion pixels, if the picture is a square, then it is 604,152 pixels on each side (too big to fit on any HDTV screen)
2. http://www.rtings.com/info/what-is-the-resolution  A 4K super high resolution TV is only about 3,000 X 2,000 pixels. Even a movie screen can’t show all of the detail!
3. https://www.amctheatres.com/sony4k, you can only look at it one part at a time.
3. Preview Wolfram Alpha, an engine for providing knowledge from data.

## Activity 4 (20 min) - Work with some data online

1. Students should complete the Data Search and Analysis Handout. [slide 7]
• Depending on how much time you have, you can pair students and assign even/odd questions or chunks of questions to different groups, or have each student research on their own.
2. If there’s time in class, try to go over results and compare (especially the first 5) to see if people got similar answers. Why or why not? [slide 8]

# Assign Homework (5 minutes)

Give students the worksheet: Homework Unit 4 Lesson 1.

There are 10 videos to choose from, each 10-15 minutes long. Either allow students to self-select, or assign them a particular video. Students should watch the video and answer the questions on the worksheet. This is an opportunity to discuss plagiarism: students are expected to watch the video and write from their own experience.

# Session 2

For this session, use the presentation: Finding and Analyzing Data from the Lesson Resources Folder

# Getting Started (5 min)

Students should journal on the following:  Describe at least 2 ways that we create meaning out of data. [ slide 1]

• Possible answers: graph it, total it, average it, find min and max, map it, compare it to other data, find trends, generate predictions (like weather), draw conclusions (facial recognition, emotions, voice inflection), diagnose diseases, discover new stars, etc.

# Activities (40 min)

## Activity 1 (35 min): Analyzing Data

Part 1: Correlation vs. Causation

1. Look at slide 2 from the PowerPoint. Creating meaning from data can be misleading.
2. Point out that the graph shows a direct relationships between the number of divorces in Maine and the amount of margarine that is purchased. When one goes up, the other does too, and vice versa. Is this a causal relationship?
• Show some examples from the Tyler Vigen website http://www.tylervigen.com/spurious-correlations . It has many examples of data connections that may be statistically valid but don’t make sense.  The site was created to point out how comparisons due to data correlation are often not valid.

Part 2: Data Science

1. What does a data scientist do? [slide 3]
• Show the two videos and discuss.
• Tricks to analyzing big data:
• Knowing what data to use, and what to disregard.
• Knowing how to make up for missing data.
• Knowing how to discover and predict trends and correlations.
• There are many degrees offered in data science, and free online courses are available from Udacity and Coursera, among others.
2. Look at 3 false assumptions about big data [slide 4]:
1. It’s complete and accurate
2. It tells the whole story
3. Bigger is better
3. What considerations and tradeoffs arise in the computational manipulation of data? [slide 4]
1. How do you account for missing data?
2. How do you certify your sources?
3. How do you decide which data to include and which to exclude?
4. How much data is enough? (time is money!)
5. Are your processing algorithms accurate?
4. What is some of the data needed to successfully fly a space mission? (Possible answer: Knowing all about the spacecraft: speed, direction, amount of fuel/oxygen left.) The same problems that applied to early space missions are some of the same problems faced in dealing with big data.
• You need to decide which factors to include in your calculations, and which to exclude.
• You need to decide when to make an assumption for missing data or when to estimate.
• In writing a program for an early space flight there are many unknown factors using a space craft that has never flown before.
• It’s usually impossible to create a perfect algorithm that can take into account every possibility, so how do you allow for errors and changes?
5. What are some of the calculations needed? (Possible answers: how much fuel to release and with which engines.)
• They had to run many simulations first to see what would happen under various circumstances.
6. See if anybody knows how NetFlix, movie makers, or Amazon use data about their customers to be more successful. [slide 5] http://www.smartdatacollective.com/bernardmarr/312146/big-data-how-netflix-uses-it-drive-business-success and http://www.fastcompany.com/3024655/pitch-perfect-and-how-analytics-are-transforming-movie-marketing

Businesses like Amazon and NetFlix learn the habits of different customers and make recommendations based on their previous choices and others who share similar characteristics (like Google ads).

See if anybody knows the story of Moneyball (based on a true story) of how a baseball team made decisions based on data analysis to become winners, https://en.wikipedia.org/wiki/Moneyball_(film) and how Vivek Ranadivé--who knew little about basketball but owned a multi-million dollar computer processing company and knew how to choose and analyze data--coached his then twelve-year-old daughter’s National Junior Championship basketball team to the national championship game.  He relied upon his sporting knowledge of soccer and cricket paired with his analytic mindset, to create a system of play which allowed his relatively un-athletic team to excel.  From the moment that he used intellect and his business experience to coach an inexperienced team to the championship game, the man who once thought basketball was “mindless” was hooked on the sport. http://www.forbes.com/sites/aliciajessop/2013/05/28/why-the-kings-are-staying-in-sacramento-meet-vivek-ranadive/

1. How is data analyzed? Data analysis requires an algorithm, a plan to collect and process data. [slide 6]
1. Generate discussion about what data is collected and how it is analyzed. What is a possible algorithm for making a decision about choosing what movies NetFlix might suggest for a customer?
Brainstorm: what other data might they collect? (what’s currently popular in that age group, demographic, etc.)
2. Choose one of the options and write an outline of an algorithm: choosing a movie to produce or a sports player to hire. [slide 7]
1. Describe at least two calculations needed
2. Describe some of the data you’d need to collect.
Share and discuss.

## Activity 2 (5 min): Present homework from previous day after watching TED talks on data. [slide 8]

If time is short, choose only 1 or 2 of the questions from the homework to be presented to the class and collect the rest to grade.

# Journal (5 min)

In your writing journal, map out the steps to answer a specific question or find a solution to solve a specific problem using data.

## Options for Differentiated Instruction

### Extension Activities:

Data analysis activities from NOAA, NASA, and more! - http://climate-expeditions.org/educators/activities.html

### Differentiation Instruction:

What is data acquisition? - http://www.ni.com/data-acquisition/what-is/

Data analysis and graphs (with Excel sample) - http://www.sciencebuddies.org/science-fair-projects/project_data_analysis.shtml

Collecting and analyzing data - http://ctb.ku.edu/en/table-of-contents/evaluate/evaluate-community-interventions/collect-analyze-data/main

Using Excel for Handling, Graphing, and Analyzing Scientific Data: A Resource for Science and Mathematics Students - http://academic.pgcc.edu/psc/Excel_booklet.pdf

## Formative Assessment

Journal day 1:

Given this fictitious data:

A blood drive at the local high school reveals that 20% of the students were HIV positive.

• What is your immediate reaction?
• What questions do you have?

Journal day 2:  Describe at least 2 ways that we create meaning out of data.

Homework: Feedback from a TED video on big data

## Summative Assessment

Students complete the Data Search and Analysis student activity.

Write an outline of an algorithm to make a data-based decision about what movie to produce or what sports team member to hire.