Unit 5. Data Manipulation
Revision Date: Jan 19, 2017 (Version 2.1)Summary
Big Data has been defined in many different ways. Easy access to large sets of data and the ability to analyze large data sets changes how people make decisions. Students will explore how Big Data can be used to solve real-world problems in their community. After watching a video that explains how Big Data is different from how we have analyzed and used data in the past, students will explore Big Data techniques in online simulations. Students will identify appropriate data source(s) and formulate solvable questions.
Overview
Session 1- What is Big Data?
Session 2 – Where can big data be used?
Student computer usage for this lesson is: required
Reading assignment and video clips for Session 1:
Possibly useful resource(s) for data collection:
Big Data Concepts:
Sample data sets (both acquired from http://catalog.data.gov/dataset) :
Journal: How can a computer gather data from people ? (Think-Pair-Share)
Remind students of the mind guessing game: http://en.akinator.com/ or 20 questions http://www.20q.net/
Discuss: How can the computer learn from people when playing one of these games? How many different answers do you think it could possibly know?
Teacher note: students are not expected to actually play this game during class.
Read The Rise of Big Data in chunks: An Introduction to “Big Data” (20 mins) Reading can be found at: http://www.foreignaffairs.com/articles/139104/kenneth-neil-cukier-and-viktor-mayer-schoenberger/the-rise-of-big-data
Break students into groups or pairs and jigsaw the seven units of the reading. Each group is to summarize their section in a tweet sized comment (not more than 140 characters).
Share tweets with the class.
Explain to students that big data is impacting every area of life. By using more data and processing power we can make better decisions. As an illustration, show a clip from the movie Moneyball: (3 mins) https://www.youtube.com/watch?v=rMObWsKaIls
After students watch, they create a journal entries explaining at least two ways data was used to better manage the baseball team. Partners discuss journal entries. Share at least one observation with table groups and then share at least one observation from each group with the class.
Show the first 3-5 minutes of this clip. (It becomes a bit dry, so just show the amount that is appropriate for your students to get the idea): https://www.youtube.com/watch?v=7D1CQ_LOizA
Some other concepts to point out to students if there is time:
Some examples of how big data is used:
Some examples of how big data was inappropriately used:
Students are to pick three topics they want to research that use big data. It is preferred that these topics relate to something learned this year in the course (e.g., the need for IPv6). Tomorrow, as the students enter class, they will sign up on a list with their chosen topic. Since the students will have three options, it is likely they will get one of their selected topics to research.
Journal: Think about you daily and weekly activities. What types of data are being stored about you?
Remind students to think about what they do online, in stores, while in a car, etc.
Review the steps to processing Big Data:
As a class, walk through these steps using the two files in the lesson resources folder (FailedBanklist.csv & Consumer_Complaints.csv)
Step 1.
Demonstrate how files such as these can be obtained at http://catalog.data.gov/dataset
Formulate questions such as:
Are there any banks that are on both the complaint list and the failed banklist?
Can we make some deductions about banks that may be on both lists? If so, what deductions can we make?
Step 2.
Extract data source into format supported by underlying tools
Open one of these files in Notepad (or some simple editing program such as Notepad++) and demonstrate how the actual data itself is separated by commas, thus the file name “csv” for comma separated value.
Open both files in Microsoft Excel. Complete a find for the bank name “Banco Popular de Puerto Rico” on both lists. You may want to first sort the data by bank name to find this bank or you can use CTRL + F to find the bank name (see screenshots below).
Step 3.
Normalize data (remove redundancies, irrelevant details)
In this step, there is technically no need to remove redundancies or irrelevant details but you can show the students how you could remove data or limit the data to a particular data set. For example, if were to want to look at only the banks from Maryland, you can use the filter tool to only view those banks from MD.
Step 4.
Import data into tool
Right now the file type is as a csv file. By resaving the file as a .xlsx file it becomes a true spreadsheet file.
Step 5.
Perform analysis
We have determined that the bank “Banco Popular de Puerto Rico” is on both lists. Now ask the students “Why is this bank on both lists?” Note: On the Failed Bank list the Banco Popular de Puerto Rico is actually an acquiring institution. By looking more closely at the dates of the acquisition of the failed bank “Westernbank Puerto Rico” one can formulate some possible deductions that maybe the reason “Banco Popular de Puerto Rico” is on the complaint list is because they had recently taken over a failed bank. It could be possible that some of these complaints were related to this recent acquisition.
Step 6.
Visualize Results
Explain to students that they will learn more about visualize their results in Unit 6. They can complete graph visualization in excel. Show them the website: http://www.gapminder.org/. Explain that even though a visualization in excel is not interactive like http://www.gapminder.org/, they can complete some form of visualizing their data by using a spreadsheet. Note: http://www.gapminder.org/ is VERY attention grabbing. Only briefly show the students what they can do with it (see how data changes over time, look at many different data sets, and download data in different forms - including csv and xlsx formats).
Students should research their selected topics from homework. Some possible websites for finding data are listed above under “Possible good resource(s) for data collection.”
Students are to get your approval for a topic and then use the Big Data Sets Worksheet in the Lesson Resource Folder to find big data sets that are related to the approved topic.
Students are to review using http://www.gapminder.org/ looking specifically at life expectancy. Students will write one question after “playing” the timeline of life expectancy using gapminder on an exit slip before leaving class. For example, one may write “Why is the life expectancy of countries such as Denmark, Sweden, & Norway typically higher than other countries throughout most of the timeline?”
Students are to submit a document stating their topic for research using Big Data. This document should answer the questions:
Topic:
How is Big Data used to solve or remedy the topic?
Link(s) used to find Big Data? (i.e. data.gov, etc)
How has the transformation of data storage affected how data itself is used?
Answer: Storage and processing of large digital data enables us to analyze large data sets quickly rather than small sampling sizes as used before.
How can a computer use Big Data to make predictions?
Answer: Computers can use smart algorithms, powerful processors, and clever software to make inferences and predictions for solvable questions.
Unit 5. Data Manipulation
Revision Date: Jan 22, 2017 (Version 2.1)Pre-Lesson Preparation
Summary
Students investigate data organization, simulate linear and binary searches, and write pseudocode and Python for linear and binary search methods.
Outcomes
Overview
Session 1
Session 2
Source
Phone book presentation adapted from a lesson taught by Dr. Rheingans in CMSC 201 at the University of Maryland, Baltimore County
Students will:
Student computer usage for this lesson is: required
In Lesson Resources Folder:
Journal: What would be the best way to organize a collection of DVDs so that you could find the one you want very quickly? Would you need a different method for a radio station with thousands of DVDs? Discuss.
As a class, discuss the following questions:
What if you are looking for a specific song. What is the most effective way to look for something if it is in an unordered, unsorted collection?
What are some challenges to organizing large sets of information so they can be processed?
Why does it help to classify data when you are trying to sort it or search through it?
Possible Answers and suggestions for discussion:
Take out a dictionary (or phone book). Ask the class, how would you search for a particular word/name?
Steps for Binary Search in a book of items to demonstrate to the class:
Why would this not work for an unordered list?
Note: Skeleton code (SearchCode.py) is provided in the Lesson Resources Folder.
Think-Pair-Share
For students that have difficulty understanding the concepts of searching for items in a set of data, pair those students with a student who has a firm grasp of the concept for the activities. Have the pair work together for 1A and then have them keep their own paper secure using the extra game sheet (1A'). Simiilarly for 1B - 1B' and 1C - 1C'.
Correctness of Python functions for linear search and binary search
"Searching Assessment Items.docx" in lesson folder
"Search Comparison Worksheet" in the lesson folder
Unit 5. Data Manipulation
Revision Date: Mar 12, 2017 (Version 2.1)Summary
In this three-session lesson, students explore and confront the difficulties of the problem of sorting data and the difficulties involved in expressing a clear and efficient algorithm for sorting.
Outcomes
Overview
Session 1:
Session 2:
Session 3:
The algorithmic techniques and analysis involved in sorting data are seen in a wide variety of contexts and applications. Sorting numbers in a list is challenging but foundational to many algorithms in computer science.
What makes a "good" algorithm?
What should be taken into consideration when comparing algorithms that complete the same task?
Student computer usage for this lesson is: optional
Journal: Have students respond to the following questions:
Teacher note: Having just finished the lessons on searching, students should recall that searching an ordered list allows using the binary search which is faster than searching an unordered list with a linear or random search.
Teacher note: The focus should be directed more toward the problem-solving technique than nitpicking about the language used. Although students are writing instructions for a human to manipulate a set of playing cards, they still need to be precise, because the assumption is that the person doesn't know what they are doing. This problem is challenging and will require creativity.
Suggest that it could be helpful to break the task down into parts. Using abstraction and collaboration can decrease the size and complexity of the task that each programmer has to solve. Abstraction allows you to build upon existing processes, collaboration allows you to break up the task once you agree on what the strategy is and what the separate tasks are that need to be solved. Example, cards need to be comared and swapped, one person could write the specific steps for how that should be done.
Journal:
Homework: Any pairs that did not finish the activity should complete it as a homework assignment before the next session.
In this session, students review the sorting algorithms they wrote in session 1. Students will follow the algorithms created by their classmates and discover a variety of sorting strategies. By analyzing the various algorithms, students will attempt to find the "best" sorting strategy.
Teacher note: There are two main difficulties in algorithm design to highlight: (1) It is very difficult to be precise with language without some agreement about what terms mean. (2) Solving the problem by determining the strategies and steps required to sort objects correctly, as well as efficiently, presents a second level of difficulty.
Journal: How do you think a sorting algorithm should be "measured" to determine if it is the "best"? Ask for ideas and discuss this during the group activity.
Create groups of four by joining the pairs who previously exchanged algorithms.
Teacher note: This "swapping algorithm" activity works especially well when students exchange algorithms with a group that has a fundamentally different approach. However, as a practical matter, this can be hard to arrange. From your observations during Session 1, you might have a sense of groups with different approaches that you can assign to swap algorithms.
Teacher note: It is possible that the nominated algorithm won't work perfectly. If you encounter any problems with the directions, give them the benefit of the doubt and simulate it as best you can to enable the class to understand the intent.
Teacher note: The students will perform an actual analysis in a later lesson, so it's okay at this point to simply guide the discussion to see how students are thinking. They will re-examine these ideas later.
Remind students of the two main issues in writing effective algorithms:
Homework: Assign students to write a final version of their algorithm, working out any ambiguities or other problems revealed during the activities.
In this session, we end the set of sorting activities by relating sorting to algorithms in the real world. A further exploration of algorithm analysis with some new algorithms will sharpen their intuition about what should and shouldn't be "counted" when analyzing algorithms, what is "hard" for a computer to do, or what takes a "long time."
Journal: discuss ideas and elements from the previous lesson:
Teacher should clarify that in general, we want these things from an algorithm:
Suggestion: If you have a mix of new and advanced students, challenge the advanced students to sort twice as many cards with a parallel processing algorithm of their own design. Each student on the team can perform one action at the same time.
Evaluation of algorithms
Convert actions into an algorithm
Unit 5. Data Manipulation
Revision Date: Jul 25, 2017 (Version 2.1)Pre-lesson Preparation: You should familiarize yourself with www.sorting-algorithms.com paying particular attention to the variety of algorithms and settings along the top of the page. For session 2, you should have the timedsorts.py code and data files (in the lesson folder) readily available for your students.
Summary
In this two-session lesson, students will explore algorithmic efficiency. They will understand the idea through discussion, manual analysis of simple algorithms, and data collection for implemented algorithms.
Outcomes
Students will be able to:
Overview
Session 1:
Session 2:
Student computer usage for this lesson is: required
Sorting:
Think-Pair-Share: Alternate Routes
Briefly discuss with your class the topic: what properties make for a good algorithm? What makes one algorithm better than another? Properties you may want to discuss if your students do not volunteer them:
A good analogy is purchasing a car, where people are concerned about:
Today's session will address the topic of efficiency.
Introduce the concept of algorithmic efficiency to your students by asking them if any can describe what algorithmic efficiency is, or what it means for an algorithm to be efficient. Briefly describe efficiency as how well an algorithm uses two resources, time and space (stored memory), to solve a problem. Some topics you may wish to discuss include:
Teacher note: This topic is more advanced, so you may wish to go more in depth or move on to the activity, as appropriate for your students.
A central idea of algorithms is that some algorithms will take more and more time as the size of their input increases. Time is not measured in seconds but rather the number of computational steps needed for the algorithm to finish operation on a given input. Great algorithms grow linearly, at the same rate as their input, meaning the time it takes to finish is directly proportional to the size of the problem they are solving (amount of input data). For instance, an algorithm that takes 10 steps for an input of size 10 and 1000 steps for an input of size 1000 is said to be linear in its input. However, most algorithms take longer as their input gets larger. For instance, an algorithm that takes only 25 steps for an input of size 5 may take 100 steps for an input of size 10, 10000 steps for an input of size 100, and one million steps for a size of only 1000 (it is taking quadratically more time as the input gets larger).
When we analyze algorithms, we often talk about the algorithm's computational complexity, which is the order of magnitude of the algorithm's running time. We almost always discuss the worst case complexity, since that is a bound on the resources required.
If an algorithm finishes with the same number of steps regardless of the size of its input, it is called constant time, which is O(1) in mathematical form (read aloud as "big-oh one"). Constant time algorithms are the fastest in terms of computational efficiency, and any algorithm that takes a constant number of steps is considered O(1). An algorithm that takes 10 steps for an input of size 10 and also takes 10 steps for an input of size 1000 is likely O(1). However, very, very few algorithms are constant time because most algorithms necessarily take longer as the size of their input increases.
An algorithm that can finish by looking at each piece of its input only once is called linear time or linear order, and is written mathematically as O(n), where n stands for "the size of the input." An algorithm that takes 10 steps for an input of size 10 and also takes 1000 steps for an input of size 1000 is likely O(n). Very few algorithms are linear order, especially if they must compare pieces in their input, such as sorting algorithms. The best sorting algorithms are somewhere between linear time and quadratic polynomial time, written as O(n^{2}), where n^{2} stands for "the size of the input, squared." Any algorithm that is O(n^{2}) typically must compare each piece of its input with every other piece of input at least once. An algorithm that takes 100 steps for input of size 10 and a million steps for input of size 1000 is likely O(n^{2}).
Most sorting algorithms are of an order between O(n) and O(n^{2}) known as linearithmic time, written as O(n log n), where log is the logarithmic function. In fact, O(n log n) is the fastest possible order for a comparison-based sorting algorithm. It is impossible for such algorithms to be O(n) since they must make at least some comparisons of their input data.
Using the simulation tools at http://www.sorting-algorithms.com/, students will investigate, compare, and contrast sorting algorithms. Notice the grid in the center of the page. Each column is a particular sorting algorithm, and each row is an ordering of horizontal bars (either random, nearly sorted, reversed order, or few unique). Each algorithm will sort the bars in a given cell from top to bottom in increasing order by length.
Ask your students to interact with the website by clicking the green start icons and observing how long it takes each algorithm to sort its bars relative to the other algorithms.
Some questions to have them discuss or record in their journal could include:
Make sure your students understand that the size and order of input data can affect how long an algorithm takes. You should direct or help your students discover that bubble sort is a slow sorting algorithm that can be fairly fast for nearly sorted data. You may wish to discuss that bubble sort is O(n^{2}) in the worst case, explaining why it takes so long for large input, but is O(n) in the best case, which is when input is already (or nearly) sorted. In contrast, selection sort is O(n^{2}) in both the worst and best cases, and merge sort is O(n log n) in both the worst and best cases. In general, most sorting algorithms that we would want to use are O(n log n), since O(n^{2}) is usually too slow. You may also want to mention that bubble sort is considered one of the most inefficient sorting algorithms and that quick sort’s worst performance is on already sorted data, so some quick sort implementations shuffle the inputs before sorting to avoid that situation.
Watch one or more of the available movie clips that compare the performance of sorting algorithms:
Suggested list of videos (Many more are available):
Journal: Remind your students about the sorting algorithms from the previous session and have them answer the following questions:
The students will measure and analyze the effect of sorting set size on execution time for a given sorting algorithm using Python code. Using the timedsorts.py file in the lesson resources folder as a basis, the students will perform an experimental analysis to compare sorting algorithms by timing them on input data of different sizees. They will hypothesize, design and code their experiment, collect results, and write a report for homework.
The sorting functions available in the Python code include: quick sort, merge sort, selection sort, insertion sort, and bubble sort. For advanced students or classes may, you may wish to have them implement additional sorting algorithms.
The sample code includes helper functions to generate random data, to load data from a file, and to time sorting functions on the data. Example code for invoking these functions is included at the end of the file. You can remove this example code before sharing it with your students if you wish to emphasize the programming and critical thinking required to do this project.
Each student (or pair or group) needs their own copy of the Python code to modify for their experiments.
Students will compare sorting algorithms by timing them with Python code on input data of various sizes. Have your students (individually or in pairs) make a hypothesis about what will happen as the size of data input increases, answering the following questions:
Have your students write out a description of the steps they will take to perform the experiment.
Have students modify their Python sorting code to implement the experimental steps they outlined. Students must:
The data collection should be completed by the end of class, but students will continue to work on this activity by writing a report describing their results.
Assign as homework to write a short report about the findings, making sure to:
Students must complete a short research report on their sorting algorithm research procedure, results, and analysis of the results.
The teacher may decide to have the students choose how they want to organize the empirical analysis effort. Alternatively, scaffolding with a worksheet or checklist could be used to guide the students through the data collection and analysis tasks.
The following "Checks for Understanding" could be used to guide the students towards the three learning objectives:
Objective: SWBAT identify families of correct algorithms that have different efficiencies in their problem solving approach.
Objective: SWBAT demonstrate logical reasoning and metrics is used to describe an algorithm’s efficiency.
Objective: SWBAT to perform empirical analysis of sorting algorithms by running the algorithms on different inputs.
Students will complete a short research report on their sorting algorithm research procedure, results, and analysis of the results.
Unit 5. Data Manipulation
Revision Date: Jul 18, 2016 (Version 2.1)Summary: Students are introduced to the theory of computation, computability, the halting problem, and advanced algorithms. In particular, they will learn about heuristic search used by artificial intelligence (AI) programs to play games.
Objective:
Students will be able to:
Overview:
Session 1
Session 2
Student computer usage for this lesson is: required
Links to videos and online tools as indicated in the lesson plan.
Alternative instruction could include the Towers of Hanoi problem and discuss the algorithm for solving it. Some demonstrations are available here:
Think-Pair-Share: In pairs, think about and try to answer each of the following questions:
Note: just give them a few minutes to try the factoring, but round them up to continue and discuss: which operations were much harder to perform than their inverse? Can you just invert the steps, and why or why not?
Make a connection to the previous lesson by comparing these to sorting algorithms, where some are speedy and efficient like Merge sort and Quick sort, and others are unusably slow, like Bubble sort. Highlight the difference that different problems have different lower bounds on optimal solutions, and that some problems like integer factorization have solutions but take too long to be solved in a practical way.
Discuss the definition of computation (in a theoretical sense) with your students. Computation is input plus processing to get output. A computer is one system that is a "model of computation" since it takes input, processes it, and produces output.
Another model of computation is called a Turing machine, named after Alan Turing (one of the most famous computer scientists). A Turing machine is a theoretical entity that has a tape of symbols (a line of 0s and 1s), a head that can read only one symbol at a time, and an internal state that can change based on instructions as the head reads symbols. Turing and a mathematician called Alonzo Church are responsible for the "Church-Turing" thesis, which says that a Turing machine can compute anything that a digital computer can. This is a fundamental idea of the theory of computation, and has the implication that anything one computer is capable of doing is possible to be computed by another, given enough resources (time and memory).
Now discuss the idea of computability with your students. Ask your students to answer or think-pair-share: are there things it is impossible for a computer to compute? The most classic "undecidable" (non-computable) question is called the Halting Problem. The Halting Problem is: make a program that can tell if another program will halt (terminate at some point eventually) or will loop forever and never end.
The Halting Problem is impossible for a computer to compute, which you can prove (informally) by paradox. Suppose you did have a program that solved the Halting Problem, called HALT(X), which takes the code for some program X as input and says "yes" if X terminates or "no" if X loops forever. Then you could write a new program that uses HALT inside it, which we will call PARADOX(X). First PARADOX(X) will run HALT(X) and if the result is no, PARADOX will halt, but if the result is yes, then PARADOX will loop forever. But here is the problem: what if we use the code for PARADOX as the input to PARADOX, running PARADOX(PARADOX)? If it says that PARADOX halts, then PARADOX runs forever, and if it says PARADOX runs forever, then PARADOX halts. This problem is a paradox and does not make sense because the premise, that a program called HALT could exist, must be wrong! Therefore, the Halting Problem is impossible for a computer to solve.
Video explanation with optional student simulation
Think-Pair-Share:
This session concerns advanced algorithms, in particular heuristic search, which is commonly used in artificial intelligence. Refresh your students' minds on the definitions of computation, computability, and undecidable problems. Additionally, mention the properties we consider when we compare algorithms:
Introduce the idea of heuristic search, which is a class of algorithms used in many artificial intelligence programs. A heuristic is something that is used to find a good solution in a reasonable time, and a heuristic search algorithm is an algorithm that uses heuristics to determine how to search through some space.
A great way to introduce heuristic search is first to discuss game trees. A game tree is a structure that is used to represent the "space" of a game that an algorithm wants to search through.
Think of a game like chess: you make a move, the opponent makes a move, and the process continues until the ending conditions have been met (one player in checkmate or stalemate). A game tree is a mathematical structure used by AI and heuristic search algorithms to model the moves made in a chess game. At any turn, we can make a "tree" by drawing the root node as representing the current state of the board and drawing one branch under it for every possible move. In Tic-Tac-Toe, if you are the starting player, then the root node represents a blank board, and there will be nine branches, one for each possible move (each space where you could place your mark). Following a branch in the game tree takes you to a new node that represents the configuration of the game that results from having taken that move. In Tic-Tac-Toe, if I am the first player and place my X in the center space, I have "followed" that branch down the tree to a new node that represents the board with an X in the center space. The opponent then uses this node as the root of their game tree, and has a branch for each of their possible moves.
Think-Pair-Share: Have your students pair off and play a game of Tic-Tac-Toe and try to draw the game tree as they play it, drawing the nodes for each move they made and every potential branch from those nodes. Bring them back into discussion and ask them what if they had to draw out every node followed down every branch? Now ask them to imagine the game tree for chess, which has 20 possible moves on the first turn, 400 on the second, and many, many more as the game goes on. How can an artificially intelligent program learn to play chess when there are so many (too many) options? Chess actually has around 35^{100 }nodes in its tree and 10^{40} legal states.
Heuristic search on game trees is one way AI programs are able to play games like chess. How good are computer game players?
Typically games modelled with game trees are 2-person games, players alternate moves, and they are zero-sum (meaning one player's loss is the other's gain). More complicated elements in such games may have include: hidden information (like other players' hands), chance (dice), or multiple players.
How does an AI program use heuristic search to play a game? Typically in these steps:
The key problems are:
For evaluation, some function is typically coded or learned over time.
Refer to the "Advanced Algorithms" slides in the lesson resources folder for examples of uninformed search. For an activity, you may want to create a game tree for Tic-Tac-Toe and have your students walk through how each of the following algorithms would operate over it.
Uninformed Search are algorithms that work without a heuristic, using no information about the likely "direction" of the goal node. Algorithms include:
For any games with variety and complexity, certainly for chess and even checkers, uninformed search is simply too slow because it is exhaustive. This problem is another example, like with sorting, where the efficiency of our solution matters a great deal. To get programs to play games, we need them to be efficient and intelligent about the number and quality of moves they consider.
Informed Search algorithms each follow some heuristic that uses information about the game to determine smart directions to explore. Examples include:
Advanced classes may wish to discuss local search algorithms, such as hill-climbing and genetic algorithms (in the "Advanced Algorithms" slides in the lesson folder).
Minimax
Thinking about game trees again, we want to select the branch that takes us to a node with the maximum evaluated state. But there is a catch: the opponent gets to make moves, too. That is, every other branch in our game tree is the opponent's turn. How does the AI program account for the other player?
Perhaps most logically, the way AI programs do so is to assume the other player will play optimally. Just as the AI will take the branch that leads to the state with the greatest evaluation, it assumes the other player takes the branch leading to the state that will maximize their position. In other words, the AI searches through their game tree by following the branch with the maximum value on their turn, and following the branch with the minimum value on the opponent's turn. This algorithm is called minimax and is the basis of nearly all AI that play 2-person zero-sum games.
For the Halting Problem proof, it is important that students can translate the solution that is on the video into a representation that makes (some) sense to them. Acting out the inputs and outputs of the set of machines is an approach worth trying.
The following "Checks for Understanding" could be used to guide the students towards the three learning objectives.
Objective: Students will identify some Advanced Algorithms that Exploit Inverse Operations Efficiency.
Objective: Students will identify some Advanced Algorithmic Techniques.
Objective: SWBAT discuss at least one example of a computing problems that is unsolvable
Students will be able to summarize -- in their own words or with simple models -- the proof of the Halting Problem.
Students will be able to identify the sensitivity of cryptography to the difficulty of factoring large numbers.
Unit 5. Data Manipulation
Revision Date: Aug 19, 2016 (Version 2.1)Summary
Students will define, design and implement a programming project: a miniature version of the Create Performance Task. They will create presentations and share with groups the projects they developed and how their project used abstractions.
Outcomes
Overview
Session 1:
Presentation (5 min) Students are introduced to the Create Task.
Activity (15 min) Students select a small programming project to model the Create Task.
Activity (10 min) Students identify an algorithm to use in their program and share within tabe groups.
Activity (10 min) Students identify an an abstraction to use in their program.
Wrap Up (5 min) Students share the algorithms and abstractions they will use with the class.
Session 2:
1. Journal (2 min) - What is your plan for today for the development of your project? What abstractions do you plan on using in your project?
2. Activity (43 min) - Students complete implementing their projects.
3. Journal (5 min) - Reflection. What abstractions did you use in your project?
Session 3:
1. Getting Started (5 min) - Students individually respond to three prompts about their projects.
2. Activity (20 min) - Students prepare one minute presentations of their projects.
3. Presentations (15 min) - Students present their project to table groups.
4. Wrap Up (5 min) - Students create exit slips with any questions about the Create Task.
Students practice choosing a project and planning how to implement it in a fixed time frame.
Students have just two days to plan and implement a project. Since these will be small projects, students may need help using algorithms and data abstraction. Since an algorithm is a list of steps that comes to a conclusion, if students develop pseudocode for their projects they can refer to the pseudocode as their algorithm.
Students may receive most of the credit from an incomplete project if the project demonstrates the required components.
For this practice task, teachers may want to provide program stubs. Stubs could include suggested functions.
Present an overview of the Create Task.
Explain that students will have 12 hours to complete the Create Task later in the course and they will three 50-minute sessions for this practice. The actual Create Task will have a suggested collaborative component and be larger in scope.
Discuss the following guidelines for the full project and the practice project we will be doing.
Three components to create:
General:
One project - individual with collaboration in stages
12 hours of classroom time
Project must use functional and data abstraction.
Report: Written responses (response to all prompts combined must not exceed 750 words, exclusive of the Program Code.):
a. Provide a written response or audio narration in your video that: Identifies the programming language and identifies the purpose of your program.
b. Describe the incremental and iterative development process of your program
c. Describe how a selected algorithm functions.
d. Explain how an abstraction you developed helped manage complexity
For this practice task, students will complete simpler project and a one-minute presentation about it, rather than a video and a report.
Students work individually to select projects.
After completing the project, students will create a one-minute presentation about it. Presentation should address the following.
a. Identify the programming language and the purpose of your program.
b. Describe the incremental and iterative development process of your program
c. Describe how a selected algorithm functions.
d. Explain how an abstraction you developed helped manage complexity
The presentation must address at least points a and b of the above and c or d.
Projects are chosen by the student. If they wish, their projects may be based on the following labs from How to Think Like a Computer Scientist.
Labs
Students select a project and share their ideas with partners.
After collaborating with partners, students submit to their teacher a brief description of the project describing its most important features and how it will work.
Students identify an algorithm and an abstraction to use in their projects. The algorithm should be written as pseudocode and then shared with their partners.
Students should discuss in groups what abstraction they chose and how they think it would be helpful.
Students complete a brief journal entry describing:
Students work to implement and test projects. Teachers may evaluate student performance based on student journal entries and their observations of their effort in implementing the project.
Students reflect on their project and making journal entry of how they used abstraction in the project.
Students begin by individually responding to these prompts about their project:
b. describe the purpose, how your program code works and the most important features and algorithms
d. describe the development process
e. explain an abstraction and how it helped manage complexity
Students prepare one-minute presentation about their projects including their responses to prompts b, d and e.
Students present their project to table groups. Time the presentations so that they do not exceed 1 minute. Students share with table groups what they like about the project, what they learned and any questions they have.
Students create exit slips with any questions they have about the Create Task after viewing and discussing the presentations.
For the practice task, project descriptions and pseudocode for each proposed project should be assessed. Assessment can be done by collaborative partners first. If partners have concerns, they should be brought to the teacher. If student projects are too big or too small in scope, teachers should provide feedback.
The project should be scored using the latest rubric provided by the College Board.
The latest rubric (updated as of June 2016) is in the lesson folder.