Unit 4. Data AcquisitionRevision Date: Sep 28, 2015 (Version 1.2)
This lesson introduces students to reading information from an input file and writing to an output file as a functionality of Python programming. The students will then apply these concepts to program a simple Dice Roll application to generate data. This lesson will prepare students to read and write files for use in later Data Acquisition lessons.
countif. (This exercise may be assigned as homework if students have the computing resources to complete a programming assignment as homework.)
The students must understand how to open and read from an input file using Python.
The students must understand how to declare and write to an output file using Python.
Student computer usage for this lesson is: required
Python for Informatics by Charles Severance, http://www.pythonlearn.com/book.php.
Explanation of the CountIf function in Excel http://office.microsoft.com/en-us/excel-help/countif-HP005209029.aspx.
The mbox.txt and mbox-short.txt files are in the Lesson Resources Folder.
What are the advantages and disadvantages of:
Have students review their journal entries as a class and note the advantages and disadvantages on a white board.
The students should code the examples in the book as the teacher proceeds through the lessons.
countIffunction in Excel.
this = COUNTIF(A1:A1000,1)counts how many 1s are in the range A1 to A1000. You can show the example on the Microsoft office help website. http://office.microsoft.com/en-us/excel-help/countif-HP005209029.aspx
countiffunction to compare the distribution of the rolls for how many times each number 2 through 12 was rolled with the pair of six-sided dice to the distribution for the 12-sided die.
Have students work in pairs as the new concepts are introduced and practiced.
For a class needing more scaffolding: Work as a group. Have students take turns around the room to read aloud the brief text in each section in Chapter 7. Do the short exercises together with a "row captain" assigned to each row (or group) in the classroom who is in charge of checking that everybody in their row has completed each short task and has gotten the help needed to finish. Row captains help each other until the entire class has successfully completed each task. Report out on what challenges were encountered, recording problems and solutions at the front of the classroom as the class works. Rotate the role of row captain for each section.
For more independent students: Introduce/demonstrate the key ideas first and then allow student to work through Chapter 7 at their own pace.
The teacher will check the student’s code for understanding.
The teacher will check for understanding as each new concept is introduced.
Exercise 7.1 Write a program to read through a file and print the contents of the file (line by line) all in upper case. Executing the program will look as follows:
Enter a file name: mbox-short.txt
FROM STEPHEN.MARQUARD@UCT.AC.ZA SAT JAN 5 09:14:16 2008
RECEIVED: FROM MURDER (MAIL.UMICH.EDU [22.214.171.124])
BY FRANKENSTEIN.MAIL.UMICH.EDU (CYRUS V2.3.8) WITH LMTPA;
SAT, 05 JAN 2008 09:14:16 -0500
You can download the sample input file from www.py4inf.com/code/mbox-short.txt
Exercise 7.2 Write a program to prompt for a file name, and then read through the file and look for lines of the form:
When you encounter a line that starts with “X-DSPAM-Confidence:” pull apart the line to extract the floating point number on the line. Count these lines and the compute the total of the spam confidence values from these lines. When you reach the end of the file, print out the average spam confidence.
Enter the file name: mbox.txt
Average spam confidence: 0.894128046745
Enter the file name: mbox-short.txt
Average spam confidence: 0.750718518519
Test your file on the mbox.txt and mbox-short.txt files.