
Hello world


Write a program that uses a print statement to say ‘hello world’ as shown in ‘Desired Output’.


 print('hello world')


Python中字符串可以用四种引号组合进行标注:' " ''' """,其中单独的单引或者双引可以使引号内部不同的引号不需要进行转义,如"this is 'a' test"中的单引号就不需要进行转义。三个单引号和双引号具有相同的作用,可以保证被定义的字符串原封不动地输出(一般定义多行),在后面执行SQL语句时有用到。同时其中的单引号或者双引号不需要转义。



Write a program to prompt the user for hours and rate per hour using input to compute gross pay. Use 35 hours and a rate of 2.75 per hour to test the program (the pay should be 96.25). You should use input to read a string and float() to convert the string to a number. Do not worry about error checking or bad user data.


Desired Output

Pay: 96.25


# This first line is provided for you

hrs = input("Enter Hours:")
hrs = float(hrs)

rate = float(input("Enter Rate:"))
print('Pay:', hrs * rate)



Write a program to prompt the user for hours and rate per hour using input to compute gross pay. Pay the hourly rate for the hours up to 40 and 1.5 times the hourly rate for all hours worked above 40 hours. Use 45 hours and a rate of 10.50 per hour to test the program (the pay should be 498.75). You should use input to read a string and float() to convert the string to a number. Do not worry about error checking the user input - assume the user types numbers properly.



hrs = input("Enter Hours:")
h = float(hrs)

rate = float(input("Enter Rates:"))
if h > 40:
    res = 40 * rate + (h - 40) * rate * 1.5
    res = rate * 40



Write a program to prompt for a score between 0.0 and 1.0. If the score is out of range, print an error. If the score is between 0.0 and 1.0, print a grade using the following table:

Score Grade
>= 0.9 A
>= 0.8 B
>= 0.7 C
>= 0.6 D
< 0.6 F

If the user enters a value out of range, print a suitable error message and exit. For the test, enter a score of 0.85.



score = input("Enter Score: ")
grade = ['D','C','B','A']

    score = float(score)

if score < 0 or score > 1:

score = int(score * 10)

if score < 6:
elif score > 9:




Write a program to prompt the user for hours and rate per hour using input to compute gross pay. Pay should be the normal rate for hours up to 40 and time-and-a-half for the hourly rate for all hours worked above 40 hours. Put the logic to do the computation of pay in a function called computepay() and use the function to do the computation. The function should return a value. Use 45 hours and a rate of 10.50 per hour to test the program (the pay should be 498.75). You should use input to read a string and float() to convert the string to a number. Do not worry about error checking the user input unless you want to - you can assume the user types numbers properly. Do not name your variable sum or use the sum() function.



def computepay(h, r):
    if h <= 40:
        return h * r
        return 40 * r + (h - 40) * r * 1.5

hrs = input("Enter Hours:")
hrs = float(hrs)
rat = float(input("Enter Rates:"))

p = computepay(hrs, rat)



Write a program that repeatedly prompts a user for integer numbers until the user enters ‘done’. Once ‘done’ is entered, print out the largest and smallest of the numbers. If the user enters anything other than a valid number catch it with a try/except and put out an appropriate message and ignore the number. Enter 7, 2, bob, 10, and 4 and match the output below.

写一个程序监测用户输入的整数,直到用户输入“done”程序停止,之后程序输出这批数字的最大值和最小值。需要对输入进行检测,对于不合法的输入需要输出Invalid input


largest = None
smallest = None
while True:
    num = input("Enter a number: ")
    if num == "done":

        num = int(num)
        print('Invalid input')

    if largest is None:
        largest = num
        smallest = num
        if largest < num:
            largest = num
        if smallest > num:
            smallest = num

print("Maximum is", largest)
print("Minimum is", smallest)


Write code using find() and string slicing (see section 6.10) to extract the number at the end of the line below. Convert the extracted value to a floating point number and print it out.


text = "X-DSPAM-Confidence:    0.8475"
index = text.find('0')

res = float(text[index:])


Write a program that prompts for a file name, then opens that file and reads through the file, looking for lines of the form:

X-DSPAM-Confidence:    0.8475

Count these lines and extract the floating point values from each of the lines and compute the average of those values and produce an output as shown below. Do not use the sum() function or a variable named sum in your solution.

You can download the sample data at when you are testing below enter mbox-short.txt as the file name.

文件操作,提取指定格式行的数据,并计算平均值。标准输出为Average spam confidence: 0.750718518519

def getNumber(line : str):
    index = line.find(':') + 1  #查找冒号更加合适
    return float(line[index:])

# Use the file name mbox-short.txt as the file name
fname = input("Enter file name: ")
if len(fname) < 1:
    fname = 'mbox-short.txt'
fh = open(fname)

Sum = 0
num = 0
for line in fh:
    if not line.startswith("X-DSPAM-Confidence:"):
    Sum = Sum + getNumber(line.rstrip())
    num = num + 1

print('Average spam confidence:', Sum / num)


Open the file romeo.txt and read it line by line. For each line, split the line into a list of words using the split() method. The program should build a list of words. For each word on each line check to see if the word is already in the list and if not append it to the list. When the program completes, sort and print the resulting words in alphabetical order.

You can download the sample data at


fname = input("Enter file name: ")
if len(fname) < 1:
    fname = 'romeo.txt'
fh = open(fname)
lst = list()

for line in fh:
    line = line.rstrip()
    words = line.split()

    for word in words:
        if word not in lst:



Open the file mbox-short.txt and read it line by line. When you find a line that starts with ‘From ’ like the following line:

From Sat Jan  5 09:14:16 2008

You will parse the From line using split() and print out the second word in the line (i.e. the entire address of the person who sent the message). Then print out a count at the end.

Hint: make sure not to include the lines that start with ‘From:‘.

You can download the sample data at


fname = input("Enter file name: ")
if len(fname) < 1:
    fname = "mbox-short.txt"

fh = open(fname)
count = 0

for line in fh:
    if not line.startswith('From '):
    line = line.split()
    count = count + 1

print("There were", count, "lines in the file with From as the first word")


Write a program to read through the mbox-short.txt and figure out who has sent the greatest number of mail messages. The program looks for ‘From ’ lines and takes the second word of those lines as the person who sent the mail. The program creates a Python dictionary that maps the sender’s mail address to a count of the number of times they appear in the file. After the dictionary is produced, the program reads through the dictionary using a maximum loop to find the most prolific committer.


name = input("Enter file:")
if len(name) < 1:
    name = "mbox-short.txt"
handle = open(name)
res = dict()

for line in handle:
    if not line.startswith('From '):

    line = line.split()
    author = line[1]
    res[author] = res.get(author, 0) + 1

name = None
count = None
for k, v in res.items():
    if count is None or v > count:
        count = v
        name = k

print(name, count)


Write a program to read through the mbox-short.txt and figure out the distribution by hour of the day for each of the messages. You can pull the hour out from the ‘From ’ line by finding the time and then splitting the string a second time using a colon.

From Sat Jan  5 09:14:16 2008

Once you have accumulated the counts for each hour, print out the counts, sorted by hour as shown below.


name = input("Enter file:")
if len(name) < 1 : name = "mbox-short.txt"
handle = open(name)

status = {}
for line in handle:
    if not line.startswith('From '):

    # 抓取时间
    line = line.split()
    line = line[5]
    line = line.split(':')

    # 截取时间
    tim = line[0]
    if tim in status:
        status[tim] = status[tim] + 1
        status[tim] = 1

res = sorted([(k, v) for (k, v) in status.items()])
for k, v in res:
    print(k, v)


Regular Expressions

Finding Numbers in a Haystack

In this assignment you will read through and parse a file with text and numbers. You will extract all the numbers in the file and compute the sum of the numbers.

Data Files

We provide two files for this assignment. One is a sample file where we give you the sum for your testing and the other is the actual data you need to process for the assignment.

These links open in a new window. Make sure to save the file into the same folder as you will be writing your Python program. Note: Each student will have a distinct data file for the assignment - so only use your own data file for analysis.

Data Format

The file contains much of the text from the introduction of the textbook except that random numbers are inserted throughout the text. Here is a sample of the output you might see:

Why should you learn to write programs? 7746
12 1929 8827
Writing programs (or programming) is a very creative
7 and rewarding activity.  You can write programs for
many reasons, ranging from making your living to solving
8837 a difficult data analysis problem to having fun to helping 128
someone else solve a problem.  This book assumes that
everyone needs to know how to program ...

The sum for the sample text above is 27486. The numbers can appear anywhere in the line. There can be any number of numbers in each line (including none).

Handling The Data

The basic outline of this problem is to read the file, look for integers using the re.findall(), looking for a regular expression of ‘[0-9]+’ and then converting the extracted strings to integers and summing up the integers.

My Code

# 统计文件中数字的总和
import re

filename = input('Input filename:')
if len(filename) < 1:
    filename = 'regex_sum_275911.txt'

res = 0
fh = open(filename)
for line in fh:
    tmp = re.findall('[0-9]+', line)

    # 计算一行中数字的和
    for x in tmp:
        res = res + int(x)



Python 2
import re
print sum( [ ****** *** * in **********('[0-9]+',**************************.read()) ] )

Python 3:
import re
print( sum( [ ****** *** * in **********('[0-9]+',**************************.read()) ] ) )