I have to write a program that uses predefined functions 'recall' and 'precision' to calculate the effectiveness of a spam filter. Been a while since I've done any programming so I'm a little rusty. so far all I have is 'import sys'
The functions are defined in another program, so to get them into my program would I do 'import [program]' or do I have to import the functions individually?
The data files consist of two fields that are separated by a tab space. The first field tells the name of each email and the second field tells whether it is a spam or not: 1 if it is a spam and 0 if it is not. The second file tells what the filter thought: 1 if it thought the email was a spam and 0 if it did not.So would I need to strip and compare the files? Do I need to use pickle? Not really sure where to begin on this one. Any help or info is appreciated!
precision and recall on a spam filter
Page 1 of 17 Replies - 682 Views - Last Post: 13 December 2012 - 11:56 AM
Replies To: precision and recall on a spam filter
#2
Re: precision and recall on a spam filter
Posted 12 December 2012 - 11:41 AM
This ought to help you get started
say I have spammy.py which is a library that contains two functions, mine, and yours.
This lets you say:
this lets you say:
you can also say:
Do some stuff with that and come back when you hit another wall you can't beat down.
say I have spammy.py which is a library that contains two functions, mine, and yours.
import spammy
This lets you say:
spammy.mine spammy.yours
from spammy import mine from spammy import yours
this lets you say:
mine yours
file_truth_text = open("replace_this_with_path_to_your_file_truth_table.txt", 'r').readlines()
for line in file_truth_text:
print line
you can also say:
truth_file = open("path/to/your/file", 'r')
for line in truth_file:
print line
Do some stuff with that and come back when you hit another wall you can't beat down.
#3
Re: precision and recall on a spam filter
Posted 12 December 2012 - 01:19 PM
here's what I have so far:
I'm getting an IOError: [Errno 2] No such file or directory: '~/hw10/hw10.ref'
Which is odd cause I did an ls and the file is right there in the directory
import hw10_lib
from hw10_lib import precision
from hw10_lib import recall
actual = open("~/hw10/hw10.ref", 'r').readlines()
for line in actual:
print (line)
I'm getting an IOError: [Errno 2] No such file or directory: '~/hw10/hw10.ref'
Which is odd cause I did an ls and the file is right there in the directory
#4
Re: precision and recall on a spam filter
Posted 12 December 2012 - 02:52 PM
Python doesn't understand the ~ for your home directory. You have to spell the path out.
So open("/home/marth17/hw10/hw10.ref") instead of open("~/hw10/hw10.ref"). Assuming that /home/marth17 is your homedirectory.
So open("/home/marth17/hw10/hw10.ref") instead of open("~/hw10/hw10.ref"). Assuming that /home/marth17 is your homedirectory.
#5
Re: precision and recall on a spam filter
Posted 12 December 2012 - 03:01 PM
On second thought it might be better to use pythons os.path module to get the path:
import os
home_path = os.path.expanduser("~")
full_path = os.path.join(home_path, "hw10/hw10.ref")
actual = open(full_path, "r").readlines()
...
#6
Re: precision and recall on a spam filter
Posted 12 December 2012 - 06:02 PM
ok fixed that, and got it to print out hw10.ref, but now how do I compare that file with the other file in terms of precision and accuracy?
#7
Re: precision and recall on a spam filter
Posted 12 December 2012 - 11:42 PM
when using predefined functions, I cannot get them to return anything other than the value 1.0.
I know that this is not correct because I are supposed to get a precision result of 0.529411764706.
Also, I am using pop because for some reason the first entry of each list is not a number, so I can't use append(int(...
here's what I have:
I know that this is not correct because I are supposed to get a precision result of 0.529411764706.
Also, I am using pop because for some reason the first entry of each list is not a number, so I can't use append(int(...
here's what I have:
import hw10_lib
from hw10_lib import precision
from hw10_lib import recall
actual = []
for line in open("/path/hw10.ref", 'r'):
actual.append(line.strip().split('\t')[-1])
actual.pop(0)
predicted = []
for line in open("/path/hw10.hyp", 'r'):
predicted.append(line.strip().split('\t')[-1])
predicted.pop(0)
prec = precision(actual, predicted)
rec = recall(actual, predicted)
print ('Precision: ', prec)
print ('Recall: ', rec)
#8
Re: precision and recall on a spam filter
Posted 13 December 2012 - 11:56 AM
You should show us the recall and precision functions. Otherwise there is no way to tell what went wrong.
Page 1 of 1
|
|

New Topic/Question
Reply



MultiQuote



|