Jump to content

I wrote a program in Python that takes a key/value pair from a CSV and puts them in a dictionary. The dictionaries are then compared for matches. How would I strip the hyphen formatting from the dictionaries? As you can see below the hyphenation is not standard which makes comparing difficult. I'm thinking I need to use regex 
 
For example, let's say the CSV looks like the following:

Chemical CAS | Chemical Name12-589-69 | Ammonia5748-15-9 | Chlorine

 
The dictionary looks like the following:

 d = {'12-589-69', 'Ammonia', '5784-15-9', 'Chlorine'}

I have some code going but this only works for lists. If I subsituted a dictionary would it work the same way?

import reinfilename = 'myfile.fasta'outfilename = 'myfile_nodots.fasta'regex = re.compile("[.-]+ *")    with open(infilename, 'r') as infile, open(outfilename, 'w') as outfile:    for line in infile:        outfile.write(regex.sub("", line))
Link to comment
https://linustechtips.com/topic/374803-remove-hyphens-from-dictionary-in-python/
Share on other sites

Link to post
Share on other sites

After creating the dictionary, you can use dict comprehension and string replacing to fix it up.

d = {key.replace('-',''): val for key,val in d.items()}

 

The following code adds spaces to all the keys which inteferes with the comparisn operation. Anyway to eliminate that?

Link to post
Share on other sites

It shouldn't do that, would need to see the rest of your code to find the problem.

 

Let me know if you need the source CSV files.

import easyguiimport reimport csv#ask user for input fileseasygui.msgbox("What is the source file?")path = easygui.fileopenbox()easygui.msgbox("What is the compare file?")path2 = easygui.fileopenbox()#take input files and put them in dictionarieswith open(path) as l:    d = dict(filter(None, csv.reader(l)))with open(path2) as m:    g = dict(filter(None, csv.reader(m)))#Set up keys, value, and combined text filesmatchkeys = open(r'C:\matchkeys.txt', 'w')matchvalues = open(r'C:\matchvalues.txt', 'w')matches = open(r'C:\combined.txt', 'w')#output path for CSVmatchescsv = r'C:\matches.csv'#Remove hyphens from dictionary prior to comparison to ensure accurate resultsd = {key.replace('-',''): val for key,val in d.items()}g = {key.replace('-',''): val for key,val in g.items()}		#perform the comparisonfor key in d:    if key in g:        k = key        v = d[key]        resultlist = ('{}: {}'.format(d[key], key))        #Write keys, values, and combined into text files        matchkeys.writelines(k + '\n')        matchvalues.writelines(v + '\n')        matches.writelines(resultlist + '\n')#Display combined list in textbox. Combined text list is not delimeted, hence why we have two separate text filesmatches = open(r'C:\combined.txt', 'r')easygui.textbox(msg='These are the matching chemicals',title='Matching Chemicals',text=matches,codebox=0)#Combine the text files by creating lists and combining them into a CSVvalueslist = []matchvalues = open(r'C:\matchvalues.txt', 'r')valueslist = matchvalues.readlines()valueslist[:] = [line.rstrip('\n') for line in valueslist]matchvalues.close()keyslist = []matchkeys = open(r'C:\matchkeys.txt', 'r')keyslist = matchkeys.readlines()keyslist[:] = [line.rstrip('\n') for line in keyslist]matchkeys.close()#Zip the lists above into a CSVzip(valueslist, keyslist)with open(matchescsv, 'w', newline='') as j: #  open() closes itself    writer = csv.writer(j, delimiter=',')    writer.writerows(zip(valueslist, keyslist))easygui.msgbox("Your matches have been exported to a CSV in the following path " + matchescsv)
Link to post
Share on other sites

Sorry for the late response. Here's the source CSV and here's what I'm comparing it with: 

Your source file has a space at the end of every value in the first column. You can get rid of it by calling str.rstrip() when you load the value or just fixing the file.

1474412270.2748842

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×