Python - Need to simplify some code
Hi, I appreciate the thread and I hope this assists on your conquest.
The format of "raw.txt" is reflective of the format provided on:
I then generated 100 unique results to provide some scalability and ensure no accidental errors with irregular data. Please note, if "raw.txt" begins with a "\n" (newline), the Python script will fail.
The "clean.csv" is reflective of the requested cleaned CSV file. You may edit the "replacement_terms" of the correct pretext strings (i.e. replace "EmployeeID" with "Payroll Number") in the script to be reflective of your true headers.
replacement_terms = {"\n": "", "EmployeeID: ": "\n", "Issue at ": "", "Staff Member Name ": "", "First Name ": "", "Preferred Name ": "", "Middle Name ": "", "Surname ": "", "Start Date ": "", "Contract End ": "", "Country ": ""}
The main.py is the actual script for making these edits. Your primary focus should be on
employee_list = [] for line in f: for key, value in replacement_terms.items(): line = line.replace(key, value) if "\n" in line and line != "\n": line = line.replace("\n", "") if len(employee_list) > 0: employees_list += [employee_list.copy()] employee_list.clear() employee_list.append(line) f.close()
This is the actual code that does the edits (with the above mentioned replacement_terms on line 2). I think the script should be fairly explainable. I'm confident it's far from most optimized. However, it works decently well and does not use any imports.
Another thing to note, with clean.csv, it'll rewrite your old data; if desired to append information as you go (rather than destruct), you can remove lines:
# Let's write the CSV header f = open("clean.csv", "w") for key in replacement_terms.keys(): if key != "\n": f.write(key + ",") f.write("\n") f.close()
The "w" means "write" which will forcefully write over the existing .csv; removing this will leave only the appending "a" of the CSV body portion.
I threw in another script, "generate_test_data.py" to generate the exam raw.txt data. This does include an import as it falls outside the original request. However, it's useful to show how this information was generated.
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now