Jump to content

Python OTP encryption program help

SgtBot
Go to solution Solved by colonel_mortis,

When you encrypt the contents, you might end up with bytes that no longer correspond to valid unicode characters, so the str function can't deal with it. Instead of using the str function on line 45, try the bytes function. You will also need to open the file as binary, which means everywhere where you have open(filename, 'r') or open(filename, 'w'), you need to replace it with open(filename, 'rb') or open(filename, 'wb'). I've not tested that this fixes your problems, but I think that should help.

I have been practicing python by programming an encryption program that uses a one-time pad to encrypt the contents of a text file. The program will read the contents of the text file, split it into a list of individual characters, converts those into their ascii equivalent with the ord() function, generates  a key of the same length as the original data (in a list of individual ascii integers as well), and XOR's the two together. The program will output both the key and the encrypted or decrypted message into a text file which the user can name, and since it is OTP encryption, the same function is used to decrypt the cyphertext as well. Here is the code:

 

import random

def encrypt(data, key):
    dataPrep = []
    for i in data:
        dataPrep.append(ord(i))

    enc = []
    for i in range(len(dataPrep)):
        enc.append(chr(dataPrep[i] ^ key[i]))

    return ''.join(enc)
    
def generateKey(data):
    keyPrompt = input("Generate new key?(y/n):\n")
    key = []
    if keyPrompt == 'y':
        for i in range(len(data)):
            key.append(random.randint(1, 126))
        keyStr = ''.join(map(chr, key))
        k = open('key.txt', 'w')
        k.write(keyStr)
        k.close()
    elif keyPrompt == 'n':
        keyFile = input("Enter the filename of the key file (include '.txt')\n")
        k = open(keyFile, 'r')
        key = k.read()
        k.close()
        key = list(key)
        key = list(map(ord, key))
    return key

textFile = input("What file would you like to encrypt/decrypt (include '.txt')\n")
d = open(textFile, 'r')
data = d.read()
d.close()

key = generateKey(data)

dataPrompt = input("Are you encrypting or decrypting this file? ('e' for encrypt, 'd' for decrypt)\n")
if dataPrompt == 'e':
    enc = encrypt(data, key)
    encryptName = input("What filename would you like to store the encrypted data in? (include '.txt')\n")
    e = open(encryptName, 'w')
    e.write(str(enc))
    e.close()
elif dataPrompt == 'd':
    enc = encrypt(data, key)
    encryptName = input("What filename would you like to store the decrypted data in? (include '.txt')\n")
    e = open(encryptName, 'w')
    e.write(str(enc))
    e.close()

The code has worked flawlessly so far with small text files, such as ones that read: "apple potato grape pear" or "hello world!". But when I decided to try and encrypt a larger file that is when I ran into problems that I can't figure out... I tried to encrypt the poem Ode by Arthur O'Shaughnessy (which is 786 characters long when checked with len()). When I tried to encrypt it the first time, I got the error message:

Traceback (most recent call last):
  File "C:\Users\SgtBot\Desktop\Encrypt Program\encrypt.py", line 45, in <module>
    e.write(str(enc))
  File "C:\Users\SgtBot\AppData\Local\Programs\Python\Python37-32\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2055' in position 313: character maps to <undefined>

The key seemed to generate fine (still 786 characters):

'AdBit-^"[Mp,fNq[yh^WDXWI-U)a4XX\Z F u^JhcF
m!CY6e?S/oroCuiMae
>P{i'"b(|$xTP6ZTFKFlo`j)b+AYMJK`Qk.-h
#rmM$?1\,@ 2-SJTLMOy$fMV(K}e-%guhm1lSvt>zaET4w2uuc_X?v^]N :6%QojeR#wI>>/8?4$+?,
sao.OV6S
^!-n/YuqP
 _oy=}&lb5HWqQuOn]MyVw(.j%{fIi><^Ie@;l7Pvc&B:jQLB#<{25`"H[:X6~Y[b]1?\
&G,Sm8bC aBz3dl41Pp)Qf}'SDf	29KN<A=;uA^t4~)TY-~MX11a[5/NFACWoegp#l+}Z
7@Q*@@+^t:$Dro%nSY
E.aZZwg_/RB+KJpt$q=Iq{OPJESk{ycn<JPybaWC#Ik%TTvZl
v#g@F:HyQ"4Ct*kdG4Kbj>0cTV$@!^Wgk9Hh'&< \"qDV#wR8}&O
vO
4Z%eJG@AFA}{FD%xkl_*
 Of2p5S 5<pR,p cpqbr .`)avn3Hiw.d+$
#,BQO~AP[Dq
<F0B9-dENJ<)	`~tOZQ}>^W!hr)~%E{rrQe#sxZT-4H~_vs!d$J/:E=oi"I04=ji@b@SL"PF[3fRdC<"6v

But the text file supposed to contain the encrypted text was overwritten with nothing, just a blank text file.

 

I'm not really sure what went wrong here.. I tried changing the key length to not include weird ascii characters but instead 32 to 126 but I still got the same issue.

 

Any advice would be greatly appreciated!

Link to comment
Share on other sites

Link to post
Share on other sites

When you encrypt the contents, you might end up with bytes that no longer correspond to valid unicode characters, so the str function can't deal with it. Instead of using the str function on line 45, try the bytes function. You will also need to open the file as binary, which means everywhere where you have open(filename, 'r') or open(filename, 'w'), you need to replace it with open(filename, 'rb') or open(filename, 'wb'). I've not tested that this fixes your problems, but I think that should help.

HTTP/2 203

Link to comment
Share on other sites

Link to post
Share on other sites

7 hours ago, colonel_mortis said:

When you encrypt the contents, you might end up with bytes that no longer correspond to valid unicode characters, so the str function can't deal with it. Instead of using the str function on line 45, try the bytes function. You will also need to open the file as binary, which means everywhere where you have open(filename, 'r') or open(filename, 'w'), you need to replace it with open(filename, 'rb') or open(filename, 'wb'). I've not tested that this fixes your problems, but I think that should help.

Well I tried the line 45 edit with the bytes function and got the error:

Traceback (most recent call last):
  File "C:\Users\SgtBot\Desktop\Encrypt Program\encrypt.py", line 45, in <module>
    e.write(bytes(enc))
TypeError: string argument without an encoding

And without that but with the reading the files as binary I got the error:

Traceback (most recent call last):
  File "C:\Users\SgtBot\Desktop\Encrypt Program\encrypt.py", line 42, in <module>
    enc = encrypt(data, key)
  File "C:\Users\SgtBot\Desktop\Encrypt Program\encrypt.py", line 6, in encrypt
    dataPrep.append(ord(i))
TypeError: ord() expected string of length 1, but int found

I haven't been converting them to bytes format because it seems that XOR'ing 2 ints together provides the same result as it would when XOR'ing their binary counterparts. I've been using the ord() function to convert individual letters and characters into the decimal ascii counterpart, and then XORing those with a list of integers ranging from 0 to 126 (because ascii 127 is the delete character, and that causes some problems with the encryption). Do you think maybe my range of numbers is just too high?

Link to comment
Share on other sites

Link to post
Share on other sites

Ohh I figured it out. It is a problem when the program tries to read and write files. I don't put any specific encoding when opening or reading the files and so ascii numbers outside of the normal range cause problems when trying to read or write. I fixed the code by putting an additional ", encode = 'utf-8'" into each open() function, whether it be read or write. Thanks for the insight though you definitely led me in the right direction! Here is the updated code:

import random
import sys

def encrypt(data, key):
    dataPrep = []
    for i in data:
        dataPrep.append(ord(i))

    enc = []
    for i in range(len(dataPrep)):
        enc.append(chr(dataPrep[i] ^ key[i]))

    return ''.join(enc)
    
def generateKey(data):
    keyPrompt = input("Generate new key?(y/n):\n")
    key = []
    if keyPrompt == 'y':
        for i in range(len(data)):
            key.append(random.randint(1, 126))
        keyStr = ''.join(map(chr, key))
        k = open('key.txt', 'w', encoding='utf-8')
        k.write(keyStr)
        k.close()
    elif keyPrompt == 'n':
        keyFile = input("Enter the filename of the key file (include '.txt')\n")
        k = open(keyFile, 'r', encoding='utf-8')
        key = k.read()
        k.close()
        key = list(key)
        key = list(map(ord, key))
    return key

textFile = input("What file would you like to encrypt/decrypt (include '.txt')\n")
d = open(textFile, 'r', encoding='utf-8')
data = d.read()
d.close()

key = generateKey(data)

dataPrompt = input("Are you encrypting or decrypting this file? ('e' for encrypt, 'd' for decrypt)\n")
if dataPrompt == 'e':
    enc = encrypt(data, key)
    encryptName = input("What filename would you like to store the encrypted data in? (include '.txt')\n")
    e = open(encryptName, 'w', encoding='utf-8')
    e.write(str(enc))
    e.close()
elif dataPrompt == 'd':
    enc = encrypt(data, key)
    encryptName = input("What filename would you like to store the decrypted data in? (include '.txt')\n")
    e = open(encryptName, 'w', encoding='utf-8')
    e.write(str(enc))
    e.close()

 

Link to comment
Share on other sites

Link to post
Share on other sites

Though it probably doesn't matter for your purposes, because you're limiting the range of the bytes in your key to 1-126, I believe your encrypted files would be vulnerable to cryptanalysis, because you don't xor the data with 0 or 127-255, so there is a slight bias in the data (and, if you encode anything other than plain ascii, it will also not modify the upper bit, because only the bytes 128-255 can have that bit set).

If you did want to fix that, you would need to switch to bytes everywhere (my previous answer covered some of the locations where you'd need to change that, but you would also need to remove the call to chr in encrypt, and call to ord within the map when reading the key, and make changes to keyStr when generating a key.

 

Your solution is fine if you're only generating characters 1-126 though (and I believe 127/DEL ought to work correctly as long as you don't print it to the screen, as well as 0/NUL; adding those would make it equivalent to a one time pad (subject to the randomness that python gives you) for plain ascii text, which covers everything you want to type except emoji and accents).

HTTP/2 203

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, colonel_mortis said:

Though it probably doesn't matter for your purposes, because you're limiting the range of the bytes in your key to 1-126, I believe your encrypted files would be vulnerable to cryptanalysis, because you don't xor the data with 0 or 127-255, so there is a slight bias in the data (and, if you encode anything other than plain ascii, it will also not modify the upper bit, because only the bytes 128-255 can have that bit set).

If you did want to fix that, you would need to switch to bytes everywhere (my previous answer covered some of the locations where you'd need to change that, but you would also need to remove the call to chr in encrypt, and call to ord within the map when reading the key, and make changes to keyStr when generating a key.

 

Your solution is fine if you're only generating characters 1-126 though (and I believe 127/DEL ought to work correctly as long as you don't print it to the screen, as well as 0/NUL; adding those would make it equivalent to a one time pad (subject to the randomness that python gives you) for plain ascii text, which covers everything you want to type except emoji and accents).

I might try that next then after I'm satisfied with this one :D Originally when I was writing this, I did try to change the contents of the text file into binary, however when the string is in the bytes literal format (b'string'), I was not able to use the '^' operator for XOR like I would be able to with an integer... is there a  way to XOR two byte literals together?

Link to comment
Share on other sites

Link to post
Share on other sites

10 hours ago, SgtBot said:

I might try that next then after I'm satisfied with this one :D Originally when I was writing this, I did try to change the contents of the text file into binary, however when the string is in the bytes literal format (b'string'), I was not able to use the '^' operator for XOR like I would be able to with an integer... is there a  way to XOR two byte literals together?

You will still need to xor each byte individually, then combine them. This code works for me:

def encrypt(data, key): # data and key are both bytes objects (b"string")
    enc = []
    for i in range(len(data)):
        enc.append(data[i] ^ key[i]) # xor each byte individually

    return bytes(enc) # Convert from an array back to a bytes object

 

HTTP/2 203

Link to comment
Share on other sites

Link to post
Share on other sites

20 hours ago, colonel_mortis said:

You will still need to xor each byte individually, then combine them. This code works for me:


def encrypt(data, key): # data and key are both bytes objects (b"string")
    enc = []
    for i in range(len(data)):
        enc.append(data[i] ^ key[i]) # xor each byte individually

    return bytes(enc) # Convert from an array back to a bytes object

 

Actually, after messing around with the code for a bit, I found out that your first suggestion to read and write the files using something along the lines of the code below it worked much better:

open('file', 'rb')

I apologize for not looking more in depth to you first method, it definitely helped a lot and it also allowed me to encrypt not only text files, but video and image as well. Here is the updated code:

import random

def encrypt(data, key):
    dataPrep = []
    for i in data:
        dataPrep.append(i)

    enc = []
    for i in range(len(dataPrep)):
        enc.append(dataPrep[i] ^ key[i % 1023])

    return bytes(enc)
    
def generateKey(data):
    keyPrompt = input("Generate new key?(y/n):\n")
    key = []

    while keyPrompt not in ('y', 'n'):
        print('Invalid input.')
        keyPrompt = input("Generate new key?(y/n):\n")
    
    if keyPrompt == 'y':
        for i in range(1024):
            key.append(random.randint(1, 255))
        keyStr = bytes(key)
        k = open('key.txt', 'wb')
        k.write(keyStr)
        k.close()
    elif keyPrompt == 'n':
        while True:
            try:
                keyFile = input("Enter the filename of the key file (include file type)\n")
                k = open(keyFile, 'rb')
                key = k.read()
                k.close()
            except FileNotFoundError:
                print("That file does not exist.")
                pass
            else:
                break
        
    return key

while True:
    try:
        textFile = input("What file would you like to encrypt/decrypt (include file type)\n")
        d = open(textFile, 'rb')
        data = d.read()
        d.close()
    except FileNotFoundError:
        print("That file does not exist.")
        pass
    else:
        break
    
key = generateKey(data)

enc = encrypt(data, key)
encryptName = input("What filename would you like to store the encrypted/decrypted data in? (include file type')\n")
encryptList = encryptName.split('.')
e = open(encryptName, 'wb')
e.write(enc)
e.close()

The reason that it did not work before is that I was still converting the files into text with chr() and trying to write them to a file that I opened with open(file, 'wb'). Thanks for all the help! I also changed it to generate a 1kb key instead of a key of the same length as the file just because I was testing it on encrypting video and image files as well.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×