Jump to content
To encourage social distancing, you must leave two blank lines at the start and end of every post, and before and after every quote. Failure to comply may result in non-essential parts of the forum closing. Click for more details. ×
Search In
  • More options...
Find results that contain...
Find results in...
GlorifiedPaperShuffler

Linux Script to Flatten Directory Structure

Recommended Posts

Posted · Original PosterOP

First, I'm not sure if this belongs here or in the Linux section, so mods, please move it as you deem appropriate.

 

I'm a total newbie to writing Linux scripts. But I have a mega server I just built, so I'm in the mood to play a little and learn something new along the way.

 

I want to write a script to clean up a particularly messy Google Drive directory that I have mounted in a Mint XFCE VM. It's the result of years of "I'll just put this here for now".

 

They're structured something like this:

gdrive:

-lvl1fld1

--lvl2fld1sfl1

--lvl2fld1sfl2

---lvl3fld1sfl2ssfl1

-lvl1fld2

-lvl1fld3

--lvl2fld3sfl1

-lvl1fld4

 

The folder structure can be many levels deep..., and there are literally 16000 files in there.

I don't want all 16000, just certain files (e.g., I have no need to keep random .txt files or thumb.db files since this is Linux).

 

So I want to write a script that will flatten the structure for me, while keeping only the relevant files. Thus, my current thinking is that the script needs to perform the following:

  1. Recursively scan the folders to obtain a list of files that have the specified extensions (extension whitelist).
  2. Move all files to another main folder, into the appropriate folders named according to the original lvl1 name.
  3. Leave the rest behind in gdrive as garbage for me to manually confirm that they are no longer needed.

As a result, I should end up with 2 folders. One is the gdrive folder above with the garbage left behind, and another is a cleandrive folder with the following structure:

cleandrive:

-lvl1fld1

-lvl1fld2

-lvl1fld3

-lvl1fld4

 

Where should I start?

Link to post
Share on other sites
#! /bin/python3

import sys
import os
import re


if len(sys.argv) != 4:
    print("Usage: {} <source dir> <destination dir> <regular expression>".format(sys.argv[0]))



for root, dirs, files in os.walk(sys.argv[1]):
    for file in files:
        path = os.path.join(root, file)
        match = re.match(sys.argv[3], path)
        if match:
            os.rename(path, sys.argv[2] + "/" + file)

I just wrote the above python script that will work. 

 

create a file called file_walker

copy and paste the entire code above into it

in your terminal run: chmod +x file_walker

then run: ./file_walker <souce directory> <destination directory> <regular expression for matching>

 

e.g. if i want to move all the files that ends with .txt in my home downloads directory to a subdirectory name test on my fedora linux machine, i would run:  

./file_walker /home/xgao/Downloads /home/xgao/Downloads/test ^.*\.txt$

Edited by wasab
grammer

Sudo make me a sandwich 

 

Check out my guide on creating your own private cloud storage

 

Link to post
Share on other sites
Posted · Original PosterOP
2 hours ago, wasab said:

I just wrote the above python script that will work. 

 

create a file called file_walker

copy and paste the entire code above into it

in your terminal run: chmod +x file_walker

then run: ./file_walker <souce directory> <destination directory> <regular expression for matching>

 

e.g. if i want to move all the files that ends with .txt in my home downloads directory to a subdirectory name test on my fedora linux machine, i would run:  

./file_walker /home/xgao/Downloads /home/xgao/Downloads/test ^.*\.txt$

Maybe I'm a dummy, but I'm getting:

bash: ./file_walker: /bin/python3: bad interpreter: No such file or directory

Am I doing something wrong here?

 

Never mind. I got the script to execute, but it isn't exactly what I'm trying to achieve...

 

It basically goes through the directory, and just dumps the files into a fixed directory.

I need it to recreate the folder structure on the other end.

There are 160+ folders in the gdrive, hence it'd be nice to not have to execute the script manually 160+ times.

 

I need to make a script that will recreate those 160+ folders. i.e., flatten the target folder's structure, but not completely, in another folder. End result is:

cleandrive:

-lvl1fld1

-lvl1fld2

-lvl1fld3

-lvl1fld4

 

So on and so forth until it reaches lvl1fld160

Link to post
Share on other sites

It will go through the directory and all the subdirectory. If your drive is 160 levels deep, it will go down into all the 160 levels. You don't need to run it 160 times. 

 

What you need to do is manually create the folder you want to dump these files into. If there are 4 folders you want to dump 4 different files, run the script 4 times. 

 

Edit: oh, I see what you mean. I will need to know excatly the naming scheme of the folders and how you sort each files into each folder. You can modify that script if you think you have a rough idea. 


Sudo make me a sandwich 

 

Check out my guide on creating your own private cloud storage

 

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


×