Jump to content

c++ making a list of paths from a directory tree

Hi.

Sorry if the title wasn't explicit enough.

I want to turn a directory tree that I know is like this:

main
  |-folder1
  |      |-file1.extension
  |      |-file2.extension
  |
  |-folder2
         |-file3.extension
         |-file4.extension

Into a text file with this:

dir		main/
dir		main/folder1/
fil	index	main/folder1/file1.extension
fil	index	main/folder1/file2.extension
dir		main/folder2/
fil	index	main/folder2/file3.extension
fil	index	main/folder2/file4.extension

 

The point of this is to do caching, I'm using inotify, but it doesn't work recursively, so I need to add each directory (or file, I haven't decided yet, but probably, because then I can immediately know it's index) to it's watch list.

So I'd like all the directories and files (then again, maybe just the files) listed inside a text file, so that when the server starts, I can just loop load all those into inotify, and know exactly when a file is changed, created or deleted, to change, create or delete it's e-tag in the database.

 

But I have no idea how to even get the list of files/directories inside a directory. I've read about getting the result from ls, but then how do I even get inside that directory from a c++ program, or how do I know it's even a file(I could try and ifstream.open but I feel there must be a better way)? I need a direction here.

 

Thank you.

Link to comment
Share on other sites

Link to post
Share on other sites

#include <iostream>
#include <dirent.h>
#include <sys/stat.h>

bool directoryExists(const char *directoryToCheck)
{ 
    struct stat info;
    if(stat(directoryToCheck, &info) != 0) {
        return false;
    } else if(info.st_mode & S_IFDIR)  { 
        return true;
    } else {
        return false;
    } 
}

int main()
{
    const char *directoryName{"/home/pinguinsan/"}; //Or whatever directory you want
    DIR *targetDirectory;
    struct dirent *entity;
    if ((targetDirectory = opendir(directoryName)) != nullptr) {
        //Print all files/directories
        while ((entity = readdir(targetDirectory)) != nullptr) {
            std::cout << entity->d_name;
            if (directoryExists(entity->d_name)) {
                std::cout << "  <---Directory";
            } else {
                std::cout << "  <---File";
            }
            std::cout << std::endl;
        }
        closedir (targetDirectory);
    } else {
        //File open failure
        std::cout << "Could not open directory " << directoryName << std::endl;
        return EXIT_FAILURE;
    }
}

 

This should work on Windows and Linux. I tested it on my home directory in Linux ("/home/pinguinsan/") and this is an example output:

pinguinsan@Z170A-Titanium-PC:~$ ./test 
.gradle  <---Directory
.xorg.config.nvidia.backup  <---File
.vimrc  <---File
.local  <---Directory
.esd_auth  <---File
.mplayer  <---Directory
.nvidia-settings-rc  <---File
.designer  <---Directory
.nv  <---Directory
.android  <---Directory
.gksu.lock  <---File
.adobe  <---Directory
.openshot  <---Directory
.xfce4-session.verbose-log.last  <---File
.bash_history  <---File
.mozilla  <---Directory
.IdeaIC2016.2  <---Directory
.steam  <---Directory
.quake2  <---Directory
GitHub  <---Directory
.vscode  <---Directory
.steampid  <---File
.thumbnails  <---Directory
.xfce4-session.verbose-log  <---File
..  <---Directory
.xsession-errors.old  <---File
.xsession-errors  <---File
test  <---File
.config  <---Directory
.emacs.d  <---Directory
.pulse-cookie  <---File
Games  <---Directory
HackerRank  <---Directory
.face  <---File
.xinitrc  <---File
.gstreamer-0.10  <---Directory
.extcalc  <---Directory
.bashrc  <---File
.gtk-bookmarks  <---File
.gnome2  <---Directory
.lesshst  <---File
.git-credential-cache  <---Directory
.Xauthority  <---File
Desktop  <---Directory
.xorg.config.nvidia  <---File
.ssh  <---Directory
.java  <---Directory
.epsxe  <---Directory
.viminfo  <---File
.AndroidStudioPreview2.0  <---Directory
.steampath  <---File
.jssc  <---Directory
.cache  <---Directory
.macromedia  <---Directory
.ssr  <---Directory
VirtualBox VMs  <---Directory
.bash_logout  <---File
.inputrc  <---File
.gnome  <---Directory
.arduino15  <---Directory
.wget-hsts  <---File
Google  <---Directory
.  <---Directory
Downloads  <---Directory
Documents  <---Directory
.gimp-2.8  <---Directory
.git-completion.bash  <---File
.mime.types  <---File
.ICEauthority  <---File
.doom3  <---Directory
.vim  <---Directory
.q3a  <---Directory
.oracle_jre_usage  <---Directory
.dmrc  <---File
.pki  <---Directory
.gnupg  <---Directory
.darkplaces  <---Directory
test.cpp  <---File
.bash_profile  <---File
Arduino  <---Directory
.gitconfig  <---File

 

From there, it would be possible to put all of the directories into a data structure and repeat the above process until you run out of directories that have files in them.

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, gabrielcarvfer said:

Try Boost C++ Filesystem utilities. They have pretty simple tutorials on how to use their stuff. https://theboostcpplibraries.com/boost.filesystem
 

I know boost, but I'm looking for native stuff, thank you regardless.

 

5 minutes ago, Pinguinsan said:

#include <iostream>
#include <string>
#include <dirent.h>
#include <sys/types.h>
#include <sys/stat.h>

bool directoryExists(const char *directoryToCheck)
{ 
    struct stat info;
    if(stat(directoryToCheck, &info) != 0) {
        return false;
    } else if(info.st_mode & S_IFDIR)  { 
        return true;
    } else {
        return false;
    } 
}

int main()
{
    const char *directoryName{"/home/pinguinsan/"}; //Or whatever directory you want
    DIR *targetDirectory;
    struct dirent *entity;
    if ((targetDirectory = opendir(directoryName)) != nullptr) {
        //Print all files/directories
        while ((entity = readdir(targetDirectory)) != nullptr) {
            std::cout << entity->d_name;
            if (directoryExists(entity->d_name)) {
                std::cout << "  <---Directory";
            } else {
                std::cout << "  <---File";
            }
            std::cout << std::endl;
        }
        closedir (targetDirectory);
    } else {
        //File open failure
        std::cout << "Could not open directory " << directoryName << std::endl;
        return EXIT_FAILURE;
    }
}

This should work on Windows and Linux. I tested it on my home directory in Linux ("/home/pinguinsan/") and this is an example output:


pinguinsan@Z170A-Titanium-PC:~$ ./test 
.gradle  <---Directory
.xorg.config.nvidia.backup  <---File
.vimrc  <---File
.local  <---Directory
.esd_auth  <---File
.mplayer  <---Directory
.nvidia-settings-rc  <---File
.designer  <---Directory
.nv  <---Directory
.android  <---Directory
.gksu.lock  <---File
.adobe  <---Directory
.openshot  <---Directory
.xfce4-session.verbose-log.last  <---File
.bash_history  <---File
.mozilla  <---Directory
.IdeaIC2016.2  <---Directory
.steam  <---Directory
.quake2  <---Directory
GitHub  <---Directory
.vscode  <---Directory
.steampid  <---File
.thumbnails  <---Directory
.xfce4-session.verbose-log  <---File
..  <---Directory
.xsession-errors.old  <---File
.xsession-errors  <---File
test  <---File
.config  <---Directory
.emacs.d  <---Directory
.pulse-cookie  <---File
Games  <---Directory
HackerRank  <---Directory
.face  <---File
.xinitrc  <---File
.gstreamer-0.10  <---Directory
.extcalc  <---Directory
.bashrc  <---File
.gtk-bookmarks  <---File
.gnome2  <---Directory
.lesshst  <---File
.git-credential-cache  <---Directory
.Xauthority  <---File
Desktop  <---Directory
.xorg.config.nvidia  <---File
.ssh  <---Directory
.java  <---Directory
.epsxe  <---Directory
.viminfo  <---File
.AndroidStudioPreview2.0  <---Directory
.steampath  <---File
.jssc  <---Directory
.cache  <---Directory
.macromedia  <---Directory
.ssr  <---Directory
VirtualBox VMs  <---Directory
.bash_logout  <---File
.inputrc  <---File
.gnome  <---Directory
.arduino15  <---Directory
.wget-hsts  <---File
Google  <---Directory
.  <---Directory
Downloads  <---Directory
Documents  <---Directory
.gimp-2.8  <---Directory
.git-completion.bash  <---File
.mime.types  <---File
.ICEauthority  <---File
.doom3  <---Directory
.vim  <---Directory
.q3a  <---Directory
.oracle_jre_usage  <---Directory
.dmrc  <---File
.pki  <---Directory
.gnupg  <---Directory
.darkplaces  <---Directory
test.cpp  <---File
.bash_profile  <---File
Arduino  <---Directory
.gitconfig  <---File

 

From there, it would be possible to put all of the directories into a data structure and repeat the above process until you run out of directories that have files in them.

Thank you this is godlike!

So all directories show with a dot? Not that I mind, just for curiosity why is that? I know ./ is the current directory but that confuses me.

Also why put a cout after the if's end for an endl instead of putting that endl after each of those two couts inside the if? Also where can I read about optimization? I often find myself wondering about which option would be best among basic things (for example I thought a case was much more optimized than a bunch of else-if's when if you know the odds of a certain option an if is way better, I used this for the mime types, content-type in http headers).

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, ¨TrisT¨ said:

I know boost, but I'm looking for native stuff, thank you regardless.

 

Thank you this is godlike!

So all directories show with a dot? Not that I mind, just for curiosity why is that? I know ./ is the current directory but that confuses me.

Also why put a cout after the if's end for an endl instead of putting that endl after each of those two couts inside the if? Also where can I read about optimization? I often find myself wondering about which option would be best among basic things (for example I thought a case was much more optimized than a bunch of else-if's when if you know the odds of a certain option an if is way better, I used this for the mime types, content-type in http headers).

Hey, glad you like it!  I'll try to answer each:


So all directories show with a dot?

Not quite. In Linux, a very common way to store configuration data is in a file in the users home directory (equivalent to C:\Users\UserName\AppData\ under Windows). To make it look cleaner, they prefix their configuration folders with a dot ('.'), because folders prefixed with a dot are "hidden folders" for most Linux file managers. So since I have a lot of stuff installed on this machine, I have a lot of those folders. All Linux file systems also have two "special" directories in every directory, the single dot ('.') and double dot ('..') directories, which represent the current directory and up one directory, respectively. So for example, if I'm in the folder "/home/pinguinsan", then '.' represents the directory /home/pinguinsan, and the '..' represents the directory /home/.

 

 

Also why put a cout after the if's end for an endl instead of putting that endl after each of those two couts inside the if?

This is just a preference, either would work. I just like including any "common" operations (anything that is executed regardless of the if/else, ie the std::endl) outside of the if/else structure, so I don't have to type it twice. It would have worked equally well either way.

 

Also where can I read about optimization?

I understand what you mean, I often spent a lot of time thinking about this earlier in the programming career. The question could honestly spawn a gigantic debate an it depends on what application domain you are targeting (it is very important for embedded systems, but pretty much pointless in modern desktop systems). You can definitely take solace to know that most of the time, the compiler (ESPECIALLY gcc) will optimize things far more than you could by changing small bits around. But if you are unsure of something, I've found the best way to do this is to Google something like "c++ sorting best practice" or something like that and reading the stackoverflow threads. They will tell you a lot. To answer your question specifically, take a look at this stackoverflow post.

 

Lastly, this actually sounds like a really fun problem and my Detroit Lions are on bye this week, so I'll try to whip up a recursive solution (printing all directories and files from a root directory) for learning purposes.

Link to comment
Share on other sites

Link to post
Share on other sites

7 minutes ago, Pinguinsan said:

Hey, glad you like it!  I'll try to answer each:


So all directories show with a dot?

Not quite. In Linux, a very common way to store configuration data is in a file in the users home directory (equivalent to C:\Users\UserName\AppData\ under Windows). To make it look cleaner, they prefix their configuration folders with a dot ('.'), because folders prefixed with a dot are "hidden folders" for most Linux file managers. So since I have a lot of stuff installed on this machine, I have a lot of those folders. All Linux file systems also have two "special" directories in every directory, the single dot ('.') and double dot ('..') directories, which represent the current directory and up one directory, respectively. So for example, if I'm in the folder "/home/pinguinsan", then '.' represents the directory /home/pinguinsan, and the '..' represents the directory /home/.

 

 

Also why put a cout after the if's end for an endl instead of putting that endl after each of those two couts inside the if?

This is just a preference, either would work. I just like including any "common" operations (anything that is executed regardless of the if/else, ie the std::endl) outside of the if/else structure, so I don't have to type it twice. It would have worked equally well either way.

 

Also where can I read about optimization?

I understand what you mean, I often spent a lot of time thinking about this earlier in the programming career. The question could honestly spawn a gigantic debate an it depends on what application domain you are targeting (it is very important for embedded systems, but pretty much pointless in modern desktop systems). You can definitely take solace to know that most of the time, the compiler (ESPECIALLY gcc) will optimize things far more than you could by changing small bits around. But if you are unsure of something, I've found the best way to do this is to Google something like "c++ sorting best practice" or something like that and reading the stackoverflow threads. They will tell you a lot. To answer your question specifically, take a look at this stackoverflow post.

 

Lastly, this actually sounds like a really fun problem and my Detroit Lions are on bye this week, so I'll try to whip up a recursive solution (printing all directories and files from a root directory) for learning purposes.

Thank you again, and good luck with the recursive one.

I've gotta do it as well, because this is a web server and it'll have quite a bit of sub-directories. And now that I'm thinking of it it really seems like quite a problem. I figure the easy route would be to, whenever you find a directory, create a new thread with the function - to process that directory, then pass on an argument with a string containing the directory it's in, relatively to the directory you first told the program to search in, so it can be listed relatively to it, like string[root/firstdir/secondir/] + currently found file name. It'll most likely be bottle-necked by the storage speed so your CPU won't go to 100% every time you run it (not that it should be a thing to rely on but hey).

 

I know how many "layers" of subfolders I'll have, so I'll just do something simple regarding a bi-dimensional array.

 

If you'd be ok with it, I'm pretty curious to know how you'll do it.

Link to comment
Share on other sites

Link to post
Share on other sites

On 11/13/2016 at 5:09 PM, Pinguinsan said:

Lastly, this actually sounds like a really fun problem and my Detroit Lions are on bye this week, so I'll try to whip up a recursive solution (printing all directories and files from a root directory) for learning purposes.

You've most likely already figured this out but I only had time now so here it is:

worked.png

 

bool directoryExists(const char *directoryToCheck)
{ 
    struct stat info;
    if(stat(directoryToCheck, &info) != 0) {
        return false;
    } else if(info.st_mode & S_IFDIR)  { 
        return true;
    } else {
        return false;
    } 
}


void FileLister(const char* directoryName)
{
    DIR *targetDirectory;
    struct dirent *entity;
    if ((targetDirectory = opendir(directoryName)) != nullptr) {
        while ((entity = readdir(targetDirectory)) != nullptr) {
            std::string gayu = std::string(directoryName) + std::string(entity->d_name);
            if (directoryExists(gayu.c_str())) {
                if (std::string(entity->d_name) != "." && std::string(entity->d_name) != "..")
                {
            		std::cout << gayu.c_str() << std::endl;
	                std::string dire = std::string(directoryName) + std::string(entity->d_name) + "/";
	                FileLister((const char*)dire.c_str());
                }
            } else {
                std::cout <<gayu.c_str() << std::endl;
            }
        }
        closedir (targetDirectory);
    } else {
        std::cout << "Could not open directory " << directoryName << std::endl;
        return;
    }
}

Don't mind my variable names I get bored

 

Wouldn't have been possible without you so thank you again! :D

Link to comment
Share on other sites

Link to post
Share on other sites

Hey great job man! I actually dug into it for awhile but hit a block, and some other things came up and I didn't get a chance to finish it. Excellent work.

Link to comment
Share on other sites

Link to post
Share on other sites

On 11/13/2016 at 5:09 PM, Pinguinsan said:

Hey, glad you like it!  I'll try to answer each:


So all directories show with a dot?

Not quite. In Linux, a very common way to store configuration data is in a file in the users home directory (equivalent to C:\Users\UserName\AppData\ under Windows). To make it look cleaner, they prefix their configuration folders with a dot ('.'), because folders prefixed with a dot are "hidden folders" for most Linux file managers. So since I have a lot of stuff installed on this machine, I have a lot of those folders. All Linux file systems also have two "special" directories in every directory, the single dot ('.') and double dot ('..') directories, which represent the current directory and up one directory, respectively. So for example, if I'm in the folder "/home/pinguinsan", then '.' represents the directory /home/pinguinsan, and the '..' represents the directory /home/.

 

 

Also why put a cout after the if's end for an endl instead of putting that endl after each of those two couts inside the if?

This is just a preference, either would work. I just like including any "common" operations (anything that is executed regardless of the if/else, ie the std::endl) outside of the if/else structure, so I don't have to type it twice. It would have worked equally well either way.

 

Also where can I read about optimization?

I understand what you mean, I often spent a lot of time thinking about this earlier in the programming career. The question could honestly spawn a gigantic debate an it depends on what application domain you are targeting (it is very important for embedded systems, but pretty much pointless in modern desktop systems). You can definitely take solace to know that most of the time, the compiler (ESPECIALLY gcc) will optimize things far more than you could by changing small bits around. But if you are unsure of something, I've found the best way to do this is to Google something like "c++ sorting best practice" or something like that and reading the stackoverflow threads. They will tell you a lot. To answer your question specifically, take a look at this stackoverflow post.

 

Lastly, this actually sounds like a really fun problem and my Detroit Lions are on bye this week, so I'll try to whip up a recursive solution (printing all directories and files from a root directory) for learning purposes.

Thx :D

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×