I am in the middle of a project for work right now requiring me to grab a list of files, identifying the parent directories, file names, and other information. I then use the path and an Amazon module to upload the files to an S3 bucket. I have used the os module in the past to deal with files, which is what I did this time.
After perusing an article on the Real Python site about pathlib, I wanted to try it out to see if it would produce the information I was getting from os. Here’s what I came up with:
# import pathlib instead of using os
import pathlib
# creates a Windows path object from a path on my Windows machine
dir = pathlib.Path("C:\\Users\\miles\\OneDrive\\Desktop\\JavaScript")
# creates a generator object that is iterable
result = list(dir.rglob("*"))
# iterates through a list of path objects
for r in result:
# .stat() provides information about the file, such as time last
# accessed and the total size of the file
print(r.stat())
# this gives True or False depending on if is a file
print(r.is_file())
# same as is_file() but is True if a directory
print(r.is_dir())
# printing r seems to give a string representing the path
print(r)
Here is a snippet of my original code, which uses os.walk:
# function to create the directory list
def create_dir_list():
# provide directory as a string, just as above
rootdir = 'C:\\Users\\Miles\\Desktop\\documentportal\\Builders'
dirss = []
# iterate through the generator object provided by os.walk
# os.walk creates a generator object with a tuple for each directory
for subdir, dirs, files in os.walk(rootdir):
for dir in dirs:
# os.walk only provides us the subdirectory name, so we have to join
dir = os.path.join(dir, subdir)
# we also have to do some string manipulation on the directory name
dir = dir.replace("C:\\Users\\Miles\\Desktop\\documentportal\\", "").replace("\\", "/")
Along with the disadvantages of os.walk listed in the comments above, I also have to identify folders versus files and label them. You can tell that .is_file() and .is_dir() would have been very useful for this. Also, pathlib has a long list of functions that give information about my files and directories, which may be useful in another context.
If I could do it all over again, I would use pathlib!