Home » Python » How do I list all files of a directory?

How do I list all files of a directory?

Posted by: admin October 29, 2017 Leave a comment

Questions:

How can I list all files of a directory in Python and add them to a list?

Answers:

os.listdir() will get you everything that’s in a directory – files and directories.

If you want just files, you could either filter this down using os.path:

from os import listdir
from os.path import isfile, join
onlyfiles = [f for f in listdir(mypath) if isfile(join(mypath, f))]

or you could use os.walk() which will yield two lists for each directory it visits – splitting into files and dirs for you. If you only want the top directory you can just break the first time it yields

from os import walk

f = []
for (dirpath, dirnames, filenames) in walk(mypath):
    f.extend(filenames)
    break

And lastly, as that example shows, adding one list to another you can either use .extend() or

>>> q = [1, 2, 3]
>>> w = [4, 5, 6]
>>> q = q + w
>>> q
[1, 2, 3, 4, 5, 6]

Personally, I prefer .extend()

Questions:
Answers:

I prefer using the glob module, as it does pattern matching and expansion.

import glob
print(glob.glob("/home/adam/*.txt"))

Will return a list with the queried files:

['/home/adam/file1.txt', '/home/adam/file2.txt', .... ]

Questions:
Answers:
import os
os.listdir("somedirectory")

will return a list of all files and directories in “somedirectory”.

Questions:
Answers:

python iconpython iconpython icon

How to get a list of files in Python 2, 3, 3.4, 3.5

Here is a list of what I talked about in this answer:

  1. os.listdir() for Python 3

    • 1.1 – Use of list comprehension to select only txt files
    • 1.2 – Using os.path.isfile to avoid directories in the list
  2. pathlib

  3. os.walk()

  4. os.scandir()

  5. python 2 (os.listdir())
    • 4.1 – python 2.7 – os.walk(‘.’)

  6. Example of use of os.walk(‘.’) to count how many files there are in a directory and its subdirectories (for python 3.5 and 2.7)

  7. using glob

1. os.listdir() (python 3)


>>> import os
>>> arr = os.listdir()
>>> arr
['$RECYCLE.BIN', 'work.txt', '3ebooks.txt', 'documents']

1.1 – Use of list comprehension to select only txt files

>>> arr_txt = [x for x in os.listdir() if x.endswith(".txt")]
>>> print(arr_txt)
['work.txt', '3ebooks.txt']

1.2 – Using os.path.isfile to avoid directories in the list

import os.path

listOfFiles = [f for f in os.listdir() if os.path.isfile(f)]

print(listOfFiles)

output

There are only files here

[‘a simple game.py’, ‘data.txt’, ‘decorator.py’, ‘deep_reverse_list.py’, ‘deep_reverse_list2.py’, ‘hangman.py’, ‘import pygame.py’, ‘list_click_display.py’, ‘os_path.py’]

2. Python 3.4 [ pathlib ]


import pathlib

>>> flist = []
>>> for p in pathlib.Path('.').iterdir():
...  if p.is_file():
...   print(p)
...   flist.append(p)
...
error.PNG
exemaker.bat
guiprova.mp3
setup.py
speak_gui2.py
thumb.PNG

If you want to use list comprehension

>>> flist = [p for p in pathlib.Path('.').iterdir() if p.is_file()]

3. Python 3.5 (and 2.7) [ os.walk ]


To include all the files in the subdirectory (in this example there are 11 files in the first directory and 3 in a subdirectory) I will use os.walk() that works sell in python 3.5 and newer versions:

import os
x = [i[2] for i in os.walk('.')]
y=[]
for t in x:
    for f in t:
        y.append(f)
print(y)
# print y # for 2.7 uncomment this and comment the previous line

output

[‘append_to_list.py’, ‘data.txt’, ‘data1.txt’, ‘data2.txt’, ‘data_180617’, ‘os_walk.py’, ‘READ2.py’, ‘read_data.py’, ‘somma_defaltdic.py’, ‘substitute_words.py’, ‘sum_data.py’, ‘data.txt’, ‘data1.txt’, ‘data_180617’]

you can also do something like this to get only files

>>> import os
>>> x = next(os.walk('F://python'))[2] # for the current dir use ('.')
>>> ['calculator.bat','calculator.py']

When you use next(os.walk(‘,’)), you have the same results of os.listdir(), but you have the root as the first item of the list, all the folders in the second item and all the files in the third, while in os.listdir() you have folders and files in the same list. In both case (next(os.walk(‘.’)) and os.listdir()) you just look in the current directory, leaving the subdirectory alone (you must use os.walk(‘-‘) for that, as we showed before).


4. os.scandir() from python 3.5 on


>>> import os
>>> x = [f.name for f in os.scandir() if f.is_file()]
>>> x
['calculator.bat','calculator.py']

Another example with scandir (a little variation from docs.python.org)
This one is more efficient than os.listdir. In this case, it shows the files only in the current directory where the script is executed.

>>> import os
>>> with os.scandir() as i:
...  for entry in i:
...   if entry.is_file():
...    print(entry.name)
...
ebookmaker.py
error.PNG
exemaker.bat
guiprova.mp3
setup.py
speakgui4.py
speak_gui2.py
speak_gui3.py
thumb.PNG
>>>

5. Python 2


Use getcwd() to get the current work directory in python 2 (or (‘.’))

>>> import os
>>> mylist = os.listdir(os.getcwd())
>>> mylist
['$RECYCLE.BIN', 'work.txt', '3ebooks.txt', 'documents']

To go up in the directory tree, you got to code like this:

>>> for f in os.listdir('..'):
...     print f


>>> for f in os.listdir('/'):
...     print f

list of files with absolute path

It’s the same as in Python 3 (except the print)

>>> x = os.listdir('F:/python')
>>> for files in x:
>>>    print files
...
$RECYCLE.BIN
work.txt
3ebooks.txt
documents

5.1 – python 2 – os.walk(‘.’)

Let’s make an example for python 2.7 with walk (same as python 3).

>>> def getAllFiles(dir):
...     """Get all the files in the dir and subdirs"""
...     allfiles = []
...     for pack in os.walk(dir):
...         for files in pack[2]:
...             if os.path.isfile(files):
...                 allfiles += [files]
...     return allfiles
...
>>> getAllFiles("F://python")
['first.py', 'Modules.txt', 'test4Console.py', 'text4Console.bat', 'tkinter001.py']

6. Example of use of os.walk(‘.’) for python 3.5 and 2.7

In this example, we look for the number of files that are included in all the directory and its subdirecories.

import os    

def count(dir, counter=0):
    "returns number of files in dir and subdirs"
    for pack in os.walk(dir):
        for f in pack[2]:
            counter += 1
    return dir + " : " + str(counter) + "files"


print(count("F:\\python"))

output

‘F:\\python’ : 12057 files’

7. Using glob

>>> import glob
>>> glob.glob("*.txt")
['ale.txt', 'alunni2015.txt', 'assenze.text.txt', 'text2.txt', 'untitled.txt']

python iconpython iconpython icon

Questions:
Answers:

A one-line solution to get only list of files (no subdirectories):

filenames = next(os.walk(path))[2]

or absolute pathnames:

paths = [os.path.join(path,fn) for fn in next(os.walk(path))[2]]

Questions:
Answers:

Getting Full File Paths From a Directory and All Its Subdirectories

import os

def get_filepaths(directory):
    """
    This function will generate the file names in a directory 
    tree by walking the tree either top-down or bottom-up. For each 
    directory in the tree rooted at directory top (including top itself), 
    it yields a 3-tuple (dirpath, dirnames, filenames).
    """
    file_paths = []  # List which will store all of the full filepaths.

    # Walk the tree.
    for root, directories, files in os.walk(directory):
        for filename in files:
            # Join the two strings in order to form the full filepath.
            filepath = os.path.join(root, filename)
            file_paths.append(filepath)  # Add it to the list.

    return file_paths  # Self-explanatory.

# Run the above function and store its results in a variable.   
full_file_paths = get_filepaths("/Users/johnny/Desktop/TEST")

  • The path I provided in the above function contained 3 files— two of them in the root directory, and another in a subfolder called “SUBFOLDER.” You can now do things like:
  • print full_file_paths which will print the list:

    • ['/Users/johnny/Desktop/TEST/file1.txt', '/Users/johnny/Desktop/TEST/file2.txt', '/Users/johnny/Desktop/TEST/SUBFOLDER/file3.dat']

If you’d like, you can open and read the contents, or focus only on files with the extension “.dat” like in the code below:

for f in full_file_paths:
  if f.endswith(".dat"):
    print f

/Users/johnny/Desktop/TEST/SUBFOLDER/file3.dat

Questions:
Answers:

Since version 3.4 there are builtin iterators for this which are a lot more efficient than os.listdir():

pathlib: New in version 3.4.

>>> import pathlib
>>> [p for p in pathlib.Path('.').iterdir() if p.is_file()]

According to PEP 428, the aim of the pathlib library is to provide a simple hierarchy of classes to handle filesystem paths and the common operations users do over them.

os.scandir(): New in version 3.5.

>>> import os
>>> [entry for entry in os.scandir('.') if entry.is_file()]

Note that os.walk() use os.scandir() instead of os.listdir() from version 3.5 and it’s speed got increased by 2-20 times according to PEP 471.

Let me also recommend reading ShadowRanger’s comment below.

Questions:
Answers:

I really liked adamk’s answer, suggesting that you use glob(), from the module of the same name. This allows you to have pattern matching with *s.

But as other people pointed out in the comments, glob() can get tripped up over inconsistent slash directions. To help with that, I suggest you use the join() and expanduser() functions in the os.path module, and perhaps the getcwd() function in the os module, as well.

As examples:

from glob import glob

# Return everything under C:\Users\admin that contains a folder called wlp.
glob('C:\Users\admin\*\wlp')

The above is terrible – the path has been hardcoded and will only ever work on Windows between the drive name and the \s being hardcoded into the path.

from glob    import glob
from os.path import join

# Return everything under Users, admin, that contains a folder called wlp.
glob(join('Users', 'admin', '*', 'wlp'))

The above works better, but it relies on the folder name Users which is often found on Windows and not so often found on other OSs. It also relies on the user having a specific name, admin.

from glob    import glob
from os.path import expanduser, join

# Return everything under the user directory that contains a folder called wlp.
glob(join(expanduser('~'), '*', 'wlp'))

This works perfectly across all platforms.

Another great example that works perfectly across platforms and does something a bit different:

from glob    import glob
from os      import getcwd
from os.path import join

# Return everything under the current directory that contains a folder called wlp.
glob(join(getcwd(), '*', 'wlp'))

Hope these examples help you see the power of a few of the functions you can find in the standard Python library modules.

Questions:
Answers:
def list_files(path):
    # returns a list of names (with extension, without full path) of all files 
    # in folder path
    files = []
    for name in os.listdir(path):
        if os.path.isfile(os.path.join(path, name)):
            files.append(name)
    return files 

Questions:
Answers:

You should use os module for listing directory content.os.listdir(".") returns all the contents of the directory. We iterate over the result and append to the list.

import os

content_list = []

for content in os.listdir("."): # "." means current directory
    content_list.append(content)

print content_list

Questions:
Answers:
import os
lst=os.listdir(path)

os.listdir returns a list containing the names of the entries in the directory given by path.

Questions:
Answers:

If you are looking for a Python implementation of find, this is a recipe I use rather frequently:

from findtools.find_files import (find_files, Match)

# Recursively find all *.sh files in **/usr/bin**
sh_files_pattern = Match(filetype='f', name='*.sh')
found_files = find_files(path='/usr/bin', match=sh_files_pattern)

for found_file in found_files:
    print found_file

So I made a PyPI package out of it and there is also a GitHub repository. I hope that someone finds it potentially useful for this code.

Questions:
Answers:

Returning a list of absolute filepaths, does not recurse into subdirectories

L = [os.path.join(os.getcwd(),f) for f in os.listdir('.') if os.path.isfile(os.path.join(os.getcwd(),f))]

Questions:
Answers:

Python 3.5 introduced new, faster method for walking through the directory – os.scandir().

Example:

for file in os.scandir('/usr/bin'):
    line = ''
    if file.is_file():
        line += 'f'
    elif file.is_dir():
        line += 'd'
    elif file.is_symlink():
        line += 'l'
    line += '\t'
    print("{}{}".format(line, file.name))

Questions:
Answers:

List all files in a directory:

import os
from os import path

files = [x for x in os.listdir(directory_path) if path.isfile(directory_path+os.sep+x)]

Here, you get list of all files in a directory.

Questions:
Answers:
# -** coding: utf-8 -*-
import os
import traceback

print '\n\n'

def start():
    address = "/home/ubuntu/Desktop"
    try:
        Folders = []
        Id = 1
        for item in os.listdir(address):
            endaddress = address + "/" + item
            Folders.append({'Id': Id, 'TopId': 0, 'Name': item, 'Address': endaddress })
            Id += 1         

            state = 0
            for item2 in os.listdir(endaddress):
                state = 1
            if state == 1: 
                Id = FolderToList(endaddress, Id, Id - 1, Folders)
        return Folders
    except:
        print "___________________________ ERROR ___________________________\n" + traceback.format_exc()

def FolderToList(address, Id, TopId, Folders):
    for item in os.listdir(address):
        endaddress = address + "/" + item
        Folders.append({'Id': Id, 'TopId': TopId, 'Name': item, 'Address': endaddress })
        Id += 1

        state = 0
        for item in os.listdir(endaddress):
            state = 1
        if state == 1: 
            Id = FolderToList(endaddress, Id, Id - 1, Folders)
    return Id

print start()

Questions:
Answers:

Using generators

import os
def get_files(search_path):
     for (dirpath, _, filenames) in os.walk(search_path):
         for filename in filenames:
             yield os.path.join(dirpath, filename)
list_files = get_files('.')
for filename in list_files:
    print(filename)

Questions:
Answers:

If you care about performance, try scandir, for Python 2.x, you may need to install it manually. Examples:

# python 2.x
import scandir
import sys

de = scandir.scandir(sys.argv[1])
while 1:
    try:
        d = de.next()
        print d.path
    except StopIteration as _:
        break

This save a lot of time when you need to scan a huge directory, you do not need to buffer a huge list, just fetch one by one. And also you can do it recursively:

def scan_path(path):
    de = scandir.scandir(path)
    while 1:
        try:
            e = de.next()
            if e.is_dir():
                scan_path(e.path)
            else:
                print e.path
        except StopIteration as _:
                break

Questions:
Answers:

Use this function if you want to different file type or get full directory.

import os
def createList(foldername, fulldir = True, suffix=".jpg"):
    file_list_tmp = os.listdir(foldername)
    #print len(file_list_tmp)
    file_list = []
    if fulldir:
        for item in file_list_tmp:
            if item.endswith(suffix):
                file_list.append(os.path.join(foldername, item))
    else:
        for item in file_list_tmp:
            if item.endswith(suffix):
                file_list.append(item)
    return file_list

Questions:
Answers:
import dircache
list = dircache.listdir(pathname)
i = 0
check = len(list[0])
temp = []
count = len(list)
while count != 0:
  if len(list[i]) != check:
     temp.append(list[i-1])
     check = len(list[i])
  else:
    i = i + 1
    count = count - 1

print temp

Questions:
Answers:

By using os library.

import os
for root, dirs,files in os.walk("your dir path", topdown=True):
    for name in files:
        print(os.path.join(root, name))

Questions:
Answers:
import os 
os.listdir(path)

This will return list all files and directories in path

filenames = next(os.walk(path))[2]

This will return only list of files not subdirectories

Questions:
Answers:

Referring to the answer by @adamk, here is my os detection method in response to the slash inconsistency comment by @Anti Earth

import sys
import os
from pathlib import Path
from glob import glob
platformtype = sys.platform
if platformtype == 'win32':
    slash = "\\"
if platformtype == 'darwin':
    slash = "/"

# TODO: How can I list all files of a directory in Python and add them to a list?

# Step 1 - List all files of a directory

# Method 1: Find only pre-defined filetypes (.txt) and no subfiles, answer provided by @adamk
dir1 = "%sfoo%sbar%s*.txt" % (slash)
_files = glob(dir1)

# Method 2: Find all files and no subfiles
dir2 = "%sfoo%sbar%s" % (slash)
_files = (x for x in Path("dir2").iterdir() if x.is_file())

# Method 3: Find all files and all subfiles
dir3 = "%sfoo%sbar" % (slash)
_files = (x for x in Path('dir3').glob('**/*') if x.is_file())


# Step 2 - Add them to a list

files_list = []
for eachfiles in _files:
    files_basename = os.path.basename(eachfiles)
    files_list.append(files_basename)

print(files_list)
['file1.txt', 'file2.txt', .... ]

I’m assuming that you want just the basenames in the list.

Refer to this post for pre-defining multiple file formats for Method 1.

Questions:
Answers:

Here is a simple example:

import os
root, dirs, files = next(os.walk('.'))
for file in files:
    print(file) # In Python 3 use: file.encode('utf-8') in case of error.

Note: Change . to your path value or variable.

Here is the example returning list of files with absolute paths:

import os
path = '.' # Change this as you need.
abspaths = []
for fn in os.listdir(path):
    abspaths.append(os.path.abspath(os.path.join(path, fn)))
print("\n".join(abspaths))

Documentation: os and os.path for Python 2, os and os.path for Python 3.

Questions:
Answers:
ls -a

This will list even the hidden stuff.

Leave a Reply

Your email address will not be published. Required fields are marked *