Home » Linux » Ignore case in glob() on Linux

Ignore case in glob() on Linux

Posted by: admin November 29, 2017 Leave a comment

Questions:

I’m writing a script which will have to work on directories which are modified by hand by Windows and Linux users alike. The Windows users tend to not care at all about case in assigning filenames.

Is there a way to handle this on the Linux side in Python, i.e. can I get a case-insensitive, glob-like behaviour?

Answers:

Use case-insensitive regexes instead of glob patterns. fnmatch.translate generates a regex from a glob pattern, so

re.compile(fnmatch.translate(pattern), re.IGNORECASE)

gives you a case-insensitive version of a glob pattern as a compiled RE.

Keep in mind that, if the filesystem is hosted by a Linux box on a Unix-like filesystem, users will be able to create files foo, Foo and FOO in the same directory.

Questions:
Answers:

You can replace each alphabetic character c with [cC], via

import glob
def insensitive_glob(pattern):
    def either(c):
        return '[%s%s]'%(c.lower(),c.upper()) if c.isalpha() else c
    return glob.glob(''.join(map(either,pattern)))

Questions:
Answers:

Non recursively

In order to retrieve the files (and files only) of a directory “path”, with “globexpression”:

list_path = [i for i in os.listdir(path) if os.path.isfile(os.path.join(path, i))]
result = [os.path.join(path, j) for j in list_path if re.match(fnmatch.translate(globexpression), j, re.IGNORECASE)]

Recursively

with walk:

result = []
for root, dirs, files in os.walk(path, topdown=True):
  result += [os.path.join(root, j) for j in files \
             if re.match(fnmatch.translate(globexpression), j, re.IGNORECASE)]

Better also compile the regular expression, so instead of

re.match(fnmatch.translate(globexpression)

do (before the loop):

reg_expr = re.compile(fnmatch.translate(globexpression), re.IGNORECASE)

and then replace in the loop:

  result += [os.path.join(root, j) for j in files if re.match(reg_expr, j)]

Questions:
Answers:

Depending on your case, you might use .lower() on both file pattern and results from folder listing and only then compare the pattern with the filename