I’m running Python 2.6.2 on XP. I have a large number of text files (100k+) spread across several folders that I would like to consolidate in a single folder on an external drive.
I’ve tried using shutil.copy() and shutil.copytree() and distutils.file_util.copy_file() to copy files from source to destination. None of these methods has successfully copied all files from a source folder, and each attempt has ended with IOError Errno 13 Permission Denied and I am unable to create a new destination file.
I have noticed that all the destination folders I’ve used, regardless of the source folders used, have ended up with exactly 13,106 files. I cannot open any new files for writing in folders that have this many (or more files), which may be why I’m getting Errno 13.
I’d be grateful for suggestions on whether and why this problem is occurring.
Are you using FAT32? The maximum number of directory entries in a FAT32 folder is is 65.534. If a filename is longer than 8.3, it will take more than one directory entry. If you are conking out at 13,106, this indicates that each filename is long enough to require five directory entries.
Solution: Use an NTFS volume; it does not have per-folder limits and supports long filenames natively (that is, instead of using multiple 8.3 entries). The total number of files on an NTFS volume is limited to around 4.3 billion, but they can be put in folders in any combination.
I wouldn’t have that many files in a single folder, it is a maintenance nightmare. BUT if you need to, don’t do this on FAT: you have max. 64k files in a FAT folder.
Read the error message
Your specific problem could also be be, that you as the error message suggests are hitting a file which you can’t access. And there’s no reason to believe that the count of files until this happens should change. It is a computer after all, and you are repeating the same operation.
I predict that your external drive is formatted 32 and that the filenames you’re writing to it are somewhere around 45 characters long.
FAT32 can only have 65536 directory entries in a directory. Long file names use multiple directory entries each. And “.” always takes up one entry. That you are able to write 65536/5 – 1 = 13106 entries strongly suggests that your filenames take up 5 entries each and that you have a FAT32 filesystem. This is because there exists code using 16-bit numbers as directory entry offsets.
Additionally, you do not want to search through multi-1000 entry directories in FAT — the search is linear. I.e. fopen(some_file) will induce the OS to march linearly through the list of files, from the beginning every time, until it finds some_file or marches off the end of the list.
Short answer: Directories are a good thing.