def scan(scanroot, ofilename):
ofile = open(ofilename, 'w')
with ProcessPoolExecutor() as pe:
futures = []
for root, dirs, files in os.walk(scanroot):
for f in files:
fname = os.path.join(root, f)
futures.append(pe.submit(scan_file, fname))
for f in futures:
try:
ofile.write(f.result())
except Exception as e:
print('ERROR:', str(e))
ofile.close()
Very simple and works fine. But when I did this on a USB 3 disk on Linux (Ubuntu Wily) something weird happened. If you do the evaluation with just one process, the disk transfers data at a rate of 70 MB/s, which is a fraction slower the speed of an internal hard disk. When running 8 simultaneous jobs, the total transfer rate is 4 MB/s which is almost 20 times slower.
I have no idea what could be causing this but it seems to be specific to USB, internal hard drives handle multiple readers effortlessly.