View Full Version : Recursive Function
Benny
07-12-2002, 03:20 AM
Hey all,
I'm in the process of writting a neat little directory mirroring/syncronisation prog, but I suck at recursive stuff, so I am stuck on traversing the directory structure.....
def directoryCrawl(srcDirName, dstDirName):
files = filecmp.dircmp(srcDirName, dstDirName)
for fileName in files.right_only:
if exceptCheck(exceptList, fileName) == 0:
print "Deleting "+fileName
for fileName in files.left_only:
copyUtil(os.path.join(srcDirName, fileName), os.path.join(dstDirName, fileName))
for fileName in files.common_files:
if filecmp.cmp(os.path.join(srcDirName,fileName), os.path.join(dstDirName,fileName)) == 0:
copyUtil(os.path.join(srcDirName, fileName), os.path.join(dstDirName, fileName))
for item in files.common_dirs:
srcDirName = os.path.join(srcDirName, item)
dstDirName = os.path.join(dstDirName, item)
directoryCrawl(srcDirName, dstDirName)
Now of course that wont work.................how can I find out when I get to the end of the directory structure? So I can know to reset the srcDirName and dstDirName??
Any ideas? Or am I just going about this all wrong?
I could probably do this easier if I wrote my own directory/file comparing functions. But that would be a lot of work and the filecmp.dircmp class is ideal for what I am doing (except for the fact that it doesn't give me full pathnames).
Cheers..
Strike
07-14-2002, 07:20 PM
You're at the end of the directory structure when all the files in a directory are just plain files (no directories).
Benny
07-15-2002, 01:43 AM
/me smacks his head......
Man o Man am I an idiot.
I totally forgot everything I ever knew about scope......plus everything that is logical about recursion.....
I was trying to change srcDir and dstDir within the function and not within the call to the same function.........so of course it wasn't working. To get it working I simply needed:
def directoryCrawl(files, srcDirName, dstDirName):
for fileName in files.right_only:
if exceptCheck(exceptList, fileName) == 0:
print "Deleting "+fileName
for fileName in files.left_only:
copyUtil(os.path.join(srcDirName, fileName), os.path.join(dstDirName, fileName))
for fileName in files.common_files:
#if filecmp.cmp(os.path.join(srcDirName,fileName), os.path.join(dstDirName, fileName)) == 0:
copyUtil(os.path.join(srcDirName, fileName), os.path.join(dstDirName, fileName))
for item in files.subdirs.keys():
directoryCrawl(files.subdirs[item], os.path.join(srcDirName, item), os.path.join(dstDirName,item))
Works like a charm!! Yay - productive day!
jemfinch
07-15-2002, 02:35 AM
Use os.path.walk.
Jeremy
Benny
07-15-2002, 02:41 AM
Originally posted by jemfinch
Use os.path.walk.
Jeremy
Yeh I know about os.path.walk
But it didn't lend itself to my application as nicely as using filecmp.dircmp() and a recursive function. Plus I needed the practise with recursion (as you plainly saw).
Thanks though.
jemfinch
07-15-2002, 10:33 AM
Originally posted by Benny
But it didn't lend itself to my application as nicely as using filecmp.dircmp() and a recursive function. Plus I needed the practise with recursion (as you plainly saw).
What does your application do? I seriously doubt an already-debugged and flexible construct included in the standard library wouldn't lend itself your application as nicely as a homemade and (most likely) buggy and less flexible construct does.
Jeremy
Benny
07-15-2002, 08:36 PM
Originally posted by jemfinch
What does your application do? I seriously doubt an already-debugged and flexible construct included in the standard library wouldn't lend itself your application as nicely as a homemade and (most likely) buggy and less flexible construct does.
Your probably correct, but the fact is, I wanted to try with recursion and filecmp.dircmp() made things very quick to code.
Here is my code so far (hardly finished at the moment, but it will give you some idea of what I want the program to do)
#!/usr/bin/env python
import string, glob, filecmp, os, sys, shutil
wFile = open("mirror.work", 'r')
workContents = wFile.readlines()
wFile.close()
exceptList = ["bbirnbaum", "depts", "lost+found", 'netlogon', 'quota.user', 'samba']
def exceptCheck(a,b):
try:
a.index(b)
return(1)
except ValueError:
return(0)
def copyUtil(srcfilePath, dstfilePath):
if os.path.isdir(srcfilePath) == 0:
print "Copying File: "+srcfilePath+" --> "+dstfilePath
try:
shutil.copy2(srcfilePath, dstfilePath)
except IOError, (errno, strerror):
print "I/O error(%s): %s" % (errno, strerror)
else:
print "Copying Directory: "+srcfilePath+" --> "+dstfilePath
shutil.copytree(srcfilePath, dstfilePath)
def delUtil(path):
if os.path.isdir(path) == 1:
print "Deleting Directory: "+path
shutil.rmtree(path)
else:
print "Deleting File: "+path
os.remove(path)
def directoryCrawl(files, srcDirName, dstDirName):
for fileName in files.right_only:
if exceptCheck(exceptList, fileName) == 0:
delUtil(os.path.join(dstDirName, fileName))
for fileName in files.left_only:
copyUtil(os.path.join(srcDirName, fileName), os.path.join(dstDirName, fileName))
for fileName in files.common_files:
if filecmp.cmp(os.path.join(srcDirName, fileName), os.path.join(dstDirName, fileName)) == 0:
copyUtil(os.path.join(srcDirName, fileName), os.path.join(dstDirName, fileName))
for item in files.subdirs.keys():
directoryCrawl(files.subdirs[item], os.path.join(srcDirName, item), os.path.join(dstDirName,item))
for item in workContents:
item = string.replace(item, "\t", "")
item = string.replace(item, "\n", "")
job = string.split(item, ':')
print "\n+-+-+-+-+-+- Running Job: "+job[0]+" -+-+-+-+-+-+\n"
files = filecmp.dircmp(job[1], job[2])
directoryCrawl(files, job[1], job[2])
print "\n+-+-+-+-+-+- Job COMPLETE: "+job[0]+" -+-+-+-+-+-+\n"
It is (going to be) a folder/drive mirroring tool.
I liked using filecmp.dircmp() because it can show me exactly which folders/files differ between the two directorys and since the aim of my script to replicate a folder structure exactly, this seemed like a logical choice.
I'm the first to admit, I'm not the most experienced or talented programmer, so someone like yourself probably could have used os.path.walk() to their advantage, but me with my inferior skills can't see the connection all that well.
By all means show me a (the?) better method using os.path.walk() and I'll try to learn from it, but simply "paying" me out (as it seemed) isn't going to help me at all.
Never the less, I am pretty damn pleased and proud with what I have done so far.
So yeh...
jemfinch
07-16-2002, 12:15 AM
Originally posted by Benny
It is (going to be) a folder/drive mirroring tool.
In that case, the quickest way to do what you want is to use shutil.copytree to a temp directory, and os.rename to rename the temp directory to the destination directory.
I liked using filecmp.dircmp() because it can show me exactly which folders/files differ between the two directorys and since the aim of my script to replicate a folder structure exactly, this seemed like a logical choice.
In today's world, processor and disk time is cheap. Since you're using Python (and thus obviously not too worried about speed or efficiency) just copying the whole darn thing is more robust and easier to program than selectively copying parts.
By all means show me a (the?) better method using os.path.walk() and I'll try to learn from it, but simply "paying" me out (as it seemed) isn't going to help me at all.
You try os.path.walk and you'll learn a lot more :)
Never the less, I am pretty damn pleased and proud with what I have done so far.
I don't mean to denigrate your work, I'm just pointing stuff out in a rather blunt way (because I'm in a bad mood today for some reason.)
Jeremy
Benny
07-16-2002, 01:43 AM
Originally posted by jemfinch
In that case, the quickest way to do what you want is to use shutil.copytree to a temp directory, and os.rename to rename the temp directory to the destination directory.
Obviously this would be easiest, but I wanted to be able to do things such as been able to exclude certain directories/files from getting mirrored as well have have certain files/directories in the mirror location not get deleted. Plus I'll be mirroring somewhere along the lines of 140gig worth of data. Copying 140gig everytime I do a mirror (when only maybe 10gig of that has changed) is not desireable.
In today's world, processor and disk time is cheap. Since you're using Python (and thus obviously not [b]too worried about speed or efficiency) just copying the whole darn thing is more robust and easier to program than selectively copying parts.
Same reasons as above, I want more control about what exactly gets backed up and plus it is going to be doing the mirroring across a network, so I dont want to go clogging up the network, hence I only want to transfer files that are new or differ.
You try os.path.walk and you'll learn a lot more :)
I did/have and I couldn't code it nearly as easily or get it to do exactly what I wanted (when compared to my filecmp.dir() recursive solution) - so therefore it is either:
a) My crappy (lazy) coding skills <--- most likely
or
b) My choice of using the recursive method was a better one.
I don't mean to denigrate your work, I'm just pointing stuff out in a rather blunt way (because I'm in a bad mood today for some reason.)
I know. It can just be really frustrating to have done something (spent time and thought on something) that you are happy with/proud of and then have someone shoot it down in a ball of flames. There is a thin line between constructive criticism (which I welcome) and just plain criticism.
vBulletin® v3.7.0, Copyright ©2000-2009, Jelsoft Enterprises Ltd.