...or use xargs, or whatever.
(!) [Thomas] There's nothing wrong in your implimentation, but it is
worthy of note that the test is simply going to add another "thing" for
the script to do. If the number of files are vast, this is just going to
slow it down. You could remove [1.] entirely and let find match the files
beforehand:
find . -type f -exec ./truncate {} \;
(!) [Jason] Oh! I didn't think of that. That is better than silently
dropping non-existent and non-regular files.
(!) [Thomas] I could hash this argument out in any number of combinations
involving xargs, -exec, etc, with arguments as to whether you should use a
shell script, etc., etc.
(!) [Jason] Yes, and you probably would be wanting to use xargs if the
number of files is vast.
(!) [Thomas] Maybe. But that will still fail where a filename has spaces
in it. Example:
[n6tadam@station fi]$ ls -lFb
total 8
-rw-r--r-- 1 n6tadam n6tadam 0 Jan 11 11:18 foo
drwxr-xr-x 2 n6tadam n6tadam 4096 Jan 11 11:11 ignore/
-rw-r--r-- 1 n6tadam n6tadam 120 Jan 11 11:08 this\ has\ spaces
Ignoring the "ignore/" directory, I've got a file with spaces in the
filename [1], as well as a 'normal' file. If I wanted to truncate the
files in the CWD above, I might use:
find . -type f -maxdepth 1 -exec sh -c 'cat /dev/null > {}' \;
... which is fine, for the file with no spaces. Of course, the truncate.sh
script you wrote is fine for handling that (you actually quoted the
variable -- thousands do not). But just what is wrong with that command
above? Well, for each file that find finds, it has to spawn a separate
non-interactive shell to process it. That's slow.
xargs might improve things (I'll leave this as an exercise to the reader
to use 'time'):
find . -type f -maxdepth 1 -print0 | xargs -0i sh -c "cat /dev/null > \"{}\""
Note the quoting. It's paramount that this is done, because even though
the '-print0' option to find splits file names ending '\0' (and xargs
re-interprets them again at the other end), we're still having to quote
the filename (this will still fail if the filename contains a '"'
character, though). Why? Because by the time it gets passed through to the
shell to handle it, we're back to the our old tricks of: '"\"use\" more
quo\"t\"es'.
So is using find(1) any better than using a plain shell script that globs
a given directory for files to truncate? No. Because find blindly exec()'s
whatever we pass to it (and we're having to use shell redirection) we must
invoke the shell for it to work. The only advantage to using find is that
it would handle some strange files, nothing more (in this particular
application of it, anyway).
I suppose you could make that find command more efficient:
find . -type f -maxdepth 1 -not -empty -print0 | xargs -0i sh -c "cat /dev/null
> \"{}\""
... which just ensures that the files we pass to it have a filesize
greater than zero. The "best" solution that I personally can see, is using
the following:
find . -type f -maxdepth 1 -not -empty -print0 | xargs -0i cp /dev/null {}
This obliterates the need to fork a subshell to perform any redirection --
and as with any "find .. | xargs" combination, it'll be quite fast, too.
But the main reason for using it is that by avoiding any
shell-redirection-mangle-filename techniques, we don't have to worry about
quoting. The delimiter of '\0' via find and xargs should be enough to
protect it.
Also note that cat'ting /dev/null is nonsensical in this instance.
[1] Remember that there is nothing "illegal" about using such characters.
Any character is a valid one for filenames at the filesystem level. What
defines them as being a pain is the shell. Nothing more.
(!) [Ben] Not quite; '/' can't be used as a filename. Although "\n" can,
which (along with any high-bit characters) can create lots of pain for
anyone trying to work with them...
(!) [Jason] But ASCII NUL is an illegal character, right? So this will
always work?
find -print0 | xargs -0 command
Jason Creighton
(!) [Ben] Right; you can't use a NUL or a '/'. Other than those two,
anything is fair game... well, not really. :) Mostly, it's a REALLY good
way to screw yourself up; in general, it's not a good idea to use anything
outside of [a-zA-Z0-9_] as part of a filename.
=7= |