Issue
I want to get all .odt
files in some folder recursively, extract their text-content and create .txt
files from them (named accordingly, so A.odt -> A.txt)
Problem is, I am no good with shell apart from a few tricks.
grep for this is easy: grep -r -i --include \*.odt .
manpage of odt2txt says, I need to specify --output=FILE
So for one file it would be odt2txt A.txt --output=A.txt
This works like a charm. But how to combine those two?
I face two problems here, normally I would chain my commands (again, shell noob) with pipes, like so
grep -r -i --include \*.odt . | odt2txt $INPUT_FROM_GREP --output=$MISSING_NAME
But as you can see, odt2txt wants the file name as first argument, and how to get the name, without the extension to be used by odt2txt
?
I feel like I am not on the right track.
Solution
grep
is used to find matching lines in files, but all you seem to want to do is find files whose names match a certain pattern. For that, one would use find
. Also, I presume that odt2txt
wants A.odt
as first argument, not A.txt
.
I would use find
to find the files, then use its -exec
option to execute odt2txt
. I'd use basename
to strip of the .odt
extension, and then I add .txt
. So, something like this:
find . -name '*.odt' -exec odt2txt {} --output=`basename {} .odt`.txt ";"
Note that after an -exec
, {}
denotes the filename, and the end of the command to execute is signalled by ";"
.
Answered By - user8549967