Issue
I've a little situation here where I do hope some Linux magician can help me out. I've a a folder /assets
that has .jpg
and .xml
files - a ton of them.
Original xml structure:
<annotation>
<folder></folder>
<filename>Pro_jpeg.rf.77216510eeb475f923d5bb3bdb22ee11.jpg</filename>
<path>Pro_jpeg.rf.77216510eeb475f923d5bb3bdb22ee11.jpg</path>
<source>
<database>roboflow.ai</database>
</source>
<size>
<width>416</width>
<height>416</height>
<depth>3</depth>
</size>
<segmented>0</segmented>
<object>
<name>cheese</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>53</xmin>
<xmax>371</xmax>
<ymin>224</ymin>
<ymax>391</ymax>
</bndbox>
</object>
</annotation>
If you take a look, you can see that within <bndbox>
tag I've following params defined: xmin, xmax, ymin and ymax
- in that sequence. It's wrong and I'd like to update all my .xml files with following sequence: xmin, ymin, xmax, ymax
. So the file would end up like this:
<annotation>
<folder></folder>
<filename>Pro_jpeg.rf.77216510eeb475f923d5bb3bdb22ee11.jpg</filename>
<path>Pro_jpeg.rf.77216510eeb475f923d5bb3bdb22ee11.jpg</path>
<source>
<database>roboflow.ai</database>
</source>
<size>
<width>416</width>
<height>416</height>
<depth>3</depth>
</size>
<segmented>0</segmented>
<object>
<name>cheese</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>53</xmin>
<ymin>224</ymin>
<xmax>371</xmax>
<ymax>391</ymax>
</bndbox>
</object>
</annotation>
Any ideas how can this be done? Thanks a lot in advance!
Edit (what I currently have): So I open terminal, navigate to the folder: cd desktop and call this block. All files are located in desktop/cheese1 folder:
for file in cheese1/*.xml; do
xmlstarlet edit -L --delete "//bndbox/xmax" --insert "//bndbox/ymin" --to "//bndbox/xmin" \
--insert "//bndbox/xmax" --to "//bndbox/ymax" \
"$file"
done
Nothing really happens, so not sure if syntax is wrong, is it logic or both.
Solution
Sed solution, which assumes that all input data is formatted as in the example, and that the wrong sequence is always ordered as xmin > xmax > ymin > ymax
:
sed '/<bndbox>/,/<\/bndbox>/{/<xmax>/{h;d};/<ymin>/G}' cheese1/*.xml
Use sed -i
to edit your files in-place, if output satisfies you.
The way it works is pretty simple. Between <bndbox>...</bndbox>
lines we save the <xmax>
line and delete it, then when we encounter the <ymin>
line, we paste the <xmax>
line we saved earlier right after the <ymin>
one.
Answered By - Discussian Answer Checked By - Clifford M. (WPSolving Volunteer)