Issue
I have a file pom.xml
with the following content:
...
<artifactId>test-module</artifactId>
<version>1.0.0-SNAPSHOT</version>
<packaging>pom</packaging>
<parent>
<groupId>com.example</groupId>
<artifactId>parent-module</artifactId>
<version>1.1.1</version>
</parent>
...
I'm interested in obtaining just the version of the test-module
, namely, 1.0.0-SNAPSHOT
. I've tried running this command but it doesn't seem to give me the desired result:
sed -e 's/.*<artifactId>test-module<\/artifactId>\s+<version>\(.*\)<\/version>.*/\1/' -e 't' -e 'd' pom.xml
The motivation for trying the command above comes from the observation made from running this command here:
sed -e 's/.*<version>\(.*\)<\/version>.*/\1/' -e 't' -e 'd' pom.xml
which produces this output:
1.0.0-SNAPSHOT
1.1.1
Any help would be appreciated! Thank you!
Solution
Assuming the file has the properties always in that order, you can grep for the test-module and print one line after that, then extract the version:
❯ cat stackoverflow.txt | grep -A1 '<artifactId>test-module</artifactId>' | sed -n 's,<version>\(.*\)</version>,\1,p'
1.0.0-SNAPSHOT
Explanation
grep
has an option to show X amount of lines after the matched context. From grep's help:
-A, --after-context=NUM print NUM lines of trailing context
so with your file, this is what you get just after the grep command:
❯ cat stackoverflow.txt | grep -A1 '<artifactId>test-module</artifactId>'
<artifactId>test-module</artifactId>
<version>1.0.0-SNAPSHOT</version>
At this point you use sed
to extract the version, but here's the catch: you don't want sed to print all lines (because it would print also the grep matching pattern which you don't want) so you need to invoke it with the silent option and then explicitly print the matching lines (see the p
option in the sed command).
Let's break down the sed
command (sed -n 's,<version>\(.*\)</version>,\1,p'
):
From the manual:
-n, --quiet, --silent
suppress automatic printing of pattern space
This means that auto-print is disabled and you need to specifically print the address space. We achieve this by adding the p
(as in print) command after the regex address.
At this point we use regex to "substitute" the whole line only with the matching group (ie: what's inside parenthesis).
If you are interested in knowing more about this approach there are specifically 2 doc pages that are useful:
Answered By - Elisiano Petrini Answer Checked By - Terry (WPSolving Volunteer)