Friday, July 29, 2022

[SOLVED] sed out string middle of string that may contain one or more numbers

Issue

My strings are:

  • "TESTING_ABC_1-JAN-2022.BCK-gz;1"
  • "TESTING_ABC_30-JAN-2022.BCK-gz;1"

In bash when I run: echo "TESTING_ABC_1-JAN-2022.BCK-gz;1" | sed 's/.*\([0-9]\{1,2\}-[A-Z][A-Z][A-Z]-[0-9][0-9][0-9][0-9]\).*/\1/' it returns 1-JAN-2022 which is good.

But when I run: echo "TESTING_ABC_30-JAN-2022.BCK-gz;1" | sed 's/.*\([0-9]\{1,2\}-[A-Z][A-Z][A-Z]-[0-9][0-9][0-9][0-9]\).*/\1/' I get 0-JAN-2022 but I want 30-JAN-2022.

From me passing in my string. How can I do it so that I can get single or double digit dates in one line like "30-JAN-2022" or "1-JAN-2022"


Solution

It is much easier to use awk and avoid any regex:

cat file

TESTING_ABC_1-JAN-2022.BCK-gz;1
TESTING_ABC_30-JAN-2022.BCK-gz;1

awk -F '[_.]' '{print $3}' file

1-JAN-2022
30-JAN-2022

Another option is to use grep -Eo with a valid regex for date in DD-MON-YYYY format:

grep -Eo '[0-9]{1,2}-[A-Z]{3}-[0-9]{4}' file

1-JAN-2022
30-JAN-2022


Answered By - anubhava
Answer Checked By - Terry (WPSolving Volunteer)