Issue
Im trying to write a bash script that uses a RegEx pattern and takes this as input:
#
#------------------------------------------- spaceholder ---------------------------------------------------------------------------
#
#@E2E-1 @id:1
Scenario: Login & Search: B2B_PKG_IN >> BE
Given I am on the login page
When I enter <username> and incorrect password multiple times
Then I should be locked out of my account
And I should see a lockout message
#@E2E-2 @id:32
Scenario: Login & Search: B2B_PKG_IN >> NL
Given I am on the login page
When I enter <username> and incorrect password multiple times
Then I should be locked out of my account
And I should see a lockout message
#
#------------------------------------------- B2B_PKG_3PA ---------------------------------------------------------------------------
#
@E2E-3 @id:3
Scenario: Login & Search: B2B_PKG_3PA >> BE
Given I am on the login page
When I enter <username> and incorrect password multiple times
Then I should be locked out of my account
And I should see a lockout message
and I tested it with this chatgpt generated Pattern: ((@[^\n]+|#@[^\n]+)?\s*)?Scenario:[^\n]*\n(?:[^\n]*\n)*?\n
and it works just like how I want and the output looks like this on a RegEx testing Website:
#@E2E-1 @id:1
Scenario: Login & Search: B2B_PKG_IN >> BE
Given I am on the login page
When I enter <username> and incorrect password multiple times
Then I should be locked out of my account
And I should see a lockout message
#@E2E-2 @id:32
Scenario: Login & Search: B2B_PKG_IN >> NL
Given I am on the login page
When I enter <username> and incorrect password multiple times
Then I should be locked out of my account
And I should see a lockout message
@E2E-3 @id:3
Scenario: Login & Search: B2B_PKG_3PA >> BE
Given I am on the login page
When I enter <username> and incorrect password multiple times
Then I should be locked out of my account
And I should see a lockout message
so now I tried it with grep in bash like this:
while IFS= read -r -d '' block; do
current_scenarios+=("$block")
done < <(grep -Pzo "$pattern" "$current_file")
and for some reason it include the #space holder part, Ive tried grep -E
, grep -oP
, grep -Po
but nothing seems to work, please help
Solution
Assuming:
- Each block starts with the line beginning with
#@
or@
. - The next line in the block starts with the string
Scenario:
. - A block does not contain blank lines in between.
- A block ends if followed by two or more blank lines (or the end of the file).
- You want to assign a bash array
current_scenarios
with the matched blocks. - Each line of
$current_file
looks containing two leading whitespaces, but this is an editing issue and can be ignored.
Then would you please try:
pattern="(?sm)^#?@[^\n]+\n*^Scenario:.*?\n(?=\n{2}|\n*\Z)"
while IFS= read -r -d '' block; do
current_scenarios+=("$block")
done < <(grep -Pzo "$pattern" "$current_file")
printf "%s\n\n" "${current_scenarios[@]}" # just to see the results
Explanation of the regex (?sm)^#?@[^\n]+\n*^Scenario:.*?\n(?=\n{2}|\n*\Z)
:
- The
(?s)
option makes a dot.
match a newline. - The
(?m)
option makes^
and$
match start/end of the line, as well as the start/end of the input string. ^#?@
matches either#@
or@
at the start of the line.[^\n]+\n*
consumes current line.^Scenario:
matches the next line..*?\n
matches following lines as short as possible (non-greedy).(?=\n{2}|\n*\Z)
is a lookahead assertion which matches two blank lines or the end of the input.
Answered By - tshiono Answer Checked By - Robin (WPSolving Admin)