Issue
I'm new to Bash scripting and would like to test extracting a data, my plan is to extract data to a .txt file base on the aws query from .sh file (bashscript). I have a bigger purpose but I want to have this work first before I input my main aws query.
I have a .sh script here below.
NOTE: I put all values to -confidential- without any quotations. Just the value
#!/bin/bash
variables () {
lambdas="confidential"
}
variables
queryId=$(aws logs start-query --log-group-name $lambdas --start-time 1695704400 --end-time 1695707999 --query-string 'filter @message like "requestId" | parse @message "operationName: '*'," as API | parse @message "status: *," as Status | parse @message "channel: '*'," as Channel | stats count(*) as Total by API,Channel,Status | sort API,Total desc' | jq '.queryId' | awk -F '"' '{print $2}')
results=$(aws logs get-query-results --query-id $queryId | jq .)
echo "$results" > testextract5.txt
Whenever I run that .sh file, this is the result .txt below:
{
"results": [],
"statistics": {
"recordsMatched": 0,
"recordsScanned": 0,
"bytesScanned": 0
},
"status": "Running"
}
BUT when I manually copy paste each command from .sh file all the bash script in the git-bash command line, it works! See below:
jomasangkay@DESKTOP-R1DM530:/mnt/c/Users/personal pc/Desktop/company/Bashscript/test$ queryId=$(aws logs start-query --log-group-name $lambdas --start-time 1695704400 --end-time 1695707999 --query-string 'filter @message like "requestId" | parse @message "operationName: '*'," as API | parse @message "status: *," as Status | parse @message "channel: '*'" as Channel | stats count(*) as Total by API,Channel,Status | sort API,Total desc' | jq '.queryId' | awk -F '"' '{print $2}')
jomasangkay@DESKTOP-R1DM530:/mnt/c/Users/personal pc/Desktop/company/Bashscript/test$ results=$(aws logs get-query-results --query-id $queryId | jq .)
jomasangkay@DESKTOP-R1DM530:/mnt/c/Users/personal pc/Desktop/company/Bashscript/test$ echo "$results" > testextract5.txt
RESULT extracted .txt below,
{
"field": "API",
"value": "'Confidential'"
},
{
"field": "Channel",
"value": "'Confidential'"
},
{
"field": "Status",
"value": "Confidential"
},
{
"field": "Total",
"value": "1"
}
]
],
"statistics": {
"recordsMatched": 16507,
"recordsScanned": 69000,
"bytesScanned": 29835564
},
"status": "Complete"
}
Action taken/Future Plan:
I tried to troubleshoot and used a bash -x script.sh but I don't see any issue at all. I tried to rollback to wsl version 1, installed latest update of aws cli. None of it works. Is there something wrong with my script?
I'm expecting that when I run the .sh file, there should be the expected data like the above .txt RESULT. My main purpose creating this bash script is because AWS Cloudwatch Log insights has a limitation related to lambda , so i plan to create a bash script , using array in $lambda . This post is for my trial and error only. I need to make this work to move forward.
==========================
Tried the double quotation for $lambda and $queryId, same result no data. below
jomasangkay@DESKTOP-R1DM530:/mnt/c/Users/personal pc/Desktop/company/Bashscript/test$ bash -x testing2.sh
+ variables
+ lambdas=confidential
++ jq .queryId
++ awk -F '"' '{print $2}'
++ aws logs start-query --log-group-name confidential --start-time 1695704400 --end-time 1695707999 --query-string 'filter @message like "requestId" | parse @message "operationName: *," as API | parse @message "status: *," as Status | parse @message "channel: *," as Channel | stats count(*) as Total by API,Channel,Status | sort API,Total desc'
+ queryId=0f6c6fd6-da5e-40ed-9c74-ad22791455be
++ aws logs get-query-results --query-id 0f6c6fd6-da5e-40ed-9c74-ad22791455be
++ jq .
+ results='{
"results": [],
"statistics": {
"recordsMatched": 0,
"recordsScanned": 0,
"bytesScanned": 0
},
"status": "Running"
}'
+ echo '{
"results": [],
"statistics": {
"recordsMatched": 0,
"recordsScanned": 0,
"bytesScanned": 0
},
"status": "Running"
}'
===============================
Updated as of Oct 3, 2023 - 05:15 AM SINGAPORE TIME
I isolated the issue by creating a 2 .txt file called test1.txt and test2.txt , and their contents below,
test1.txt
c86c7f6b-95e4-4d17-9417-2f6987989316
test2.txt
c86c7f6b-95e4-4d17-9417-2f6987989316
As you can see above they have the exact same value (no spaces, no unusual text or whatsoever.)
See may new modified code below,
#!/bin/bash
variables () {
lambdas="confidential"
}
variables
queryId=$(aws logs start-query --log-group-name "$lambdas" --start-time 1695704400 --end-time 1695707999 --query-string 'filter @message like "requestId"' | jq '.queryId' | awk -F '"' '{print $2}')
#NOT WORKING
echo $queryId > test1.txt
#WORKING
echo c86c7f6b-95e4-4d17-9417-2f6987989316 > test2.txt
queryIds=$(awk '{print $1}' test2.txt)
results=$(aws logs get-query-results --query-id $queryIds | jq .)
echo "$results" > testextract.txt
When i assigned the variable from test1.txt, its not working no data extracted.
When i assigned the variable from test2.txt, its working as expected.
For some reason, looks like the value from test1.txt query id converted to somewhat special character, im not sure.
Solution
Repeat aws logs get-query-results
until results are available
As we see in the updated output in the question, the response to aws logs get-query-results
has "status": "Running"
. The query is simply still running on AWS, the results are not ready yet, not available to download. So we just need to call aws logs get-query-results
again a bit later.
As per the documentation, the valid status values are Scheduled | Running | Complete | Failed | Cancelled | Timeout | Unknown
. You can try to get the results repeatedly in a loop until the status is not one of Scheduled
or Running
:
while true; do
results=$(aws logs get-query-results --query-id "$queryId" | jq .)
status=$(jq -r .status <<< "$results")
echo "status=$status"
[ "$status" = "Scheduled" -o "$status" = "Running" ] || break
sleep 1
done
The red herring
Originally, this part looked suspicious Bash:
queryId=$(aws ... 'filter ... operationName: '*', ... channel: '*', ...' | ...) ^^^ ^^^
That is, a "naked" *
, outside of the single-quoted expression, which looks unintentional, and error prone.
The intention seems to be to include '*'
inside the single-quoted expression, which we can do by ending the ongoing single-quoted expression which started at 'filter ...
by adding a single-quote, then adding '*'
in a double-quoted expression, and then start another single-quoted expression to continue the one we interrupted:
queryId=$(aws ... 'filter ... operationName: '"'*'"', ... channel: '"'*'"', ...' | ...)
^ ending the previous single-quoted expression
^^^^^ a double-quoted expression
^ starting a new single-quoted expression
Fixing the line in the posted code:
queryId=$(aws logs start-query --log-group-name "$lambdas" --start-time 1695704400 --end-time 1695707999 --query-string 'filter @message like "requestId" | parse @message "operationName: '"'*'"'," as API | parse @message "status: *," as Status | parse @message "channel: '"'*'"'," as Channel | stats count(*) as Total by API,Channel,Status | sort API,Total desc' | jq '.queryId' | awk -F '"' '{print $2}')
Although this was not the real issue, it's still good to fix.
Answered By - janos Answer Checked By - Candace Johnson (WPSolving Volunteer)