Tuesday, October 25, 2022

[SOLVED] Sed or awk - Remove one part of duplicate string

Issue

Homage to scriptings gurus. I have a really polstergeist phenomenal on virtual machine. I have a script to change the url from image. In a normal linux OS this work normal with SED but in virtual machine (I prove 2 differents machines and differents configurations with differents terminals too) i have differents results. This is the case:

My original line

<div class="col-sm-6 col-md-4 col-lg-3"><figure> <img src='convert_IMG-20181121-WA0009.jpg' class="img-fluid galeria"></figure></div>

My good result in Linux Native:

<div class="col-sm-6 col-md-4 col-lg-3"><figure> <img src='/home/valentin/recetas/public_html/convert_IMG-20181121-WA0009.jpg' class="img-fluid galeria"></figure></div>

My fault result on Virtual Machine

<div class="col-sm-6 col-md-4 col-lg-3"><figure> <img src='/home/valentin/recetas/public_html//home/valentin/recetas/public_html/convert_IMG-20181121-WA0009.jpg' class="img-fluid galeria"></figure></div>

The SED command:

 sed -i "s|img src='|img src='${BASEDIR}/${PUBLIC_BASE}/|" "${i}"

Where

BASEDIR=/home/valentin/recetas
PUBLIC_BASE=public_html

I tried without vars . I thought the problem can be on the quotes in the pattern sustitution and final slash but i try with differents options and without variables without positive result. The result final i need is the good result. Really i don't know how to made no duplicate url on the PATTERN replace with sed. Or if is not possible how i can to extract this second duplicate string. Thank in advance.


Solution

Is it possible you could have run the sed -i twice?

I can (obviously?) duplicate the issue by running the sed -i call twice, eg:

$ cat url
<div class="col-sm-6 col-md-4 col-lg-3"><figure> <img src='convert_IMG-20181121-WA0009.jpg' class="img-fluid galeria"></figure></div>

$ sed -i "s|img src='|img src='${BASEDIR}/${PUBLIC_BASE}/|" url
$ cat url
<div class="col-sm-6 col-md-4 col-lg-3"><figure> <img src='/home/valentin/recetas/public_html/convert_IMG-20181121-WA0009.jpg' class="img-fluid galeria"></figure></div>

$ sed -i "s|img src='|img src='${BASEDIR}/${PUBLIC_BASE}/|" url
$ cat url
<div class="col-sm-6 col-md-4 col-lg-3"><figure> <img src='/home/valentin/recetas/public_html//home/valentin/recetas/public_html/convert_IMG-20181121-WA0009.jpg' class="img-fluid galeria"></figure></div>

One idea for removing the duplicates consists of removing all instances of ${BASEDIR}/${PUBLIC_BASE}/ and then re-adding a single instance:

$ sed -i "s|${BASEDIR}/${PUBLIC_BASE}/||g; s|img src='|img src='${BASEDIR}/${PUBLIC_BASE}/|" url
$ cat url
<div class="col-sm-6 col-md-4 col-lg-3"><figure> <img src='/home/valentin/recetas/public_html/convert_IMG-20181121-WA0009.jpg' class="img-fluid galeria"></figure></div>

This could also be used against the original url with the understanding that the first replacement (s|${BASEDIR}/${PUBLIC_BASE}/||g) will be a no-op.



Answered By - markp-fuso
Answer Checked By - Gilberto Lyons (WPSolving Admin)