Issue
I have the following line:
>A_1000
ACTTTCGATCTCTTGTAGATCTGTTCTC...CAC
ACTTTCGATCTCTTGTAGATCTGTTCTC...CAC
I would like to convert the first line as follows:
>Initialword/A_1000/Finalword
ACTTTCGATCTCTTGTAGATCTGTTCTC...CAC
ACTTTCGATCTCTTGTAGATCTGTTCTC...CAC
I found a similar question that did allow me to append the end and the beginning as I needed (Add words at beginning and end of a FASTA header line with sed). However, it adds the Finalword to the next line.
I ran the following:
sed 's%^>(.*)%>Initialword/\1/Finalword%' input.fasta > output.fasta
Which returns:
>Initialword/A_0101M/Finalword
ACTTTCGATCTCTTGTAGATCTGTTCTC...CACM
ACTTTCGATCTCTTGTAGATCTGTTCTC...CACM
But in the Fasta file it looks like:
>Initialword/A_0101
/Finalword
ACTTTCGATCTCTTGTAGATCTGTTCTC...CAC
ACTTTCGATCTCTTGTAGATCTGTTCTC...CAC
How can I fix this to just add the text to the beginning and end of the header? What is the M at the end of each line in the file?
Thank you
Solution
First convert your file and then use GNU sed
:
dos2unix <input.fasta | sed -E 's%^>(.*)%>Initialword/\1/Finalword%' >output.fasta
Answered By - Cyrus