Issue
within the pca.ndx file there is a block of numbers stated from the pattern expression in the following format:
48789 48790 48791 48792 48793 48794 48795 48796 48797 48798 48799 48800 48801 48802 48803
48804 48805 48806 48807 48808 48809 48810 48811 48812 48813 48814 48815 48816 48817 48818
48819 48820 48821 48822 48823 48824 48825 48826 48827 48828 48829 48830 48831 48832 48833
48834 48835 48836 48837 48838 48839 48840 48841 48842 48843 48844 48845 48846 48847 48848
48849 48850 48851 48852 48853 48854 48855 48856 48857 48858 48859 48860 48861 48862 48863
48864 48865 48866 48867 48868 48869 48870 48871 48872 48873 48874 48875 48876 48877 48878
48879 48880 48881 48882 48883 48884 48885 48886 48887 48888 48889 48890 48891 48892 48893
48894 48895 48896 48897 48898 48899 48900 48901 48902 48903 48904 48905 48906 48907 48908
48909 48910 48911 48912 48913 48914 48915 48916 48917 48918 48919 48920 48921 48922 48923
48924 48925 48926 48927 48928 48929 48930 48931 48932 48933 48934 48935 48936 48937 48938
48939 48940 48941 48942 48943 48944 48945 48946 48947 48948 48949 48950 48951 48952
[ something_&_C-alpha ] ## << this is the pattern separator !!
91 106 121 138 154 164 181 193 207 222 237 253 273 297 308
329 345 365 386 410 427 444 461 476 493 508 518 533 540 556
566 584 590 600 620 626 641 658 674 688 715 721 740 765 771
782 793 807 824 831 848 864 871 895 912 931 941 960 979 986
998 1010 1029 1043 1067 1091 1112 1124 1135 1150 1170 1187 1201 1218 1237
1254 1271 1290 1315 1321 1335 1345 1360 1374 1384 1405 1420 1441 1461 1475
1497 1516 1526 1540 1551 1570 1590 1605 1616 1623 1642 1656 1680 1687 1711
1727 1743 1753 1772 1791 1798 1818 1825 1846 1870 1889 1899 1918 1935 1951
1972 1989 2006 2013 2032 2046 2053 2073 2092 2099 2116 2132 2146 2170 2190
2206 2222 2234 2254 2271 2290 2307 2324 2335 2354 2364 2388 2412 2431 2441
2458 2482 2489 2496 2520 2536 2546 2556 2575 2589 2608 2615 2629 2644 2650
2669 2688 2702 2718 2737 2753 2769 2788 2795 2811 2827 2846 2865 2872 2889
2909 2925 2941 2965 2989 3009 3029 3051
I need to detect the separator block according to the part of the pattern corresponded to the "&_C-alpha" keyword and then remove everything before the [ something_C-alpha ]
, so the expected output should be :
[ something_&_C-alpha ] ## << this is the pattern separator !!
91 106 121 138 154 164 181 193 207 222 237 253 273 297 308
329 345 365 386 410 427 444 461 476 493 508 518 533 540 556
566 584 590 600 620 626 641 658 674 688 715 721 740 765 771
782 793 807 824 831 848 864 871 895 912 931 941 960 979 986
998 1010 1029 1043 1067 1091 1112 1124 1135 1150 1170 1187 1201 1218 1237
1254 1271 1290 1315 1321 1335 1345 1360 1374 1384 1405 1420 1441 1461 1475
1497 1516 1526 1540 1551 1570 1590 1605 1616 1623 1642 1656 1680 1687 1711
1727 1743 1753 1772 1791 1798 1818 1825 1846 1870 1889 1899 1918 1935 1951
1972 1989 2006 2013 2032 2046 2053 2073 2092 2099 2116 2132 2146 2170 2190
2206 2222 2234 2254 2271 2290 2307 2324 2335 2354 2364 2388 2412 2431 2441
2458 2482 2489 2496 2520 2536 2546 2556 2575 2589 2608 2615 2629 2644 2650
2669 2688 2702 2718 2737 2753 2769 2788 2795 2811 2827 2846 2865 2872 2889
2909 2925 2941 2965 2989 3009 3029 3051
In principle it is possible to achieve it directly using sed :
sed -i -e '/ *_&_C-alpha \]~/p' -e '1,/ *_&_C-alpha \]/d' shared.ndx
this does what I need BUT unfortunately it removes additionally the name of the separator [ something_&_C-alpha ] from the output file
Solution
Using any awk:
$ awk '/\[ .*_&_C-alpha ]/{f=1} f' pca.ndx
[ something_&_C-alpha ] ## << this is the pattern separator !!
91 106 121 138 154 164 181 193 207 222 237 253 273 297 308
329 345 365 386 410 427 444 461 476 493 508 518 533 540 556
566 584 590 600 620 626 641 658 674 688 715 721 740 765 771
782 793 807 824 831 848 864 871 895 912 931 941 960 979 986
998 1010 1029 1043 1067 1091 1112 1124 1135 1150 1170 1187 1201 1218 1237
1254 1271 1290 1315 1321 1335 1345 1360 1374 1384 1405 1420 1441 1461 1475
1497 1516 1526 1540 1551 1570 1590 1605 1616 1623 1642 1656 1680 1687 1711
1727 1743 1753 1772 1791 1798 1818 1825 1846 1870 1889 1899 1918 1935 1951
1972 1989 2006 2013 2032 2046 2053 2073 2092 2099 2116 2132 2146 2170 2190
2206 2222 2234 2254 2271 2290 2307 2324 2335 2354 2364 2388 2412 2431 2441
2458 2482 2489 2496 2520 2536 2546 2556 2575 2589 2608 2615 2629 2644 2650
2669 2688 2702 2718 2737 2753 2769 2788 2795 2811 2827 2846 2865 2872 2889
2909 2925 2941 2965 2989 3009 3029 3051
or any sed:
$ sed -n '/\[ .*_&_C-alpha ]/,$p' pca.ndx
[ something_&_C-alpha ] ## << this is the pattern separator !!
91 106 121 138 154 164 181 193 207 222 237 253 273 297 308
329 345 365 386 410 427 444 461 476 493 508 518 533 540 556
566 584 590 600 620 626 641 658 674 688 715 721 740 765 771
782 793 807 824 831 848 864 871 895 912 931 941 960 979 986
998 1010 1029 1043 1067 1091 1112 1124 1135 1150 1170 1187 1201 1218 1237
1254 1271 1290 1315 1321 1335 1345 1360 1374 1384 1405 1420 1441 1461 1475
1497 1516 1526 1540 1551 1570 1590 1605 1616 1623 1642 1656 1680 1687 1711
1727 1743 1753 1772 1791 1798 1818 1825 1846 1870 1889 1899 1918 1935 1951
1972 1989 2006 2013 2032 2046 2053 2073 2092 2099 2116 2132 2146 2170 2190
2206 2222 2234 2254 2271 2290 2307 2324 2335 2354 2364 2388 2412 2431 2441
2458 2482 2489 2496 2520 2536 2546 2556 2575 2589 2608 2615 2629 2644 2650
2669 2688 2702 2718 2737 2753 2769 2788 2795 2811 2827 2846 2865 2872 2889
2909 2925 2941 2965 2989 3009 3029 3051
EDIT in response to the OPs request in a comment to use a different regexp that they provided (the leading <blank>*
in / *_&_C-alpha ]/
is doing nothing useful in this regexp btw as it'd match no blanks or any number of blanks so /_&_C-alpha ]/
would function the same way in this context):
$ sed -n '/ *_&_C-alpha ]/,$p' pca.ndx
[ something_&_C-alpha ] ## << this is the pattern separator !!
91 106 121 138 154 164 181 193 207 222 237 253 273 297 308
329 345 365 386 410 427 444 461 476 493 508 518 533 540 556
566 584 590 600 620 626 641 658 674 688 715 721 740 765 771
782 793 807 824 831 848 864 871 895 912 931 941 960 979 986
998 1010 1029 1043 1067 1091 1112 1124 1135 1150 1170 1187 1201 1218 1237
1254 1271 1290 1315 1321 1335 1345 1360 1374 1384 1405 1420 1441 1461 1475
1497 1516 1526 1540 1551 1570 1590 1605 1616 1623 1642 1656 1680 1687 1711
1727 1743 1753 1772 1791 1798 1818 1825 1846 1870 1889 1899 1918 1935 1951
1972 1989 2006 2013 2032 2046 2053 2073 2092 2099 2116 2132 2146 2170 2190
2206 2222 2234 2254 2271 2290 2307 2324 2335 2354 2364 2388 2412 2431 2441
2458 2482 2489 2496 2520 2536 2546 2556 2575 2589 2608 2615 2629 2644 2650
2669 2688 2702 2718 2737 2753 2769 2788 2795 2811 2827 2846 2865 2872 2889
2909 2925 2941 2965 2989 3009 3029 3051
Answered By - Ed Morton Answer Checked By - David Goodson (WPSolving Volunteer)