Monday, January 31, 2022

[SOLVED] Parallel download using Curl command line utility

Issue

I want to download some pages from a website and I did it successfully using curl but I was wondering if somehow curl downloads multiple pages at a time just like most of the download managers do, it will speed up things a little bit. Is it possible to do it in curl command line utility?

The current command I am using is

curl 'http://www...../?page=[1-10]' 2>&1 > 1.html

Here I am downloading pages from 1 to 10 and storing them in a file named 1.html.

Also, is it possible for curl to write output of each URL to separate file say URL.html, where URL is the actual URL of the page under process.


Solution

Well, curl is just a simple UNIX process. You can have as many of these curl processes running in parallel and sending their outputs to different files.

curl can use the filename part of the URL to generate the local file. Just use the -O option (man curl for details).

You could use something like the following

urls="http://example.com/?page1.html http://example.com?page2.html" # add more URLs here

for url in $urls; do
   # run the curl job in the background so we can start another job
   # and disable the progress bar (-s)
   echo "fetching $url"
   curl $url -O -s &
done
wait #wait for all background jobs to terminate


Answered By - nimrodm
Answer Checked By - Candace Johnson (WPSolving Volunteer)