-- Leo's gemini proxy

-- Connecting to perso.pw:1965...

-- Connected

-- Sending request

-- Meta line: 20 text/gemini;

Download files listed in a http index with wget


Author: Solène

Date: 16 June 2020

Tags: wget internet


Comment on Mastodon


Sometimes I need to download files through http from a list on an "autoindex"

page and it's always painful to find a correct command for this.


The easy solution is **wget** but you need to use the correct parameters

because wget has a lot of mirroring options but you only want specific ones to

achieve this goal.


I ended up with the following command:


wget --continue --accept "*.tgz" --no-directories --no-parent --recursive http://ftp.fr.openbsd.org/pub/OpenBSD/6.7/amd64/


This will download every tgz files available at the address given as last parameter.


The parameters given will filter to only download the **tgz** files, put the

files in the current working directory and most important, don't try to escape

to the parent directory to start downloading again. The `--continue`` parameter

allow to interrupt wget and start again, downloaded file will be skipped and

partially downloaded files will be completed.


**Do not reuse this command if files changed on the remote server** because

continue feature only work if your local file and the remote file are the same,

this simply look at the local and remote names and will ask the remote server

to start downloading at the current byte range of your local file. If meanwhile

the remote file changed, you will have a mix of the old and new file.


Obviously ftp protocol would be better suited for this download job but ftp is

less and less available so I find **wget** to be a nice workaround for this.

-- Response ended

-- Page fetched on Fri Apr 26 22:54:34 2024