Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
docker/.env
docker/output
docker/urls.txt
7 changes: 7 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
FROM python:3.8-slim as requirements
COPY requirements.txt .
RUN pip install -r requirements.txt

FROM requirements
COPY archive-org-downloader.py .
ENTRYPOINT ["/usr/local/bin/python", "archive-org-downloader.py"]
33 changes: 33 additions & 0 deletions docker/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
### Install Docker Desktop

See [here](https://www.docker.com/products/docker-desktop/)


### Configure Downloader

```shell
cd ./docker
```

Put your desired book URLs in a `./urls.txt` file:

```shell
echo 'https://archive.org/details/XXXX' >> ./urls.txt
```

Place the following contents in a `./.env` file:

```shell
AD_EMAIL='ARCHIVE.ORG_EMAIL'
AD_PASSWORD='ARCHIVE.ORG_PASSWORD'
AD_RESOLUTION='0'
```


### Run the Downloader

```shell
docker-compose run archive_downloader
```

You can find the downloaded PDFs at `./output/*.pdf`
20 changes: 20 additions & 0 deletions docker/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
version: "3.9"
services:
archive_downloader:
build:
context: ..
dockerfile: Dockerfile
volumes:
- ./output:/ad_output:rw
- ./urls.txt:/ad_urls.txt:ro
command:
- "-e"
- "$AD_EMAIL"
- "-p"
- "$AD_PASSWORD"
- "-r"
- "$AD_RESOLUTION"
- "-d"
- "/ad_output"
- "-f"
- "/ad_urls.txt"