I need to wrap my parser, namely the avito_parser_cli.py file from the repository https://github.com/denis5417/avito_parser in the Docker container.

I created a Dockerfile :

 FROM python:3 ADD avito_parser.py avito_parser_cli.py requirements.txt / RUN python3 -m venv env CMD ['source', 'env/bin/activate'] RUN pip3 install -r requirements.txt ENTRYPOINT ["python3", "avito_parser_cli.py"] 

I create a virtual environment and install the dependencies I need in it. I use ENTRYPOINT instead of CMD to accept command line arguments at startup.

Then I compiled an image of sudo docker build -t avito_parser_cli .

For testing, I moved to another folder and launched the docker run avito_parser_cli "трактор мтз" -t -m 300000 -s 'date' -a image of the docker run avito_parser_cli "трактор мтз" -t -m 300000 -s 'date' -a

All arguments parsed correctly and the script gave what was expected. But he also had to write the result to the file output.csv , he did not give any errors, but I did not find the file anywhere. When you run a script without Docker, the script successfully created the file and wrote it into it.

I have the following questions:

  1. Have I designed the Dockerfile and assembled the image? Is it accepted to create a virtual environment inside the image and install all dependencies in it, or do you need to do something else (for example, do not create an environment, but install all dependencies right away)?
  2. Where did my output.csv ? How to make it so that it is created in the same directory in which the image is launched and is it customary to do so?
  3. How do such images usually spread? Is it enough just to leave the Dockerfile in the repository with the project?
  • Your output.csv is inside a container. - Xander
  • @ Alexander and how to get it from there? - pinguin

2 answers 2

  1. The virtual environment inside the container is superfluous. The container itself is an isolated virtual environment. If your application is the only thing that works in the container (and the Docker ideology is inclined to exactly this option), then all the required programs and libraries can be installed directly.
  2. Files created by programs running inside the container are in the virtual file system of the container and disappear after it is stopped. In order to pull them out of the container and save, you need to mount the host folder to your container and force the program to write to the mounted folder. Described in detail in the documentation , examples and nuances for Windows can be viewed here briefly: docker run -v /path/on/host/machine:/path/inside/container my-docker-image .
  3. It depends on how much you want to strain the users of your container. If you want to save them from having to download the sources and build the image yourself, you can publish your image on the Docker Hub or in some alternative repository (for example, JCenter) so that your users can run the container simply through the docker run .
  • A comprehensive answer, thank you very much. - pinguin pm
  • However, one question remains, the file output.csv is created in the root of the container, in the same place where the scripts and requirements lie. It turns out the path to it /output.csv. What should I specify for -v to save the file in the folder from which I run the image? - pinguin
  • It would be best to either put the script in the root, not in the root of any folder, or force the script to write the file in a separate folder so that only this file is in it. Then there will be no problems with forwarding this folder to the host, and you are guaranteed not to light anything extra. - fori1ton
  1. You can get a file from the container like this:

docker cp <container_id>:/file/path/inside/container /file/path/outside/container

  1. Personally, I used them for Jenkis and had the option to either create a tar.gz format from a ready-made saved container, or collect it completely from a dockfile.