but the storm is pulling the master, and the master is almost 7 gigabytes
You don’t save too much space this way and that’s why. The git clone command is executed in two steps:
- Copy repository from the remote repository as a whole. The data is pumped out packed in blobs (Binary Large OBject), together with data on the structure of commits and their interrelations. All commits are pumped out in all branches, including the one that occupies 7 Gb.
- Checkout a specific commit. The default is where the HEAD is looking at the remote repository. As Daniel-664
-b remote_branch you can change the default using the -b remote_branch parameter and immediately get the branch you need.
The problem is that while there is at least one commit in which there is a huge file with logs or other files, this data is stored in the repository. If the log has changed and commited several times, then each version is stored there. Therefore, blobs from this branch can take up even more space than files from a particular commit.
In order not to keep these unnecessary files forever, you can delete the corresponding commits and binary files from the repository. There is a significant limitation: it will overwrite the repository history. If there are other developers, they will need to clone the repository again.
There are several ways to clean up the repository:
- Simply remove the branch in which the logs are committed, and then clean it with the garbage collector. Not suitable if there are other valuable changes in a branch commit.
- Using the
filter-branch --tree-filter or filter-branch --index-filter . It edits every commit in history and "re-saves" it (it actually creates a new commit chain, but their attributes will remain the same and the branch will look at the new last commit. Use the bfg-repo-cleaner utility, which does the same as filter-branch, but is faster and more convenient:
# клонируем наш репозиторий git clone --mirror git://example.com/my-repo.git # самое время сделать бэкап cp my-repo.git my-repo.git.backup # удаляем ненужную папку c помощью BFG java -jar bfg.jar --delete-folders logs my-repo.git # Заходим, сбрасываем reflog и запускаем сборщик мусора cd my-repo.git git reflog expire --expire=now --all && git gc --prune=now --aggressive # заливаем обратно в гитлаб git push -f
In the future, to keep logs, databases, and the like out of the repository, use .gitignore :