I’m thinking about putting the virtualenv for a Django web app I am making inside my git repository for the app. It seems like an easy way to keep deploy’s simple and easy. Is there any reason why I shouldn’t do this?
I’m totally new to virtualenv, so there is a good chance this is a really stupid question.
pip freeze to get the packages I need into a
requirements.txt file and add that to my repository. I tried to think of a way of why you would want to store the entire virtualenv, but I could not.
I used to do the same until I started using libraries that are compiled differently depending on the environment such as PyCrypto. My PyCrypto mac wouldn’t work on Cygwin wouldn’t work on Ubuntu.
It becomes an utter nightmare to manage the repository.
Either way I found it easier to manage the pip freeze & a requirements file than having it all in git. It’s cleaner too since you get to avoid the commit spam for thousands of files as those libraries get updated…
Storing the virtualenv directory inside git will, as you noted, allow you to deploy the whole app by just doing a git clone (plus installing and configuring Apache/mod_wsgi). One potentially significant issue with this approach is that on Linux the full path gets hard-coded in the venv’s activate, django-admin.py, easy_install, and pip scripts. This means your virtualenv won’t entirely work if you want to use a different path, perhaps to run multiple virtual hosts on the same server. I think the website may actually work with the paths wrong in those files, but you would have problems the next time you tried to run pip.
The solution, already given, is to store enough information in git so that during the deploy you can create the virtualenv and do the necessary pip installs. Typically people run
pip freeze to get the list then store it in a file named requirements.txt. It can be loaded with
pip install -r requirements.txt. RyanBrady already showed how you can string the deploy statements in a single line:
virtualenv --no-site-packages --distribute .env && source .env/bin/activate && pip install -r requirements.txt
Personally, I just put these in a shell script that I run after doing the git clone or git pull.
Storing the virtualenv directory also makes it a bit trickier to handle pip upgrades, as you’ll have to manually add/remove and commit the files resulting from the upgrade. With a requirements.txt file, you just change the appropriate lines in requirements.txt and re-run
pip install -r requirements.txt. As already noted, this also reduces “commit spam”.
I think one of the main problems which occur is that the virtualenv might not be usable by other people. Reason is that it always use absolute path’s. So if you virtualenv was for example in
/home/lyle/myenv/ it will assume the same for all other people using this repository (it must be exactly the same absolute path). You can’t presume people using the same directory structure as you.
Better practice is that everybody is setting up their own environment (be it with or without virtualenv) and installing libraries there. That also makes you code more usable over different platforms (Linux/Windows/Mac), also because virtualenv is installed different in each of them.
If you know which operating systems your application will be running on, I would create one virtualenv for each system and include it in my repository. Then I would make my application detect which system it is running on and use the corresponding virtualenv.
The system could e.g. be identified using the platform module.
In fact, this is what I do with an in-house application I have written, and to which I can quickly add a new system’s virtualenv in case it is needed. This way, I do not have to rely on that pip will be able to successfully download the software my application requires. I will also not have to worry about compilation of e.g. psycopg2 which I use.
If you do not know which operating system your application may run on, you are probably better off using
pip freeze as suggested in other answers here.
If you just setting up development env, then use pip freeze file, caz that makes the git repo clean.
Then if doing production deployment, then checkin the whole venv folder. That will make your deployment more reproducible, not need those libxxx-dev packages, and avoid the internet issues.
So there are two repos. One for your main source code, which includes a requirements.txt. And a env repo, which contains the whole venv folder.