I have encountered some behaviour I didn’t expect from Subversion, and discovered this almost accidentally: A modified file is not being flagged as modified.
I have a unit test involving a Microsoft Excel spreadsheet as an input file. The unit test calculates a CRC checksum; my test started failing due to the checksum changing.
The test file is stored in SVN, and has MIME type “application/octet-stream”, and is therefore considered as binary by SVN.
I get the same behaviour from TortoiseSVN and the SVN command line client, in this case both based on SVN 1.6: When the file is opened in Excel, the fact that it is open must be encoded in the file itself; SVN shows that the file is modified. However, when the file is closed again (without saving), it seems to return to its unmodified state: that is,
svn status does not list the Excel file;
svn diff produces no output anyway due to the data being binary.
The catch is that the file does not now binary compare with the file stored in the repository. (If a fresh copy is exported, it does not binary compare with the opened-and-closed copy.) The file is apparently unchanged from the user’s point of view, so in a semantic sense, the SVN response is reasonable. But not syntactically; and SVN is essentially syntactic.
The part for which I can’t find a reason is why SVN does not flag the file as modified. I can’t imagine SVN has any special handling for Excel files (and in any case the MIME type is not specifically one associated with MS Excel); there is no SNV keyword property defined. Likewise, I can imagine Excel knows anything about the contents of the hidden .svn subdirectory in which the SVN working copy information is stored.
Do you have any clues as to what is going on here?
Excel always locks files on open, setting the timestamp to the current date. When you close without saving, Excel will revert the timestamp back. This causes SVN to ignore the file.
As for the changed contents, I’m not sure. Can you reproduce the problem?
Subversion assumes that the “last modification” time stamp is not lying. If the timestamp is unchanged, the content of the file is not checked for changes. I think all version control system do this, checking for local modifications would be unbearably slow otherwise.
edit: for the details of how SVN works in this regard, questions.c in the source of the SVN working copy library is a good start.
Is it possible the file is locked by excel while it is open and svn can’t access it to see if it has changed or not?