From: Jan Simon on
Dear Max,

> > While renaming can be triggered to work only, if the destination does not exist, the creation/opening of a file cannot be done thread-safe, as far as I understood.
>
> That's a great idea! But did you mean that the renaming (moving) can be triggered to work only if the ORIGIN exists?

No, I meant destination.

> One could use the presence of the destination file as a break on the access of other files:
> fileattrib('test.lock','-w') % change the attribute so that if the file exists, the use of it as a destination will cause an error (status of copyfile==0). Windows platforms?
> status = copyfile('test.lock','test.lock.lock') % copy the file. if cannot - try again later

COPYFILE(Source, Dest)
Then FILEATTRIB(Source, '-w') protects the source from writing - but is this useful?
The existence of a non-protected destination file is enough to let MOVEFILE or COPYFILE return a 0.

Jan
From: Max on
Dear Jan,

> > One could use the presence of the destination file as a break on the access of other files:
> > fileattrib('test.lock','-w') % change the attribute so that if the file exists, the use of it as a destination will cause an error (status of copyfile==0). Windows platforms?
> > status = copyfile('test.lock','test.lock.lock') % copy the file. if cannot - try again later
>
> COPYFILE(Source, Dest)
> Then FILEATTRIB(Source, '-w') protects the source from writing - but is this useful?
> The existence of a non-protected destination file is enough to let MOVEFILE or COPYFILE return a 0.

No, you can try it. The attribute of the file is preserved when using "copyfile" command (at least in Linux). Thus the destination file will also be protected against copying into it, and copyfile will produce status == 1 (successful copying) only if the destination file does not exist.

> > > While renaming can be triggered to work only, if the destination does not exist, the creation/opening of a file cannot be done thread-safe, as far as I understood.
> >
> > That's a great idea! But did you mean that the renaming (moving) can be triggered to work only if the ORIGIN exists?
>
> No, I meant destination.

Sorry, but then I don't get it. If the DESTINATION file (the file you copy TO) exists, but it's not set to be read-only, Matlab just copies the content of the SOURCE file into the destination file, overwriting it's content. I couldn't find a way to tell Matlab to move (or copy) the file only if the destination does not exist. On the other hand, if the ORIGIN does not exist then there is nothing to move and it will cause problems. So, the solution would look smth like this:

% try to move the file until succeed:
while ~movefile(lockSource,lockDest)
pause(.1)
end
<-- file processing -->
movefile(lockDest,lockSource) % make the lock file available for the next process

Right?

Max
From: Jan Simon on
Dear Max,

> while ~movefile(lockSource,lockDest)
> pause(.1)
> end
> <-- file processing -->
> movefile(lockDest,lockSource)
>
> Right?

Right, this works (as far as I can see).
I thought of another mechanism:
file1 = tempname; fclose(fopen(file1, 'w'));
lockfile = 'D:\Temp\locked';
while ~movefile(file1, lockfile)
pause(0.1);
end
% Now file has been moved to lockfile.
% Another instance cannot do this again:
file2 = tempname; fclose(fopen(file2, 'w'));
disp(movefile(file2, lockfile)) % *must* be 0 until:
...
delete(lockfile);
% Now other instances can move another file to lockfile.

So you toggle the name of the file, I push the file to the lock and delete it afterwards. I cannot see a strong advantage for one of the solutions.
As far as I can see both methods are robust, but in the former thread (meantioned already) it was stated, that the problem has not been solved for decades of years. What do I miss?

Jan
From: Max on
> So you toggle the name of the file, I push the file to the lock and delete it afterwards. I cannot see a strong advantage for one of the solutions.
Agree. It looks like it's mostly a matter of taste which method to use.

> As far as I can see both methods are robust, but in the former thread (meantioned already) it was stated, that the problem has not been solved for decades of years. What do I miss?

If you mean, why others didn't adopt the methods, I don't know... Works perfectly fine for me.

Thanks,
Max
From: Walter Roberson on
Jan Simon wrote:
> Dear Max,
>
>> while ~movefile(lockSource,lockDest)
>> pause(.1)
>> end
>> <-- file processing -->
>> movefile(lockDest,lockSource)
>> Right?
>
> Right, this works (as far as I can see).
> I thought of another mechanism:
> file1 = tempname; fclose(fopen(file1, 'w'));
> lockfile = 'D:\Temp\locked';
> while ~movefile(file1, lockfile)
> pause(0.1);
> end
> % Now file has been moved to lockfile.
> % Another instance cannot do this again:
> file2 = tempname; fclose(fopen(file2, 'w'));
> disp(movefile(file2, lockfile)) % *must* be 0 until:
> ...
> delete(lockfile);
> % Now other instances can move another file to lockfile.
>
> So you toggle the name of the file, I push the file to the lock and
> delete it afterwards. I cannot see a strong advantage for one of the
> solutions. As far as I can see both methods are robust, but in the
> former thread (meantioned already) it was stated, that the problem has
> not been solved for decades of years. What do I miss?

If you examine the unix definition of "rename",
http://www.opengroup.org/onlinepubs/000095399/functions/rename.html
you will see that it does not exactly match the functionality of Matlab's
movefile() according to "help movefile", the contents of which differ from
"doc movefile":

MOVEFILE Move file or directory.
[STATUS,MESSAGE,MESSAGEID] = MOVEFILE(SOURCE,DESTINATION,MODE) moves the
file or directory SOURCE to the new file or directory DESTINATION. Both
SOURCE and DESTINATION may be either an absolute pathname or a pathname
relative to the current directory. When MODE is used, MOVEFILE moves SOURCE
to DESTINATION, even when DESTINATION is read-only. The DESTINATION's
writable attribute state is preserved. See NOTE 1.


In particular, the Unix definition has it that the destination is removed and
then the renaming happens, a process that does not preserve or examine any
attributes such as the "writable" attribute.

Examining this, we see that Matlab's movefile() cannot be implemented
atomically in Unix, and is thus open to race conditions.

Unix's rename() system call is atomic (according to POSIX.1-1990), but the
assumption made in saying that it is atomic is that any shared file systems
ensure the atomaticity against multiple accesses (possibly from different
nodes), and that is a guarantee that SMB between dissimilar systems doesn't
even try to make, and which NFSv2 doesn't try to make, and which NFSv3 tries
to make but real implementations tend to fail at.


The Unix definition of rename() says,

"If the old argument points to the pathname of a file that is not a directory,
the new argument shall not point to the pathname of a directory. If the link
named by the new argument exists, it shall be removed and old renamed to new.
In this case, a link named new shall remain visible to other processes
throughout the renaming operation and refer either to the file referred to by
new or old before the operation began."

Note the possibility there that other processes shall continue to see either
the old file or the new file while the rename is taking place. This makes it
dodgy to write code without race conditions.


As Matlab's movefile() cannot be implemented atomically in underlying
operating system semantics, the implication is that to do a proper rename
requires a lock of some sort -- but if a lock of some sort existed, you would
be using _that_ instead of trying to fudge things by using movefile().