checking in >= 256k file fatally corrupts rcs file

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

checking in >= 256k file fatally corrupts rcs file

Don Hatch
Encountered this using cygwin's rcs 5.8.1-1 and 5.8.2-1.
Earlier version 5.7-11 does not have this bug
(so I've been specifically reverting to rcs 5.7-11
every time I run setup.exe).

Checking in a text file of size >= 256k
corrupts the rcs file, irretrievably losing most of the contents
and making future operations impossible.

The attached script demonstrates.
Run it in an empty directory (or a directory containing only the script).

Output of cygcheck -s -v -r also attached as cygcheck.out.

--
Don Hatch
[hidden email]
http://www.plunk.org/~hatch/

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

RUNME (1K) Download Attachment
cygcheck.out (227K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: checking in >= 256k file fatally corrupts rcs file

Warren Young
On 10/8/2013 04:22, Don Hatch wrote:
>
> Checking in a text file of size >= 256k
> corrupts the rcs file, irretrievably losing most of the contents

It's documented in the rcs NEWS file:

     - Env var RCS_MEM_LIMIT controls stdio threshold.

       For speed, RCS uses memory-based routines for files up to
       256 kilobytes, and stream-based (stdio) routines otherwise.
       You can change this threshold value by setting the environment
       variable ‘RCS_MEM_LIMIT’ to a non-negative integer, measured in
       kilobytes.  An empty ‘RCS_MEM_LIMIT’ value is silently ignored.

So, use the new environment variable, or build up your huge diffs a few
steps at a time, so as to avoid spamming this buffer.

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

Reply | Threaded
Open this post in threaded view
|

Re: checking in >= 256k file fatally corrupts rcs file

Don Hatch
On Tue, Oct 08, 2013 at 05:48:53PM -0600, Warren Young wrote:

> On 10/8/2013 04:22, Don Hatch wrote:
> >
> >Checking in a text file of size >= 256k
> >corrupts the rcs file, irretrievably losing most of the contents
>
> It's documented in the rcs NEWS file:
>
>     - Env var RCS_MEM_LIMIT controls stdio threshold.
>
>       For speed, RCS uses memory-based routines for files up to
>       256 kilobytes, and stream-based (stdio) routines otherwise.
>       You can change this threshold value by setting the environment
>       variable ?RCS_MEM_LIMIT? to a non-negative integer, measured in
>       kilobytes.  An empty ?RCS_MEM_LIMIT? value is silently ignored.
>
> So, use the new environment variable, or build up your huge diffs a
> few steps at a time, so as to avoid spamming this buffer.

Hi Warren,

Thanks for the pointer.

That quote certainly doesn't describe or justify the very serious
corruption bug I'm seeing in any way;
however, it seems very likely that the corruption bug
was introduced along with this new optimization feature
that you reference, so it's good information.

If it's indeed the case that this bug was introduced along with this
non-essential feature, it would be really good if the feature
could be backed out, and possibly re-introduced at a later time
after it has been re-worked so that doesn't break the product.

Don

--
Don Hatch
[hidden email]
http://www.plunk.org/~hatch/

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

Reply | Threaded
Open this post in threaded view
|

Re: checking in >= 256k file fatally corrupts rcs file

Warren Young
On 10/8/2013 18:30, Don Hatch wrote:

> On Tue, Oct 08, 2013 at 05:48:53PM -0600, Warren Young wrote:
>> On 10/8/2013 04:22, Don Hatch wrote:
>>>
>>> Checking in a text file of size >= 256k
>>> corrupts the rcs file, irretrievably losing most of the contents
>>
>> It's documented in the rcs NEWS file:
>
> That quote certainly doesn't describe or justify the very serious
> corruption bug I'm seeing in any way;

I'm not trying to justify it.  I'm just pointing out that you have an
expedient workaround for the bug which lets you run on current versions
of Cygwin rcs.

> it would be really good if the feature
> could be backed out, and possibly re-introduced at a later time
> after it has been re-worked so that doesn't break the product.

If the bug affects GNU rcs on all platforms, it needs to be fixed
upstream.  Then Cygwin rcs will get the fix when the Cygwin rcs package
maintainer updates the packages.

If the bug is in the Cygwin rcs port or in Cygwin itself, the actual bug
needs to be fixed, rather than hack out the feature that tickles the bug.

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

Reply | Threaded
Open this post in threaded view
|

Re: checking in >= 256k file fatally corrupts rcs file

Gary Johnson-4
On 2013-10-08, Warren Young wrote:

> On 10/8/2013 18:30, Don Hatch wrote:
> >On Tue, Oct 08, 2013 at 05:48:53PM -0600, Warren Young wrote:
> >>On 10/8/2013 04:22, Don Hatch wrote:
> >>>
> >>>Checking in a text file of size >= 256k
> >>>corrupts the rcs file, irretrievably losing most of the contents
> >>
> >>It's documented in the rcs NEWS file:
> >
> >That quote certainly doesn't describe or justify the very serious
> >corruption bug I'm seeing in any way;
>
> I'm not trying to justify it.  I'm just pointing out that you have an
> expedient workaround for the bug which lets you run on current
> versions of Cygwin rcs.
>
> >it would be really good if the feature
> >could be backed out, and possibly re-introduced at a later time
> >after it has been re-worked so that doesn't break the product.
>
> If the bug affects GNU rcs on all platforms, it needs to be fixed
> upstream.  Then Cygwin rcs will get the fix when the Cygwin rcs
> package maintainer updates the packages.
>
> If the bug is in the Cygwin rcs port or in Cygwin itself, the actual
> bug needs to be fixed, rather than hack out the feature that tickles
> the bug.

There was a discussion around March 27, 2012, about another change
in the behavior of RCS between 5.7 and 5.8.  It appears that someone
decided to make some sweeping "improvements" to RCS and broke a few
things along the way.

Regards,
Gary


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

Reply | Threaded
Open this post in threaded view
|

Re: checking in >= 256k file fatally corrupts rcs file

Warren Young
On 10/8/2013 20:08, Gary Johnson wrote:
>
> There was a discussion around March 27, 2012, about another change
> in the behavior of RCS between 5.7 and 5.8.  It appears that someone
> decided to make some sweeping "improvements" to RCS and broke a few
> things along the way.

GNU rcs 5.7 was released in 1995.  rcs 5.8 didn't come out until 2011.
So yeah, 16 years of changes in one jump.

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

Reply | Threaded
Open this post in threaded view
|

Re: checking in >= 256k file fatally corrupts rcs file

Don Hatch
In reply to this post by Warren Young
On Tue, Oct 08, 2013 at 07:30:40PM -0600, Warren Young wrote:

> On 10/8/2013 18:30, Don Hatch wrote:
> >On Tue, Oct 08, 2013 at 05:48:53PM -0600, Warren Young wrote:
> >>On 10/8/2013 04:22, Don Hatch wrote:
> >>>
> >>>Checking in a text file of size >= 256k
> >>>corrupts the rcs file, irretrievably losing most of the contents
> >>
> >>It's documented in the rcs NEWS file:
> >
> >That quote certainly doesn't describe or justify the very serious
> >corruption bug I'm seeing in any way;
>
> I'm not trying to justify it.  I'm just pointing out that you have
> an expedient workaround for the bug which lets you run on current
> versions of Cygwin rcs.

Ah I see, I misinterpreted the point of your message.

Your workaround feels much too dangerous to me...
if I forget to set the variable, or set it wrong,
or someone else doesn't know about the variable and runs into the bug,
then corruption happens and work is irretrievably lost.
I'll be continuing to use 5.7 instead
(even though I have to go through increasing contortions
to even get 5.7, since it's no longer the current or previous version).

>
> >it would be really good if the feature
> >could be backed out, and possibly re-introduced at a later time
> >after it has been re-worked so that doesn't break the product.
>
> If the bug affects GNU rcs on all platforms, it needs to be fixed
> upstream.  Then Cygwin rcs will get the fix when the Cygwin rcs
> package maintainer updates the packages.
>
> If the bug is in the Cygwin rcs port or in Cygwin itself, the actual
> bug needs to be fixed, rather than hack out the feature that tickles
> the bug.

I certainly agree with you as a long-term plan;
however at this moment, it seems to me,
we have a dangerously broken current version in place
that is causing people to lose data and work
(I lost a significant amount when I hit the bug).

Would it be possible to simply declare 5.8 DOA
and revert the "considered most stable" version to be 5.7?
If not, can we make a 5.9 that's identical to 5.7,
ship that immediately as the "considered most stable" version,
and then we can more leisurely and properly work on figuring out
what went wrong and where?

Don

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

Reply | Threaded
Open this post in threaded view
|

Re: checking in >= 256k file fatally corrupts rcs file

Warren Young
On 10/9/2013 01:05, Don Hatch wrote:
>
> if I forget to set the variable, or set it wrong,
> or someone else doesn't know about the variable and runs into the bug,
> then corruption happens and work is irretrievably lost.

How is this more difficult than what you're already doing, manually
rolling back to the previous version?

We're talking about a one-time one-line change to your .bash_profile:

     export RCS_MEM_LIMIT=10240

That will let you have 10 MiB diffs.

I tried it, and it does indeed allow your RUNME script to finish
successfully.

> (I lost a significant amount when I hit the bug).

I assume you're talking about the record of changes between your last
backup and the current version.  The last backup should be pretty
recent, so your current version shouldn't be too far different from it.

> Would it be possible to simply declare 5.8 DOA

The Cygwin package system allows one to mark 5.8-1 as obsolete, but I
don't know if it can be told "and downgrade to 5.7-11".

> If not, can we make a 5.9 that's identical to 5.7,

If a new Cygwin package had to come out based on 5.7, it would be called
5.7-12, not 5.9.  GNU rcs 5.9 already exists.  (The current upstream
release is 5.9.1.)  Cygwin packages generally leave the upstream
revision numbers unmolested.

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

Reply | Threaded
Open this post in threaded view
|

Re: checking in >= 256k file fatally corrupts rcs file

Ryan Johnson-10
In reply to this post by Warren Young
On 08/10/2013 7:48 PM, Warren Young wrote:

> On 10/8/2013 04:22, Don Hatch wrote:
>>
>> Checking in a text file of size >= 256k
>> corrupts the rcs file, irretrievably losing most of the contents
>
> It's documented in the rcs NEWS file:
>
>     - Env var RCS_MEM_LIMIT controls stdio threshold.
>
>       For speed, RCS uses memory-based routines for files up to
>       256 kilobytes, and stream-based (stdio) routines otherwise.
>       You can change this threshold value by setting the environment
>       variable ‘RCS_MEM_LIMIT’ to a non-negative integer, measured in
>       kilobytes.  An empty ‘RCS_MEM_LIMIT’ value is silently ignored.
>
> So, use the new environment variable, or build up your huge diffs a
> few steps at a time, so as to avoid spamming this buffer.
So in other words, a misguided performance optimization [1] that almost
certainly has little measurable impact on performance [2] has introduced
a silent data corruption bug (or tickled a latent one somewhere else).
Lovely.

The gcc devs have the right philosophy: features that break things badly
get reverted immediately regardless of whose fault the bug is, and will
be considered for re-inclusion once the bug has been fixed on the side.
In this case, though, the I'm not sure re-inclusion is even warranted.

[1] Modern filesystems and filesystem caching are pretty darn good at
handling temporary files these days. Further, if you really care about
using RAM to improve performance, 256kB is an absurdly low limit for a
buffer size, and has been for most of the last decade.

[2] I'd be shocked if even 0.1% of checkins were large enough to have a
noticeable latency in a modern system, and even more shocked if the 0.1%
that are large enough to be slow were still small enough that 256kB of
buffering made any difference in their runtime. Unless the code is
calling fsync() after every newline or something, in which case that's
what needs to be fixed.

$0.02
Ryan


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

Reply | Threaded
Open this post in threaded view
|

Re: checking in >= 256k file fatally corrupts rcs file

Ryan Johnson-10
In reply to this post by Warren Young
On 09/10/2013 9:37 AM, Warren Young wrote:

> On 10/9/2013 01:05, Don Hatch wrote:
>> Would it be possible to simply declare 5.8 DOA
>
> The Cygwin package system allows one to mark 5.8-1 as obsolete, but I
> don't know if it can be told "and downgrade to 5.7-11".
>
>> If not, can we make a 5.9 that's identical to 5.7,
>
> If a new Cygwin package had to come out based on 5.7, it would be
> called 5.7-12, not 5.9.  GNU rcs 5.9 already exists.  (The current
> upstream release is 5.9.1.)  Cygwin packages generally leave the
> upstream revision numbers unmolested.
Or, just roll a 5.8-2 that happens to be compiled from 5.7 sources, or
from 5.8 sources with the optimization disabled. I suspect it would be
trivial to disable the optimization in the 5.8 code base. The old
behavior was almost certainly disk-only, meaning that the bug lies in
the transition from using the buffer to using disk. Disable buffer usage
completely and the bug goes away.

Ryan


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

Reply | Threaded
Open this post in threaded view
|

Re: checking in >= 256k file fatally corrupts rcs file

Don Hatch
In reply to this post by Warren Young
On Wed, Oct 09, 2013 at 07:37:26AM -0600, Warren Young wrote:
> On 10/9/2013 01:05, Don Hatch wrote:
> >
> >if I forget to set the variable, or set it wrong,
> >or someone else doesn't know about the variable and runs into the bug,
> >then corruption happens and work is irretrievably lost.
>
> How is this more difficult than what you're already doing, manually
> rolling back to the previous version?

Not necessarily more difficult, but more dangerous/fragile
and therefore requiring continual attention at the user level.
My preference is to not have the broken version on my system at all--
then I can relax and not be continually worrying about whether
each different way I (or anyone else including root daemons)
use rcs is going to somehow circumvent the environment variable
and destroy more work.

>
> We're talking about a one-time one-line change to your .bash_profile:
>
>     export RCS_MEM_LIMIT=10240

That assumes too much: that only one user account uses rcs
(consider other users and root startup scripts and daemons),
that all of my usages of rcs are through the command line
(consider helper scripts and tools),
and that the only command line I use is bash (it's not).
Too fragile.

> >(I lost a significant amount when I hit the bug).
>
> I assume you're talking about the record of changes between your
> last backup and the current version.  The last backup should be
> pretty recent, so your current version shouldn't be too far
> different from it.

I'm not completely following what you're saying,
but I believe I lost more than what you're describing.
I lost the current working file (which I had just put
at least a day of good work into) and the rcs file became
useless for getting any previous versions, either-- I had to
revert to the most recent backup of the rcs file.
I don't remember precisely why I lost the working file--
I might have done a "ci" rather than a "ci -l" at some point,
or it may be because I had an $id$ tag
that caused the working file to be rewritten from the rcs file
which no longer had good data in it.

I remember at the first moment I noticed something was awry,
I still had the contents of the working file in an editor buffer...
not realizing the seriousness of what was going on,
I exited the editor, thereby discarding the only remaining
good copy of the file.  Blech.

Don

--
Don Hatch
[hidden email]
http://www.plunk.org/~hatch/

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

Reply | Threaded
Open this post in threaded view
|

Re: checking in >= 256k file fatally corrupts rcs file

Achim Gratz
In reply to this post by Ryan Johnson-10
Ryan Johnson writes:
> So in other words, a misguided performance optimization [1] that
> almost certainly has little measurable impact on performance [2] has
> introduced a silent data corruption bug (or tickled a latent one
> somewhere else). Lovely.

It is not the performance optimization that isn't working, but the code
path through plain stdio while doing a diff that was probably never
exercised (the tests all pass on Cygwin).  Try RCS_MEM_LIMIT=0 to force
stdio.  The error does not occur on Linux and it doesn't seem to be
known to the devs.


Regards,
Achim.
--
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+

Wavetables for the Waldorf Blofeld:
http://Synth.Stromeko.net/Downloads.html#BlofeldUserWavetables


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

Reply | Threaded
Open this post in threaded view
|

Re: checking in >= 256k file fatally corrupts rcs file

Richard Gribble-2
I

On Wed, Oct 9, 2013 at 2:28 PM, Achim Gratz <[hidden email]> wrote:

> Ryan Johnson writes:
>> So in other words, a misguided performance optimization [1] that
>> almost certainly has little measurable impact on performance [2] has
>> introduced a silent data corruption bug (or tickled a latent one
>> somewhere else). Lovely.
>
> It is not the performance optimization that isn't working, but the code
> path through plain stdio while doing a diff that was probably never
> exercised (the tests all pass on Cygwin).  Try RCS_MEM_LIMIT=0 to force
> stdio.  The error does not occur on Linux and it doesn't seem to be
> known to the devs.
>
>
> Regards,
> Achim.
> --
> +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+
>
> Wavetables for the Waldorf Blofeld:
> http://Synth.Stromeko.net/Downloads.html#BlofeldUserWavetables
>
>
> --
> Problem reports:       http://cygwin.com/problems.html
> FAQ:                   http://cygwin.com/faq/
> Documentation:         http://cygwin.com/docs.html
> Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
>

I can attest to the problem with file size corrupting the file - on
several occasions I have checked in a file and lost roughly half of
it.  As RCS only stores the last version in its entirety and deltas
against that version to create prior versions, losing part of the
current version results in a completely corrupt file - the only thing
that has saved me is that I have two copies of RCS on separate
computers.  I now only check files in on my Linux box, then transfer
them to the Windows/cygwin box.

However, by the same token, there is a bug in 5.7 where the check of
the symbolic name (when checking out a file) only tests the number of
characters in the table of symbolic names - for example, say you have
a file with the following symbolic names:  R25, R25a, R25b, R25c and
R26.  Now you try to check out version R25c.  If the code finds the
R25 symbolic name first (before R25a/R25b/R25c), it compares three
characters (the length of "R25") against the first three characters of
the symbolic name provided to the co command ("R25") and finds a match
- giving you the wrong version.  So neither option is ideal.


Respectfully,

Richard Gribble.

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple