Problem with packagesource::sites in setup

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Problem with packagesource::sites in setup

Ken Brown-6
I think we're currently mishandling packagesource::sites when several
libsolv repos contain the same version of a package.  If I'm not
mistaken, we create a new packageversion pv for each repo, and
pv.source()->sites contains a single site, corresponding to that repo.

So we never take advantage of the fact that we have more than one mirror
(or mirror directory) from which we can potentially obtain an archive
for the package.

I think the way to fix this is to consolidate all the packageversions pv
into a single one, which knows about all the sites.

This could be handled by packagemeta::add_version().  When it replaces
an existing version, it could remove the old one from the pool after
copying the sites information.  In order to obtain the sites
information, it would have to be able to query the libsolv pool, so we
would have to internalize repo data as we go along, presumably in the
IniDBBuilderPackage destructor.

Does this sound about right?  If so, I'll try to prepare a patch.  Or
maybe there's a better/easier way to solve the problem.

Ken
Reply | Threaded
Open this post in threaded view
|

Re: Problem with packagesource::sites in setup

Jon TURNEY
On 15/03/2018 21:23, Ken Brown wrote:
> I think we're currently mishandling packagesource::sites when several
> libsolv repos contain the same version of a package.  If I'm not
> mistaken, we create a new packageversion pv for each repo, and
> pv.source()->sites contains a single site, corresponding to that repo.
>
> So we never take advantage of the fact that we have more than one mirror
> (or mirror directory) from which we can potentially obtain an archive
> for the package.

Hmm... I think this is going to interact with the package repositories
release: label.  If they are both "cygwin", then one will overwrite the
other.  If they are different, then we'll have 2 separate libsolv repos.

In the first case, I'm not sure that having the same package available
from more than one package repository mirror was ever was doing anything
terribly useful (i.e. it doesn't make the download any faster, or more
reliable)

But, yeah, what we are doing currently is probably wrong.

In the second case, it's possible for the length/hash of the "same"
version to be different, so it's not clear in what sense they really are
the same, and I think it's random which one we're going to get
(silently), which is unhelpful at best...

> I think the way to fix this is to consolidate all the packageversions pv
> into a single one, which knows about all the sites.
>
> This could be handled by packagemeta::add_version().  When it replaces
> an existing version, it could remove the old one from the pool after
> copying the sites information.  In order to obtain the sites
> information, it would have to be able to query the libsolv pool, so we
> would have to internalize repo data as we go along, presumably in the
> IniDBBuilderPackage destructor.
>
> Does this sound about right?  If so, I'll try to prepare a patch.  Or
> maybe there's a better/easier way to solve the problem.

Reply | Threaded
Open this post in threaded view
|

Re: Problem with packagesource::sites in setup

Ken Brown-6
On 3/15/2018 6:07 PM, Jon Turney wrote:

> On 15/03/2018 21:23, Ken Brown wrote:
>> I think we're currently mishandling packagesource::sites when several
>> libsolv repos contain the same version of a package.  If I'm not
>> mistaken, we create a new packageversion pv for each repo, and
>> pv.source()->sites contains a single site, corresponding to that repo.
>>
>> So we never take advantage of the fact that we have more than one
>> mirror (or mirror directory) from which we can potentially obtain an
>> archive for the package.
>
> Hmm... I think this is going to interact with the package repositories
> release: label.  If they are both "cygwin", then one will overwrite the
> other.

I hadn't thought of that.  But will one really overwrite the other or
will we just get several copies of the same package and version in the
"cygwin" repo, each with its own site?

>  If they are different, then we'll have 2 separate libsolv repos.

> In the first case, I'm not sure that having the same package available
> from more than one package repository mirror was ever was doing anything
> terribly useful (i.e. it doesn't make the download any faster, or more
> reliable)

No, but it can help if one mirror is having transient network problems.
For example, we might get a corrupt archive from one mirror, and then
the loop in download.cc:download_one() will try the next one.  I have no
idea how many users use more than one mirror with the expectation that
this will happen.  Probably not many.

> But, yeah, what we are doing currently is probably wrong.

And I'm less convinced now that it's worth worrying about.  But if we
decide not to fix it, we should probably simplify and clarify the code
by saving just one site in a packagesource object instead of a vector
that always has a single element in it.

> In the second case, it's possible for the length/hash of the "same"
> version to be different, so it's not clear in what sense they really are
> the same, and I think it's random which one we're going to get
> (silently), which is unhelpful at best...
>
>> I think the way to fix this is to consolidate all the packageversions
>> pv into a single one, which knows about all the sites.
>>
>> This could be handled by packagemeta::add_version().  When it replaces
>> an existing version, it could remove the old one from the pool after
>> copying the sites information.  In order to obtain the sites
>> information, it would have to be able to query the libsolv pool, so we
>> would have to internalize repo data as we go along, presumably in the
>> IniDBBuilderPackage destructor.
>>
>> Does this sound about right?  If so, I'll try to prepare a patch.  Or
>> maybe there's a better/easier way to solve the problem.
>

Reply | Threaded
Open this post in threaded view
|

Re: Problem with packagesource::sites in setup

Ken Brown-6
On 3/15/2018 10:42 PM, Ken Brown wrote:

> On 3/15/2018 6:07 PM, Jon Turney wrote:
>> On 15/03/2018 21:23, Ken Brown wrote:
>>> I think we're currently mishandling packagesource::sites when several
>>> libsolv repos contain the same version of a package.  If I'm not
>>> mistaken, we create a new packageversion pv for each repo, and
>>> pv.source()->sites contains a single site, corresponding to that repo.
>>>
>>> So we never take advantage of the fact that we have more than one
>>> mirror (or mirror directory) from which we can potentially obtain an
>>> archive for the package.
>>
>> Hmm... I think this is going to interact with the package repositories
>> release: label.  If they are both "cygwin", then one will overwrite
>> the other.
>
> I hadn't thought of that.  But will one really overwrite the other or
> will we just get several copies of the same package and version in the
> "cygwin" repo, each with its own site?
>
>>   If they are different, then we'll have 2 separate libsolv repos.
>
>> In the first case, I'm not sure that having the same package available
>> from more than one package repository mirror was ever was doing
>> anything terribly useful (i.e. it doesn't make the download any
>> faster, or more reliable)
>
> No, but it can help if one mirror is having transient network problems.
> For example, we might get a corrupt archive from one mirror, and then
> the loop in download.cc:download_one() will try the next one.  I have no
> idea how many users use more than one mirror with the expectation that
> this will happen.  Probably not many.
>
>> But, yeah, what we are doing currently is probably wrong.
>
> And I'm less convinced now that it's worth worrying about.

I just realized that this affects local installs also.  Here it's not
unusual for a user to have several mirror directories from different
setup runs.  But if two different setup.ini files list a given
packageversion, setup will offer it for install only if the archive is
found in the directory corresponding to the last setup.ini read.

So I do think this should be fixed.

Ken

Reply | Threaded
Open this post in threaded view
|

Re: Problem with packagesource::sites in setup

Achim Gratz
In reply to this post by Ken Brown-6
Ken Brown writes:
> I think we're currently mishandling packagesource::sites when several
> libsolv repos contain the same version of a package.  If I'm not
> mistaken, we create a new packageversion pv for each repo, and
> pv.source()->sites contains a single site, corresponding to that repo.

That should normally not happen, I'm not sure what libsolv does in this
case in the absence of repo priorities (which would provide an ordering
among otherwise identical versions).

> So we never take advantage of the fact that we have more than one
> mirror (or mirror directory) from which we can potentially obtain an
> archive for the package.

That sort of thing is supposed to be handled at another level in the
distros using libsolv based installers (i.e. mirrorbrain).  WHile I
agree that this would be sensible behaviour for us, I don't know how
much we'd be relying on accidental behaviour.


Regards,
Achim.
--
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+

SD adaptations for KORG EX-800 and Poly-800MkII V0.9:
http://Synth.Stromeko.net/Downloads.html#KorgSDada
Reply | Threaded
Open this post in threaded view
|

Re: Problem with packagesource::sites in setup

Ken Brown-6
In reply to this post by Ken Brown-6
On 3/16/2018 7:44 AM, Ken Brown wrote:

> On 3/15/2018 10:42 PM, Ken Brown wrote:
>> On 3/15/2018 6:07 PM, Jon Turney wrote:
>>> On 15/03/2018 21:23, Ken Brown wrote:
>>>> I think we're currently mishandling packagesource::sites when
>>>> several libsolv repos contain the same version of a package.  If I'm
>>>> not mistaken, we create a new packageversion pv for each repo, and
>>>> pv.source()->sites contains a single site, corresponding to that repo.
>>>>
>>>> So we never take advantage of the fact that we have more than one
>>>> mirror (or mirror directory) from which we can potentially obtain an
>>>> archive for the package.
>>>
>>> Hmm... I think this is going to interact with the package
>>> repositories release: label.  If they are both "cygwin", then one
>>> will overwrite the other.
>>
>> I hadn't thought of that.  But will one really overwrite the other or
>> will we just get several copies of the same package and version in the
>> "cygwin" repo, each with its own site?
>>
>>>   If they are different, then we'll have 2 separate libsolv repos.
>>
>>> In the first case, I'm not sure that having the same package
>>> available from more than one package repository mirror was ever was
>>> doing anything terribly useful (i.e. it doesn't make the download any
>>> faster, or more reliable)
>>
>> No, but it can help if one mirror is having transient network
>> problems. For example, we might get a corrupt archive from one mirror,
>> and then the loop in download.cc:download_one() will try the next
>> one.  I have no idea how many users use more than one mirror with the
>> expectation that this will happen.  Probably not many.
>>
>>> But, yeah, what we are doing currently is probably wrong.
>>
>> And I'm less convinced now that it's worth worrying about.
>
> I just realized that this affects local installs also.  Here it's not
> unusual for a user to have several mirror directories from different
> setup runs.  But if two different setup.ini files list a given
> packageversion, setup will offer it for install only if the archive is
> found in the directory corresponding to the last setup.ini read.
>
> So I do think this should be fixed.

A patchset is on its way.

Ken