[Patch] Make getenv() functional before the environment is initialized

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

[Patch] Make getenv() functional before the environment is initialized

Pierre A Humblet
This makes getenv return sensibly before the environment is initialized.
The attached file should be properly formatted (Changelog & patch),
which my mailer can't do.

Pierre


2006-04-06  Pierre Humblet [hidden email]

        * environ.cc (getearly): New function.
           (getenv) : Call getearly if needed.
   
Index: environ.cc
===================================================================
RCS file: /cvs/src/src/winsup/cygwin/environ.cc,v
retrieving revision 1.139
diff -u -p -b -r1.139 environ.cc
--- environ.cc  22 Mar 2006 16:42:44 -0000      1.139
+++ environ.cc  6 Apr 2006 16:06:05 -0000
@@ -224,6 +224,39 @@ my_findenv (const char *name, int *offse
 }
 
 /*
+ * getearly --
+ *     Primitive getenv before the environment is built.
+ */
+
+static char *
+getearly (const char * name)
+{
+  int s = strlen (name);
+  char * rawenv;
+  char ** ptr;
+  child_info *get_cygwin_startup_info ();
+  child_info_spawn *ci = (child_info_spawn *) get_cygwin_startup_info ();
+
+  if (ci && (ptr = ci->moreinfo->envp))
+    {
+      for (; *ptr; ptr++)
+       if (strncasematch (name, *ptr, s)
+           && (*(*ptr + s) == '='))
+         return *ptr + s + 1;
+    }
+  else if ((rawenv = GetEnvironmentStrings ()))
+    {
+      while (*rawenv)
+       if (strncasematch (name, rawenv, s)
+           && (*(rawenv + s) == '='))
+         return rawenv + s + 1;
+       else
+         rawenv = strchr (rawenv, 0) + 1;
+    }
+  return NULL;
+}
+
+/*
  * getenv --
  *     Returns ptr to value associated with name, if any, else NULL.
  */
@@ -232,7 +265,8 @@ extern "C" char *
 getenv (const char *name)
 {
   int offset;
-
+  if (!__cygwin_environ)
+    return getearly (name);
   return my_findenv (name, &offset);
 }

environ.diff (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [Patch] Make getenv() functional before the environment is initialized

Corinna Vinschen-2
On Apr  6 12:35, Pierre A. Humblet wrote:
>        * environ.cc (getearly): New function.
>           (getenv) : Call getearly if needed.

Thanks for the patch and sorry for the loooong delay.  I've applied a
slightly tweaked version of your patch, which uses a function pointer in
getenv, instead of adding a conditional.


Thanks,
Corinna

--
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat
Reply | Threaded
Open this post in threaded view
|

Re: [Patch] Make getenv() functional before the environment is initialized

Pierre A. Humblet-2

----- Original Message -----
From: "Corinna Vinschen"
To: <[hidden email]>
Sent: Friday, April 21, 2006 1:23 PM
Subject: Re: [Patch] Make getenv() functional before the environment is
initialized


> On Apr  6 12:35, Pierre A. Humblet wrote:
>>        * environ.cc (getearly): New function.
>>           (getenv) : Call getearly if needed.
>
> Thanks for the patch and sorry for the loooong delay.  I've applied a
> slightly tweaked version of your patch, which uses a function pointer in
> getenv, instead of adding a conditional.
>

Corinna,

Thanks! Since sending the patch, I have found some issues with it :(

In particular GetEnvironmentStrings returns a big block of
storage that should be free (which we can't do), and that is
going to be lost on a fork, potentially leading to trouble.

Thus I have another implementation using GetEnvironmentValue
and cmalloc. (with HEAP_1_MAX, so that it will be released
on the next exec).
I also take advantage of spawn_info, whose existence I had forgotten.
Overall it's also simpler.

Here is another patch, sorry for not sending this earlier.

Pierre

2006-04-21 Pierre Humblet [hidden email]

        * environ.cc (getearly): Use GetEnvironmentVariable and cmalloc
        instead of GetEnvironmentStrings.

Index: environ.cc
===================================================================
RCS file: /cvs/src/src/winsup/cygwin/environ.cc,v
retrieving revision 1.140
diff -u -p -b -r1.140 environ.cc
--- environ.cc  21 Apr 2006 17:21:41 -0000      1.140
+++ environ.cc  21 Apr 2006 18:37:55 -0000
@@ -231,28 +231,21 @@ my_findenv (const char *name, int *offse
 static char * __stdcall
 getearly (const char * name, int *offset __attribute__ ((unused)))
 {
-  int s = strlen (name);
-  char * rawenv;
-  char ** ptr;
-  child_info *get_cygwin_startup_info ();
-  child_info_spawn *ci = (child_info_spawn *) get_cygwin_startup_info ();
+  int s;
+  char ** ptr, * ret;

-  if (ci && (ptr = ci->moreinfo->envp))
+  if (spawn_info && (ptr = spawn_info->moreinfo->envp))
     {
+      s = strlen (name);
       for (; *ptr; ptr++)
        if (strncasematch (name, *ptr, s)
            && (*(*ptr + s) == '='))
          return *ptr + s + 1;
     }
-  else if ((rawenv = GetEnvironmentStrings ()))
-    {
-      while (*rawenv)
-       if (strncasematch (name, rawenv, s)
-           && (*(rawenv + s) == '='))
-         return rawenv + s + 1;
-       else
-         rawenv = strchr (rawenv, 0) + 1;
-    }
+  else if ((s = GetEnvironmentVariable (name, NULL, 0))
+          && (ret = (char *) cmalloc (HEAP_1_MAX, s))
+          && GetEnvironmentVariable (name, ret, s))
+    return ret;
   return NULL;
 }



Reply | Threaded
Open this post in threaded view
|

Re: [Patch] Make getenv() functional before the environment is initialized

Christopher Faylor-2
On Fri, Apr 21, 2006 at 02:52:06PM -0400, Pierre A. Humblet wrote:

>----- Original Message -----
>From: "Corinna Vinschen"
>To: <[hidden email]>
>Sent: Friday, April 21, 2006 1:23 PM
>Subject: Re: [Patch] Make getenv() functional before the environment is
>initialized
>
>
>>On Apr  6 12:35, Pierre A. Humblet wrote:
>>>       * environ.cc (getearly): New function.
>>>          (getenv) : Call getearly if needed.
>>
>>Thanks for the patch and sorry for the loooong delay.  I've applied a
>>slightly tweaked version of your patch, which uses a function pointer in
>>getenv, instead of adding a conditional.
>>
>
>Corinna,
>
>Thanks! Since sending the patch, I have found some issues with it :(
>
>In particular GetEnvironmentStrings returns a big block of
>storage that should be free (which we can't do), and that is
>going to be lost on a fork, potentially leading to trouble.
>
>Thus I have another implementation using GetEnvironmentValue
>and cmalloc. (with HEAP_1_MAX, so that it will be released
>on the next exec).
>I also take advantage of spawn_info, whose existence I had forgotten.
>Overall it's also simpler.
>
>Here is another patch, sorry for not sending this earlier.

I don't see any reason to permanently allocate memory with cmalloc.

I think that using GetEnvironmentStrings is still the right choice here.
You just have to make sure that it gets freed.  I'm going to check in a
cleanup of getearly which will move the rawenv variable to a static
which will potentially be used by environ_init.  Then environ_init will
free it if it has been previously set.

I've made some other minor stylistic changes to the function as well.

cgf
Reply | Threaded
Open this post in threaded view
|

Re: [Patch] Make getenv() functional before the environment is initialized

Pierre A. Humblet-2

----- Original Message -----
From: "Christopher Faylor" <[hidden email]>
To: <[hidden email]>
Sent: Friday, April 21, 2006 3:13 PM
Subject: Re: [Patch] Make getenv() functional before the environment is
initialized


> On Fri, Apr 21, 2006 at 02:52:06PM -0400, Pierre A. Humblet wrote:
>>
>>In particular GetEnvironmentStrings returns a big block of
>>storage that should be free (which we can't do), and that is
>>going to be lost on a fork, potentially leading to trouble.
>>
>>Thus I have another implementation using GetEnvironmentValue
>>and cmalloc. (with HEAP_1_MAX, so that it will be released
>>on the next exec).
>>I also take advantage of spawn_info, whose existence I had forgotten.
>>Overall it's also simpler.
>>
>>Here is another patch, sorry for not sending this earlier.
>
> I don't see any reason to permanently allocate memory with cmalloc.
>
> I think that using GetEnvironmentStrings is still the right choice here.
> You just have to make sure that it gets freed.  I'm going to check in a
> cleanup of getearly which will move the rawenv variable to a static
> which will potentially be used by environ_init.  Then environ_init will
> free it if it has been previously set.

But doesn't the program then have a pointer to memory that has been freed?
That pointer can also be accessed after forks.

Pierre

Reply | Threaded
Open this post in threaded view
|

Re: [Patch] Make getenv() functional before the environment is initialized

Christopher Faylor-2
On Fri, Apr 21, 2006 at 03:45:54PM -0400, Pierre A. Humblet wrote:
>----- Original Message -----
>From: "Christopher Faylor" <[hidden email]>
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>To: <[hidden email]>
     ^^^^^^^^^^^^^^^^^^^^^^^^^^

>Sent: Friday, April 21, 2006 3:13 PM
>Subject: Re: [Patch] Make getenv() functional before the environment is
>initialized
>
>
>>On Fri, Apr 21, 2006 at 02:52:06PM -0400, Pierre A. Humblet wrote:
>>>
>>>In particular GetEnvironmentStrings returns a big block of
>>>storage that should be free (which we can't do), and that is
>>>going to be lost on a fork, potentially leading to trouble.
>>>
>>>Thus I have another implementation using GetEnvironmentValue
>>>and cmalloc. (with HEAP_1_MAX, so that it will be released
>>>on the next exec).
>>>I also take advantage of spawn_info, whose existence I had forgotten.
>>>Overall it's also simpler.
>>>
>>>Here is another patch, sorry for not sending this earlier.
>>
>>I don't see any reason to permanently allocate memory with cmalloc.
>>
>>I think that using GetEnvironmentStrings is still the right choice here.
>>You just have to make sure that it gets freed.  I'm going to check in a
>>cleanup of getearly which will move the rawenv variable to a static
>>which will potentially be used by environ_init.  Then environ_init will
>>free it if it has been previously set.
>
>But doesn't the program then have a pointer to memory that has been freed?
>That pointer can also be accessed after forks.

Isn't that always a possibility?  You can't rely on the persistence of
the stuff returned from getenv().

cgf
Reply | Threaded
Open this post in threaded view
|

Re: [Patch] Make getenv() functional before the environment is initialized

Pierre A. Humblet-2

----- Original Message -----
From: "Christopher Faylor" <[hidden email]>
To: <[hidden email]>
Sent: Friday, April 21, 2006 4:12 PM
Subject: Re: [Patch] Make getenv() functional before the environment is
initialized


>>
>>But doesn't the program then have a pointer to memory that has been freed?
>>That pointer can also be accessed after forks.
>
> Isn't that always a possibility?  You can't rely on the persistence of
> the stuff returned from getenv().

That's not my reading of
http://www.opengroup.org/onlinepubs/000095399/functions/getenv.html

"The string pointed to may be overwritten by a subsequent call to getenv(),
 but shall not be overwritten by a call to any other function in this volume of
IEEE Std 1003.1-2001."

Athough Posix allows the string to be overwritten, indicating that persistence
is implied,
it does not allow the pointer to become invalid.

See also
http://developer.apple.com/documentation/Darwin/Reference/Manpages/man3/getenv.3.html
which says that the environment semantics make it inherently leaky.
That's why I didn't hesitate calling cmalloc

Pierre

Reply | Threaded
Open this post in threaded view
|

Re: [Patch] Make getenv() functional before the environment is initialized

Corinna Vinschen-2
On Apr 21 16:32, Pierre A. Humblet wrote:

>
> ----- Original Message -----
> From: "Christopher Faylor" <[hidden email]>
> To: <[hidden email]>
> Sent: Friday, April 21, 2006 4:12 PM
> Subject: Re: [Patch] Make getenv() functional before the environment is
> initialized
>
>
> >>
> >>But doesn't the program then have a pointer to memory that has been freed?
> >>That pointer can also be accessed after forks.
> >
> >Isn't that always a possibility?  You can't rely on the persistence of
> >the stuff returned from getenv().
>
> That's not my reading of
> http://www.opengroup.org/onlinepubs/000095399/functions/getenv.html
>
> "The string pointed to may be overwritten by a subsequent call to getenv(),
> but shall not be overwritten by a call to any other function in this volume
> of IEEE Std 1003.1-2001."
>
> Athough Posix allows the string to be overwritten, indicating that
> persistence is implied,
> it does not allow the pointer to become invalid.
>
> See also
> http://developer.apple.com/documentation/Darwin/Reference/Manpages/man3/getenv.3.html
> which says that the environment semantics make it inherently leaky.
> That's why I didn't hesitate calling cmalloc

The getearly function is only called in the initialization phase of the
application, when the Cygwin environment isn't initialized.

Why is it necessary to make this so complicated?  Why isn't it
sufficient to return the value in a static buffer and tolerate that a
lib or application which hooks into this early stage of initialization
has to copy the value, if it needs it later on?


Corinna


--
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat
Reply | Threaded
Open this post in threaded view
|

Re: [Patch] Make getenv() functional before the environment is initialized

Christopher Faylor-2
In reply to this post by Pierre A. Humblet-2
On Fri, Apr 21, 2006 at 04:32:25PM -0400, Pierre A. Humblet wrote:
>From: "Christopher Faylor" <...>
>To:

http://cygwin.com/acronyms/#PCYMTNQREAIYR

cgf
Reply | Threaded
Open this post in threaded view
|

Re: [Patch] Make getenv() functional before the environment is initialized

Christopher Faylor-2
In reply to this post by Pierre A. Humblet-2
I just talked to Corinna about this on IRC and neither of us really
cares enough about this to merit a long discussion so I've just checked
in a variation of the cmalloc patch.  The only change that I made was to
define a HEAP_2_STR value so that the HEAP_1_MAX usage is confined to
cygheap.cc where I'd intended it.

cgf
Reply | Threaded
Open this post in threaded view
|

Re: [Patch] Make getenv() functional before the environment is initialized

Pierre A. Humblet-2
----- Original Message -----
From: "Christopher Faylor"
To: <[hidden email]>
Sent: Friday, April 21, 2006 5:39 PM
Subject: Re: [Patch] Make getenv() functional before the environment is
initialized


>I just talked to Corinna about this on IRC and neither of us really
> cares enough about this to merit a long discussion so I've just checked
> in a variation of the cmalloc patch.  The only change that I made was to
> define a HEAP_2_STR value so that the HEAP_1_MAX usage is confined to
> cygheap.cc where I'd intended it.

Thanks a lot, Chris & Corinna.

Now that I am trying it, it doesn't work anymore when launched from Cygwin.

I am starting to wonder if the current
*ptr[len] == '='
is equivalent to the former
*(*ptr + s) == '='
when s = len and ptr is a char **

Pierre

Reply | Threaded
Open this post in threaded view
|

Re: [Patch] Make getenv() functional before the environment is initialized

Christopher Faylor-2
On Mon, Apr 24, 2006 at 12:16:34PM -0400, Pierre A. Humblet wrote:

>----- Original Message -----
>From: "Christopher Faylor"
>To: <[hidden email]>
>Sent: Friday, April 21, 2006 5:39 PM
>Subject: Re: [Patch] Make getenv() functional before the environment is
>initialized
>
>
>>I just talked to Corinna about this on IRC and neither of us really
>>cares enough about this to merit a long discussion so I've just checked
>>in a variation of the cmalloc patch.  The only change that I made was to
>>define a HEAP_2_STR value so that the HEAP_1_MAX usage is confined to
>>cygheap.cc where I'd intended it.
>
>Thanks a lot, Chris & Corinna.
>
>Now that I am trying it, it doesn't work anymore when launched from Cygwin.
>
>I am starting to wonder if the current
>*ptr[len] == '='
>is equivalent to the former
>*(*ptr + s) == '='
>when s = len and ptr is a char **

If it didn't work, I'd think that g++ would complain.  However, this
isn't really something that needs much discussion since it is easy to
test.

Create the below program and name it 'foo'.  Then, make sure that the
current directory is in your path, and type 'foo'.

I get a "it works" when I do this.

cgf

#include <stdio.h>

int
main (int argc, char **argv)
{
  if (*argv[0] == 'f')
    puts ("it works");
}
Reply | Threaded
Open this post in threaded view
|

Re: [Patch] Make getenv() functional before the environment is initialized

Christopher Faylor-2
On Mon, Apr 24, 2006 at 12:35:59PM -0400, Christopher Faylor wrote:

>I get a "it works" when I do this.
>
>#include <stdio.h>
>
>int
>main (int argc, char **argv)
>{
>  if (*argv[0] == 'f')
>    puts ("it works");
>}

But, in keeping with my previously noticed propensity for gaining
intelligence five seconds after hitting 'y', it occurred to me that my
test case was flawed.  Changing the above 'if' so that it did this:

  if (*argv[1] == 'o')

resulted in a SEGV.  So I got the precedence of '*' wrong.  I've checked
in a patch to deal with this problem.

cgf