Discussion:
[Mason-devel] possible bug in Mason 1.36
Beaudet, David P.
2007-08-06 20:09:10 UTC
Permalink
I discovered that under Apache 1.3 (running the Bricolage CMS), HTTP
headers are getting sent twice - once correctly, and once after the page
has been sent back to the browser. As a result, I'm getting HTTP
headers displayed in the HTML for certain pages. Downgrading to Mason
1.35 fixes the problem. I believe this problem only appears under
Apache 1.x.

I think there might be a bug in Mason's ApacheHandler.pm, line 954,
introduced with version 1.36.

Here's what I think is going on:

(1) ApacheHandler.pm, line 954: content_type('') is accepted by Apache
as an empty string (which might be invalid in its own right, but unsure)
and is output as an empty string when the Mason OUT method sends the
content type via send_http_header() the first through.

(2) Further down the chain, ApacheHandler.pm's http_header_sent() method
checks the value of the "Content-Type" header against undef to see
whether the headers were already sent and receives undef back, so the
fact that the header was already sent with ('') as content_type is not
registered.

(3) So, EXEC happily sends a second header with a blank Content-Type.


Resolution:
Changing line 954 from $r->content_type('') back to what it used to be
in 1.35 $r->content_type(undef) fixes the duplicate header problem.
There's no documentation I could find explaining why that was changed in
1.36.

Here's a patch, in case it is in fact a bug that can be fixed in the
next release.


--- ./ApacheHandler.pm 2007-08-06 16:01:08.000000000 -0400

+++ ./ApacheHandler.pm 2007-08-06 16:05:16.000000000 -0400

@@ -951,7 +951,7 @@

my $is_dir = -d $r->filename;



if ($is_dir && !$self->decline_dirs) {

- $r->content_type('');

+ $r->content_type(undef);

}

return $is_dir ? 'dir' : -f _ ? 'file' : 'other';

}
Dave Rolsky
2007-08-19 16:38:57 UTC
Permalink
Post by Beaudet, David P.
(1) ApacheHandler.pm, line 954: content_type('') is accepted by Apache
as an empty string (which might be invalid in its own right, but unsure)
and is output as an empty string when the Mason OUT method sends the
content type via send_http_header() the first through.
An empty string is almost certainly invalid as a content type, but we're
using it intentionally as a way to mark the request as being for a
directory. Otherwise the content type is something like
"httpd/unix-directory", which is probably not what you want.

A request for a directory should always be handled by a dhandler with
Mason, and you should always set the content_type explicitly in a
dhandler. There's actually a note about this in the docs in
HTML::Mason::Admin under "Allowing Directory Requests".
Post by Beaudet, David P.
(2) Further down the chain, ApacheHandler.pm's http_header_sent() method
checks the value of the "Content-Type" header against undef to see
whether the headers were already sent and receives undef back, so the
fact that the header was already sent with ('') as content_type is not
registered.
It's not checking against undef, it's just checking for truth, which means
that the empty string will be false.
Post by Beaudet, David P.
Changing line 954 from $r->content_type('') back to what it used to be
in 1.35 $r->content_type(undef) fixes the duplicate header problem.
There's no documentation I could find explaining why that was changed in
1.36.
The reason was that passing undef to this method caused a "Use of
unitialized value in subroutine".

I'm not sure what's happening with Bricolage, but I cannot reproduce this
doubled header problem in my tests. Can you provide a simple test case? Or
maybe the problem is that Bricolage has a dhandler which is not setting
the content type explicitly.


-dave

/*===================================================
VegGuide.Org www.BookIRead.com
Your guide to all that's veg. My book blog
===================================================*/
Beaudet, David P.
2007-08-19 18:07:33 UTC
Permalink
Thanks for your help with this. We'll take a look at Bricolage dhandlers first and construct a simple test case if it turns out not to be Bricolage specific, but will report back either way.


________________________________

From: Dave Rolsky [mailto:***@urth.org]
Sent: Sun 8/19/2007 12:38 PM
To: mason-***@lists.sourceforge.net
Cc: Beaudet, David P.
Subject: Re: [Mason-devel] possible bug in Mason 1.36
Post by Beaudet, David P.
(1) ApacheHandler.pm, line 954: content_type('') is accepted by Apache
as an empty string (which might be invalid in its own right, but unsure)
and is output as an empty string when the Mason OUT method sends the
content type via send_http_header() the first through.
An empty string is almost certainly invalid as a content type, but we're
using it intentionally as a way to mark the request as being for a
directory. Otherwise the content type is something like
"httpd/unix-directory", which is probably not what you want.

A request for a directory should always be handled by a dhandler with
Mason, and you should always set the content_type explicitly in a
dhandler. There's actually a note about this in the docs in
HTML::Mason::Admin under "Allowing Directory Requests".
Post by Beaudet, David P.
(2) Further down the chain, ApacheHandler.pm's http_header_sent() method
checks the value of the "Content-Type" header against undef to see
whether the headers were already sent and receives undef back, so the
fact that the header was already sent with ('') as content_type is not
registered.
It's not checking against undef, it's just checking for truth, which means
that the empty string will be false.
Post by Beaudet, David P.
Changing line 954 from $r->content_type('') back to what it used to be
in 1.35 $r->content_type(undef) fixes the duplicate header problem.
There's no documentation I could find explaining why that was changed in
1.36.
The reason was that passing undef to this method caused a "Use of
unitialized value in subroutine".

I'm not sure what's happening with Bricolage, but I cannot reproduce this
doubled header problem in my tests. Can you provide a simple test case? Or
maybe the problem is that Bricolage has a dhandler which is not setting
the content type explicitly.


-dave

/*===================================================
VegGuide.Org www.BookIRead.com
Your guide to all that's veg. My book blog
===================================================*/
David E. Wheeler
2007-08-20 00:32:14 UTC
Permalink
Post by Dave Rolsky
I'm not sure what's happening with Bricolage, but I cannot
reproduce this
doubled header problem in my tests. Can you provide a simple test case? Or
maybe the problem is that Bricolage has a dhandler which is not setting
the content type explicitly.
Well, we set the content type in Apache, rather than Mason. Our
Apache handler has ($ah is the Mason ApacheHandler):

eval {
# Prevent browsers from caching pages.
$r->no_cache(1);
# Set up the language and content type headers.
$r->content_languages([$lang_name]);
$r->content_type('text/html; charset=' . lc $char_set);
Bric::Util::Pref->use_user_prefs(1);

# Start the database transactions.
begin(1);
# Handle the request.
$status = $ah->handle_request($r);
# Commit the database transactions.
commit(1);

Bric::Util::Pref->use_user_prefs(0);
};

IIRC, we did this so that the dhandlers wouldn't have to, as we have
a lot of dhandlers and, at the time we wrote Bricolage, served
nothing but HTML. So could this be causing the problem? I'd like to
keep the setting of the content-type central like this, if possible.

Thanks,

David
Dave Rolsky
2007-08-20 14:55:09 UTC
Permalink
Well, we set the content type in Apache, rather than Mason. Our Apache
eval {
# Prevent browsers from caching pages.
$r->no_cache(1);
# Set up the language and content type headers.
$r->content_languages([$lang_name]);
$r->content_type('text/html; charset=' . lc $char_set);
Bric::Util::Pref->use_user_prefs(1);
# Start the database transactions.
begin(1);
# Handle the request.
$status = $ah->handle_request($r);
# Commit the database transactions.
commit(1);
Bric::Util::Pref->use_user_prefs(0);
};
IIRC, we did this so that the dhandlers wouldn't have to, as we have a lot of
dhandlers and, at the time we wrote Bricolage, served nothing but HTML. So
could this be causing the problem? I'd like to keep the setting of the
content-type central like this, if possible.
Hmm, this is weird. I don't see how this would've worked before, since
Mason would've overridden the content type when the request was for a
directory. And I definitely don't see why this would break after upgrading
Mason to 1.36, since that hasn't fundamentally changed.


-dave

/*===================================================
VegGuide.Org www.BookIRead.com
Your guide to all that's veg. My book blog
===================================================*/
David E. Wheeler
2007-08-21 03:53:07 UTC
Permalink
Post by Dave Rolsky
Hmm, this is weird. I don't see how this would've worked before, since
Mason would've overridden the content type when the request was for a
directory. And I definitely don't see why this would break after upgrading
Mason to 1.36, since that hasn't fundamentally changed.
So how can we globally set it again?

David
Dave Rolsky
2007-08-21 19:00:53 UTC
Permalink
Post by David E. Wheeler
Post by Dave Rolsky
Hmm, this is weird. I don't see how this would've worked before, since
Mason would've overridden the content type when the request was for a
directory. And I definitely don't see why this would break after upgrading
Mason to 1.36, since that hasn't fundamentally changed.
So how can we globally set it again?
Globally set the content type, you mean?

My point was simply that while I don't understand how this code worked
with Mason 1.35, there's no reason it shouldn't continue to work with 1.36
;)


-dave

/*===================================================
VegGuide.Org www.BookIRead.com
Your guide to all that's veg. My book blog
===================================================*/
David E. Wheeler
2007-08-21 23:42:46 UTC
Permalink
Post by Dave Rolsky
Post by David E. Wheeler
So how can we globally set it again?
Globally set the content type, you mean?
Yes.
Post by Dave Rolsky
My point was simply that while I don't understand how this code
worked with Mason 1.35, there's no reason it shouldn't continue to
work with 1.36 ;)
Um, the reverse is true, as David Beaudet pointed out.

Best,

David
Scott Lanning
2007-08-22 14:02:44 UTC
Permalink
Post by Dave Rolsky
My point was simply that while I don't understand how this code worked
with Mason 1.35, there's no reason it shouldn't continue to work with 1.36
Mason checks twice in HTML::Mason::ApacheHandler whether to send headers
during the request in Bricolage:
1) in the _set_mason_req_out_method, where it builds the out_method sub,
for mod_perl1 it sends headers out and saves whether they've been
sent in a $sent_headers closure/static variable
2) in exec, one condition for calling $r->send_http_header
is !HTML::Mason::ApacheHandler::http_header_sent($r).
This is where the second header comes out.
http_header_sent does this

sub http_header_sent { shift->headers_out->{"Content-type"} }

to supposedly determine if headers have been sent.

Mason version 1.35 behavior, Mason does $r->content_type(undef),
Dumper($r->headers_out):

$VAR1 = bless( {
'Set-Cookie' => 'BRICOLAGE_AUTH=....',
'Pragma' => 'no-cache',
'Cache-control' => 'no-cache',
'Connection' => 'close',
'Transfer-Encoding' => 'chunked',
'Content-Type' => 'text/html; charset=utf-8',
'Content-Language' => 'en_us',
'Expires' => 'Wed, 22 Aug 2007 13:11:21 GMT'
}, 'Apache::Table' );

Content-Type is 'text/html; charset=utf-8',
so http_header_sent returns a true value.


Mason version 1.36 behavior, Mason does $r->content_type(''),
Dumper($r->headers_out):

$VAR1 = bless( {
'Set-Cookie' => 'BRICOLAGE_AUTH=....',
'Pragma' => 'no-cache',
'Cache-control' => 'no-cache',
'Connection' => 'close',
'Transfer-Encoding' => 'chunked',
'Content-Type' => '',
'Content-Language' => 'en_us',
'Expires' => 'Wed, 22 Aug 2007 13:12:49 GMT'
}, 'Apache::Table' );

Content-Type is '',
so http_header_sent returns a false value.

In Apache, src/modules/perl/Apache.xs has

char *
content_type(r, ...)
Apache r

CODE:
get_set_PVp(r->content_type,r->pool);

OUTPUT:
RETVAL

where get_set_PVp is in src/modules/perl/mod_perl_xs.h

#define get_set_PVp(thing,p) \
RETVAL = (char*)thing; \
if(items > 1) \
thing = (char*)(SvOK(ST(1)) ? pstrdup(p, SvPV(ST(1),na)) : NULL)

So passing undef or an empty string to $r->content_type
are apparently two different things, undef doing nothing,
empty string overwriting.
(Though I admit I'm a little confused. I think items == 2,
which is greater than 1, in both cases. So SvOK(ST(1)) will return
true if it's an empty string, false if it's undef, but don't those
both set `thing', which is a pointer to r->content_type ?)
David E. Wheeler
2007-08-22 16:51:25 UTC
Permalink
Post by Scott Lanning
So passing undef or an empty string to $r->content_type
are apparently two different things, undef doing nothing,
empty string overwriting.
So, to translate, is this a Mason bug? Or a Mason feature? Does it
need to be fixed in Mason or in Bricolage?

Thanks,

David
Scott Lanning
2007-08-23 08:45:27 UTC
Permalink
Post by David E. Wheeler
Post by Scott Lanning
So passing undef or an empty string to $r->content_type
are apparently two different things, undef doing nothing,
empty string overwriting.
So, to translate, is this a Mason bug? Or a Mason feature? Does it
need to be fixed in Mason or in Bricolage?
Well, it seems like a Mason bug to me.

The handling of whether or not headers have been sent
appears to be broken in Mason, or at least sprained.
Presumably that needs cleaned up.
I pointed out the two different ways that Mason is checking
whether headers have been sent:
sometimes with a $sent_headers variable,
sometimes with a questionable http_header_sent function.

I guess Bricolage can implement an H::M::ApacheHandler subclass
that overrides the _request_fs_type function and uses
$r->content_type(undef) instead of $r->content_type('').
Though, I mean, that method has an underscore in front of it.
It's dirty thing, anyway, that a function that's
requesting the filesystem type has a side-effect
of resetting the Content-type header. Sometimes.
Bricolage isn't exactly immaculate itself, though. ;)
Dave Rolsky
2007-08-24 00:42:28 UTC
Permalink
Post by Scott Lanning
Well, it seems like a Mason bug to me.
The handling of whether or not headers have been sent appears to be
broken in Mason, or at least sprained. Presumably that needs cleaned up.
I pointed out the two different ways that Mason is checking whether
headers have been sent: sometimes with a $sent_headers variable,
sometimes with a questionable http_header_sent function.
There's a couple things going on here, I think.

First, I don't know that I entirely understand how this bug manifests, and
in particular I don't know why the changes from 1.35 to 1.36 would cause
it. Specifically, why would setting $r->content_type to '' instead of
undef be the trigger? Mason's (potentially broken) code to determine
whether headers were sent should work the same way in both of those cases.

That said, looking closely at how we determine if headers were sent, it
does seem pretty bogus, and I'd like to clean it up.

However, _before_ I do that I'd love to have a test case that demonstrates
the current bug.


-dave

/*===================================================
VegGuide.Org www.BookIRead.com
Your guide to all that's veg. My book blog
===================================================*/
Beaudet, David P.
2007-08-24 02:59:37 UTC
Permalink
Post by Dave Rolsky
Specifically, why would setting $r->content_type to '' instead of
undef be the trigger? Mason's (potentially broken) code to determine
whether headers were sent should work the same way in both of those cases.
IMHO, this is definitely the crux of the issue. Apache accepts '' as the content_type which is evident in that it comes out as '' in the http headers when Mason's out method sends the headers the first time. However, as I pointed out in my original post, there's a check further down in the Mason code that query's Apache for the content type... Apache returns undef even though '' was sent for the content_type so Mason thinks no headers were sent and sends them again.
Scott Lanning
2007-08-24 14:03:42 UTC
Permalink
Post by Dave Rolsky
However, _before_ I do that I'd love to have a test case that
demonstrates the current bug.
I tested to see if there's a difference between $r->content_type('')
and $r->content_type(undef), but it seems those actually work
as expected.... So I made a test case in Mason,
as you will see here.

At bottom of /etc/apache-perl/httpd.conf :
-----

PerlModule HappyHandler
<Location /happy>
SetHandler perl-script
PerlHandler HappyHandler
</Location>

-----
Contents of /usr/local/lib/site_perl/HappyHandler.pm ,
which is in perl -le'print "@INC"' :
-----

package HappyHandler;
use strict;
use warnings;
use HTML::Mason::ApacheHandler;
use Apache::Constants qw(:common);

my $ah = HTML::Mason::ApacheHandler->new(
comp_root => '/var/www/happy',
data_dir => '/tmp/data',
decline_dirs => 0,
);

sub handler {
my ($r) = @_;

$r->content_type('text/html; charset=utf-8');
$ah->handle_request($r);

return OK;
}

1;

-----
Contents of /var/www/happy/autohandler :
-----

% $m->call_next();

-----
Contents of /var/www/happy/dhandler :
-----

dhandler: '<% defined($r->content_type) ? $r->content_type : 'undef' %>'

-----


With HTML::Mason 1.36, this is the output at
http://localhost/happy :
-----

dhandler: ''
HTTP/1.1 200 OK
Date: Fri, 24 Aug 2007 13:53:19 GMT
Server: Apache/1.3.34 (Ubuntu) PHP/4.4.2-1build1 mod_perl/1.29
Keep-Alive: timeout=15, max=99
Connection: Keep-Alive, Keep-Alive
Transfer-Encoding: chunked, chunked
Content-Type:

-----
The component has an empty string as the Content-type,
and an additional set of headers is output.
But if I apply the following patch,
-----

--- /usr/local/share/perl/5.8.7/HTML/Mason/ApacheHandler.pm~ 2007-08-22
12:15:31.000000000 +0200
+++ /usr/local/share/perl/5.8.7/HTML/Mason/ApacheHandler.pm 2007-08-24
15:54:00.000000000 +0200
@@ -951,7 +951,8 @@
my $is_dir = -d $r->filename;

if ($is_dir && !$self->decline_dirs) {
- $r->content_type('');
+# $r->content_type('');
+ $r->content_type(undef);
}
return $is_dir ? 'dir' : -f _ ? 'file' : 'other';
}

-----
then this is the output at http://localhost/happy :
-----

dhandler: 'undef'

-----

That is, the Content-type is undef, and no additional
set of headers is output. Q.E.D. :)
Scott Lanning
2007-08-24 14:19:41 UTC
Permalink
On Fri, 24 Aug 2007, Scott Lanning wrote:
[...]
Post by Scott Lanning
That is, the Content-type is undef, and no additional
set of headers is output. Q.E.D. :)
On second thought... it's only half the story, I guess,
because the Content-type should be 'text/html; charset=utf-8'
not undef. It's still true that extra headers aren't sent
when Content-type is set to undef.
Dave Rolsky
2007-08-27 16:32:02 UTC
Permalink
Post by Scott Lanning
[...]
Post by Scott Lanning
That is, the Content-type is undef, and no additional
set of headers is output. Q.E.D. :)
On second thought... it's only half the story, I guess,
because the Content-type should be 'text/html; charset=utf-8'
not undef. It's still true that extra headers aren't sent
when Content-type is set to undef.
Well, like I said, if you set the Content-type header before passing
control to Mason, _and_ the request is for a directory, I would expect
that Content-type setting to get erased.

I think this can all be improved by having Mason simply use $r->notes to
track whether or not it has sent headers.


-dave

/*===================================================
VegGuide.Org www.BookIRead.com
Your guide to all that's veg. My book blog
===================================================*/

Loading...