Discussion:
More conf directive issues
(too old to reply)
Dave Rolsky
2000-12-22 21:59:14 UTC
Permalink
Ok, I've got code together that should implement conf directives for all
of the parser, interp & apachehandler options.

One note: This code requires Text::CSV_XS to correctly parse
comma-separated lists (for the preloads option, for example).

However, it occurred to me that it might be good to think about some other
directives for things that are currently done in a handler.pl file.

For example:

MasonTextOnly

would reject all requests for non-text content.

We could even go crazy and do something like:

MasonSession File

although that may be getting into territory better left alone.


-dave

/*==================
www.urth.org
We await the New Sun
==================*/
Jonathan Swartz
2001-01-02 19:15:09 UTC
Permalink
-----Original Message-----
Sent: Friday, December 22, 2000 1:59 PM
Subject: [Mason-devel] More conf directive issues
Ok, I've got code together that should implement conf directives for all
of the parser, interp & apachehandler options.
Cool.
One note: This code requires Text::CSV_XS to correctly parse
comma-separated lists (for the preloads option, for example).
Hm, too bad there isn't a non-XS version. I'd certainly trade compatibility
for efficiency here.

In the absence of Text::CSV_XS, everything except for your conf directives
should still be made to work.
However, it occurred to me that it might be good to think about some other
directives for things that are currently done in a handler.pl file.
I can't think of a reason to have httpd.conf directives that don't
correspond directly with Mason commands or Interp/ApacheHandler parameters.
If this means implementing new commands or parameters, so be it.

Suppose, for example, we implement a series of convenient "MasonSession"
directives that effectively replace a block of code with a single line.
There's no reason why this convenience and power shouldn't be available to
handler.pl developers.
MasonTextOnly
would reject all requests for non-text content.
Perhaps, though note that this can be done completely with Apache
directives, e.g.

<FilesMatch "\.(html|tpl|txt)$">
SetHandler perl-script
PerlHandler HTML::Mason
</FilesMatch>

This may actually be preferred, as it forces people to think about file
extensions and leads down the path to separating top-level/non-top-level
components by extension. I'm increasingly seeing this as an important
stylistic practice. Perhaps the documentation should change to favor (or at
least include) this method.

Jon
Jonathan Swartz
2001-01-02 21:24:22 UTC
Permalink
-----Original Message-----
Sent: Friday, December 22, 2000 1:59 PM
Subject: [Mason-devel] More conf directive issues
Ok, I've got code together that should implement conf directives for all
of the parser, interp & apachehandler options.
Cool.
One note: This code requires Text::CSV_XS to correctly parse
comma-separated lists (for the preloads option, for example).
Hm, too bad there isn't a non-XS version. I'd certainly trade compatibility
for efficiency here.

In the absence of Text::CSV_XS, everything except for your conf directives
should still be made to work.
However, it occurred to me that it might be good to think about some other
directives for things that are currently done in a handler.pl file.
I can't think of a reason to have httpd.conf directives that don't
correspond directly with Mason commands or Interp/ApacheHandler parameters.
If this means implementing new commands or parameters, so be it.

Suppose, for example, we implement a series of convenient "MasonSession"
directives that effectively replace a block of code with a single line.
There's no reason why this convenience and power shouldn't be available to
handler.pl developers.
MasonTextOnly
would reject all requests for non-text content.
Perhaps, though note that this can be done completely with Apache
directives, e.g.

<FilesMatch "\.(html|tpl|txt)$">
SetHandler perl-script
PerlHandler HTML::Mason
</FilesMatch>

This may actually be preferred, as it forces people to think about file
extensions and leads down the path to separating top-level/non-top-level
components by extension. I'm increasingly seeing this as an important
stylistic practice. Perhaps the documentation should change to favor (or at
least include) this method.

Jon
Dave Rolsky
2001-01-02 21:33:48 UTC
Permalink
Post by Jonathan Swartz
Post by Dave Rolsky
One note: This code requires Text::CSV_XS to correctly parse
comma-separated lists (for the preloads option, for example).
Hm, too bad there isn't a non-XS version. I'd certainly trade compatibility
for efficiency here.
There is a Test::CSV but it is not maintained.

I've used Text::CSV_XS on both Linux and Solaris, at least.
Post by Jonathan Swartz
In the absence of Text::CSV_XS, everything except for your conf directives
should still be made to work.
That's no problem.
Post by Jonathan Swartz
Suppose, for example, we implement a series of convenient "MasonSession"
directives that effectively replace a block of code with a single line.
There's no reason why this convenience and power shouldn't be available to
handler.pl developers.
Good point.


-dave

/*==================
www.urth.org
We await the New Sun
==================*/
Gordon Henriksen
2001-01-03 00:52:40 UTC
Permalink
Post by Jonathan Swartz
Post by Dave Rolsky
One note: This code requires Text::CSV_XS to correctly parse
comma-separated lists (for the preloads option, for example).
Hm, too bad there isn't a non-XS version. I'd certainly trade compatibility
for efficiency here.
There is a Text::CSV that doesn't use XS. It doesn't appear to have quite
the configurability of Text::CSV_XS, though.

It also seems extremely heavyweight (all sorts of pointless autoloading
and otherwise bloat) for something that could easily be accomplished with
a regular expression and a loop. I don't see any reason whatsoever to
introduce a dependency to get this:

sub split_value_list {
$_ = $_[0];

my @values;
while (1) {
if (m(\G
(?:
( \z ) # EOS.
| ( [\s,]+ ) # Whitespace and commas.
| "( [^"\\]* (?:\\.[^"\\]*)* )" # A quoted string.
| ( (?:[^\s",\\]+|\\.)+ ) # A word.
)
)gx)
{
my($eos, $ignore, $quoted, $word) = ($1, $2, $3, $4);

if (defined $eos) {
last;
} elsif (defined $ignore) {
next;
} elsif (defined $quoted) {
# Words and quoted strings can be treated the same.
$word = $quoted;
}

$word =~ s/\\(.)/$1/g;
push @values, $word;
} else {
die "Malformed list.\n";
}
}

return @values;
}

ex:

'a "e" f "gh"' => ("a", "e", "f", "gh")

'' => ()

',,,' => ()

'/usr/local/bugzilla/apache, /home/dev/mason, '
. '"/Users/Gordon/\\\"Web Site\"\\"
=> ("/usr/local/bugzilla/apache",
"/home/dev/mason",
q(/Users/Gordon/\"Web Site"\))
--
Gordon Henriksen
***@actifunds.com
Gordon Henriksen
2001-01-03 20:28:04 UTC
Permalink
Er. I've got no preference, whichever Dave wants to do. It is nice to
lose the XS dependency. I only suggested Text::CSV to save us having
to writing the code (which I knew to be non-trivial) and possibly go
through a debug cycle. The former is moot, since you've already
written it; the latter depends on how confident we are that we've
gotten every case.
Did you get this code from somewhere, or construct it yourself?
It's my own.
Why did you decide to allow both whitespace and commas to separate
elements, btw? That seems like TMTOWTDI without much benefit. I'd be
happy with either whitespace or commas as separators. Apache's
standard seems to be whitespace, so maybe that's the right way to
go...
So any configuration files that were written against the Text::CSV_XS
version would continue to work and the expected httpd.conf behavior (no
commas) would also work. To remove comma delimiters:

- | ( [\s,]+ ) # Whitespace and commas.
+ | ( \s+ ) # Whitespace.
| "( [^"\\]* (?:\\.[^"\\]*)* )" # A quoted string.
- | ( (?:[^\s",\\]+|\\.)+ ) # A word.
+ | ( (?:[^\s"\\]+|\\.)+ ) # A word.

--

Gordon Henriksen
Post by Gordon Henriksen
There is a Text::CSV that doesn't use XS. It doesn't appear to have quite
the configurability of Text::CSV_XS, though.
It also seems extremely heavyweight (all sorts of pointless autoloading
and otherwise bloat) for something that could easily be accomplished with
a regular expression and a loop. I don't see any reason whatsoever to
sub split_value_list {
$_ = $_[0];
while (1) {
if (m(\G
( \z ) # EOS.
| ( [\s,]+ ) # Whitespace and commas.
| "( [^"\\]* (?:\\.[^"\\]*)* )" # A quoted string.
| ( (?:[^\s",\\]+|\\.)+ ) # A word.
)
)gx)
{
my($eos, $ignore, $quoted, $word) = ($1, $2, $3, $4);
if (defined $eos) {
last;
} elsif (defined $ignore) {
next;
} elsif (defined $quoted) {
# Words and quoted strings can be treated the same.
$word = $quoted;
}
$word =~ s/\\(.)/$1/g;
} else {
die "Malformed list.\n";
}
}
}
'a "e" f "gh"' => ("a", "e", "f", "gh")
'' => ()
',,,' => ()
'/usr/local/bugzilla/apache, /home/dev/mason, '
. '"/Users/Gordon/\\\"Web Site\"\\"
=> ("/usr/local/bugzilla/apache",
"/home/dev/mason",
q(/Users/Gordon/\"Web Site"\))
--
Gordon Henriksen
Jonathan Swartz
2001-01-03 19:19:50 UTC
Permalink
Er. I've got no preference, whichever Dave wants to do. It is nice to lose
the XS dependency. I only suggested Text::CSV to save us having to writing
the code (which I knew to be non-trivial) and possibly go through a debug
cycle. The former is moot, since you've already written it; the latter
depends on how confident we are that we've gotten every case.

Did you get this code from somewhere, or construct it yourself?

Why did you decide to allow both whitespace and commas to separate elements,
btw? That seems like TMTOWTDI without much benefit. I'd be happy with either
whitespace or commas as separators. Apache's standard seems to be
whitespace, so maybe that's the right way to go...
Post by Gordon Henriksen
There is a Text::CSV that doesn't use XS. It doesn't appear to have quite
the configurability of Text::CSV_XS, though.
It also seems extremely heavyweight (all sorts of pointless autoloading
and otherwise bloat) for something that could easily be accomplished with
a regular expression and a loop. I don't see any reason whatsoever to
sub split_value_list {
$_ = $_[0];
while (1) {
if (m(\G
( \z ) # EOS.
| ( [\s,]+ ) # Whitespace and commas.
| "( [^"\\]* (?:\\.[^"\\]*)* )" # A quoted string.
| ( (?:[^\s",\\]+|\\.)+ ) # A word.
)
)gx)
{
my($eos, $ignore, $quoted, $word) = ($1, $2, $3, $4);
if (defined $eos) {
last;
} elsif (defined $ignore) {
next;
} elsif (defined $quoted) {
# Words and quoted strings can be treated the same.
$word = $quoted;
}
$word =~ s/\\(.)/$1/g;
} else {
die "Malformed list.\n";
}
}
}
'a "e" f "gh"' => ("a", "e", "f", "gh")
'' => ()
',,,' => ()
'/usr/local/bugzilla/apache, /home/dev/mason, '
. '"/Users/Gordon/\\\"Web Site\"\\"
=> ("/usr/local/bugzilla/apache",
"/home/dev/mason",
q(/Users/Gordon/\"Web Site"\))
--
Gordon Henriksen
Dave Rolsky
2001-01-03 21:38:15 UTC
Permalink
Post by Gordon Henriksen
It also seems extremely heavyweight (all sorts of pointless autoloading
and otherwise bloat) for something that could easily be accomplished with
a regular expression and a loop. I don't see any reason whatsoever to
My concern is that your code does not really handle all the possibilities.

To pick one obvious case, you don't allow for single quotes as a quote
delimiter. Since Apache expects double quotes around longer strings and
will strip them by itself, users may want to use single quotes.

I realize that you could tweak the code to handle this fairly simply. My
concern is that handling CSV type text flexibly and properly is not
trivial and its easy to not realize you've missed something.

OTOH, if you can write a spec of exactly what your code does and does not
handle and people agree that that is good enough then I'm happy to use it.


-dave

/*==================
www.urth.org
We await the New Sun
==================*/
Gordon Henriksen
2001-01-03 23:01:30 UTC
Permalink
Post by Dave Rolsky
Post by Gordon Henriksen
It also seems extremely heavyweight (all sorts of pointless autoloading
and otherwise bloat) for something that could easily be accomplished with
a regular expression and a loop. I don't see any reason whatsoever to
My concern is that your code does not really handle all the possibilities.
To pick one obvious case, you don't allow for single quotes as a quote
delimiter. Since Apache expects double quotes around longer strings and
will strip them by itself, users may want to use single quotes.
Most likely. The code wasn't meant to be taken as gospel. Change the four
double quotes to single quotes and that's done. CSV[_XS] only allow for
one quoting character, which defaults to be a double quote; I used the
same.
Post by Dave Rolsky
OTOH, if you can write a spec of exactly what your code does and does not
handle and people agree that that is good enough then I'm happy to use it.
The subroutine handles an arbitrary sequence sequence of words, quoted
strings, and whitespace.

A word may contain anything but whitespace or quotes.

A quoted string may contain anything but a quote.

Whitespace was defined to include commas.

Characters within words and strings can be escaped with backslashes.
Escape pairs are simply replaced with the escaped character, so the only
characters which are meaningful to escape are " and \.

'word, word, word' -> ("word", "word", "word")
'word word word' -> ("word", "word", "word")
'"word" "word" "word"' -> ("word", "word", "word")
'"\r\n", \ \, ' -> ("rn", " ,")

The only weird-looking thing this does is to not require whitespace at the
boundaries of quoted strings.

'word"quoted"' -> ("word", "quoted")
'"word""word"' -> ("word", "quoted")

Its only error cases are unbalanced quotes and escapes at EOS:

'this is, too: \\' -> error!
'this is an error: "' -> error!

The specific quote and escape characters can be changed trivially; it's
only 20 lines of code, after all.
--
Gordon Henriksen
***@actifunds.com
Ken Williams
2001-01-03 23:19:00 UTC
Permalink
I haven't chimed in on this discussion yet, but I have a question. Why is it
necessary to put CSV code in Mason? Where is it used?
Post by Dave Rolsky
One note: This code requires Text::CSV_XS to correctly parse
comma-separated lists (for the preloads option, for example).
Wouldn't that be better implemented as repeated directives? I imagine you mean
things like:

PerlSetVar MasonPreload "'foo/comp/comp1', 'foo/comp/comp2'"

Which could be done much more clearly, and without new work, via

PerlSetVar MasonPreload foo/comp/comp1
PerlAddVar MasonPreload foo/comp/comp2

I just hate to see us going down a road I'm not convinced we need to
travel. If the needs are more complicated than the above, maybe this
stuff needs to be done in an httpd.conf <Perl> section, or in a
startup.pl that just sets some values.

Perhaps an example would help convince me.


------------------- -------------------
Ken Williams Last Bastion of Euclidity
***@forum.swarthmore.edu The Math Forum
Gordon Henriksen
2001-01-03 23:36:56 UTC
Permalink
Post by Ken Williams
I haven't chimed in on this discussion yet, but I have a question. Why is it
necessary to put CSV code in Mason? Where is it used?
Post by Dave Rolsky
One note: This code requires Text::CSV_XS to correctly parse
comma-separated lists (for the preloads option, for example).
Wouldn't that be better implemented as repeated directives? I imagine you mean
PerlSetVar MasonPreload "'foo/comp/comp1', 'foo/comp/comp2'"
Which could be done much more clearly, and without new work, via
PerlSetVar MasonPreload foo/comp/comp1
PerlAddVar MasonPreload foo/comp/comp2
I just hate to see us going down a road I'm not convinced we need to
travel. If the needs are more complicated than the above, maybe this
stuff needs to be done in an httpd.conf <Perl> section, or in a
startup.pl that just sets some values.
Perhaps an example would help convince me.
I was wondering about this as well.
--
Gordon Henriksen
***@actifunds.com
Dave Rolsky
2001-01-04 00:04:23 UTC
Permalink
Post by Ken Williams
I just hate to see us going down a road I'm not convinced we need to
travel. If the needs are more complicated than the above, maybe this
stuff needs to be done in an httpd.conf <Perl> section, or in a
startup.pl that just sets some values.
Perhaps an example would help convince me.
Doing it on one was just what I thought of first. I actually prefer what
you proposed to long lists.

-dave

/*==================
www.urth.org
We await the New Sun
==================*/
Jonathan Swartz
2001-01-04 01:32:06 UTC
Permalink
What about when we need to represent hashes, such as a comp_root value with
multiple component roots?

Again, I don't care what representation we pick as long as we stick to it
for all future hashes.
-----Original Message-----
Sent: Wednesday, January 03, 2001 4:04 PM
Subject: RE: [Mason-devel] More conf directive issues
Post by Ken Williams
I just hate to see us going down a road I'm not convinced we need to
travel. If the needs are more complicated than the above, maybe this
stuff needs to be done in an httpd.conf <Perl> section, or in a
startup.pl that just sets some values.
Perhaps an example would help convince me.
Doing it on one was just what I thought of first. I actually prefer what
you proposed to long lists.
-dave
/*==================
www.urth.org
We await the New Sun
==================*/
_______________________________________________
Mason-devel mailing list
http://lists.sourceforge.net/mailman/listinfo/mason-devel
Dave Rolsky
2001-01-04 01:53:58 UTC
Permalink
Post by Jonathan Swartz
What about when we need to represent hashes, such as a comp_root value with
multiple component roots?
maybe this:

PerlSetVar MasonCompRoot foo => /foo/bar
PerlAddVar MasonCompRoot bar => /bar/foo

I have no idea what this ends up looking like when I call
Apache->dir_config though.


-dave

/*==================
www.urth.org
We await the New Sun
==================*/
Dave Rolsky
2001-01-04 17:46:18 UTC
Permalink
Post by Ken Williams
Which could be done much more clearly, and without new work, via
PerlSetVar MasonPreload foo/comp/comp1
PerlAddVar MasonPreload foo/comp/comp2
Ok, I've changed the code to handle this. The other option that needs to
possible take a list is MasonCompRoot, which can be one of two things.

PerlSetVar MasonCompRoot /foo/bar

...or....

PerlSetVar MasonCompRoot "foo_root => /foo/bar"
PerlAddVar MasonCompRoot "baz_root => /bar/quux"

This seems clear enough to use. I can't imagine many paths include
\s+=>\s+ in them.


-dave

/*==================
www.urth.org
We await the New Sun
==================*/

Continue reading on narkive:
Loading...