[WM] [Fwd: Improper conversion from & to &]
jlm17
jlm17 at lucent.com
Fri Apr 30 20:37:48 IST 2004
Ok, I found where this is happening. It is in the final HTML cleanup, and the use of HTML::Parser.
From the HTML::Parser documentation:
$p->attr_encoded
$p->attr_encoded( $bool )
By default, the "attr" and @attr argspecs will have general enti-
ties for attribute values decoded. Enabling this attribute leaves
entities alone.
So in the new() function of HTMLCleaner.pm I added this line:
$self->attr_encoded(1);
I'm not sure what the best place for this really is.
Let me know if you want a diff or patch or the actual file with the one line added.
jlm17 wrote:
> I didn't look at the EtText code. I'm not using EtText. That made me
> think, I'm not putting format="text/html" in my <content> tags, and that
> maybe it was defaulting to "text/et". So I have a small example, and I
> put the format="text/html" into it and it still converts & to &. For
> reference here is what I am doing:
>
> <webmake>
> <content name=dud format="text/html">
> <html><body>
> <a href="http://nowhere.com/nofile.pl?&htqdb">test</a>
> </body></html>
> </content>
> <out name="dud" file="dud2.html">
> ${dud}
> </out>
>
> I'm running webmake through the perl debugger. I'll let you know if I
> find something.
>
> Robert Echlin wrote:
>
>> Thanks for that information, jlm.
>> I will watch for that.
>>
>> Did you check in the EtText code?
>>
>> Robert
>>
>> jlm17 wrote:
>>
>>> It appears that webmake is converting my & entities into &. This
>>> is actually incorrect behavior:
>>>
>>>> Ampersands (&'s) in URLs
>>>>
>>>> Another common error occurs when including a URL which contains an
>>>> ampersand ("&"):
>>>>
>>>> <!-- This is invalid! --> <a href="foo.cgi?chapter=1§ion=2">...</a>
>>>>
>>>> This example generates an error for "unknown entity section" because
>>>> the "&" is assumed to begin
>>>
>>>
>>>
>>> >an entity. In many cases, browsers will recover safely from the
>>> error, but the example used here
>>> >will cause the link to fail in Netscape 3.x (but not other versions
>>> of Netscape) since it will
>>> >assume that the author intended to write §ion, which is
>>> equivalent to §ion.
>>>
>>>>
>>>> To avoid problems with both validators and browsers, always use
>>>> & in place of &:
>>>>
>>>> <a href="foo.cgi?chapter=1&section=2">...</a>
>>>
>>>
>>>
>>>
>>> The above is from http://www.htmlhelp.com/tools/validator/problems.html
>>>
>>> So far I have been unlucky in finding out where in the webmake code
>>> this is actually happening.
>>> Everywhere that I see an & in the code it is actually converting
>>> TO it instead of from it.
>>>
>>> _______________________________________________
>>> Webmake-talk mailing list
>>> Webmake-talk at taint.org
>>> http://webmake.taint.org/mailman/listinfo/webmake-talk
>>>
>>
> _______________________________________________
> Webmake-talk mailing list
> Webmake-talk at taint.org
> http://webmake.taint.org/mailman/listinfo/webmake-talk
More information about the Webmake-talk
mailing list