PyXWF: Workarounds, Cookies and Cleanup • zombofant.net
zombofant.nethacking and stuff#skip-to-content#navigationTurn lights on
NavigationHomeAbout usBlogMarch 2013February 2013December 2012PyXWF: Workarounds, Cookies and CleanupOctober 2012September 2012August 2012Browse by tagHackingPython Web FrameworkpyxtrlockTag cloudcupsdspepsonesafedorafirefoxhackinghardwarekdelinuxmarblemetaclasspampyLR1pythonPyWebPyWeblogPyXWFsanescanningsocis2012xcbzombofant.netYou are here: zombofant.netDecember 2012PyXWF: Workarounds, Cookies and CleanupPyXWF: Workarounds, Cookies and Cleanupposted on December 3, 16:48, 2012 UTC by Jonas Wielicki
WorkaroundsCookies!WebStack support crippledHost-based awesomenessCleanupCachesUse plugins for core functionalityXSLT instead of python loopsScript inclusion and other meta-tagsMinor other fixes and featuresBrowsers don’t work. Thats a truth all web developers reach once in their life.
We at PyXWF had this insight some time ago, thats one of the reasons we wanted
a coherent framework abstracting most of what can happen to you as a content
or application developer.Across the last months, where PyXWF was mostly… tested (i. e. in productive
use), some problems with browsers, which were in the majority of the cases no
bugs in PyXWF but rather buggy behaviour of browsers (and machine clients),
appeared and were fixed or worked around.The first was related to MSIE 9, which claims (along the lines with
a lot of other document types) to have support for
application/xhtml+xml, but still being unable to deal with it properly.
Thus, we have a workaround for that, by explicitly checking for that
specific user agent (UA).I thought, hey, this is MSIE, we’re used to it being evil. Then, a few weeks
later, another report of a broken website flew in. It was related to chromium.
In fact, chromium ≤ 7 (admittedly, not the most recent version, but appearantly
still present in some debian versions) prefers application/xhtml+xml, but is
unable to render it correctly. So another exception was added.Well yeah, old browsers and such, such things happen. Then people complained
(hi!) that they cannot properly link our blog posts on g+, those who were
linking Renés posts on his SoCiS participation. Turns out, g+ is unable to
deal with XHTML. Stackoverflow couldn’t help me out there and a bugreport
at the only publicy available bugtracker didn’t help either. Thus,
one more exception was added (quite odd that g+ identifies as Firefox 8
though).Pfew. Things seemed to work for now. Then we started to try to use JavaScript
on a website served from PyXWF. Turns out, in Firefox, it doesn’t work. Failing
at a getElementsByTagName call, I suspected bad things. Sadly,
I was indeed right. We had some issues with browsers being unable to deal
with prefixed XHTML before, but it was okay for me to disable XHTML support for
some terminal browsers. But most recent Firefox versions? I felt quite bad at
it. As a hotfix, I of course disabled XHTML all together, but with some
help from friendly SO guys, we came up with an XSLT transforming prefixed
XHTML to unprefixed XHTML, which is then
used for all User Agents which we have not verified to work correctly with
prefixed XHTML.Such stuff sucks, but it’s really most of what you’re dealing with when
developing a web framework which is supposed to work in all cases. Well, finally
it indeed does work correctly—we do not know of any bugs right now. If
anyone spots some malfunction, please send me an e-mail or report over at
the github issue tracker for PyXWF (we accept bug reports which relate
to the zombofant.net website there too, for the sake of convenience).It’s christmas time, that means cookies! Actually, not, because the cookie
support is dated back to September indeed. PyXWF now has a simple interface to
create and use cookies. It takes care of everything, including the encoding and
decoding of the cookie value. For safety reasons, all values are encoded as
modified base64. What’s missing right now is the support to encode raw binary
values—it is expected that the encoding input is either utf-8 or a python
unicode string.Otherwise the cookie support features all features present in the
current specification, RFC 6265. As the values are modified base64, you
can put literally everything in a cookie—PyXWF will take care that you don’t
accidentially the whole spec.I think I already mentioned it, but working with WebStack isn’t that funny
anymore. We’re mostly relying on our own WSGI adapter right now, which does the
things you’d expect and works properly with unicode. Also, I really had no
motivation to deal with the WebStack interface to get cookies working
similarily seamless as they do for the WSGI backend.The <host: />-namespaced classes have been around some time, but they have
been reworked into a single, mighty class. For those who don’t know, these
classes allow for host-based settings and redirects. You can, for example, tell
PyXWF to redirect users typing www.zombofant.net to zombofant.net
(we’re doing that) using a permanent (301 Moved Permanently) redirect.You can do even more magic. In the new version, it’s possible to set up host
pairs and setup “mobileness” based on the host which is used. Here’s a snippet
from the sitemap xml we’re using for some site, hostnames stripped. It’s
featuring an automatic redirect to m.example.com for known mobile clients,
but also allows to override this behaviour if the client subsequently dials
to the main host name (this is useful to allow both automatic redirects to the
appropriate when users follow, for example, a link received through instant
messaging, but also allows users to decide which version they want to use).So let’s go through the code, just to show how easy it is to do complex things
with PyXWF. First of all, order matters here. The values from
<host:mobileness /> are used immediately upon declaration of any directive
which needs them. This means that you have to declare the
<host:mobileness /> before any directives needing them, inside the
<tweaks /> node of the sitemap.So we define the mobileness values for host-based mobileness detection.<host:mobileness>
<host:name mobile="true">m.example.com</host:name>
<host:name mobile="false">example.com</host:name>
</host:mobileness>
On it’s own, this block does nothing but setting some values. You won’t see a
difference. Redirects and everything have to be set up first. We first set up
redirects to force users and search engines into the non-www version:<host:redirect src="www.example.com" dest="example.com"
method="permanent" />
<host:redirect src="www.m.example.com" dest="m.example.com"
method="permanent" />
Now we redirect mobile/desktop users to the respective version of the site, but
only if a cookie we set afterwards is not found in the request. This allows
users to pick the version they want to see afterwards:<host:mobile-redirect cookie="mobile-redirect">
<host:pair
first="m.example.com"
second="example.com" />
</host:mobile-redirect>
There can also be multiple <host:pair /> sets here. The order inside the
pair does not matter. The “mobileness” of the host is extracted from the
declaration we made at first.<host:force-mobile />
This last directive is important to override the value of IsMobileClient
on the Context instance which is handed over to the transform, so that
they’ll render correctly for each host. The default value of IsMobileClient
is extracted from the user agent string. This directive makes the host class
override that value if a host-based mobileness has been set up before.Some tweaks have been made to caching. Firstly, there’s now some logging
available on cache activities. This is quite helpful if the performance isn’t
as good as expected and a cache-limit is in place. Having a limit low enough to
make PyXWF un-cache a required transform on each request is possibly costly.Secondly, more things are now held in the cache instead of statically stored in
objects. This mainly refers to the TransformNodes, which previously kept
a static reference to their transform result, which can become quite expensive
in terms of memory if you have a site which creates most of the pages using
TransformNodes (I do).I love modularity. This is why the main settings of PyXWF (including
<compatibility /> and <performance /> for example) are
now implemented using a TweakSitleton plugin (which is always loaded
implicitly, no need to reference it in the <plugins /> node). Note that the
referenced commit does not show the final state of the plugin and that this
moved out a lot of code from the Site class, which makes it a lot more
readable.Even more readability was gained when we
transited from python loops to an XSLT to perform the transformations of
attributes and nodes in the <py: /> namespace. This saves us a lot of
in-python loops over a whole document tree, and allows for more parallelization
(as lxml releases the GIL when doing such stuff). Also we can now extend the
functionalities more easily. Of course, there is a bunch of tests to make sure
the transform does exactly the same thing as the old in-python transform did
(and that the additional features work).There have been several problems while trying to include JavaScript files in
templates which were not the top-level template. This is obviously not optimal,
which is why I fixed it. Even worse was, that it took like
four commits to fix. This was related to the fact that I often
neglected that ElementTree nodes cannot have multiple parents, thus more copying
is required. Also the handling of html-namespaced nodes in <py:meta /> tags
was not optimal and pretty restrictive.
Another crumb class: StaticCrumb, which only renders a static
document.
A whole bunch of unittests against all those tiny bugs I squashed.
PEP8-iness
More clear error messages at some places.
More HTTP compliance (we now set the Allow header correctly)
And finally, just for fun, the diffstat since the last blogpost (commit
approx. 7842f826):77 files changed, 4542 insertions, 2465 deletions
tags:PyXWFhackingpythonprevious:SoCiS: Coding period ends soon: A small summarynext:Hack of the Week
The content on this page is licensed under CC-BY-SA.