Nuggets of hard-won programming experience, brain-dumped for Googlability.

Tuesday, March 27, 2012

Py3k Porting Notes

I've been meaning to write this down for a while, but the wider release of PEP404 finally spurred me into getting it done. I spent a bunch of time porting a medium sized (~2.5kLoC) Python library to Python 3, using two different approaches, and there's a few tips that might have saved me time if I'd known them in advance.


For the first pass (2fecd513a4..cc08d9b081), I followed the recommended method of running 2to3, then fixed up the Python 2 source until the Py3k generated version passed all of the unit tests.

This was mostly straightforward, but tips that might have saved me time in this process are:

  • Understanding the old style versus new style import locations would have helped. But then I still don't really understand the wrinkles here.
    [2fecd513a4, e92f459593]
  • Get in the habit of using self.assertEqual rather than self.assertEquals in unit tests. [c5688b8003]
  • Doctests don't work well with 2to3. It's possible to get around this by contorting the results, but that rather spoils the point of doctests being simple and clear.
  • I never did figure out a good way to name/mark packages for the Cheese
    where you've got a Python 2.x version and a generated Py3k version.

Dual Source

The original Python 2.x code and the generated Py3k code actually looked pretty similar, so the second pass (e334775b77..c90d133bc0) converted the codebase to a single set of source that could be used in both Python 2.x and Py3k.

This was rather trickier, and involved a few contortions. However, note that several of the contortions are because the minimum Python version I wanted to target is 2.5 – if the target had been >= 2.6, there's more to from __future__ import.

  • Unicode string literals are a pain. u"a string" is illegal in Py3k, but "a string" isn't Unicode in Python 2.x. In the meanwhile, there's a u("string that can include \u0101") function (which might be slow, so the code is tuned to avoid using it at runtime).
    (Would be fixed for a target of >= 2.6 with from __future__ import unicode_literals.)
    [e334775b77, 1897a0a30c, a49b35ead4, 59f4e2fef0,5e6f1e2447, e019a990aa, b4e6759e67, 9246f92746, b4cb1e9246, 3ff89a806c, 319c33ed67]
  • Needed a prnt function to be used instead of print.
    (Would be fixed for a target of >= 2.6 with from __future__ import print_function.)
    [c15fd9cb8d, d32f784cb4]

  • Needed to use sys.exc_info to portably get at a caught exception.
    (Would be fixed for a target of >= 2.6 because as is included there.)
  • There's no sys.maxint any more, and the suggested alternative
    (sys.maxsize) doesn't work on Python 2.5. So I just went for 65535.
    (Would be fixed for a target of >= 2.6 as sys.maxsize is included there.)
  • No L suffix for literal longs any more, so need a wrapper to force ints to longs.
  • Because the library includes auto-generated Python code, the generation tools needed to allow for Python 2.x/Py3k compatibility too. The new rpr function (sticking with disemvowelling as a way of naming things) converts generated code to use the u() utility.

The net result of all of this is a small module that acts a portability layer.

Friday, August 12, 2011

Android Java Classloading Gotcha

TL;DR: If your Android app uses any external Java library that is not part of Android, then future Android versions may break your app in odd ways.

The main Android programming language is Java, and the Android OS comes with a whole collection of standard Java libraries, from the JRE and elsewhere. It's also straightforward to include other Java libraries in your Android application.

However, you'll get a problem if a future version of Android ever includes a version of the library that's embedded in your application – the Android classloader will always pull in the standard OS version of the library rather than your own.

What's more, the OS can bundle "silent" copies of libraries, that don't appear as part of the advertized APIs. So there's not documentation that (say) Android 2.3 adds javax.sip to the system, or that Android 3.0 adds into the system namespace, at a higher priority than anything embedded in your app.

This means that your app can be working fine, and then fail miserably on the next version of Android for hard-to-debug reasons, because the OS now includes the library that your code depends on – but of a completely different version. All of a sudden, an .apk that works fine on Android version X fails with NoSuchMethodError on Android version X+1. (Here's an example of exactly this problem.)

This just seems like a straightforward bug in the Dalvik classloader to me, and as yet I've not found a good way round it. Others have suggested tools like JarJar to embed the library code more indivisibly, but I've not tried this yet. But of course doing that require prescience: spotting in advance which Java library files might possibly get incorporated into Android in the future. So maybe the safest advice is to embed/rename any Java libraries that you pull in to your application.

[Update 2011-09-03: I've put source code for an Android app on GitHub that allows you to look at what libraries are available on the system, using Java reflection functionality. So at least it's easier to figure out when this problem is occurring.]

Monday, June 27, 2011

Android Custom Permissions

TL;DR: If your Android application can be used by other Android applications, don't use
custom permissions to control this interaction.

If you're writing an Android application that exposes functionality that other applications can use, it's tempting to add your own custom permissions to control access to that functionality. By adding a tag to your AndroidManifest.xml, you can give a name and description for the permission, and can apparently leave it up to Android to handle the UI aspects of policing this permission.

But it's not that simple. Firstly, notice that the Android installer doesn't have any mechanism to indicate or enforce dependencies. The consequence of this is that an app that uses your BLOOP_FUNCTION might get installed before your app is.

The next thing to notice is that the Android package manager only exposes permissions to the user at the point an app is installed, and that unknown permissions are not presented to the user. This actually makes some sense – at this point, all the PackageManager knows is the internal name for the permission (e.g. com.example.myapp.ACCESS_BLOOP_FUNCTIONALITY), not anything user-visible.

If the user installs the other app, then later installs your app, then the user has never been asked to approve your custom permission, and PackageManager.checkPermission("com.example.myapp.ACCESS_BLOOP_FUNCTION","com.other.bloop_user") returns false. The permission is still visible if you manually enumerate the permissions of the other app (using PackageManager.getPackageInfo("com.other.bloop_user",GET_PERMISSIONS).requestedPermissions), but that doesn't really help – any attempt by the other app to access an activity or service thatrequires the permission will be rejected.

So the only hope is to try to detect the situation, and request that the user re-install the other application – hardly a palatable option.

Android Library Resources

TL;DR: Make sure any resources you define in an Android library project have globally unique names (e.g. by using a unique-prefix convention).

An Android library project is a "development project that holds shared Android source code and resources". But the resources are treated a bit differently than the source code: any project that uses the library gets the contents of LibraryProject/gen/com/example/ merged in with its own file.

This is inevitable, because the underlying resources are identified by an int, and as all of the resources need to be merged into that single numbering space, they all have to be renumbered to ensure there are no clashes.

But in turn, this means that the identifiers for the resources (@id/this_thing, @string/that_thing) should also have globally unique names, because they are sharing the same namespace. As the docs say:

Since the tools merge the resources of a library project with those of a dependent application project, a given resource ID might be defined in both projects. In this case, the tools select the resource from the application, or the library with highest priority, and discard the other resource.

So if your library has a string called "@string/name", the chances are that it will be hard
for a user of your library to get at the resource – and it will be hard to figure out why not. So
it's best to stick with "@string/my_lib_name".