Here's the problem with using an HTML solution.
The configuration API uses XML to fetch and store parameters, and a typical configuration interface uses a collection of controls and images, although, since one has the full power of Flash available, they could have sound and animation as well.
Replacing this with an HTML-based interface would mean that the author would have to submit a collection of files, probably a mix of HTML, Javascript, and images. The Javascript would be required mainly to provide the necessary pack/unpack code to support the XML-based API, but also to manage the UI entirely on the client.
To show this on our site, we'd need to mix this HTML into our page, and the Javascript would be running from our security domain. Those of you that are familiar with web site security can immediately see the profound danger in this - allowing arbitrary Javascript code from an untrusted third party to run on our site.
One approach that Facebook has adopted is to require the author to use a controlled markup dialect (which they call FBML) and severely restrict the capabilities of the widget. This is a possibility - at one point we were looking at adopting a high-level XML-based configuration dialog description which could be transcoded to HTML, Flash, whatever. However, it was found that the bulk of what a configuration widget does is *functional*, which meant that we were going to have to have support for *some* programming language that could be executed entirely on the client.
So - what client-side programming system could we pick that would maximize the availability on the user's platform? Well - the choices come to Flash or Java applets. Well, our widget authors are already using Flash, it's a much richer media environment, and the use of Flash means that there's a strong possibility that the configuration widgets can eventually run on the device itself.
This does, however, put some devices that don't have Flash at a disadvantage - however, there's simply no solution that works for everyone. Even an HTML-based solution would require capabilities beyond the browsers included in many phones - a full DOM with Javascript, XmlHTTPRequest, etc.
When this decision was made, the iPhone did not exist, but even if it did, I'm not sure we would have made a different choice. We're disappointed that Apple has not chosen to make Flash available on their device - hopefully that will change.
As far as Linux is concerned, as you probably know, we're pretty big Linux fans around here. We expect this missing cut/paste functionality will be addressed, as many such inconveniences eventually are.