A friend recently asked me for my thoughts on how one would create a context-aware generative user interface system. The idea would be that the system would automatically generate a user interface based on various parameters that can be unique for that particular user - such as what the user is trying to do, the user's profile, the user's history with the system, etc. I've never built anything like this, but it got my gears turning. How could one build such a system?
I am not a data scientist, but I hang out with data scientists a lot, so this is all just ruminations, and most likely would not work. But, perhaps it is a place to start.
Let's break the system into big chunks:
- Let's assume we have lots of metrics about lots of user interfaces, such as data about what components typically get used in an interface, their purpose, where they are located on the UI, and how often they are used. Therefore, we have a way to describe a UI. This may go beyond simply registering the coordinates of a component, but may also include semantic information describing the component and any number of other things.
- Next, we would need to find a way to classify our contexts. We would probably want to use unsupervised learning for this. The assumption here is that most "contexts" will settle into clusters of a finite size. That is - find out if there is a finite set of types of contexts. Any given context might be unique, but it may be close to other similar contexts. The challenge is to automatically find these clusters.
- The next task - which is probably the most difficult - is somehow defining and labeling exemplars for each context cluster of what a "good" UI looks like. This means we need to label the raw data some how. One way to do this might be to collect this from actual users by getting them to rate the quality of existing UIs by putting some kind of feedback widget in the UI. Other ways might be to infer the success of the UI by measuring things like how long a user spends on the page, whether they successfully performed an action, etc. Another approach might be a focus group - where potential users are asked to rate variations on a UI in terms of their appeal and efficacy. Each UI would need to be labeled with both the context type (the group to which it belongs), and a success score.
- Once we have a corpus of labeled data, we can then build a model for each of the contexts that exist in our corpus. Each model would be able to rate the success (according to our success metric) of any new unseen UI. For example, the model for "Context Group 214" would be able to rate any new UI in that group with a success score.
- Then, we can use a genetic algorithm to generate random UIs derived from a library of standard components, for each of our context clusters. The fitness score for each generation is how well each UI scores against the model we created in step 4. The top scoring UI of each generation becomes the parent for the next generation - that is, its attributes are passed on to the next generation. Each child of the next generation has a bit of randomness thrown in, with the hopes that this randomness might allow it to get a better score against our model. The number of generations we want to run depends on how much time we have. One caveat with this is that you want to avoid getting stuck in local maxima - but that is part of the art of genetic programming.
- When a new UI is needed, we use our context classifier to determine what type of UI it is based on the context, and use the pre-determined templates from step 5 to produce the optimized UI for that context.
The end result of all this would produce a set of auto-generated UIs for each context discovered by the unsupervised learning process - each one optimized for whatever our success criteria might be.
As we change our parameters in the above process, we would get different results - such as the number of context clusters we might get.
A variation on this might be to combine steps 3 and 5, by doing some automated randomized A/B testing. We could introduce a slight bit of randomness into the UIs of actual users, and then score that against previous generations using whatever metric we use to define success - such as how long the user takes to perform some action. UIs that score worse are thrown out, UIs that score better are used as a basis for the next generation. In theory, the UI would optimize over time. Though, the danger here is that we may show actual users some randomness that ruins their experience, and our brand.
Of course, such a system would be mightily difficult to build well, in such a way as to make UIs that are competitive with a human designer. But, perhaps it could be used as a tool to give UI designers a starting point, or insight into different approaches that they might not have considered otherwise.
It looks like the Firefox team experimented with something similar at some point. Their results are dubious. (Hint: check the date of publication)