FormExtractor
From Proof-of-Concept to MVP, and beyond
Project Brief
FormX is a data extraction tool that turns physical documents into structured data. To accommodate for the ever-growing use cases for our customers and differentiate our product, we developed self-serve model building tool to allow user to create and train their own model for extraction
Project Team
Me, design
Fung, Product Manager
Jason, Senior ML Developer
Ben, Junior Developer
Platform
Web
Duration
Feb 2021 - Jun 2021
My role
From 0-to-1, I designed three iterations to test our hypothesis and go to market. I focused on the logic of how to create a custom model, train the model with samples and use the custom model across our platform. I designed the user flow, wireframe, and UI design for this feature of our product
Goal : Making AI accessible for everyone
FormX works with many businesses to automate their processes. Before, we will have to collect document samples from our customers, understand what they need to extract, label all the samples, train the model, and then add this model individually to each customer's account to use.
What if we can let the customer train the model themselves?
Faking the experience
As our Proof-of-concept, we want to test if:
Customers will want to create custom models themselves
The needs between customers are nuanced to warrant a custom model builder
Customers are willing to provide samples, some of them confidential, to train the model
We don't need the customer to actually create and train the model to test these assumptions. So we faked the experience by asking them to supply the samples and specify the information to extract. Then our engineers created the models for them.
Training the model (and our customer)
The most important part of model building is to train the model, which involves adequate data and labeling. From our proof-of-concept, we learned that we must ensure our customer uploads enough high-quality samples, or else the model will be inaccurate.
During our proof of concept, we asked some of our customers to label their samples using an open-source annotation tool. But most of our customers never labelled data on samples. Therefore we guide them step-by-step to label their samples.
To structure or not to structure
Another insight from our proof-of-concept is that many of customers' requests to extract are structured entries (e.g. tables, lists). Therefore, we added tables in addition to single-field data to help customers extract structured entries.
Turning Concept Into Product
The custom model extraction feature has gone through three major iterations to slowly address the customer's full range of needs
After the launch of version 1.0, we continue to work on issues in our backlog and additional features for specific customers’ use cases, most prominently switching our labelling tool from an embedded open-source tool to our in-house annotation tool and image recognition. Both features are still under development.
Outcome and Impact
The custom model builder feature allows FormExtractor to differentiate from its competition and increases the usage of customers who utilize the feature. Custom Models become a building block to solve more nuanced and complex extraction tasks without our engineers building custom features on top of our current suite.
3.7x
More API calls than customers not using custom model extraction
3/17
competitor offers custom model building
What I learned
Fake it till you really need to make it
When testing a hypothesis, it is important to strip away anything that is not necessary in my test, including functionalities that seem to be the core of the feature. For example, while we are building a custom model building, the feature of building a custom model is actually not necessary to test whether customers will use it.
Design + engineering = more possibility
When tackling the challenge of low quality samples, my first idea as a designer is to give feedback to user when they uploaded low quality samples. But some users told us that these are all the samples they have. After talking to my engineers, they proposed processing the samples to enhance them to an acceptable level and tweak their model to require less samples to produce similar extraction accuracy. Hence reducing user friction to re-upload samples or find new samples. This solution not only require design, but engineering effort to complete.
Balance between sequence and flexibility when working with an agile development team
I design simultaneously with the development sprints. have to adopt their agile workflow. I started wanting the flush out the product from start to end before handing it off. However, the development team cannot wait for me to finish. Instead, I first drew very rough wireframes of the entire feature, and then work on the features per sprint. The rough wireframe gave me and my PM a comprehensive sense of the product to maintain consistency while retaining the flexibility to change the flow of features or the product during the sprints.