- Create your project
- Upload your data or select an existing data set.
- Defining your projects details
- Invite Annotators
- Annotate and Train Your Model
So you've identified a task that could benefit from natural language understanding. Here we'll show you how to get up and running with a trained model on Humanloop by creating a project.
Its good to have a dataset of sentences or documents you want to classify handy. A guide on data formats is available here. We also provide some example datasets.
Create your project
When you first open the Humanloop application you'll see the following page. Go ahead and click Create Project. A project combines your data, your model and your annotators.
Follow the project creation wizard to
- Define the name and instructions for your project.
- Define the input data required by your model.
- Define the output you want your model to produce (i.e. document classification or span extraction).
You can edit the project later.
Upload your data or select an existing data set.
There are two ways to upload data into the platform:
- As a .csv file or .json file upload via the user interface.
- via our REST API directly.
The simplest format for CSV is to have the text documents one-per-row in a column and any labels or meta data in different columns.
For full details on data upload refer to
Defining your projects details
Once you've uploaded data you'll be shown a preview of an individual data-point and asked to specify which text field to use as input to the model and what the possible target labels are for your model (your "labelling taxonomy").
e.g if you were classifying sentiment, possible labels could be: "positive, negative, neutral".
If you uploaded data that already has some labels, Humanloop will automatically extract the possible labels.
You can edit the labelling taxonomy using the Edit button beside Output (you can also edit the labelling taxonomy after you create your project - more on this later)
There are two types of model you can choose by selecting from the 'Output data' dropdown if you have no initial labels for your data:
- New Classification - This will train a model that takes text as input and returns one of the possible labels and their confidence. To make this model "multi-label" (i.e. the model returns one or more of the possible labels), toggle the Allow multiple labels to be selected button.
- New Span Tagging - This will train a model that takes text as input and returns spans of text as output. (e.g for Named Entity Recognition).
You can also add short instructions for annotators (if you have longer annotation guidelines these can be uploaded after project creation.)
Once you are happy with the project details, click Create Project to get started teaching the model through annotation.
If you're working as an individual you can simply start annotating your data and as you do the model will learn from you and give you feedback on performance in real time.
However, it's very common to have a team of subject matter experts who will aid in annotation. If this is the case you can invite annotators to the project by selecting the Invite others button, or from the annotation interface:
- Click on the Project button in the top right of the annotation interface.
- Click on the tab labelled Humans on the left bar
- Add the email addresses of the annotators you would like on the project. These people will recieve an email that will take them through to the labelling instructions and then the annotation page.
From this projects page, you can also define long form Annotation Guidelines for your annotators in markdown that will be accessible from the annotation interface.
Annotate and Train Your Model
Once your annotators have logged in they will come to the annotation interface.
- On the right side of the page you will see one document alongside possible labels. Use the keyboard shortcuts to select the most appropriate label and submit to the model.
- On the left hand side of the page, you will see your prioritised list of documents (with the ability to search using key words - this is handy if you have some insights around what words may correspond to what labels).
- There is also an option to flag or skip a document to come back to later.
As you label Humanloop will train a model in real time and report back the current performance of the model as well as model predictions for both labelled and unlabelled data.
It will also prioritise the unlabelled data based on which data points it thinks are most valuable using our active learning, so that you're always labelling the most valuable data for the model.
When the performance of the model reaches a level that you're happy with, you're ready to deploy.
To edit labels as you go, select the edit labels link. You can add, remove, or edit existing labels - any changes you make will be pushed to all annotators and so this is reserved for project owners.
It can also be helpful to review your own annotation work and the work of others - as well as reassign annotation tasks to different users.
Click on the Project button in the top right of the annotation interface, select the Tasks and data tab and select the View all tasks button.
This provides you with a tabular view of all the annotation tasks associated to your project and the ability to filter on various criteria and assign tasks to different users:
To deploy your model, click on the blue Project button in the top right of the annotation interface and then click on the Integrations tab.
On this page you'll be provided an API key that will allow you to route predictions to your model. See here for full deployment API docs.
Thats it! Your model is hosted and ready to be used.