In any sort of automation procedure there is a trade-off between speed and quality. Coding text is no different, but Q has engineered a new way of coding that gets you the best of both worlds. Known as semi-automatic coding, our AI algorithms language maps to find similar responses in a "smarter" way than merely searching for exact keywords. This helps you find what you are looking for faster, while at the same time manually control how things are being coded. This article outlines some of these semi-automatic features.
Method 1 - Automatically Categorize your Data
Method 2 - Using Existing categories
Method 3 - Using the Auto function to Supplement Existing Categories
This article describes how to go from raw text data...
...to a state where the text responses have been semi-automatically categorized and can be used for further analysis:
Requirements
You will need a Text variable in order to perform manual coding. Text variables are represented by a small a next to the variable in the Variables and Questions tab:
Method 1 - Automatically Categorize your Data
- Select the text variable that you would like to code.
- Right click, go to Insert Variables > Code Text > New Code Frame > Semi Automatic Categorization > Mutually Exclusive Categories OR Multiple Overlapping Categories. NOTE: the choice between Mutually Exclusive Categories (single response) or Multiple Overlapping Categories (multiple response) is based on the structure of your text data.
Semi-Automatic Categorization automatically creates an initial categorization for you.
Method 2 - Using Existing categories
By default, Semi-Automatic Categorization creates an initial categorization for you. If, instead, you want to use existing categories and still be able to make use of features available in Semi-Automatic categorization:
- Select the text variable that you would like to code.
- Right click, go to Insert Variables > Code Text > New Code Frame > Semi Automatic Categorization > Mutually Exclusive Categories OR Multiple Overlapping Categories . NOTE: the choice between Mutually Exclusive Categories (single response) or Multiple Overlapping Categories (multiple response) is based on the structure of your text data.
- Click Cancel when you see this message:
The results are as follows: - By default, Displayr automatically starts with one category: Missing Data.
- Read through the data to create your own categories and add them to the existing category list on the right-hand side of the screen. To add new categories, on the right side of the screen right-click and select Add Category and provide a name for each new category.
In this example, we added the following categories: Technology, I like it, Big.
You can then use our Sort by algorithms to help you manually code into these existing categories:
- Fuzzy match sorting uses keywords to find responses that are similar.
- Similarity to category will look at all the responses coded into a particular category and use those to find similar responses in the uncoded data. This algorithm also becomes "smarter" the more responses that are coded into a category.
Method 3 - Using the Auto function to Supplement Existing Categories
Note: The Auto function is only available in Q version 5.15 and onward.
There may be instances when you have existing categories, but still want the procedure to suggest other ones just in case you missed something. The Auto function can be used for this purpose.
- Select the text variable that you would like to code.
- Right click, go to Insert Variables > Code Text > New Code Frame > Semi Automatic Categorization > Mutually Exclusive Categories OR Multiple Overlapping Categories . NOTE: the choice between Mutually Exclusive Categories (single response) or Multiple Overlapping Categories (multiple response) is based on the structure of your text data.
- Click Cancel when you see this message:
- By default, Displayr automatically starts with one category: Missing Data.
- To add new categories, on the right side of the screen right-click and select Add Category and provide a name for each new category.
- In this example, we added the following categories: Technology, I like it, Big.
Perform Fuzzy Sort
Next, we need to identify the responses that can be categorized into these categories. We will use the Service category in this example.
- Type Service in the Fuzzy sort on box and press Sort now.
- Press Sort now.
Q will take some time to run to the items in the list according to their similarity to the term. In the screenshot below, I've done a fuzzy sort on the word "service". The orange bars show how similar the words are to the search term. You can see the orange bars become narrower in the screenshot below as results become less exact.
The results are as follows:
Categorize the text
- Once you have identified the verbatim text responses to categorize, select one or multiple responses (using your Ctrl key).
- Click on the category or categories you wish to use on the right side of the screen. Use your Ctrl key on your keyboard to select multiple categories, if you are coding multiple overlapping mentions.
- OPTIONAL: If coding multiple overlapping mentions, press Categorize as. This will categorize the data into the selected categories and remove it from the list of all responses.
Using the Auto function to add more categories
You can use the Auto function if you would like Semi-Automatic Categorization to suggest additional categories you may have missed.
- Update the Sort by: drop-down to Fuzzy match
- Press the Auto button. It will take some time the first time you do this as Displayr builds models to automatically detect all categories in the background.
The results are as follows:
At this point, you will probably want to refine the categories. For example, you could combine categories, rename categories, or delete categories you don't want. For example: - Once all coding assignments are complete, click the Save Categories button. A new variable will appear in your Data Sets tree with "Categorized" in the name:
Next
How to Do Automatic List Categorization of Text Data with Q
How to Back Code Other Specify Responses