Labeling process using ChatGPT. Three in- puts are Typed description, Reference distribution, and Prompt. Two outputs are Reason and Adjusted distribution. Notice that the reference distribution is calculated by the number of votes for emotion classes. In the raw annotations of an example, there are instances of disgust, contempt, fear, neutrality, and happiness (*6), resulting in values of 0.6 for happiness and 0.1 for each of the remaining appearing emotions.
Performance Improvment with ChatGPT Relabeling
The table presents macro-F1 scores using the ChatGPT relabled data.