TOASTS

Official

Summary

We analyse the influence of multi-task learning strategies using task families for the English abstractive text summarization task. We group tasks into one of three strategies, i.e., sequential, simultaneous and continual multi-task learning, and evaluate trained models through two downstream tasks. We find that certain combination of task families (e.g., advanced reading comprehension and natural language inference) positively impact downstream performance. Further, we find that choice and combinations of task families influence downstream performance more than the training scheme, supporting the use of task families for abstractive text summarization.

We investigate the role of multi-task learning on English abstractive text summarization. Therefore, we organize 18 pre-selected training tasks into six higher-level, modular task families. Further, we compare three training schemes for the pre-finetuning stage and their respective mixing strategies through changes of multiple scores.

Architecture

Pre-Training

TOASTS groups selected pre-training tasks into task families and explores the correlation of these families, their influence on two downstream tasks, and the aggregation through three training schemes. Therefore, we use pre-finetuning, a second inexpensive pre-training stage between pre-training and fine-tuning, which was recently proposed by Muppet and tested by ExT5. Pre-finetuning has two main parts: